TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods

158 machine learning methods and techniques

AllAudioComputer VisionGeneralGraphsNatural Language ProcessingReinforcement LearningSequential

Poincaré Embeddings

Poincaré Embeddings learn hierarchical representations of symbolic data by embedding them into hyperbolic space -- or more precisely into an -dimensional Poincaré ball. Due to the underlying hyperbolic geometry, this allows for learning of parsimonious representations of symbolic data by simultaneously capturing hierarchy and similarity. Embeddings are learnt based on Riemannian optimization.

Natural Language ProcessingIntroduced 20001 papers

Adaptively Sparse Transformer

The Adaptively Sparse Transformer is a type of Transformer.

Natural Language ProcessingIntroduced 20001 papers

Lbl2TransformerVec

Natural Language ProcessingIntroduced 20001 papers

KE-MLM

Knowledge Enhanced Masked Language Model

Natural Language ProcessingIntroduced 20001 papers

TaxoExpan

TaxoExpan is a self-supervised taxonomy expansion framework. It automatically generates a set of <query concept, anchor concept pairs from the existing taxonomy as training data. Using such self-supervision data, TaxoExpan learns a model to predict whether a query concept is the direct hyponym of an anchor concept. TaxoExpan features: (1) a position-enhanced graph neural network that encodes the local structure of an anchor concept in the existing taxonomy, and (2) a noise-robust training objective that enables the learned model to be insensitive to the label noise in the self-supervision data.

Natural Language ProcessingIntroduced 20001 papers

RealFormer

RealFormer is a type of Transformer based on the idea of residual attention. It adds skip edges to the backbone Transformer to create multiple direct paths, one for each type of attention module. It adds no parameters or hyper-parameters. Specifically, RealFormer uses a Post-LN style Transformer as backbone and adds skip edges to connect Multi-Head Attention modules in adjacent layers.

Natural Language ProcessingIntroduced 20001 papers

Factorized Random Synthesized Attention

Factorized Random Synthesized Attention, introduced with the Synthesizer architecture, is similar to factorized dense synthesized attention but for random synthesizers. Letting being a randomly initialized matrix, we factorize into low rank matrices in the attention function: Here is a parameterized function that is equivalent to in Scaled Dot-Product Attention. For each head, the factorization reduces the parameter costs from to where and hence helps prevent overfitting. In practice, we use a small value of . The basic idea of a Random Synthesizer is to not rely on pairwise token interactions or any information from individual token but rather to learn a task-specific alignment that works well globally across many samples.

Natural Language ProcessingIntroduced 20001 papers

EDLPS

Encoder-Decoder model with local and pairwise loss along with shared encoder and discriminator network (EDLPS)

In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of obtaining word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder. This discriminator is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validate our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and provide competitive results on the paraphrase generation and sentiment analysis task on standard dataset. These results are also shown to be statistically significant. Github Link:https://github.com/dev-chauhan/PQG-pytorch. 2 The PQG dataset is available on this link: https://www.quora.com/q/quoradata/First-Quora-Dataset-Release-Question-Pairs. 3 website: https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs. 4 we report same baseline results as mentioned in [10] 5 website: www.kaggle.com/c/sentiment-analysis-on-movie-reviews. 6 Code: https://github.com/dev-chauhan/PQG-pytorch.

Natural Language ProcessingIntroduced 2000
PreviousPage 4 of 4