Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

MPNet

Natural Language ProcessingIntroduced 200017 papers

Description

MPNet is a pre-training method for language models that combines masked language modeling (MLM) and permuted language modeling (PLM) in one view. It takes the dependency among the predicted tokens into consideration through permuted language modeling and thus avoids the issue of BERT. On the other hand, it takes position information of all tokens as input to make the model see the position information of all the tokens and thus alleviates the position discrepancy of XLNet.

The training objective of MPNet is:

$\mathbb{E}\_{z\in{\mathcal{Z}\_{n}}} \sum^{n}\_{t=c+1}\log{P}\left(x\_{z\_{t}}\mid{x\_{z\_{<t}}}, M\_{z\_{{>}{c}}}; \theta\right)$

As can be seen, MPNet conditions on ${x\_{z\_{<t}}}$ (the tokens preceding the current predicted token $x\_{z\_{t}}$ ) rather than only the non-predicted tokens ${x\_{z\_{<=c}}}$ in MLM; comparing with PLM, MPNet takes more information (i.e., the mask symbol $[M]$ in position $z\_{>c}$ ) as inputs. Although the objective seems simple, it is challenging to implement the model efficiently. For details, see the paper.

Papers Using This Method

Computational Detection of Intertextual Parallels in Biblical Hebrew: A Benchmark Study Using Transformer-Based Language Models2025-06-30 Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection2025-03-10 "Actionable Help" in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts2025-02-24 Explicit Depth-Aware Blurry Video Frame Interpolation Guided by Differential Curves2025-01-01 CReMa: Crisis Response through Computational Identification and Matching of Cross-Lingual Requests and Offers Shared on Social Media2024-05-20 Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles2024-02-05 RECipe: Does a Multi-Modal Recipe Knowledge Graph Fit a Multi-Purpose Recommendation System?2023-08-08 Specious Sites: Tracking the Spread and Sway of Spurious News Stories at Scale2023-08-03 Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models2023-07-22 Utilizing ChatGPT Generated Data to Retrieve Depression Symptoms from Social Media2023-07-05 Vec2Vec: A Compact Neural Network Approach for Transforming Text Embeddings with High Fidelity2023-06-22 Partial Mobilization: Tracking Multilingual Information Flows Amongst Russian Media Outlets and Telegram2023-01-25 Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device2022-11-03 Happenstance: Utilizing Semantic Search to Track Russian State Media Narratives about the Russo-Ukrainian War On Reddit2022-05-28 YoungSheldon at SemEval-2021 Task 5: Fine-tuning Pre-trained Language Models for Toxic Spans Detection using Token classification Objective2021-08-01 mpNet: variable depth unfolded neural network for massive MIMO channel estimation2020-08-07 MPNet: Masked and Permuted Pre-training for Language Understanding2020-04-20