TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Language Model Pre-Training with Sparse Latent Typing

Language Model Pre-Training with Sparse Latent Typing

Liliang Ren, Zixuan Zhang, Han Wang, Clare R. Voss, ChengXiang Zhai, Heng Ji

2022-10-23Few-shot NERLanguage Modelling
PaperPDFCode(official)

Abstract

Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at: https://github.com/renll/SparseLT.

Results

TaskDatasetMetricValueModel
Named Entity Recognition (NER)Few-NERD (INTRA)10 way 1~2 shot40.48BERT-SparseLT+CONTainNER
Named Entity Recognition (NER)Few-NERD (INTRA)10 way 5~10 shot53.04BERT-SparseLT+CONTainNER
Named Entity Recognition (NER)Few-NERD (INTRA)5 way 1~2 shot47.2BERT-SparseLT+CONTainNER
Named Entity Recognition (NER)Few-NERD (INTRA)5 way 5~10 shot59.67BERT-SparseLT+CONTainNER
Named Entity Recognition (NER)Few-NERD (INTER)10 way 1~2 shot52.75BERT-SparseLT + CONTaiNER
Named Entity Recognition (NER)Few-NERD (INTER)10 way 5~10 shot62.43BERT-SparseLT + CONTaiNER
Named Entity Recognition (NER)Few-NERD (INTER)5 way 1~2 shot57.14BERT-SparseLT + CONTaiNER
Named Entity Recognition (NER)Few-NERD (INTER)5 way 5~10 shot66.17BERT-SparseLT + CONTaiNER

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing2025-07-16