TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/BERT

BERT

Natural Language ProcessingIntroduced 20006938 papers
Source Paper

Description

BERT, or Bidirectional Encoder Representations from Transformers, improves upon standard Transformers by removing the unidirectionality constraint by using a masked language model (MLM) pre-training objective. The masked language model randomly masks some of the tokens from the input, and the objective is to predict the original vocabulary id of the masked word based only on its context. Unlike left-to-right language model pre-training, the MLM objective enables the representation to fuse the left and the right context, which allows us to pre-train a deep bidirectional Transformer. In addition to the masked language model, BERT uses a next sentence prediction task that jointly pre-trains text-pair representations.

There are two steps in BERT: pre-training and fine-tuning. During pre-training, the model is trained on unlabeled data over different pre-training tasks. For fine-tuning, the BERT model is first initialized with the pre-trained parameters, and all of the parameters are fine-tuned using labeled data from the downstream tasks. Each downstream task has separate fine-tuned models, even though they are initialized with the same pre-trained parameters.

Papers Using This Method

Developing Visual Augmented Q&A System using Scalable Vision Embedding Retrieval & Late Interaction Re-ranker2025-07-16Addressing Data Imbalance in Transformer-Based Multi-Label Emotion Detection with Weighted Loss2025-07-15SentiDrop: A Multi Modal Machine Learning model for Predicting Dropout in Distance Learning2025-07-14Leveraging RAG-LLMs for Urban Mobility Simulation and Analysis2025-07-14The Dark Side of LLMs Agent-based Attacks for Complete Computer Takeover2025-07-09Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based Reasoning2025-07-09SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression2025-07-08AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work2025-07-03CyberRAG: An agentic RAG cyber attack classification and reporting tool2025-07-03Robustness of Misinformation Classification Systems to Adversarial Examples Through BeamAttack2025-06-30Knowledge Augmented Finetuning Matters in both RAG and Agent Based Dialog Systems2025-06-28ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation2025-06-27Response Quality Assessment for Retrieval-Augmented Generation via Conditional Conformal Factuality2025-06-26PsyLite Technical Report2025-06-26EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora2025-06-26Leveraging LLM-Assisted Query Understanding for Live Retrieval-Augmented Generation2025-06-26AI Assistants to Enhance and Exploit the PETSc Knowledge Base2025-06-25CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation2025-06-25Knowledge-Aware Diverse Reranking for Cross-Source Question Answering2025-06-25