TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/LLaMA

LLaMA

Natural Language ProcessingIntroduced 20001062 papers
Source Paper

Description

LLaMA is a collection of foundation language models ranging from 7B to 65B parameters. It is based on the transformer architecture with various improvements that were subsequently proposed. The main difference with the original architecture are listed below.

  • RMSNorm normalizing function is used to improve the training stability, by normalizing the input of each transformer sub-layer, instead of normalizing the output.
  • The ReLU non-linearity is replaced by the SwiGLU activation function to improve performance.
  • Absolute positional embeddings are removed and instead rotary positional embeddings (RoPE) are added at each layer of the network.

Papers Using This Method

Making Language Model a Hierarchical Classifier and Generator2025-07-17Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions2025-07-16Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15Compactor: Calibrated Query-Agnostic KV Cache Compression with Approximate Leverage Scores2025-07-10Evaluation of Habitat Robotics using Large Language Models2025-07-08MusiScene: Leveraging MU-LLaMA for Scene Imagination and Enhanced Video Background Music Generation2025-07-08any4: Learned 4-bit Numeric Representation for LLMs2025-07-07Model Inversion Attacks on Llama 3: Extracting PII from Large Language Models2025-07-06Large Language Models Acing Chartered Accountancy2025-06-26CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation2025-06-25OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling2025-06-25Can LLMs Replace Humans During Code Chunking?2025-06-24Shrinking the Generation-Verification Gap with Weak Verifiers2025-06-22Mental Health Equity in LLMs: Leveraging Multi-Hop Question Answering to Detect Amplified and Silenced Perspectives2025-06-22Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster2025-06-22A Minimalist Optimizer Design for LLM Pretraining2025-06-20I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution2025-06-18PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning2025-06-18All is Not Lost: LLM Recovery without Checkpoints2025-06-18Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability2025-06-16