Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

LLaMA

Natural Language ProcessingIntroduced 20001062 papers

Description

LLaMA is a collection of foundation language models ranging from 7B to 65B parameters. It is based on the transformer architecture with various improvements that were subsequently proposed. The main difference with the original architecture are listed below.

RMSNorm normalizing function is used to improve the training stability, by normalizing the input of each transformer sub-layer, instead of normalizing the output.
The ReLU non-linearity is replaced by the SwiGLU activation function to improve performance.
Absolute positional embeddings are removed and instead rotary positional embeddings (RoPE) are added at each layer of the network.

Papers Using This Method

Making Language Model a Hierarchical Classifier and Generator2025-07-17 Simplifications are Absolutists: How Simplified Language Reduces Word Sense Awareness in LLM-Generated Definitions2025-07-16 Seq vs Seq: An Open Suite of Paired Encoders and Decoders2025-07-15 Compactor: Calibrated Query-Agnostic KV Cache Compression with Approximate Leverage Scores2025-07-10 Evaluation of Habitat Robotics using Large Language Models2025-07-08 MusiScene: Leveraging MU-LLaMA for Scene Imagination and Enhanced Video Background Music Generation2025-07-08 any4: Learned 4-bit Numeric Representation for LLMs2025-07-07 Model Inversion Attacks on Llama 3: Extracting PII from Large Language Models2025-07-06 Large Language Models Acing Chartered Accountancy2025-06-26 CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation2025-06-25 OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling2025-06-25 Can LLMs Replace Humans During Code Chunking?2025-06-24 Shrinking the Generation-Verification Gap with Weak Verifiers2025-06-22 Mental Health Equity in LLMs: Leveraging Multi-Hop Question Answering to Detect Amplified and Silenced Perspectives2025-06-22 Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster2025-06-22 A Minimalist Optimizer Design for LLM Pretraining2025-06-20 I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution2025-06-18 PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning2025-06-18 All is Not Lost: LLM Recovery without Checkpoints2025-06-18 Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability2025-06-16