Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Attention Dropout

Attention Dropout

GeneralIntroduced 201810892 papers

Description

Attention Dropout is a type of dropout used in attention-based architectures, where elements are randomly dropped out of the softmax in the attention equation. For example, for scaled-dot product attention, we would drop elements from the first term:

${\text{Attention}}(Q, K, V) = \text{softmax}\left(\frac{QK^{T}}{\sqrt{d_k}}\right)V$

Papers Using This Method

Making Language Model a Hierarchical Classifier and Generator2025-07-17 Generative Click-through Rate Prediction with Applications to Search Advertising2025-07-15 Chat-Ghosting: A Comparative Study of Methods for Auto-Completion in Dialog Systems2025-07-08 SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression2025-07-08 AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07 Behaviour Space Analysis of LLM-driven Meta-heuristic Discovery2025-07-04 Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work2025-07-03 CyberRAG: An agentic RAG cyber attack classification and reporting tool2025-07-03 Robustness of Misinformation Classification Systems to Adversarial Examples Through BeamAttack2025-06-30 Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models2025-06-28 Response Quality Assessment for Retrieval-Augmented Generation via Conditional Conformal Factuality2025-06-26 PsyLite Technical Report2025-06-26 EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora2025-06-26 Leveraging LLM-Assisted Query Understanding for Live Retrieval-Augmented Generation2025-06-26 Large Language Models Acing Chartered Accountancy2025-06-26 Cat and Mouse -- Can Fake Text Generation Outpace Detector Systems?2025-06-26 AI Assistants to Enhance and Exploit the PETSc Knowledge Base2025-06-25 CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation2025-06-25 Large Language Model-Driven Code Compliance Checking in Building Information Modeling2025-06-25 Knowledge-Aware Diverse Reranking for Cross-Source Question Answering2025-06-25