TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Attention Dropout

Attention Dropout

GeneralIntroduced 201810892 papers

Description

Attention Dropout is a type of dropout used in attention-based architectures, where elements are randomly dropped out of the softmax in the attention equation. For example, for scaled-dot product attention, we would drop elements from the first term:

Attention(Q,K,V)=softmax(QKTdk)V{\text{Attention}}(Q, K, V) = \text{softmax}\left(\frac{QK^{T}}{\sqrt{d_k}}\right)VAttention(Q,K,V)=softmax(dk​​QKT​)V

Papers Using This Method

Making Language Model a Hierarchical Classifier and Generator2025-07-17Generative Click-through Rate Prediction with Applications to Search Advertising2025-07-15Chat-Ghosting: A Comparative Study of Methods for Auto-Completion in Dialog Systems2025-07-08SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression2025-07-08AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07Behaviour Space Analysis of LLM-driven Meta-heuristic Discovery2025-07-04Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work2025-07-03CyberRAG: An agentic RAG cyber attack classification and reporting tool2025-07-03Robustness of Misinformation Classification Systems to Adversarial Examples Through BeamAttack2025-06-30Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models2025-06-28Response Quality Assessment for Retrieval-Augmented Generation via Conditional Conformal Factuality2025-06-26PsyLite Technical Report2025-06-26EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora2025-06-26Leveraging LLM-Assisted Query Understanding for Live Retrieval-Augmented Generation2025-06-26Large Language Models Acing Chartered Accountancy2025-06-26Cat and Mouse -- Can Fake Text Generation Outpace Detector Systems?2025-06-26AI Assistants to Enhance and Exploit the PETSc Knowledge Base2025-06-25CCRS: A Zero-Shot LLM-as-a-Judge Framework for Comprehensive RAG Evaluation2025-06-25Large Language Model-Driven Code Compliance Checking in Building Information Modeling2025-06-25Knowledge-Aware Diverse Reranking for Cross-Source Question Answering2025-06-25