Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Gated Linear Unit

Gated Linear Unit

GeneralIntroduced 2000798 papers

Description

A Gated Linear Unit, or GLU computes:

\mathrm{GLU}(a, b) = a \otimes \sigma(b)

It is used in natural language processing architectures, for example the Gated CNN, because here $\sigma(b)$ is the gate that control what information from $a$ is passed up to the following layer. Intuitively, for a language modeling task, the gating mechanism allows selection of words or features that are important for predicting the next word. The GLU also has non-linear capabilities, but has a linear path for the gradient so diminishes the vanishing gradient problem.

Papers Using This Method

LiLM-RDB-SFC: Lightweight Language Model with Relational Database-Guided DRL for Optimized SFC Provisioning2025-07-15 Chat-Ghosting: A Comparative Study of Methods for Auto-Completion in Dialog Systems2025-07-08 I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution2025-06-18 Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription2025-06-17 A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation2025-06-09 The Impact of Feature Scaling In Machine Learning: Effects on Regression and Classification Tasks2025-06-09 A Multi-Dataset Evaluation of Models for Automated Vulnerability Repair2025-06-05 DuAL-Net: A Hybrid Framework for Alzheimer's Disease Prediction from Whole-Genome Sequencing via Local SNP Windows and Global Annotations2025-05-31 Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking2025-05-29 ShIOEnv: A CLI Behavior-Capturing Environment Enabling Grammar-Guided Command Synthesis for Dataset Curation2025-05-23 Fusion of Foundation and Vision Transformer Model Features for Dermatoscopic Image Classification2025-05-22 LogiCase: Effective Test Case Generation from Logical Description in Competitive Programming2025-05-21 EEG-to-Text Translation: A Model for Deciphering Human Brain Activity2025-05-20 Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation2025-05-16 Multilingual Machine Translation with Quantum Encoder Decoder Attention-based Convolutional Variational Circuits2025-05-14 Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization2025-05-08 GASCADE: Grouped Summarization of Adverse Drug Event for Enhanced Cancer Pharmacovigilance2025-05-07 Benchmarking Traditional Machine Learning and Deep Learning Models for Fault Detection in Power Transformers2025-05-07 A review of DNA restriction-free overlapping sequence cloning techniques for synthetic biology2025-05-06 JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry2025-04-29