TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Direct Output Connection for a High-Rank Language Model

Direct Output Connection for a High-Rank Language Model

Sho Takase, Jun Suzuki, Masaaki Nagata

2018-08-30EMNLP 2018 10Machine TranslationVocal Bursts Intensity PredictionHeadline GenerationConstituency ParsingTranslationLanguage Modelling
PaperPDFCode(official)

Abstract

This paper proposes a state-of-the-art recurrent neural network (RNN) language model that combines probability distributions computed not only from a final RNN layer but also from middle layers. Our proposed method raises the expressive power of a language model based on the matrix factorization interpretation of language modeling introduced by Yang et al. (2018). The proposed method improves the current state-of-the-art language model and achieves the best score on the Penn Treebank and WikiText-2, which are the standard benchmark datasets. Moreover, we indicate our proposed method contributes to two application tasks: machine translation and headline generation. Our code is publicly available at: https://github.com/nttcslab-nlp/doc_lm.

Results

TaskDatasetMetricValueModel
Language ModellingPenn Treebank (Word Level)Test perplexity47.17AWD-LSTM-DOC x5
Language ModellingPenn Treebank (Word Level)Validation perplexity48.63AWD-LSTM-DOC x5
Language ModellingPenn Treebank (Word Level)Test perplexity52.38AWD-LSTM-DOC
Language ModellingPenn Treebank (Word Level)Validation perplexity54.12AWD-LSTM-DOC
Language ModellingWikiText-2Test perplexity53.09AWD-LSTM-DOC x5
Language ModellingWikiText-2Validation perplexity54.19AWD-LSTM-DOC x5
Language ModellingWikiText-2Test perplexity58.03AWD-LSTM-DOC
Language ModellingWikiText-2Validation perplexity60.29AWD-LSTM-DOC
Constituency ParsingPenn TreebankF1 score94.47LSTM Encoder-Decoder + LSTM-LM

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16