GPT-Neo

Natural Language ProcessingIntroduced 200038 papers

Description

An implementation of model & data parallel GPT3-like models using the mesh-tensorflow library.

Source: EleutherAI/GPT-Neo

Papers Using This Method

IRepair: An Intent-Aware Approach to Repair Data-Driven Errors in Large Language Models2025-02-10Robust Hybrid Classical-Quantum Transfer Learning Model for Text Classification Using GPT-Neo 125M with LoRA & SMOTE Enhancement2025-01-12LLM Vocabulary Compression for Low-Compute Environments2024-11-10BERTtime Stories: Investigating the Role of Synthetic Story Data in Language pre-training2024-10-20Reconstruction of Differentially Private Text Sanitization via Large Language Models2024-10-16The Unreasonable Ineffectiveness of Nucleus Sampling on Mitigating Text Memorization2024-08-29WPN: An Unlearning Method Based on N-pair Contrastive Learning in Language Models2024-08-18Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs2024-08-13Semantic Membership Inference Attack against Large Language Models2024-06-14Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit's Showerthoughts2024-05-02More than Correlation: Do Large Language Models Learn Causal Representations of Space?2023-12-26Fairness-Aware Structured Pruning in Transformers2023-12-24Scalable Extraction of Training Data from (Production) Language Models2023-11-28Heaps' Law in GPT-Neo Large Language Model Emulated Corpora2023-11-10Watermarking LLMs with Weight Quantization2023-10-17TART: A plug-and-play Transformer module for task-agnostic reasoning2023-09-21Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets2023-06-26Exposing Bias in Online Communities through Large-Scale Language Models2023-06-04Test-Time Training on Nearest Neighbors for Large Language Models2023-05-29Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning2023-05-19