TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Gen...

LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?

Qihui Zhang, Chujie Gao, Dongping Chen, Yue Huang, Yixin Huang, Zhenyang Sun, Shilin Zhang, Weiye Li, Zhengyan Fu, Yao Wan, Lichao Sun

2024-01-11Binary text classification
PaperPDFCode(official)Code

Abstract

With the rapid development and widespread application of Large Language Models (LLMs), the use of Machine-Generated Text (MGT) has become increasingly common, bringing with it potential risks, especially in terms of quality and integrity in fields like news, education, and science. Current research mainly focuses on purely MGT detection without adequately addressing mixed scenarios, including AI-revised Human-Written Text (HWT) or human-revised MGT. To tackle this challenge, we define mixtext, a form of mixed text involving both AI and human-generated content. Then, we introduce MixSet, the first dataset dedicated to studying these mixtext scenarios. Leveraging MixSet, we executed comprehensive experiments to assess the efficacy of prevalent MGT detectors in handling mixtext situations, evaluating their performance in terms of effectiveness, robustness, and generalization. Our findings reveal that existing detectors struggle to identify mixtext, particularly in dealing with subtle modifications and style adaptability. This research underscores the urgent need for more fine-grain detectors tailored for mixtext, offering valuable insights for future research. Code and Models are available at https://github.com/Dongping-Chen/MixSet.

Results

TaskDatasetMetricValueModel
Binary text classificationMixSet (Binary)F1 score0.876Radar

Related Papers

Reliable Decision Support with LLMs: A Framework for Evaluating Consistency in Binary Text Classification Applications2025-05-20GigaCheck: Detecting LLM-generated Content2024-10-31Calibrated Large Language Models for Binary Question Answering2024-07-01Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset2024-05-03Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models2023-08-11DACCORD : un jeu de données pour la Détection Automatique d'énonCés COntRaDictoires en français2023-06-08Ghostbuster: Detecting Text Ghostwritten by Large Language Models2023-05-24MAGE: Machine-generated Text Detection in the Wild2023-05-22