TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/GTSinger: A Global Multi-Technique Singing Corpus with Rea...

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks

Yu Zhang, Changhao Pan, Wenxiang Guo, RuiQi Li, Zhiyuan Zhu, Jialei Wang, Wenhao Xu, Jingyu Lu, Zhiqing Hong, Chuxin Wang, Lichao Zhang, Jinzheng He, Ziyue Jiang, Yuxin Chen, Chen Yang, Jiecheng Zhou, Xinyu Cheng, Zhou Zhao

2024-09-20Vocal technique classificationStyle TransferSinging Voice SynthesisAll
PaperPDFCode(official)

Abstract

The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singers, absence of multi-technique information and realistic music scores, and poor task suitability. To tackle these problems, we present GTSinger, a large global, multi-technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks. Particularly, (1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset; (2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles; (3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control; (4) GTSinger offers realistic music scores, assisting real-world musical composition; (5) singing voices are accompanied by manual phoneme-to-audio alignments, global style labels, and 16.16 hours of paired speech for various singing tasks. Moreover, to facilitate the use of GTSinger, we conduct four benchmark experiments: technique-controllable singing voice synthesis, technique recognition, style transfer, and speech-to-singing conversion. The corpus and demos can be found at http://aaronz345.github.io/GTSingerDemo/. We provide the dataset and the code for processing data and conducting benchmarks at https://huggingface.co/datasets/AaronZ345/GTSinger and https://github.com/AaronZ345/GTSinger.

Related Papers

Modeling Code: Is Text All You Need?2025-07-15All Eyes, no IMU: Learning Flight Attitude from Vision Alone2025-07-15Transferring Styles for Reduced Texture Bias and Improved Robustness in Semantic Segmentation Networks2025-07-14Is Diversity All You Need for Scalable Robotic Manipulation?2025-07-08DESIGN AND IMPLEMENTATION OF ONLINE CLEARANCE REPORT.2025-07-07AnyI2V: Animating Any Conditional Image with Motion Control2025-07-03Hita: Holistic Tokenizer for Autoregressive Image Generation2025-07-03Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models2025-07-03