TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Predicting Prosodic Prominence from Text with Pre-trained ...

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Aarne Talman, Antti Suni, Hande Celikkanat, Sofoklis Kakouros, Jörg Tiedemann, Martti Vainio

2019-08-06WS (NoDaLiDa) 2019 9Prosody Prediction
PaperPDFCode(official)

Abstract

In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models are publicly available.

Results

TaskDatasetMetricValueModel
Text-To-Speech SynthesisHelsinki Prosody CorpusAccuracy83.2BERT
Text-To-Speech SynthesisHelsinki Prosody CorpusAccuracy82.1BiLSTM
Text-To-Speech SynthesisHelsinki Prosody CorpusAccuracy81.8CRF (MarMoT)
Text-To-Speech SynthesisHelsinki Prosody CorpusAccuracy80.8SVN (Minitagger)

Related Papers

VisualSpeech: Enhance Prosody with Visual Context in TTS2025-01-31DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles2024-12-04Word-wise intonation model for cross-language TTS systems2024-09-30PRESENT: Zero-Shot Text-to-Prosody Control2024-08-13Prosody Analysis of Audiobooks2023-10-10A Comparative Analysis of Pretrained Language Models for Text-to-Speech2023-09-04Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data2023-06-29What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model2023-06-10