TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Scoring Time Intervals using Non-Hierarchical Transformer ...

Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

Yujia Yan, Zhiyao Duan

2024-04-15Music Transcription
PaperPDFCodeCode(official)

Abstract

The neural semi-Markov Conditional Random Field (semi-CRF) framework has demonstrated promise for event-based piano transcription. In this framework, all events (notes or pedals) are represented as closed time intervals tied to specific event types. The neural semi-CRF approach requires an interval scoring matrix that assigns a score for every candidate interval. However, designing an efficient and expressive architecture for scoring intervals is not trivial. This paper introduces a simple method for scoring intervals using scaled inner product operations that resemble how attention scoring is done in transformers. We show theoretically that, due to the special structure from encoding the non-overlapping intervals, under a mild condition, the inner product operations are expressive enough to represent an ideal scoring matrix that can yield the correct transcription result. We then demonstrate that an encoder-only structured non-hierarchical transformer backbone, operating only on a low-time-resolution feature map, is capable of transcribing piano notes and pedals with high accuracy and time precision. The experiment shows that our approach achieves the new state-of-the-art performance across all subtasks in terms of the F1 measure on the Maestro dataset.

Results

TaskDatasetMetricValueModel
Music TranscriptionMAESTROOnset F198.32Transkun V2 (SemiCRF)
Music TranscriptionMAPSOnset F190.38Transkun V2 (SemiCRF) with Data Augmentation
Music TranscriptionMAPSOnset F186.1Transkun V2 (SemiCRF)
Music TranscriptionSMD PianoOnset F198.71Transkun V2 (SemiCRF) with Data Augmentation

Related Papers

Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription2025-06-17Dialogue in Resonance: An Interactive Music Piece for Piano and Real-Time Automatic Transcription System2025-05-22Unified Cross-modal Translation of Score Images, Symbolic Music, and Performance Audio2025-05-19Automatic Music Transcription using Convolutional Neural Networks and Constant-Q transform2025-05-07Music Tempo Estimation on Solo Instrumental Performance2025-04-25Scalable Approximate Algorithms for Optimal Transport Linear Models2025-04-06Multi-task learning-based temporal pattern matching network for guitar tablature transcription2025-04-03D3RM: A Discrete Denoising Diffusion Refinement Model for Piano Transcription2025-01-09