TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Convolutional Generative Adversarial Networks with Binary ...

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation

Hao-Wen Dong, Yi-Hsuan Yang

2018-04-25Music Generation
PaperPDFCode(official)CodeCode

Abstract

It has been shown recently that deep convolutional generative adversarial networks (GANs) can learn to generate music in the form of piano-rolls, which represent music by binary-valued time-pitch matrices. However, existing models can only generate real-valued piano-rolls and require further post-processing, such as hard thresholding (HT) or Bernoulli sampling (BS), to obtain the final binary-valued results. In this paper, we study whether we can have a convolutional GAN model that directly creates binary-valued piano-rolls by using binary neurons. Specifically, we propose to append to the generator an additional refiner network, which uses binary neurons at the output layer. The whole network is trained in two stages. Firstly, the generator and the discriminator are pretrained. Then, the refiner network is trained along with the discriminator to learn to binarize the real-valued piano-rolls the pretrained generator creates. Experimental results show that using binary neurons instead of HT or BS indeed leads to better results in a number of objective measures. Moreover, deterministic binary neurons perform better than stochastic ones in both objective measures and a subjective test. The source code, training data and audio examples of the generated results can be found at https://salu133445.github.io/bmusegan/ .

Related Papers

WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling2025-07-14MusiScene: Leveraging MU-LLaMA for Scene Imagination and Enhanced Video Background Music Generation2025-07-08TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure2025-06-29Exploring Adapter Design Tradeoffs for Low Resource Music Generation2025-06-26Let Your Video Listen to Your Music!2025-06-23MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners2025-06-23Benchmarking Music Generation Models and Metrics via Human Preference Studies2025-06-23AI-Generated Song Detection via Lyrics Transcripts2025-06-23