Grapevine

A new model for symbolic music generation using musical metadata

Artificial intelligence (AI) has opened new interesting opportunities for the music industry, for instance, enabling the development of tools that can automatically generate musical compositions or specific instrument tracks. Yet most existing tools are designed to be used by musicians, composers and music producers, as opposed to non-expert users.

Researchers at LG AI Research recently developed a new interactive system that allows any user to easily translate their ideas into music. This system, outlined in a paper published on arXiv preprint server, combines a decoder-only autoregressive transformer trained on music datasets with an intuitive user interface.

"We introduce the demonstration of symbolic music generation, focusing on providing short musical motifs that serve as the central theme of the narrative," Sangjun Han, Jiwon Ham and their colleagues wrote in their paper. "For the generation, we adopt an autoregressive model which takes musical metadata as inputs and generates 4 bars of multitrack MIDI sequences."

The transformer-based model underpinning the team's symbolic music generation system was trained on two musical datasets, namely the Lakh MIDI dataset and the MetaMIDI dataset. Collectively, these datasets contain over 400,000 MIDI (musical instrument digital interface) files, which are data files containing various information about musical tracks (e.g., the notes played, the duration of notes, the speed at which they are played).

To train their model, the team converted each MIDI file into a musical event representation (REMI) file. This specific format encodes MIDI data into tokens representing various music features (e.g., pitch and velocity). REMI files capture the dynamics of music in ways that are particularly favorable for training AI models for music generation.