Musicgen by Meta

Meta's Audiocraft team introduces MusicGen, an open-source deep-learning language model for music generation, available at

What it Does?

MusicGen creates music from text prompts and enhances melodies with new elements, allowing composition from scratch and augmentation of existing melodies.

How it Works?

By leveraging Meta's 32KHz EnCodec audio tokeniser, MusicGen can quickly generate small music segments and process them concurrently, enabling the generation of new music within minutes.

Behind the Language Model

MusicGen's language model is trained on 20,000 hours of licensed music, including 10,000 high-quality tracks and 390,000 instrumentals from ShutterStock and Pond5.

Like Google's language model

MusicGen, like Google's MusicLM, is a text-to-music model. Try MusicLM at '' while MusicGen offers similar functionality.

Multiple Model Size

MusicGen offers four model sizes, with the largest capable of producing highly intricate and complex music compositions.

System Requirment

For optimal performance, MusicGen is best run locally with a GPU of at least 16GB RAM, even though it is accessible online via Hugging Face.