m-a-p

197 models • 3 total models in database

Sort by:

MERT-v1-95M

MERT-v1-330M

The development log of our Music Audio Pre-training (m-a-p) model family: - 02/06/2023: arxiv pre-print and training codes released. - 17/03/2023: we release two advanced music understanding models, MERT-v1-95M and MERT-v1-330M , trained with new paradigm and dataset. They outperform the previous models and can better generalize to more tasks. - 14/03/2023: we retrained the MERT-v0 model with open-source-only music dataset MERT-v0-public - 29/12/2022: a music understanding model MERT-v0 trained with MLM paradigm, which performs better at downstream tasks. - 29/10/2022: a pre-trained MIR model music2vec trained with BYOL paradigm. | Name | Pre-train Paradigm | Training Data (hour) | Pre-train Context (second) | Model Size | Transformer Layer-Dimension | Feature Rate | Sample Rate | Release Date | | ------------------------------------------------------------ | ------------------ | -------------------- | ---------------------------- | ---------- | --------------------------- | ------------ | ----------- | ------------ | | MERT-v1-330M | MLM | 160K | 5 | 330M | 24-1024 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v1-95M | MLM | 20K | 5 | 95M | 12-768 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v0-public | MLM | 900 | 5 | 95M | 12-768 | 50 Hz | 16K Hz | 14/03/2023 | | MERT-v0 | MLM | 1000 | 5 | 95 M | 12-768 | 50 Hz | 16K Hz | 29/12/2022 | | music2vec-v1 | BYOL | 1000 | 30 | 95 M | 12-768 | 50 Hz | 16K Hz | 30/10/2022 | The m-a-p models share the similar model architecture and the most distinguished difference is the paradigm in used pre-training. Other than that, there are several nuance technical configuration needs to know before using: - Model Size: the number of parameters that would be loaded to memory. Please select the appropriate size fitting your hardware. - Transformer Layer-Dimension: The number of transformer layers and the corresponding feature dimensions can be outputted from our model. This is marked out because features extracted by different layers could have various performance depending on tasks. - Feature Rate: Given a 1-second audio input, the number of features output by the model. - Sample Rate: The frequency of audio that the model is trained with. Compared to MERT-v0, we introduce multiple new things in the MERT-v1 pre-training: - Change the pseudo labels to 8 codebooks from encodec, which potentially has higher quality and empower our model to support music generation. - MLM prediction with in-batch noise mixture. - Train with higher audio frequency (24K Hz). - Train with more audio data (up to 160 thousands of hours). - More available model sizes 95M and 330M. More details will be written in our coming-soon paper.

license:cc-by-nc-4.0

51,415

YuE-s1-7B-anneal-en-cot

YuE-s1-7B-anneal-en-cot 🤗  |  YuE-s1-7B-anneal-en-icl 🤗  |  YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗  |  YuE-s1-7B-anneal-zh-cot 🤗  |  YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗  |  YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)

m-a-p

MERT-v1-95M

MERT-v1-330M

YuE-s1-7B-anneal-en-cot

YuE-s2-1B-general

MERT-v0-public

ChatMusician

OpenCodeInterpreter-DS-6.7B

OpenCodeInterpreter-CL-13B

OpenCodeInterpreter-CL-7B

YuE S1 7B Anneal En Icl

ChatMusician-Base

music2vec-v1

MERT-v0

YuE-s1-7B-anneal-zh-cot

YuE-s1-7B-anneal-jp-kr-cot

MuPT-v0-4096-190M

OpenCodeInterpreter-DS-33B

YuE-s1-7B-anneal-jp-kr-icl

YuE-s1-7B-anneal-zh-icl

MIO-7B-Base

Kun-PrimaryChatModel

MuPT-v1-8192-1.97B

Amber-Reproduce-599.79B

Amber-Reproduce-301.99B

Amber-Reproduce-20.97B

Amber-Reproduce-71.30B

CT-LLM-Base

MuPT-v1-8192-190M

TreePO-Qwen2.5-7B

OpenLLaMA-Reproduce-218.1B

CriticLeanGPT-Qwen3-8B-RL

CriticLeanGPT-Qwen3-32B-RL

YuE-upsampler

MuPT-v0-8192-1.97B

MuPT-v0-8192-550M

MuPT-v0-4096-550M

MuPT-v0-4096-1.07B

neo_7b

OpenCodeInterpreter-CL-70B

OpenCodeInterpreter-CL-34B

MuPT-v0-8192-190M

OpenLLaMA-Reproduce-1409.29B

OpenCodeInterpreter-SC2-7B

YuE-s1-0.5B

CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL

MuPT-v0-8192-1.07B

Amber-Reproduce-100.66B

OpenLLaMA-Reproduce-2030.04B

Qwen2.5-Instruct-7B-COIG-P

Infinity-Instruct-3M-0625-Llama3-8B-COIG-P

CriticLeanGPT-Qwen3-14B-RL

MuPT-v1-8192-550M

Amber-Reproduce-901.78B

Amber-Reproduce-1199.57B

Amber-Reproduce-41.94B

Amber-Reproduce-398.46B

OpenLLaMA-Reproduce-1023.41B

OpenLLaMA-Reproduce-318.77B

OpenLLaMA-Reproduce-973.08B

OpenLLaMA-Reproduce-1728.05B

OpenLLaMA-Reproduce-1933.57B

Infinity-Instruct-3M-0625-Mistral-7B-COIG-P

Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P

CriticLeanGPT-Qwen2.5-32B-Instruct-SFT-RL

OpenCodeInterpreter-DS-1.3B

neo_7b_instruct_v0.1

Kun-LabelModel

MuPT-v1-8192-1.07B

OpenCodeInterpreter-SC2-15B

CriticLeanGPT-Qwen2.5-7B-Instruct-SFT

Amber-Reproduce-201.33B

Amber-Reproduce-50.33B

Amber-Reproduce-700.45B

OpenLLaMA-Reproduce-1610.61B

OpenLLaMA-Reproduce-503.32B

OpenLLaMA-Reproduce-536.87B

OpenLLaMA-Reproduce-654.31B

OpenLLaMA-Reproduce-1191.18B

MuPT-v1.1-8192-1.07B