m-a-p
MERT-v1-95M
The development log of our Music Audio Pre-training (m-a-p) model family: - 02/06/2023: arxiv pre-print and training codes released. - 17/03/2023: we release two advanced music understanding models...
MERT-v1-330M
The development log of our Music Audio Pre-training (m-a-p) model family: - 02/06/2023: arxiv pre-print and training codes released. - 17/03/2023: we release two advanced music understanding models, MERT-v1-95M and MERT-v1-330M , trained with new paradigm and dataset. They outperform the previous models and can better generalize to more tasks. - 14/03/2023: we retrained the MERT-v0 model with open-source-only music dataset MERT-v0-public - 29/12/2022: a music understanding model MERT-v0 trained with MLM paradigm, which performs better at downstream tasks. - 29/10/2022: a pre-trained MIR model music2vec trained with BYOL paradigm. | Name | Pre-train Paradigm | Training Data (hour) | Pre-train Context (second) | Model Size | Transformer Layer-Dimension | Feature Rate | Sample Rate | Release Date | | ------------------------------------------------------------ | ------------------ | -------------------- | ---------------------------- | ---------- | --------------------------- | ------------ | ----------- | ------------ | | MERT-v1-330M | MLM | 160K | 5 | 330M | 24-1024 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v1-95M | MLM | 20K | 5 | 95M | 12-768 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v0-public | MLM | 900 | 5 | 95M | 12-768 | 50 Hz | 16K Hz | 14/03/2023 | | MERT-v0 | MLM | 1000 | 5 | 95 M | 12-768 | 50 Hz | 16K Hz | 29/12/2022 | | music2vec-v1 | BYOL | 1000 | 30 | 95 M | 12-768 | 50 Hz | 16K Hz | 30/10/2022 | The m-a-p models share the similar model architecture and the most distinguished difference is the paradigm in used pre-training. Other than that, there are several nuance technical configuration needs to know before using: - Model Size: the number of parameters that would be loaded to memory. Please select the appropriate size fitting your hardware. - Transformer Layer-Dimension: The number of transformer layers and the corresponding feature dimensions can be outputted from our model. This is marked out because features extracted by different layers could have various performance depending on tasks. - Feature Rate: Given a 1-second audio input, the number of features output by the model. - Sample Rate: The frequency of audio that the model is trained with. Compared to MERT-v0, we introduce multiple new things in the MERT-v1 pre-training: - Change the pseudo labels to 8 codebooks from encodec, which potentially has higher quality and empower our model to support music generation. - MLM prediction with in-batch noise mixture. - Train with higher audio frequency (24K Hz). - Train with more audio data (up to 160 thousands of hours). - More available model sizes 95M and 330M. More details will be written in our coming-soon paper.
YuE-s1-7B-anneal-en-cot
YuE-s1-7B-anneal-en-cot 🤗 | YuE-s1-7B-anneal-en-icl 🤗 | YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗 | YuE-s1-7B-anneal-zh-cot 🤗 | YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗 | YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)
YuE-s2-1B-general
MERT-v0-public
ChatMusician
OpenCodeInterpreter-DS-6.7B
OpenCodeInterpreter-CL-13B
OpenCodeInterpreter-CL-7B
YuE S1 7B Anneal En Icl
YuE-s1-7B-anneal-en-cot 🤗 | YuE-s1-7B-anneal-en-icl 🤗 | YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗 | YuE-s1-7B-anneal-zh-cot 🤗 | YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗 | YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)
ChatMusician-Base
music2vec-v1
MERT-v0
YuE-s1-7B-anneal-zh-cot
YuE-s1-7B-anneal-en-cot 🤗 | YuE-s1-7B-anneal-en-icl 🤗 | YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗 | YuE-s1-7B-anneal-zh-cot 🤗 | YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗 | YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)
YuE-s1-7B-anneal-jp-kr-cot
MuPT-v0-4096-190M
OpenCodeInterpreter-DS-33B
YuE-s1-7B-anneal-jp-kr-icl
YuE-s1-7B-anneal-zh-icl
MIO-7B-Base
Kun-PrimaryChatModel
MuPT-v1-8192-1.97B
Amber-Reproduce-599.79B
Amber-Reproduce-301.99B
Amber-Reproduce-20.97B
Amber-Reproduce-71.30B
CT-LLM-Base
MuPT-v1-8192-190M
TreePO-Qwen2.5-7B
We release the resources for the paper TreePO: - Checkpoint with average weighted subgroup advantages + more diverse intial divergence (the final one). ← You are here. - Checkpoint with average weighted subgroup advantages + fixed divergence. - The training dataset consisted of deepscaler and simplerl math reasoning. More links: - Huggingface Paper - Project Page - X/Twitter Thread - Github Repo If you find this work useful, please consider citing the paper:
OpenLLaMA-Reproduce-218.1B
CriticLeanGPT-Qwen3-8B-RL
CriticLeanGPT-Qwen3-32B-RL
YuE-upsampler
MuPT-v0-8192-1.97B
MuPT-v0-8192-550M
MuPT-v0-4096-550M
MuPT-v0-4096-1.07B
neo_7b
OpenCodeInterpreter-CL-70B
OpenCodeInterpreter-CL-34B
MuPT-v0-8192-190M
OpenLLaMA-Reproduce-1409.29B
OpenCodeInterpreter-SC2-7B
YuE-s1-0.5B
CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL
MuPT-v0-8192-1.07B
Amber-Reproduce-100.66B
OpenLLaMA-Reproduce-2030.04B
Qwen2.5-Instruct-7B-COIG-P
This repository contains the Qwen2.5-Instruct-7B-COIG-P model, a 7B parameter Large Language Model fine-tuned for instruction following using the COIG-P dataset, as described in the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. - Developed by: [More Information Needed] - Funded by: [More Information Needed] - Shared by: [More Information Needed] - Model type: Large Language Model (LLM) - Language(s) (NLP): Chinese (zh) - License: cc-by-nc-4.0 - Finetuned from model: Qwen2 - Repository: [More Information Needed] - Paper: COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values This model is designed for text generation tasks and is particularly well-suited for Chinese language processing. It can be used for generating creative text formats, translating languages, and answering questions. The model can be fine-tuned for various downstream tasks, including chatbots, code generation, summarization, question answering, and other NLP tasks. The Llama-Factory can be used for fine-tuning the model. The model's performance may be limited when applied to tasks significantly different from those it was trained on or tasks requiring understanding of languages other than Chinese. The model may exhibit biases present in its training data, particularly reflecting biases inherent in the Chinese language and culture. Users should be aware of potential biases and limitations and use the model responsibly and ethically, avoiding applications that could perpetuate or amplify harmful biases. Use the following code to get started with the Qwen2.5-Instruct-7B-COIG-P model: The model was trained on the COIG-P dataset (https://huggingface.co/datasets/m-a-p/COIG-P). This dataset consists of 101k Chinese preference pairs across six domains: Chat, Code, Math, Logic, Novel, and Role. - Checkpoint size: [More Information Needed] - Training time: [More Information Needed] The model's performance is evaluated using the Chinese Reward Benchmark (CRBench) and AlignBench. - Chinese Reward Benchmark (CRBench): https://huggingface.co/datasets/m-a-p/COIG-P-CRM - AlignBench: https://github.com/THUDM/AlignBench [Add metrics from paper, e.g., accuracy, precision, recall] [Add results from paper, including tables and figures if appropriate]
Infinity-Instruct-3M-0625-Llama3-8B-COIG-P
python from transformers import AutoModelForCausalLM, AutoTokenizer, LogitsProcessorList import torch device = "cuda" # the device to load the model onto model = AutoModelForCausalLM.frompretrained("m-a-p/Infinity-Instruct-3M-0625-Llama3-8B-COIG-P", torchdtype=torch.bfloat16, devicemap="auto" ) tokenizer = AutoTokenizer.frompretrained("m-a-p/Infinity-Instruct-3M-0625-Llama3-8B-COIG-P") prompt = "Give me a short introduction to large language model." messages = [ {"role": "user", "content": prompt} ] text = tokenizer.applychattemplate( messages, tokenize=False, addgenerationprompt=True ) modelinputs = tokenizer([text], returntensors="pt").to(device) logitsprocessor = LogitsProcessorList( [ MinLengthLogitsProcessor(1, eostokenid=tokenizer.eostokenid), TemperatureLogitsWarper(0.7), ] ) generatedids = model.generate( modelinputs.inputids, logitsprocessor=logitsprocessor, maxnewtokens=512 ) generatedids = [ outputids[len(inputids):] for inputids, outputids in zip(modelinputs.inputids, generatedids) ] response = tokenizer.batchdecode(generatedids, skipspecialtokens=True)[0] print(response) bibtex @misc{pteam2025coigphighqualitylargescalechinese, title={COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values}, author={P Team and Siwei Wu and Jincheng Ren and Xinrun Du and Shuyue Guo and Xingwei Qu and Yiming Liang and Jie Liu and Yunwen Li and Tianyu Zheng and Boyu Feng and Huaqing Yuan and Zenith Wang and Jiaheng Liu and Wenhao Huang and Chenglin Cai and Haoran Que and Jian Yang and Yuelin Bai and Zekun Moore Wang and Zhouliang Yu and Qunshu Lin and Ding Pan and Yuchen Jiang and Tiannan Wang and Wangchunshu Zhou and Shenzhi Wang and Xingyuan Bu and Minghao Liu and Guoyin Wang and Ge Zhang and Chenghua Lin}, year={2025}, eprint={2504.05535}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.05535}, }
CriticLeanGPT-Qwen3-14B-RL
MuPT-v1-8192-550M
Amber-Reproduce-901.78B
Amber-Reproduce-1199.57B
Amber-Reproduce-41.94B
Amber-Reproduce-398.46B
OpenLLaMA-Reproduce-1023.41B
OpenLLaMA-Reproduce-318.77B
OpenLLaMA-Reproduce-973.08B
OpenLLaMA-Reproduce-1728.05B
OpenLLaMA-Reproduce-1933.57B
Infinity-Instruct-3M-0625-Mistral-7B-COIG-P
This repository contains the Infinity-Instruct-3M-0625-Mistral-7B-COIG-P model of the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]
Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P
This repository contains the Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P model of the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]
CriticLeanGPT-Qwen2.5-32B-Instruct-SFT-RL
OpenCodeInterpreter-DS-1.3B
neo_7b_instruct_v0.1
Kun-LabelModel
MuPT-v1-8192-1.07B
OpenCodeInterpreter-SC2-15B
CriticLeanGPT-Qwen2.5-7B-Instruct-SFT
Amber-Reproduce-201.33B
Amber-Reproduce-50.33B
Amber-Reproduce-700.45B
OpenLLaMA-Reproduce-1610.61B
OpenLLaMA-Reproduce-503.32B
OpenLLaMA-Reproduce-536.87B
OpenLLaMA-Reproduce-654.31B
OpenLLaMA-Reproduce-1191.18B
MuPT-v1.1-8192-1.07B
MuPT-v0-4096-1.97B
CT-LLM-SFT
Amber-Reproduce-79.69B
Amber-Reproduce-499.12B
Amber-Reproduce-998.24B
OpenLLaMA-Reproduce-100.66B
OpenLLaMA-Reproduce-1828.72B
CriticLeanGPT-Qwen2.5-32B-Instruct-SFT
TreePO-Qwen2.5-7B_GRPO-TreePO-Sampling
Amber-Reproduce-1300.23B
OpenLLaMA-Reproduce-2041.21B
OpenLLaMA-Reproduce-1073.74B
MuPT-v1.1-8192-1.97B
CriticLeanGPT-Qwen2.5-14B-Instruct-SFT-RL
Amber-Reproduce-88.08B
Amber-Reproduce-1098.91B
OpenLLaMA-Reproduce-335.54B
OpenLLaMA-Reproduce-117.44B
TreePO-Qwen2.5-7B_fixed-div
TreePO-Qwen2.5-7B_Low_Prob_Encourage
OpenCodeInterpreter-SC2-3B
Amber-Reproduce-29.36B
OpenLLaMA-Reproduce-872.42B
OpenLLaMA-Reproduce-1291.85B
Qwen2-Instruct-7B-COIG-P
This model, Qwen2-Instruct-7B-COIG-P, is a 7B parameter large language model fine-tuned for instruction following, particularly within the Chinese language domain. It's based on the Qwen-2 architecture and trained using the COIG-P dataset, focusing on aligning the model's output with human preferences. This repository contains the Qwen2-Instruct-7B-COIG-P model described in the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. This model excels at generating text responses in Chinese according to user instructions. - Developed by: [More Information Needed - Add developer/organization details] - Funded by [optional]: [More Information Needed - Add funding source information] - Shared by [optional]: m-a-p - Model type: Large Language Model (LLM) - Language(s) (NLP): Chinese (zh) - License: Apache 2.0 - Finetuned from model [optional]: [More Information Needed - Add base model details] - Repository: https://github.com/m-a-p/COIG-P - Paper: https://arxiv.org/abs/2504.05535 - Demo [optional]: [More Information Needed - Add demo link if available] This model can be used directly for text generation tasks in Chinese. Users can provide instructions or prompts, and the model will generate corresponding text outputs. This model can be fine-tuned for various downstream tasks such as question answering, text summarization, and translation, specifically within the Chinese language context. This model may not perform well on tasks requiring knowledge outside of the domain covered by the COIG-P dataset. Its performance in languages other than Chinese is also expected to be limited. [More Information Needed - Add information on biases, risks, and limitations. Consider potential biases in the training data and the model's potential for generating harmful or inappropriate content.] [More Information Needed - Add recommendations to mitigate biases, risks, and limitations] [More Information Needed - Add details about the training data, linking to the Hugging Face dataset card if applicable. The dataset used is COIG-P: https://huggingface.co/datasets/m-a-p/COIG-P] [More Information Needed - Add details about the training procedure, including pre-processing steps and hyperparameters.] [More Information Needed - Detail the evaluation setup, including datasets, factors, and metrics used.] [More Information Needed - Present the evaluation results.] [More Information Needed - Summarize the evaluation results.] [More Information Needed - Estimate and report the environmental impact of training this model.]