m-a-p

197 models • 3 total models in database
Sort by:

MERT-v1-95M

The development log of our Music Audio Pre-training (m-a-p) model family: - 02/06/2023: arxiv pre-print and training codes released. - 17/03/2023: we release two advanced music understanding models...

license:cc-by-nc-4.0
91,332
38

MERT-v1-330M

The development log of our Music Audio Pre-training (m-a-p) model family: - 02/06/2023: arxiv pre-print and training codes released. - 17/03/2023: we release two advanced music understanding models, MERT-v1-95M and MERT-v1-330M , trained with new paradigm and dataset. They outperform the previous models and can better generalize to more tasks. - 14/03/2023: we retrained the MERT-v0 model with open-source-only music dataset MERT-v0-public - 29/12/2022: a music understanding model MERT-v0 trained with MLM paradigm, which performs better at downstream tasks. - 29/10/2022: a pre-trained MIR model music2vec trained with BYOL paradigm. | Name | Pre-train Paradigm | Training Data (hour) | Pre-train Context (second) | Model Size | Transformer Layer-Dimension | Feature Rate | Sample Rate | Release Date | | ------------------------------------------------------------ | ------------------ | -------------------- | ---------------------------- | ---------- | --------------------------- | ------------ | ----------- | ------------ | | MERT-v1-330M | MLM | 160K | 5 | 330M | 24-1024 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v1-95M | MLM | 20K | 5 | 95M | 12-768 | 75 Hz | 24K Hz | 17/03/2023 | | MERT-v0-public | MLM | 900 | 5 | 95M | 12-768 | 50 Hz | 16K Hz | 14/03/2023 | | MERT-v0 | MLM | 1000 | 5 | 95 M | 12-768 | 50 Hz | 16K Hz | 29/12/2022 | | music2vec-v1 | BYOL | 1000 | 30 | 95 M | 12-768 | 50 Hz | 16K Hz | 30/10/2022 | The m-a-p models share the similar model architecture and the most distinguished difference is the paradigm in used pre-training. Other than that, there are several nuance technical configuration needs to know before using: - Model Size: the number of parameters that would be loaded to memory. Please select the appropriate size fitting your hardware. - Transformer Layer-Dimension: The number of transformer layers and the corresponding feature dimensions can be outputted from our model. This is marked out because features extracted by different layers could have various performance depending on tasks. - Feature Rate: Given a 1-second audio input, the number of features output by the model. - Sample Rate: The frequency of audio that the model is trained with. Compared to MERT-v0, we introduce multiple new things in the MERT-v1 pre-training: - Change the pseudo labels to 8 codebooks from encodec, which potentially has higher quality and empower our model to support music generation. - MLM prediction with in-batch noise mixture. - Train with higher audio frequency (24K Hz). - Train with more audio data (up to 160 thousands of hours). - More available model sizes 95M and 330M. More details will be written in our coming-soon paper.

license:cc-by-nc-4.0
51,415
76

YuE-s1-7B-anneal-en-cot

YuE-s1-7B-anneal-en-cot 🤗  |  YuE-s1-7B-anneal-en-icl 🤗  |  YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗  |  YuE-s1-7B-anneal-zh-cot 🤗  |  YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗  |  YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)

NaNK
llama
31,966
434

YuE-s2-1B-general

NaNK
llama
6,971
56

MERT-v0-public

license:cc-by-nc-4.0
3,757
5

ChatMusician

llama
1,086
149

OpenCodeInterpreter-DS-6.7B

NaNK
llama
907
135

OpenCodeInterpreter-CL-13B

NaNK
llama
846
9

OpenCodeInterpreter-CL-7B

NaNK
llama
810
11

YuE S1 7B Anneal En Icl

YuE-s1-7B-anneal-en-cot 🤗  |  YuE-s1-7B-anneal-en-icl 🤗  |  YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗  |  YuE-s1-7B-anneal-zh-cot 🤗  |  YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗  |  YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)

NaNK
llama
734
51

ChatMusician-Base

llama
486
14

music2vec-v1

license:cc-by-nc-4.0
484
42

MERT-v0

license:cc-by-nc-4.0
390
19

YuE-s1-7B-anneal-zh-cot

YuE-s1-7B-anneal-en-cot 🤗  |  YuE-s1-7B-anneal-en-icl 🤗  |  YuE-s1-7B-anneal-jp-kr-cot 🤗 YuE-s1-7B-anneal-jp-kr-icl 🤗  |  YuE-s1-7B-anneal-zh-cot 🤗  |  YuE-s1-7B-anneal-zh-icl 🤗 YuE-s2-1B-general 🤗  |  YuE-upsampler 🤗 --- Our model's name is YuE (乐). In Chinese, the word means "music" and "happiness." Some of you may find words that start with Yu hard to pronounce. If so, you can just call it "yeah." We wrote a song with our model's name. YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate a complete song, lasting several minutes, that includes both a catchy vocal track and accompaniment track. YuE is capable of modeling diverse genres/languages/vocal techniques. Please visit the Demo Page for amazing vocal performance. 2025.03.12 🔥 Paper Released🎉: We now release YuE technical report!!! We discuss all the technical details, findings, and lessons learned. Enjoy, and feel free to cite us~ 2025.03.11 🫶 Now YuE supports incremental song generation!!! See YuE-UI by joeljuvel. YuE-UI is a Gradio-based interface supporting batch generation, output selection, and continuation. You can flexibly experiment with audio prompts and different model settings, visualize your progress on an interactive timeline, rewind actions, quickly preview audio outputs at stage 1 before committing to refinement, and fully save/load your sessions (JSON format). Optimized to run smoothly even on GPUs with just 8GB VRAM using quantized models. 2025.02.17 🫶 Now YuE supports music continuation and Google Colab! See YuE-extend by Mozer. 2025.02.07 🎉 Get YuE for Windows on pinokio. 2025.01.30 🔥 Inference Update: We now support dual-track ICL mode! You can prompt the model with a reference song, and it will generate a new song in a similar style (voice cloning demo by @abrakjamson, music style transfer demo by @cocktailpeanut, etc.). Try it out! 🔥🔥🔥 P.S. Be sure to check out the demos first—they're truly impressive. 2025.01.30 🔥 Announcement: A New Era Under Apache 2.0 🔥: We are thrilled to announce that, in response to overwhelming requests from our community, YuE is now officially licensed under the Apache 2.0 license. We sincerely hope this marks a watershed moment—akin to what Stable Diffusion and LLaMA have achieved in their respective fields—for music generation and creative AI. 🎉🎉🎉 2025.01.29 🎉: We have updated the license description. we ENCOURAGE artists and content creators to sample and incorporate outputs generated by our model into their own works, and even monetize them. The only requirement is to credit our name: YuE by HKUST/M-A-P (alphabetic order). 2025.01.28 🫶: Thanks to Fahd for creating a tutorial on how to quickly get started with YuE. Here is his demonstration. 2025.01.26 🔥: We have released the YuE series. License Agreement \& Disclaimer - The YuE model (including its weights) is now released under the Apache License, Version 2.0. We do not make any profit from this model, and we hope it can be used for the betterment of human creativity. - Use & Attribution: - We encourage artists and content creators to freely incorporate outputs generated by YuE into their own works, including commercial projects. - We encourage attribution to the model’s name (“YuE by HKUST/M-A-P”), especially for public and commercial use. - Originality & Plagiarism: It is the sole responsibility of creators to ensure that their works, derived from or inspired by YuE outputs, do not plagiarize or unlawfully reproduce existing material. We strongly urge users to perform their own due diligence to avoid copyright infringement or other legal violations. - Recommended Labeling: When uploading works to streaming platforms or sharing them publicly, we recommend labeling them with terms such as: “AI-generated”, “YuE-generated", “AI-assisted” or “AI-auxiliated”. This helps maintain transparency about the creative process. - Disclaimer of Liability: - We do not assume any responsibility for the misuse of this model, including (but not limited to) illegal, malicious, or unethical activities. - Users are solely responsible for any content generated using the YuE model and for any consequences arising from its use. - By using this model, you agree that you understand and comply with all applicable laws and regulations regarding your generated content. Acknowledgements The project is co-lead by HKUST and M-A-P (alphabetic order). Also thanks moonshot.ai, bytedance, 01.ai, and geely for supporting the project. A friendly link to HKUST Audio group's huggingface space. We deeply appreciate all the support we received along the way. Long live open-source AI! If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil: :)

NaNK
llama
287
40

YuE-s1-7B-anneal-jp-kr-cot

NaNK
llama
279
21

MuPT-v0-4096-190M

llama
196
1

OpenCodeInterpreter-DS-33B

NaNK
llama
188
148

YuE-s1-7B-anneal-jp-kr-icl

NaNK
llama
152
11

YuE-s1-7B-anneal-zh-icl

NaNK
llama
143
16

MIO-7B-Base

NaNK
llama
136
0

Kun-PrimaryChatModel

133
0

MuPT-v1-8192-1.97B

NaNK
llama
119
12

Amber-Reproduce-599.79B

NaNK
94
0

Amber-Reproduce-301.99B

NaNK
93
0

Amber-Reproduce-20.97B

NaNK
91
0

Amber-Reproduce-71.30B

NaNK
90
0

CT-LLM-Base

llama
87
11

MuPT-v1-8192-190M

llama
87
4

TreePO-Qwen2.5-7B

We release the resources for the paper TreePO: - Checkpoint with average weighted subgroup advantages + more diverse intial divergence (the final one). ← You are here. - Checkpoint with average weighted subgroup advantages + fixed divergence. - The training dataset consisted of deepscaler and simplerl math reasoning. More links: - Huggingface Paper - Project Page - X/Twitter Thread - Github Repo If you find this work useful, please consider citing the paper:

NaNK
86
2

OpenLLaMA-Reproduce-218.1B

NaNK
llama
85
0

CriticLeanGPT-Qwen3-8B-RL

NaNK
84
3

CriticLeanGPT-Qwen3-32B-RL

NaNK
84
0

YuE-upsampler

license:apache-2.0
83
24

MuPT-v0-8192-1.97B

NaNK
llama
82
19

MuPT-v0-8192-550M

llama
82
1

MuPT-v0-4096-550M

llama
82
0

MuPT-v0-4096-1.07B

NaNK
llama
81
0

neo_7b

NaNK
llama
80
56

OpenCodeInterpreter-CL-70B

NaNK
llama
80
24

OpenCodeInterpreter-CL-34B

NaNK
llama
80
14

MuPT-v0-8192-190M

llama
80
0

OpenLLaMA-Reproduce-1409.29B

NaNK
llama
80
0

OpenCodeInterpreter-SC2-7B

NaNK
license:apache-2.0
79
14

YuE-s1-0.5B

NaNK
llama
79
3

CriticLeanGPT-Qwen2.5-7B-Instruct-SFT-RL

NaNK
79
1

MuPT-v0-8192-1.07B

NaNK
llama
79
0

Amber-Reproduce-100.66B

NaNK
79
0

OpenLLaMA-Reproduce-2030.04B

NaNK
llama
79
0

Qwen2.5-Instruct-7B-COIG-P

This repository contains the Qwen2.5-Instruct-7B-COIG-P model, a 7B parameter Large Language Model fine-tuned for instruction following using the COIG-P dataset, as described in the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. - Developed by: [More Information Needed] - Funded by: [More Information Needed] - Shared by: [More Information Needed] - Model type: Large Language Model (LLM) - Language(s) (NLP): Chinese (zh) - License: cc-by-nc-4.0 - Finetuned from model: Qwen2 - Repository: [More Information Needed] - Paper: COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values This model is designed for text generation tasks and is particularly well-suited for Chinese language processing. It can be used for generating creative text formats, translating languages, and answering questions. The model can be fine-tuned for various downstream tasks, including chatbots, code generation, summarization, question answering, and other NLP tasks. The Llama-Factory can be used for fine-tuning the model. The model's performance may be limited when applied to tasks significantly different from those it was trained on or tasks requiring understanding of languages other than Chinese. The model may exhibit biases present in its training data, particularly reflecting biases inherent in the Chinese language and culture. Users should be aware of potential biases and limitations and use the model responsibly and ethically, avoiding applications that could perpetuate or amplify harmful biases. Use the following code to get started with the Qwen2.5-Instruct-7B-COIG-P model: The model was trained on the COIG-P dataset (https://huggingface.co/datasets/m-a-p/COIG-P). This dataset consists of 101k Chinese preference pairs across six domains: Chat, Code, Math, Logic, Novel, and Role. - Checkpoint size: [More Information Needed] - Training time: [More Information Needed] The model's performance is evaluated using the Chinese Reward Benchmark (CRBench) and AlignBench. - Chinese Reward Benchmark (CRBench): https://huggingface.co/datasets/m-a-p/COIG-P-CRM - AlignBench: https://github.com/THUDM/AlignBench [Add metrics from paper, e.g., accuracy, precision, recall] [Add results from paper, including tables and figures if appropriate]

NaNK
license:cc-by-nc-4.0
79
0

Infinity-Instruct-3M-0625-Llama3-8B-COIG-P

python from transformers import AutoModelForCausalLM, AutoTokenizer, LogitsProcessorList import torch device = "cuda" # the device to load the model onto model = AutoModelForCausalLM.frompretrained("m-a-p/Infinity-Instruct-3M-0625-Llama3-8B-COIG-P", torchdtype=torch.bfloat16, devicemap="auto" ) tokenizer = AutoTokenizer.frompretrained("m-a-p/Infinity-Instruct-3M-0625-Llama3-8B-COIG-P") prompt = "Give me a short introduction to large language model." messages = [ {"role": "user", "content": prompt} ] text = tokenizer.applychattemplate( messages, tokenize=False, addgenerationprompt=True ) modelinputs = tokenizer([text], returntensors="pt").to(device) logitsprocessor = LogitsProcessorList( [ MinLengthLogitsProcessor(1, eostokenid=tokenizer.eostokenid), TemperatureLogitsWarper(0.7), ] ) generatedids = model.generate( modelinputs.inputids, logitsprocessor=logitsprocessor, maxnewtokens=512 ) generatedids = [ outputids[len(inputids):] for inputids, outputids in zip(modelinputs.inputids, generatedids) ] response = tokenizer.batchdecode(generatedids, skipspecialtokens=True)[0] print(response) bibtex @misc{pteam2025coigphighqualitylargescalechinese, title={COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values}, author={P Team and Siwei Wu and Jincheng Ren and Xinrun Du and Shuyue Guo and Xingwei Qu and Yiming Liang and Jie Liu and Yunwen Li and Tianyu Zheng and Boyu Feng and Huaqing Yuan and Zenith Wang and Jiaheng Liu and Wenhao Huang and Chenglin Cai and Haoran Que and Jian Yang and Yuelin Bai and Zekun Moore Wang and Zhouliang Yu and Qunshu Lin and Ding Pan and Yuchen Jiang and Tiannan Wang and Wangchunshu Zhou and Shenzhi Wang and Xingyuan Bu and Minghao Liu and Guoyin Wang and Ge Zhang and Chenghua Lin}, year={2025}, eprint={2504.05535}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.05535}, }

NaNK
llama
79
0

CriticLeanGPT-Qwen3-14B-RL

NaNK
79
0

MuPT-v1-8192-550M

llama
78
2

Amber-Reproduce-901.78B

NaNK
78
0

Amber-Reproduce-1199.57B

NaNK
78
0

Amber-Reproduce-41.94B

NaNK
78
0

Amber-Reproduce-398.46B

NaNK
78
0

OpenLLaMA-Reproduce-1023.41B

NaNK
llama
78
0

OpenLLaMA-Reproduce-318.77B

NaNK
llama
78
0

OpenLLaMA-Reproduce-973.08B

NaNK
llama
78
0

OpenLLaMA-Reproduce-1728.05B

NaNK
llama
78
0

OpenLLaMA-Reproduce-1933.57B

NaNK
llama
78
0

Infinity-Instruct-3M-0625-Mistral-7B-COIG-P

This repository contains the Infinity-Instruct-3M-0625-Mistral-7B-COIG-P model of the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

NaNK
78
0

Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P

This repository contains the Infinity-Instruct-3M-0625-Qwen2-7B-COIG-P model of the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

NaNK
78
0

CriticLeanGPT-Qwen2.5-32B-Instruct-SFT-RL

NaNK
78
0

OpenCodeInterpreter-DS-1.3B

NaNK
llama
77
25

neo_7b_instruct_v0.1

NaNK
llama
77
10

Kun-LabelModel

llama
77
6

MuPT-v1-8192-1.07B

NaNK
llama
77
5

OpenCodeInterpreter-SC2-15B

NaNK
license:apache-2.0
77
4

CriticLeanGPT-Qwen2.5-7B-Instruct-SFT

NaNK
llama-factory
77
1

Amber-Reproduce-201.33B

NaNK
77
0

Amber-Reproduce-50.33B

NaNK
77
0

Amber-Reproduce-700.45B

NaNK
77
0

OpenLLaMA-Reproduce-1610.61B

NaNK
llama
77
0

OpenLLaMA-Reproduce-503.32B

NaNK
llama
77
0

OpenLLaMA-Reproduce-536.87B

NaNK
llama
77
0

OpenLLaMA-Reproduce-654.31B

NaNK
llama
77
0

OpenLLaMA-Reproduce-1191.18B

NaNK
llama
77
0

MuPT-v1.1-8192-1.07B

NaNK
llama
77
0

MuPT-v0-4096-1.97B

NaNK
llama
76
1

CT-LLM-SFT

llama
76
1

Amber-Reproduce-79.69B

NaNK
76
0

Amber-Reproduce-499.12B

NaNK
76
0

Amber-Reproduce-998.24B

NaNK
76
0

OpenLLaMA-Reproduce-100.66B

NaNK
llama
76
0

OpenLLaMA-Reproduce-1828.72B

NaNK
llama
76
0

CriticLeanGPT-Qwen2.5-32B-Instruct-SFT

NaNK
llama-factory
76
0

TreePO-Qwen2.5-7B_GRPO-TreePO-Sampling

NaNK
76
0

Amber-Reproduce-1300.23B

NaNK
75
0

OpenLLaMA-Reproduce-2041.21B

NaNK
llama
75
0

OpenLLaMA-Reproduce-1073.74B

NaNK
llama
75
0

MuPT-v1.1-8192-1.97B

NaNK
llama
75
0

CriticLeanGPT-Qwen2.5-14B-Instruct-SFT-RL

NaNK
74
1

Amber-Reproduce-88.08B

NaNK
74
0

Amber-Reproduce-1098.91B

NaNK
74
0

OpenLLaMA-Reproduce-335.54B

NaNK
llama
74
0

OpenLLaMA-Reproduce-117.44B

NaNK
llama
74
0

TreePO-Qwen2.5-7B_fixed-div

NaNK
74
0

TreePO-Qwen2.5-7B_Low_Prob_Encourage

NaNK
74
0

OpenCodeInterpreter-SC2-3B

NaNK
license:apache-2.0
73
7

Amber-Reproduce-29.36B

NaNK
73
0

OpenLLaMA-Reproduce-872.42B

NaNK
llama
73
0

OpenLLaMA-Reproduce-1291.85B

NaNK
llama
73
0

Qwen2-Instruct-7B-COIG-P

This model, Qwen2-Instruct-7B-COIG-P, is a 7B parameter large language model fine-tuned for instruction following, particularly within the Chinese language domain. It's based on the Qwen-2 architecture and trained using the COIG-P dataset, focusing on aligning the model's output with human preferences. This repository contains the Qwen2-Instruct-7B-COIG-P model described in the paper COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. This model excels at generating text responses in Chinese according to user instructions. - Developed by: [More Information Needed - Add developer/organization details] - Funded by [optional]: [More Information Needed - Add funding source information] - Shared by [optional]: m-a-p - Model type: Large Language Model (LLM) - Language(s) (NLP): Chinese (zh) - License: Apache 2.0 - Finetuned from model [optional]: [More Information Needed - Add base model details] - Repository: https://github.com/m-a-p/COIG-P - Paper: https://arxiv.org/abs/2504.05535 - Demo [optional]: [More Information Needed - Add demo link if available] This model can be used directly for text generation tasks in Chinese. Users can provide instructions or prompts, and the model will generate corresponding text outputs. This model can be fine-tuned for various downstream tasks such as question answering, text summarization, and translation, specifically within the Chinese language context. This model may not perform well on tasks requiring knowledge outside of the domain covered by the COIG-P dataset. Its performance in languages other than Chinese is also expected to be limited. [More Information Needed - Add information on biases, risks, and limitations. Consider potential biases in the training data and the model's potential for generating harmful or inappropriate content.] [More Information Needed - Add recommendations to mitigate biases, risks, and limitations] [More Information Needed - Add details about the training data, linking to the Hugging Face dataset card if applicable. The dataset used is COIG-P: https://huggingface.co/datasets/m-a-p/COIG-P] [More Information Needed - Add details about the training procedure, including pre-processing steps and hyperparameters.] [More Information Needed - Detail the evaluation setup, including datasets, factors, and metrics used.] [More Information Needed - Present the evaluation results.] [More Information Needed - Summarize the evaluation results.] [More Information Needed - Estimate and report the environmental impact of training this model.]

NaNK
license:apache-2.0
73
0

CT-LLM-SFT-DPO

llama
72
5

Amber-Reproduce-8.39B

NaNK
72
0

MuPT-v1.1-8192-4.23B

NaNK
llama
72
0

CriticLeanGPT-Qwen2.5-32B-RL

NaNK
72
0

TreePO-Qwen2.5-7B_Naive2Low_Scheduler

NaNK
72
0

OpenLLaMA-Reproduce-754.97B

NaNK
llama
71
0

CriticLeanGPT-Qwen2.5-14B-RL

NaNK
70
1

Amber-Reproduce-801.11B

NaNK
69
0

neo_7b_sft_v0.1

NaNK
llama
68
1

CriticLeanGPT-Qwen2.5-14B-Instruct-SFT

NaNK
llama-factory
68
1

Amber-Reproduce-58.72B

NaNK
68
0

OpenLLaMA-Reproduce-1509.95B

NaNK
llama
68
0

CriticLeanGPT-Qwen2.5-7B-RL

NaNK
67
1

OpenLLaMA-Reproduce-436.21B

NaNK
llama
67
0

MusiLingo-long-v1

license:cc-by-4.0
63
6

MusiLingo-musicqa-v1

license:cc-by-nc-4.0
61
2

MusiLingo-short-v1

license:cc-by-4.0
60
4

340M-20B-DeltaNet-pure

NaNK
53
0

MIO-7B-Instruct

NaNK
llama
40
3

340M-20B-RetNet-hybrid-3-1

NaNK
34
0

1.3B-100B-RetNet-hybrid-6-1

NaNK
34
0

340M-20B-RetNet-pure

NaNK
33
0

1.3B-100B-GLA-hybrid-6-1

NaNK
33
0

340M-20B-HGRN-hybrid-6-1

NaNK
33
0

340M-20B-HGRN2-hybrid-6-1

NaNK
32
0

340M-20B-GatedDeltaNet-New-hybrid-24-1

NaNK
32
0

transformer_1.3B_baseline

NaNK
32
0

340M-20B-GLA-hybrid-6-1

NaNK
31
0

340M-20B-DeltaNet-hybrid-6-1

NaNK
31
0

340M-20B-DeltaNet-hybrid-12-1

NaNK
31
0

340M-20B-DeltaNet-hybrid-24-1

NaNK
31
0

340M-20B-RetNet-hybrid-12-1

NaNK
31
0

CRM_llama3

llama
31
0

1.3B-100B-HGRN2-hybrid-6-1

NaNK
31
0

1.3B-100B-HGRN-hybrid-3-1

NaNK
31
0

340M-20B-HGRN-hybrid-12-1

NaNK
31
0

340M-20B-HGRN-pure-baseline

NaNK
31
0

1.3B-100B-RetNet-hybrid-3-1

NaNK
31
0

1.3B-100B-GatedDeltaNet-pure

NaNK
31
0

1.3B-100B-DeltaNet-hybrid-3-1

NaNK
31
0

1.3B-100B-DeltaNet-hybrid-24-1

NaNK
31
0

340M-20B-GatedDeltaNet-hybrid-3-1

NaNK
30
1

340M-20B-GatedDeltaNet-hybrid-6-1

NaNK
30
0

340M-20B-GLA-hybrid-12-1

NaNK
30
0

340M-20B-GLA-hybrid-24-1

NaNK
30
0

340M-20B-HGRN2-hybrid-3-1

NaNK
30
0

340M-20B-RetNet-hybrid-24-1

NaNK
30
0

340M-20B-RetNet-hybrid-6-1

NaNK
30
0

1.3B-100B-GatedDeltaNet-hybrid-3-1

NaNK
30
0

1.3B-100B-GatedDeltaNet-hybrid-6-1

NaNK
30
0

1.3B-100B-GatedDeltaNet-hybrid-12-1

NaNK
30
0

1.3B-100B-GatedDeltaNet-hybrid-24-1

NaNK
30
0

1.3B-100B-GLA-hybrid-3-1

NaNK
30
0

1.3B-100B-HGRN2-hybrid-3-1

NaNK
30
0

1.3B-100B-HGRN2-hybrid-12-1

NaNK
30
0

1.3B-100B-HGRN-hybrid-6-1

NaNK
30
0

1.3B-100B-HGRN-pure

NaNK
30
0

340M-20B-HGRN-hybrid-24-1

NaNK
30
0

1.3B-100B-RetNet-hybrid-12-1

NaNK
30
0

1.3B-100B-GLA-pure

NaNK
30
0

340M-20B-GLA-pure-baseline

NaNK
30
0

1.3B-100B-DeltaNet-pure

NaNK
30
0

transformer_340M_baseline

30
0

340M-20B-GLA-hybrid-3-1

NaNK
29
0

340M-20B-HGRN2-pure-baseline

NaNK
29
0

340M-20B-HGRN2-hybrid-12-1

NaNK
29
0

340M-20B-DeltaNet-hybrid-3-1

NaNK
29
0

1.3B-100B-GLA-hybrid-12-1

NaNK
29
0

1.3B-100B-HGRN2-hybrid-24-1

NaNK
29
0

1.3B-100B-HGRN-hybrid-12-1

NaNK
29
0

340M-20B-HGRN-hybrid-3-1

NaNK
29
0

1.3B-100B-RetNet-pure

NaNK
29
0

1.3B-100B-HGRN2-pure

NaNK
29
0

340M-20B-GatedDeltaNet-hybrid-12-1

NaNK
29
0

1.3B-100B-DeltaNet-hybrid-6-1

NaNK
29
0

1.3B-100B-DeltaNet-hybrid-12-1

NaNK
29
0

340M-20B-GatedDeltaNet-pure-baseline

NaNK
29
0

340M-20B-HGRN2-hybrid-24-1

NaNK
28
0

1.3B-100B-GLA-hybrid-24-1

NaNK
28
0

1.3B-100B-HGRN-hybrid-24-1

NaNK
28
0

1.3B-100B-RetNet-hybrid-24-1

NaNK
28
0

key_sota_20250618

3
1

xcodec

1
1

xcodec_mini_infer

license:apache-2.0
0
13

neo_2b_general

NaNK
license:apache-2.0
0
5

FineFineWeb-bert

license:apache-2.0
0
5

neo_7b_decay

NaNK
license:apache-2.0
0
4

neo_scalinglaw_250M

license:apache-2.0
0
2

neo_7b_intermediate

NaNK
license:apache-2.0
0
2

MuPT-intermediate-ckpts

license:apache-2.0
0
1

CT-LLM-intermediate-ckpts

0
1

neo_scalinglaw_460M

license:apache-2.0
0
1

neo_scalinglaw_980M

license:apache-2.0
0
1