Cheng-1

2
license:mit
by
marcuscedricridia
Language Model
OTHER
New
2 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary

Model Overview Cheng-1 is a high-performance language model created through strategic merging of top-tier, pre-existing fine-tuned models.

Code Examples

**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: sce  
models:
  - model: Qwen/Qwen2.5-7B-Instruct-1M
  - model: Qwen/Qwen2.5-7B  
base_model: Qwen/Qwen2.5-7B-Instruct-1M  
parameters:  
  select_topk: 1  
dtype: bfloat16  
tokenizer_source: base  
normalize: true  
int8_mask: true  
name: Yell-Qwen2.5-7B-1M
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: della  
base_model: marcuscedricridia/Yell-Qwen2.5-7B-1M 
models:  
  - model: TIGER-Lab/AceCoder-Qwen2.5-7B-Ins-Rule 
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: Krystalan/DRT-7B
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
  - model: nvidia/AceMath-7B-Instruct
    parameters:  
      density: 1  
      weight: 1  
      lambda: 0.9  
parameters:  
  density: 1  
  weight: 1  
  lambda: 0.9  
  normalize: true  
  int8_mask: true  
dtype: bfloat16  
tokenizer_source: base  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
**Merge Code:**yaml
merge_method: model_stock  
base_model: YOYO-AI/Qwen2.5-7B-it-restore  
models:  
  - model: marcuscedricridia/mergekit-della-wpunuct
  - model: marcuscedricridia/mergekit-della-phphmhr
  - model: marcuscedricridia/mergekit-della-qejrhsk
  - model: marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.2-1M  
dtype: bfloat16  
tokenizer_source: base  
int8_mask: true  
normalize: true  
name: Cheng-1
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21
Benchmarkstext
Model: marcuscedricridia/Cheng-1
Precision: torch.bfloat16
Revision: cd8c9dd37c67c2e1b7c683fdd5e72b7f08c074b9

Average: 36.06
IFEval: 77.89
BBH: 36.54
MATH: 48.94
GPQA: 6.15
MUSR: 9.62
MMLU-PRO: 37.21

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.