DeepSeek-V3.1-int4-mixed-AutoRound
46
10
—
by
Intel
Language Model
OTHER
2309.05516B params
New
46 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
5162GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
2151GB+ RAM
Code Examples
How To Usepythontransformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
import torch
quantized_model_dir = "Intel/DeepSeek-V3.1-int4-mixed-AutoRound"
model = AutoModelForCausalLM.from_pretrained(
quantized_model_dir,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code=True)
prompts = [
"9.11和9.8哪个数字大",
"strawberry中有几个r?",
"There is a girl who likes adventure,",
"Please give a brief introduction of DeepSeek company.",
]
texts=[]
for prompt in prompts:
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
texts.append(text)
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
outputs = model.generate(
input_ids=inputs["input_ids"].to(model.device),
attention_mask=inputs["attention_mask"].to(model.device),
max_length=200, ##change this to align with the official usage
num_return_sequences=1,
do_sample=False ##change this to align with the official usage
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs["input_ids"], outputs)
]
decoded_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
for i, prompt in enumerate(prompts):
input_id = inputs
print(f"Prompt: {prompt}")
print(f"Generated: {decoded_outputs[i]}")
"""
Prompt: 9.11和9.8哪个数字大
Generated: 9.11 和 9.8 比较时,9.11 更大。
- 因为 9.11 相当于 9 + 0.11,而 9.8 相当于 9 + 0.8,但注意这里 0.11 实际上小于 0.8(0.11 < 0.8),所以 9.8 更大。
- 重新确认:9.11 是 9.11,9.8 是 9.80,因此 9.80 > 9.11。
**答案:9.8 更大。**
--------------------------------------------------
Prompt: strawberry中有几个r?
Generated: 在英文单词 "strawberry" 中,字母 "r" 出现了 **3 次**。
- 位置:第 3 个字母(s**t**r**a**w**b**e**r**r**y,注意:第 1 个 "r" 是第 3 字符,第 2 个 "r" 是第 6 字符,第 3 个 "r" 是第 7 字符)。
如果需要进一步解释或其他问题,请随时告知! 😊
--------------------------------------------------
Prompt: There is a girl who likes adventure,
Generated: Of course! A girl who likes adventure is a fantastic starting point for a story, a character, or a real-life inspiration. Here are a few ways to explore that idea:
### As a Character Profile:
**Name:** Let's call her **Elara**.
**Traits:**
* **Curious:** She asks "why" and "what if" more than anyone else. She sees a hidden path in the woods and has to know where it leads.
* **Resourceful:** She's the one with a multi-tool in her pocket, who knows how to read a map (and the stars), and can build a fire.
* **Brave, not fearless:** She feels the fear of climbing the tall cliff or exploring the dark cave, but her curiosity and determination are stronger.
* **Resilient:** She doesn't see a wrong turn
--------------------------------------------------
Prompt: Please give a brief introduction of DeepSeek company.
Generated: Of course. Here is a brief introduction to DeepSeek:
**DeepSeek** is a leading Chinese AI research company focused on developing powerful artificial general intelligence (AGI). The company is best known for creating state-of-the-art large language models (LLMs).
**Key Highlights:**
* **Core Product:** Their flagship product is the **DeepSeek-V2** language model, a powerful and efficient AI known for its strong performance in coding, mathematics, and general reasoning.
* **Open-Source Commitment:** DeepSeek has gained significant recognition for open-sourcing its earlier models (like DeepSeek-Coder and DeepSeek-LLM 67B), making them freely available for research and commercial use. This has helped foster innovation and build a strong developer community.
* **Specialization in Coding:** They are particularly renowned for their models' exceptional capabilities
--------------------------------------------------
"""Generate the modelpythontransformers
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
from auto_round import AutoRound
model_name = "deepseek-ai/DeepSeek-V3.1"
layer_config = {}
for n, m in model.named_modules():
if isinstance(m, torch.nn.Linear):
if "expert" in n and "shared_experts" not in n:
layer_config[n] = {"bits": 4}
print(n, 4)
elif n != "lm_head":
layer_config[n] = {"bits": 8}
print(n, 8)
autoround = AutoRound(model_name, iters=0, layer_config=layer_config)
autoround.quantize_and_save(format="auto_round", output_dir="tmp_autoround")Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.