Magistral-Small-2506_gguf

415
69
24 languages
Q8
llama.ccp
by
mistralai
Other
OTHER
2506B params
New
415 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
5602GB+ RAM
Mobile
Laptop
Server
Quick Summary

> [!Note] > At Mistral, we don't yet have too much experience with providing GGUF-quantized checkpoints > to the community, but want to help improving the ecosy...

Device Compatibility

Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
2334GB+ RAM

Code Examples

Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
Download the modelbash
pip install -U "huggingface_hub[cli]"

huggingface-cli download \
"mistralai/Magistral-Small-2506_gguf" \
--local-dir "mistralai/Magistral-Small-2506_gguf/"
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
-sys "your_system_prompt" \bash
llama-cli --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960 \
--temp 0.7 \
--top_p 0.95
# -sys "your_system_prompt" \
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
bash
llama-server --jinja \
-m mistralai/Magistral-Small-2506_gguf/Magistral-Small-2506_Q8_0.gguf \
--ctx-size 40960
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)
pythonopenai
from huggingface_hub import hf_hub_download
import openai

client = openai.OpenAI(
    base_url="http://<your-url>:8080/v1",
    api_key="not-needed",
)

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt("mistralai/Magistral-Small-2506_gguf", "SYSTEM_PROMPT.txt")

completion = client.chat.completions.create(
  model="Magistral-Small-2506_Q8_0.gguf",
  messages=[
	# The following line is not required if you use the default system prompt.
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "How many 'r' are in strawberry?"}
  ],
  temperature=0.7,
  top_p=0.95,
  stream=True
)

print("client: Start streaming chat completions...")
printed_content = False

for chunk in completion:
  content = None
  if hasattr(chunk.choices[0].delta, "content"):
    content = chunk.choices[0].delta.content

  if content is not None:
    if not printed_content:
        printed_content = True
        print("\ncontent:", end="", flush=True)
    # Extract and print the content
    print(content, end="", flush=True)

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.