Fijik-1b-DPO

Name: Fijik-1b-DPO
Author: Pinkstack

1.0B

3 languages

license:apache-2.0

Pinkstack

Language Model

OTHER

1B params

New

1 downloads

Early-stage

Try on Hugging Face Add to Compare

Edge AI:

Mobile

Laptop

Server

3GB+ RAM

Mobile

Laptop

Server

Quick Summary

What is it This is a 1.

Device Compatibility

Mobile

4-6GB RAM

Laptop

16GB RAM

Server

GPU

Minimum Recommended

1GB+ RAM

Code Examples

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

What is ittext

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/SE_vmS54Qm3Heu6sozIo3.png)

# What is it
    This is a 1.0 Fijik series with **1 billion** parameters, dense 56 layer transformer LLM based on Qwen2.5, specifically, it was merged using Mergekit to be twice as large as Qwen2.5 0.5B.

After merging, we used a custom dataset mix meant for this model, to improve its performance even more.
- **Step 1 for fine-tuning via unsloth:** SFT on an estimated 5 million tokens. (more or less)
- **Step 2 for the fine-tuning via unsloth:** DPO for 2 epochs for even better instruction following.
After these two steps, we got a powerful model which has less parameters than llama 3.2 3B yet performs just as good if not better, Note that unlike our other recent models, it is not a thinking model, yet it can reason quite well. Our theory behind this model is that a smaller yet deeper model can outperform for it's size.

Alibaba qwen states that Qwen2.5 was pre-trained on up to 18 trillion high-quality tokens. This model supports up to **32768** input tokens and can generate up to **8192** tokens.

# What should Fijik be used for?
Fijik 1.0 1B is by design, meant to be a production-ready, general use, high-performance model, which is also small enough to be run at high token throughputs while minimising performance loss.
- We made some efforts at ensuring the model is safe while keeping it useable. In addition, it is sensitive to system prompts (in a good way, adheres to them well), so it is very customisable. We did not put in our fine-tuning data any information about the identity of the model; rather it just knows that it is a Large Language Model (LLM), but it does not know it is Fijik, unless you specify in the system prompt.
- Due to the large context of the model, It can be used for RAG, but like any other LLM out there, you should be aware that it *may* hallucinate.
- In our fine-tuning data we included quite a bit of creative writing examples, so the model is pretty good at it.
- Coding, Math: In our SFT, DPO fine-tuning data we have put an effort into improving coding and step-by-step math performance, while it is indeed not perfect, no LLM is.
# Examples

none yet

# Limitations
This model is not uncensored, yet it may produce erotic outputs. You are solely responsible for the outputs from the model.
Like any other LLM, users and hosters alike should be aware that AI language models may hallucinate and produce inaccurate, dangerous, or even completly nonsensical outputs, all the information the model provides may seem accurate, but please, for important tasks always double check responses with credible sources.

# Uploaded  model

- **Developed by:** Pinkstack
- **License:** Apache 2.0
- **Finetuned from model :** Pinkstack/Fijik-1b-v1-sft

This Qwen2.5 model was trained with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

# Citations
Magpie:

Deploy This Model

Production-ready deployment in minutes

Together.ai

Instant API access to this model

Fastest API

Production-ready inference API. Start free, scale to millions.

Try Free API

Replicate

One-click model deployment

Easiest Setup

Run models in the cloud with simple API. No DevOps required.

Deploy Now

Disclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.