jpacifico

71 models • 14 total models in database
Sort by:

Chocolatine-3B-Instruct-DPO-Revised-Q4_K_M-GGUF

Quantized q4km GGUF version of the original model `Chocolatine-3B-Instruct-DPO-Revised` can be used on a CPU device, compatible llama.cpp now supported architecture by LM Studio. Also ready for Raspberry Pi 5 8Gb. The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2024 - Model type: LLM - Language(s) (NLP): French, English - License: MIT

NaNK
llama-cpp
4,160
6

Chocolatine-2-4B-Instruct-DPO-v2.1

NaNK
license:apache-2.0
1,959
6

Chocolatine-14B-Instruct-DPO-v1.2

NaNK
license:mit
1,428
14

Chocolatine-3B-Instruct-DPO-v1.2

Best version of Chocolatine-3B for French. The model supports 128K context length. DPO fine-tuned of microsoft/Phi-3.5-mini-instruct (3.82B params) using the jpacifico/french-orca-dpo-pairs-revised rlhf dataset. Training in French also improves the model in English, surpassing the performances of its base model. Chocolatine-3B-Instruct-DPO-v1.2 is outperforming Phi-3-medium-4k-instruct (14B) and its base model Phi-3.5-mini-instruct on MT-Bench-French, used with multilingual-mt-bench and GPT-4-Turbo as LLM-judge. You can also run Chocolatine using the following code: 4-bit quantized version is available here : jpacifico/Chocolatine-3B-Instruct-DPO-v1.2-Q4KM-GGUF The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2024 - Model type: LLM - Language(s) (NLP): French, English - License: MIT

NaNK
license:mit
606
10

Aramis-2B-BitNet-b1.58-i2s-GGUF

NaNK
license:mit
68
4

Chocolatine-2-14B-Instruct-v2.0.3-Q8_0-GGUF

NaNK
llama-cpp
41
2

Chocolatine-3B-Instruct-DPO-Revised

NaNK
license:mit
35
28

Tercet-2B-bitnet-dpo-fr-i2_s-v0.1

NaNK
31
0

Chocolatine-2-14B-Instruct-v2.0.3

NaNK
license:apache-2.0
28
14

Chocolatine-14B-Instruct-DPO-v1.2-Q4_K_M-GGUF

NaNK
llama-cpp
27
4

French-Alpaca-Llama3-8B-Instruct-q8_0-v1.0-GGUF

NaNK
25
2

Chocolatine-2-14B-Instruct-v2.0.3-Q4_K_M-GGUF

jpacifico/Chocolatine-2-14B-Instruct-v2.0.3-Q4KM-GGUF Quantized Q4KM GGUF version of the original model `Chocolatine-2-14B-Instruct-v2.0.3` can be used on a CPU device, compatible llama.cpp Supported architecture by LM Studio. The Chocolatine-2 model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2025 - Model type: LLM - Language(s) (NLP): French, English - License: Apache-2.0

NaNK
llama-cpp
24
1

French-Alpaca-Phi-3-mini-4k-instruct-v1.0-GGUF

license:mit
23
2

Aramis-2B-BitNet-bf16

NaNK
license:mit
23
2

Chocolatine-2-14B-Instruct-v2.0

DPO fine-tuning of the merged model jpacifico/Chocolatine-2-14B-Merged-base-Phi-4 (Microsoft/Phi-4 architecture, 14B params) using the jpacifico/french-orca-dpo-pairs-revised rlhf dataset. Training in French also improves the base model's overall capabilities. BAC FR benchmark Number 1 all categories on the French Government Leaderboard LLM FR Chocolatine-2 outperforms its previous versions and its base architecture Phi-4 model on MT-Bench-French, used with multilingual-mt-bench and GPT-4-Turbo as a LLM-judge. My goal was to achieve GPT-4o-mini's performance on the French language, this version equals the performance of the OpenAI model according to this benchmark You can also run Chocolatine-2 using the following code: The Chocolatine model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2025 - Model type: LLM - Language(s) (NLP): French, English - License: MIT

NaNK
llama
18
6

Vigalpaca-French-7B-ties-GGUF

NaNK
license:apache-2.0
18
1

french-alpaca-llama3-8B-Q4-GGUF

NaNK
license:apache-2.0
18
1

French-Alpaca-Phi-3-beta-GGUF

license:mit
17
4

French-Alpaca-Llama3-8B-Instruct-v1.0

NaNK
llama
16
9

French-Alpaca-7B-Instruct-beta-GGUF

NaNK
license:apache-2.0
12
3

Chocolatine-14B-Instruct-4k-DPO

NaNK
license:mit
12
1

Chocolatine-2-14B-Instruct-v2.0b3

DPO fine-tuning experiment of sometimesanotion/Lamarck-14B-v0.7 (14B params) using the jpacifico/french-orca-dpo-pairs-revised rlhf dataset. Training in French also improves the model in English Long-context Support up to 128K tokens and can generate up to 8K tokens. You can also run Chocolatine using the following code: The Chocolatine model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2025 - Model type: LLM - Language(s) (NLP): French, English - License: Apache-2.0

NaNK
license:apache-2.0
9
2

french-alpaca-instruct-Q4-GGUF

license:apache-2.0
9
0

Chocolatine-14B-Instruct-DPO-v1.3

NaNK
license:mit
7
2

bitnet-dpo-fr

6
0

bitnet-dpo-merged-modelstock4

NaNK
6
0

Chocolatine-14B-Instruct-DPO-v1.1

NaNK
license:mit
4
0

bitnet-dpo-ties-retrained-mirror

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

4
0

bitnet-dpo-ties-retrained-mirror2

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

4
0

bitnet-dpo-merged-stock-atm

This is a merge of pre-trained language models created using mergekit. This model was merged using the Model Stock merge method using jpacifico/bitnet-dpo-merged-ties as a base. The following models were included in the merge: jpacifico/bitnet-dpo-merged-modelstock-retrain jpacifico/bitnet-dpo-merged-ties-retrained-5 jpacifico/bitnet-dpo-merged-ties-retrained-4 jpacifico/bitnet-dpo-ties-retrained-mirror2 jpacifico/bitnet-dpo-merged-ties-retrained-mid The following YAML configuration was used to produce this model:

NaNK
4
0

bitnet-dpo-merged-modelstock6

NaNK
4
0

French-Alpaca-7B-Instruct-beta

NaNK
license:apache-2.0
3
6

Chocolatine-3B-Instruct-DPO-v1.0

Chocolatine v1.0 3.82B params. Window context = 4k tokens This is a French DPO fine-tune of Microsoft's Phi-3-mini-4k-instruct, improving its global understanding performances, even in English. Fine-tuned with the 12k DPO Intel/orcadpopairs translated in French : AIffl/frenchorcadpopairs. Chocolatine is a general model and can itself be finetuned to be specialized for specific use cases. More infos & Benchmarks very soon ^^ Chocolatine is a quick demonstration that a base 3B model can be easily fine-tuned to specialize in a particular language. It does not have any moderation mechanisms. - Developed by: Jonathan Pacifico, 2024 - Model type: LLM - Language(s) (NLP): French, English - License: MIT

NaNK
license:apache-2.0
3
3

Distilucie-7B-Math-Instruct-DPO-v0.1

Post-training optimization of the model OpenLLM-France/Lucie-7B-Instruct-v1.1 DPO fine-tuning using the dataset argilla/distilabel-math-preference-dpo Training set to 5 full epochs Lucie-7B has a context size of 32K tokens You can also run Distilucie using the following code: This Distilucie model is a quick demonstration that the Lucie foundation model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2025 - Model type: LLM - Language(s) (NLP): French, English - License: Apache-2.0

NaNK
llama
3
1

bitnet-dpo-merged-ties

This is a merge of pre-trained language models created using mergekit. This model was merged using the TIES merge method using microsoft/bitnet-b1.58-2B-4T-bf16 as a base. The following models were included in the merge: jpacifico/bitnet-dpo-eng-v0.1 jpacifico/bitnet-dpo-fr The following YAML configuration was used to produce this model:

NaNK
3
1

Chocolatine-3B-Instruct-DPO-q8_0-v1-GGUF

NaNK
3
0

Chocolatine-78B-Instruct-DPO-v1.3

NaNK
license:apache-2.0
3
0

bitnet-dpo-eng-v0.1

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

3
0

bitnet-dpo-ties-retrained-mirror3

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

3
0

bitnet-dpo-merged-ties2

NaNK
3
0

bitnet-dpo-merged-modelstock

This is a merge of pre-trained language models created using mergekit. This model was merged using the Model Stock merge method using jpacifico/bitnet-dpo-merged-ties as a base. The following models were included in the merge: microsoft/bitnet-b1.58-2B-4T-bf16 jpacifico/bitnet-dpo-fr jpacifico/bitnet-dpo-ties-retrained-mirror3 jpacifico/bitnet-dpo-eng-v0.1 jpacifico/bitnet-dpo-ties-retrained-mirror2 The following YAML configuration was used to produce this model:

NaNK
3
0

bitnet-dpo-merged-modelstock-retrain

3
0

bitnet-dpo-merged-ties-retrained-4

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

3
0

bitnet-dpo-merged-ties-retrained-5

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

3
0

bitnet-dpo-merged-ties-retrained-mid

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

3
0

bitnet-dpo-merged-ties-base

NaNK
3
0

bitnet-dpo-merged-ties-01

NaNK
3
0

bitnet-dpo-merged-ties-02

NaNK
3
0

tercet-bitnet-dpo-merged-ties2

NaNK
3
0

tercet-bitnet-dpo-merged-ties3

NaNK
3
0

bitnet-dpo-merged-modelstock5

This is a merge of pre-trained language models created using mergekit. This model was merged using the Model Stock merge method using jpacifico/tercet-bitnet-dpo-merged-ties2 as a base. The following models were included in the merge: microsoft/bitnet-b1.58-2B-4T-bf16 jpacifico/bitnet-dpo-ties-retrained-mirror3 jpacifico/bitnet-dpo-merged-modelstock2 jpacifico/bitnet-dpo-eng-v0.1 jpacifico/bitnet-dpo-fr jpacifico/bitnet-dpo-ties-retrained-mirror2 The following YAML configuration was used to produce this model:

NaNK
3
0

bitnet-dpo-merged-modelstock7-retrained

3
0

Chocolatine-3B-Instruct-DPO-Revised-Q8_0-GGUF

NaNK
llama-cpp
2
2

Chocolatine-2-merged-qwen

NaNK
2
1

Qwen3-4B-Instruct-DPO-test-merged-ties2

NaNK
license:apache-2.0
2
0

French-Alpaca-Phi-3-mini-4k-instruct-beta

license:mit
2
0

Chocolatine-2-14B-Instruct-v2.0b2

DPO fine-tuning experiment of sometimesanotion/Lamarck-14B-v0.7 (14B params) using the jpacifico/french-orca-dpo-pairs-revised rlhf dataset. Training in French also improves the model in English Long-context Support up to 128K tokens and can generate up to 8K tokens. You can also run Chocolatine using the following code: The Chocolatine model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico, 2025 - Model type: LLM - Language(s) (NLP): French, English - License: MIT

NaNK
license:mit
1
6

French-Alpaca-Phi-3-mini-128k-instruct-beta

license:mit
1
3

Chocolatine-3B-Instruct-DPO-v1.2-Q4_K_M-GGUF

NaNK
llama-cpp
1
2

Chocolatine-2-merged-qwen25arch

NaNK
1
2

Chocolatine-Cook-3B-combined-SFT-DPO-v0.1

NaNK
llama
1
1

Chocolatine-32B-Instruct-DPO-v1.2

NaNK
license:apache-2.0
1
1

Vigalpaca-French-7B-ties

NaNK
license:apache-2.0
1
0

Lucie-Boosted-7B-Instruct

NaNK
llama
1
0

Chocolatine-2-14B-Merged-base-Phi-4

NaNK
llama
1
0

Lucie-7B-Instruct-DPO-v1.1

NaNK
llama
1
0

Lucie-7B-Instruct-Merged-Model_Stock-v1.0

This is a merge of pre-trained language models created using mergekit. This model was merged using the Model Stock merge method using OpenLLM-France/Lucie-7B-Instruct-human-data as a base. The following models were included in the merge: OpenLLM-France/Lucie-7B-Instruct-v1.1 jpacifico/Lucie-7B-Instruct-DPO-v1.1 The following YAML configuration was used to produce this model:

NaNK
llama
1
0

bitnet-dpo-merged-modelstock2

NaNK
1
0

tercet-bitnet-dpo-merged-ties4

NaNK
1
0

Chocolatine-Admin-3B-SFT-v0.3b

Chocolatine-Admin-3B version specialized in French administrative language, supervised fine-tuning of jpacifico/Chocolatine-3B-Instruct-DPO-v1.2 based on microsoft/Phi-3.5-mini-instruct Developed in collaboration with Microsoft. The dataset based on the official lexicon published by the French DITP, gathers 2362 administrative terms constituting the basis of the simulation of prompt-answer pairs. The GPT-4o model deployed on Azure OpenAI was used to carry out the building of the dataset in several phases: - Extraction of the lexicon pages (previously converted into jpg format) - Reformulation of the definitions to make them more readable and natural to be used by an LLM in order to ensure high quality data. - Generation of questions from the terms and definitions - Generation of answers in three successive rounds taking into account the previous generations to ensure variety. For this 0.3b version, the Fine Tuning (SFT) was performed on 11 epochs with an A100 GPU instance on Azure Machine Learning. You can run Chocolatine-Admin using the following code: The Chocolatine model series is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance. It does not have any moderation mechanism. - Developed by: Jonathan Pacifico at Cellenza, in collaboration with Microsoft (2024) - License: MIT - Finetuned from model : jpacifico/Chocolatine-3B-Instruct-DPO-v1.2

NaNK
license:mit
0
4

Qwen3.5-DPO-4B

NaNK
0
1