ThomasTheMaker

1,164

k-270m-gguf

380

k-1b

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-1b-it-unsloth-bnb-4bit This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

334

gm3-270m-tinygsm-60000-Q8_0-GGUF

ThomasTheMaker/gm3-270m-tinygsm-60000-Q80-GGUF This model was converted to GGUF format from `ThomasTheMaker/gm3-270m-tinygsm-60000` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp

qwen2.5-0.5B-simple-tool

Arc

gm3-270m-tinygsm-gpt41

gm3-270m-tinygsm-gpt41-no-example

gm3-270m-tinygsm-o4mini-reasoning

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-tinygsm-llama33-70b-no-example

gm3-270m-tinygsm-gpt41-mini

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-tinygsm-gpt41-mini-no-example

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-tinygsm-Mixtral-8x7B-no-example

gm3-270m-TinyGSM-all

Smollm2-135M-Tulu-3-SFT-Personas-Instruction-Following-v1

Model Card for Smollm2-135M-Tulu-3-SFT-Personas-Instruction-Following This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M. It has been trained using TRL. - TRL: 0.22.2 - Transformers: 4.56.1 - Pytorch: 2.6.0+cu118 - Datasets: 4.0.0 - Tokenizers: 0.22.0

Smollm2-135M-Tulu-3-SFT-Personas-Instruction-Following

gm3-270m-math

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-math-lora

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-code

gm3-270m-code-lora

gm3-270m-algebra

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-algebra-lora

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-algebra-code

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: ThomasTheMaker/gm3-270m-algebra ThomasTheMaker/gm3-270m-code The following YAML configuration was used to produce this model:

gm3-270m-tulu3-mix

gm3-270m-tulu3-mix-lora

gm3-270m-tinygsm

gm3-270m-tinygsm-Mixtral-8x7B

gm3-270m-TinyGSM-no-reasoning

gm3-270m-TinyGSM-reasoning

This model is a fine-tuned version of unsloth/gemma-3-270m-it. It has been trained using TRL. - TRL: 0.22.2 - Transformers: 4.55.4 - Pytorch: 2.8.0 - Datasets: 3.6.0 - Tokenizers: 0.21.4

gemma-3-270m-it-gguf

old-bob4

gm3-270m-math-gguf

gm3-270m-algebra-gguf

k-27b-gguf

Smollm2-360M-Instruct-RKLLM-1.2.1B

pico-decoder-tiny

SmolLM2-135M-Tulu-SFT-Q8_0-GGUF

ThomasTheMaker/SmolLM2-135M-Tulu-SFT-Q80-GGUF This model was converted to GGUF format from `ThomasTheMaker/SmolLM2-135M-Tulu-SFT` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp

Arch-Router-1.5B-rkllm

gm3-270m-tinygsm-Q8_0-GGUF

ThomasTheMaker/gm3-270m-tinygsm-Q80-GGUF This model was converted to GGUF format from `ThomasTheMaker/gm3-270m-tinygsm` using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model. Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well. Step 2: Move into the llama.cpp folder and build it with `LLAMACURL=1` flag along with other hardware-specific flags (for ex: LLAMACUDA=1 for Nvidia GPUs on Linux).

llama-cpp

SmolVLM-Base-cadquery-debug

This model is a fine-tuned version of HuggingFaceTB/SmolVLM-Base on an unknown dataset. The following hyperparameters were used during training: - learningrate: 0.0001 - trainbatchsize: 1 - evalbatchsize: 8 - seed: 42 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: linear - numepochs: 1 - PEFT 0.17.1 - Transformers 4.56.2 - Pytorch 2.8.0+cu128 - Datasets 4.1.1 - Tokenizers 0.22.1

SmolVLM-Base-cadquery-debug100

This model is a fine-tuned version of HuggingFaceTB/SmolVLM-Base on an unknown dataset. The following hyperparameters were used during training: - learningrate: 0.0001 - trainbatchsize: 1 - evalbatchsize: 8 - seed: 42 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: linear - numepochs: 3 - PEFT 0.17.1 - Transformers 4.56.2 - Pytorch 2.8.0+cu128 - Datasets 4.1.1 - Tokenizers 0.22.1

old-bob1-gguf

new-bob-1_1

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

k-12b-gguf

Qwen3-1.7B-RKLLM-v1.2.0

Llama-3.2.-1B-1.2.0-rkllm

meta-llama_Llama-3.2-1B-Instruct_8_layers_3_11_Open-Orca_SlimOrca_8000_ReplaceMe_lstsq_1

Ovis2-1B-RKLLM-1.2.0

new-bob-2

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-1b-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

new-bob-3_1-gguf

k-1b-q4f16_1-MLC

k-app

Qwen3_0.6B_v.1.2.0

Jan-nano-rkllm-1.2.0

Jan-Nano: A 4B MCP-Optimized DeepResearch Model [](https://github.com/menloresearch/deep-research) Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources. Evaluation Jan-Nano has been evaluated on the SimpleQA benchmark using our MCP-based benchmark methodology, demonstrating strong performance for its model size: The evaluation was conducted using our MCP-based benchmark approach, which assesses the model's performance on SimpleQA tasks while leveraging its native MCP server integration capabilities. This methodology better reflects Jan-Nano's real-world performance as a tool-augmented research model, validating both its factual accuracy and its effectiveness in MCP-enabled environments. Jan-Nano is supported by Jan, an open-source ChatGPT alternative that runs entirely on your computer. Jan provides a user-friendly interface for running local AI models with full privacy and control.

Falcon3-1B-Base-RKLLM

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. This repository contains the Falcon3-1B-Base. It achieves strong results on reasoning, language understanding, instruction following, code and mathematics tasks. Falcon3-1B-Base supports 4 languages (English, French, Spanish, Portuguese) and a context length of up to 4K. It was pruned in terms of depth, width, number of heads, and embedding channels from a larger 3B Falcon model, and was efficiently trained on only 80 GT using a knowledge distillation objective. ⚠️ This is a raw, pretrained model, which should be further finetuned using SFT, RLHF, continued pretraining, etc. for most use cases. Model Details - Architecture - Transformer-based causal decoder-only architecture - 18 decoder blocks - Grouped Query Attention (GQA) for faster inference: 8 query heads and 4 key-value heads - Wider head dimension: 256 - High RoPE value to support long context understanding: 1000042 - Uses SwiGLU and RMSNorm - 4K context length - 131K vocab size - Pruned and healed using larger Falcon models (3B and 7B respectively) on only 80 Gigatokens of datasets comprising of web, code, STEM, high quality and multilingual data using 256 H100 GPU chips - Supports EN, FR, ES, PT - Developed by Technology Innovation Institute - License: TII Falcon-LLM License 2.0 - Model Release Date: December 2024 Benchmarks We report in the following table our internal pipeline benchmarks. - We use lm-evaluation harness. - We report raw scores. - We use same batch-size across all models. Category Benchmark Llama-3.2-1B Qwen2.5-1.5B SmolLM2-1.7B Falcon3-1B-Base Reasoning Arc Challenge (25-shot) 40.2 54.8 54.1 48.1 CommonSense Understanding PIQA (0-shot) 74.5 76.0 77.5 74.5 Useful links - View our release blogpost. - Feel free to join our discord server if you have any questions or to interact with our researchers and developers. Citation If the Falcon3 family of models were helpful to your work, feel free to give us a cite.

MiniCPM-1B-sft-bf16-rkllm-1.2.0

Llama3.1-1B-Instruct-4-LayerReplaceMe-1.2.0-rkllm

Qwen_Qwen3-1.7B_4_layers_7_11_Open-Orca_SlimOrca_8000_ReplaceMe_lstsq_1

smollm2-135m-soup1

This is a merge of pre-trained language models created using mergekit. This model was merged using the Linear merge method. The following models were included in the merge: HuggingFaceTB/SmolLM2-135M-Instruct mnoukhov/SmolLM2-135M-Instructtldr-sft HuggingFaceTB/SmolLM2-135M The following YAML configuration was used to produce this model:

Smollm2-135M-concise-reasoning

This model is a fine-tuned version of HuggingFaceTB/SmolLM2-135M. It has been trained using TRL. - TRL: 0.22.2 - Transformers: 4.56.1 - Pytorch: 2.6.0+cu118 - Datasets: 4.0.0 - Tokenizers: 0.22.0

SmolLM2-135M-Tulu-SFT

gm3-270m-TinyGSM-llama31-8b

Falcon-E-MoT-9000

Falcon-E-Capybara-Pure1Bit

SmolVLM-Base-cadquery-debug10-merged

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

SmolVLM-256M-Base-cadquery-debug10-merged

old-bob2-gguf

old-bob4-gguf

new-bob-1

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

new-bob-2-gguf

new-bob-3

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-4b-it This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

new-bob-3-gguf

new-bob-3_1

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-4b-it This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Qwen3-0.6B-RKLLM-1.2.1B

Qwen2.5-0.5B-Instruct-RKLLM-1.2.0

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. This repo contains the instruction-tuned 0.5B Qwen2.5 model, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Architecture: transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias and tied word embeddings - Number of Parameters: 0.49B - Number of Paramaters (Non-Embedding): 0.36B - Number of Layers: 24 - Number of Attention Heads (GQA): 14 for Q and 2 for KV - Context Length: Full 32,768 tokens and generation 8192 tokens For more details, please refer to our blog, GitHub, and Documentation. The code of Qwen2.5 has been in the latest Hugging face `transformers` and we advise you to use the latest version of `transformers`. With `transformers<4.37.0`, you will encounter the following error: Here provides a code snippet with `applychattemplate` to show you how to load the tokenizer and model and how to generate contents. Detailed evaluation results are reported in this 📑 blog. For requirements on GPU memory and the respective throughput, see results here. If you find our work helpful, feel free to give us a cite.

Falcon-E-MoT

This model is a fine-tuned version of tiiuae/Falcon-E-1B-Base. It has been trained using TRL. - TRL: 0.23.0 - Transformers: 4.56.2 - Pytorch: 2.8.0 - Datasets: 4.1.1 - Tokenizers: 0.22.1

SmolVLM-256M-Base-cadquery-5000-merged

SmolVLM-256M-Base-cadquery-3000-merged

old-bob1

old-bob2

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

new-bob-1-gguf

Gemma3 270M + 1.41K rows of BlenderCAD + epoch 5. 30m training (Qlora + Unsloth) on 8GB 4060

new-bob-1_1-gguf

k-270m

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

k-27b

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-27b-it This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Qwen3-4B-RKLLM-v1.2.0

k-12b

tiny-dolma10M

gm3-270m-hard-coded-10x-16bit

gm3-270m-algebra-r128-16bit

- Developed by: ThomasTheMaker - License: apache-2.0 - Finetuned from model : unsloth/gemma-3-270m-it This gemma3text model was trained 2x faster with Unsloth and Huggingface's TRL library.

gm3-270m-tinygsm-60000

gm3-27m-TinyGSM-Llama33-70B

gm3-270m-TinyGSM-o4-mini

gm3-270m-TinyGSM-deepseek-r1

SenseVoiceSmall-RKNN2

SenseVoice is an audio foundation model with audio understanding capabilities, including Automatic Speech Recognition (ASR), Language Identification (LID), Speech Emotion Recognition (SER), and Acoustic Event Classification (AEC) or Acoustic Event Detection (AED). Currently, SenseVoice-small supports multilingual speech recognition, emotion recognition, and event detection for Chinese, Cantonese, English, Japanese, and Korean, with extremely low inference latency. - Inference speed (RKNN2): About 20x real-time on a single NPU core of RK3588 (processing 20 seconds of audio per second), approximately 6 times faster than the official whisper model provided in the rknn-model-zoo. - Memory usage (RKNN2): About 1.1GB

license:agpl-3.0

Thomas-learn-to-fine-tune-with-llama-factory

A collection of yaml file to quickly fine-tune and evaluate models

luna-os