ZeroAgency

56 models • 1 total models in database

Sort by:

Zero-Mistral-24B-gguf

Mistral-Small-3.2-24B-Instruct-2506-Text-Only

Modified Small 3.2: - No vision encoder - Standard "Mistral" architecture - Based on anthracite-core/Mistral-Small-3.2-24B-Instruct-2506-Text-Only - System prompt from 3.2 version added as default to chat template

NaNK

—

Zero-Mistral-24B

NaNK

license:mit

gpt-oss-20b-multilingual-reasoner-lora

This model is a fine-tuned version of openai/gpt-oss-20b. It has been trained using TRL. - PEFT 0.17.0 - TRL: 0.21.0 - Transformers: 4.55.0 - Pytorch: 2.8.0+cu128 - Datasets: 4.0.0 - Tokenizers: 0.21.4

NaNK

—

zero-mistral-beta62-e1

NaNK

—

zero-mistral-beta62-e2

NaNK

—

gpt-oss-20b-multilingual-reasoning

This model is a fine-tuned version of axolotl-ai-co/gpt-oss-20b-dequantized on the HuggingFaceH4/Multilingual-Thinking dataset. The following hyperparameters were used during training: - learningrate: 2e-05 - trainbatchsize: 4 - evalbatchsize: 4 - seed: 42 - distributedtype: multi-GPU - numdevices: 8 - totaltrainbatchsize: 32 - totalevalbatchsize: 32 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: constantwithwarmup - trainingsteps: 8 - Transformers 4.55.0 - Pytorch 2.8.0+cu128 - Datasets 4.0.0 - Tokenizers 0.21.4

NaNK

license:apache-2.0

Zero Mistral Small 24B Instruct 2501

NaNK

license:mit

gpt-oss-120b-multilingual-reasoning

This model is a fine-tuned version of axolotl-ai-co/gpt-oss-120b-dequantized on the HuggingFaceH4/Multilingual-Thinking dataset. The following hyperparameters were used during training: - learningrate: 2e-05 - trainbatchsize: 4 - evalbatchsize: 4 - seed: 42 - distributedtype: multi-GPU - numdevices: 8 - totaltrainbatchsize: 32 - totalevalbatchsize: 32 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: constantwithwarmup - trainingsteps: 8 - Transformers 4.55.0 - Pytorch 2.8.0+cu128 - Datasets 4.0.0 - Tokenizers 0.21.4

NaNK

license:apache-2.0

zero-mistral-beta60-e2

- Developed by: [More Information Needed] - Funded by [optional]: [More Information Needed] - Shared by [optional]: [More Information Needed] - Model type: [More Information Needed] - Language(s) (NLP): [More Information Needed] - License: [More Information Needed] - Finetuned from model [optional]: [More Information Needed] - Repository: [More Information Needed] - Paper [optional]: [More Information Needed] - Demo [optional]: [More Information Needed] Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019). - Hardware Type: [More Information Needed] - Hours used: [More Information Needed] - Cloud Provider: [More Information Needed] - Compute Region: [More Information Needed] - Carbon Emitted: [More Information Needed]

NaNK

—

zero-summary-v2-beta15

This model is a fine-tuned version of ZeroAgency/zero-llama-3.1-8b-beta6 on the bethrezen/thinking-summary-v2 dataset. The following hyperparameters were used during training: - learningrate: 4e-05 - trainbatchsize: 1 - evalbatchsize: 1 - seed: 42 - distributedtype: multi-GPU - numdevices: 8 - totaltrainbatchsize: 8 - totalevalbatchsize: 8 - optimizer: Use OptimizerNames.ADAMWTORCHFUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizerargs=No additional optimizer arguments - lrschedulertype: cosine - lrschedulerwarmupsteps: 15 - numepochs: 2.0 - Transformers 4.49.0 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0

NaNK

llama