Emilio407

48 models • 1 total models in database
Sort by:

nllb-200-3.3B-8bit

NaNK
license:cc-by-nc-4.0
125
2

nllb-200-3.3B-4bit

NaNK
license:cc-by-nc-4.0
70
0

Qwen2-1.5B-Instruct-GGUF

NaNK
69
0

dolphin-2.9.2-qwen2-7b-GGUF

NaNK
67
0

madlad400-3b-mt-8bit

NaNK
license:apache-2.0
62
0

Qwen2-7B-Instruct-GGUF

NaNK
57
0

dolphin-2.9.3-qwen2-1.5b-GGUF

NaNK
42
0

gemma-2-9b-it-abliterated-GGUF

NaNK
41
0

Qwen2-0.5B-Instruct-Abliterated-GGUF

NaNK
35
0

Qwen2-0.5B-Instruct-GGUF

NaNK
30
0

Qwen2-1.5B-Instruct-Abliterated-GGUF

NaNK
30
0

guarani-jopara-llama-3.1-8B-instruct-v1-GGUF

NaNK
llama
24
0

prostate-mri-T2w-v03

NaNK
20
0

guarani-jopara-Qwen2-0.5B-Instruct-v1-GGUF

NaNK
license:apache-2.0
19
0

dolphin-2.9.1-llama-3-8b-GGUF

NaNK
18
0

dolphin-2.9.3-qwen2-0.5b-GGUF

NaNK
18
0

guarani-jopara-gemma-2-2b-it-v1-GGUF

NaNK
license:apache-2.0
18
0

Nllb 200 Distilled 600M 4bit

NaNK
license:cc-by-nc-4.0
17
1

Dolphin3.0-Qwen2.5-0.5B-GRPO-V2

NaNK
license:apache-2.0
13
0

nllb-200-1.3B-8bit

NaNK
license:cc-by-nc-4.0
13
0

Llama-3.2-3B-Instruct-Jopara-V2-GGUF

NaNK
llama
12
0

Dolphin3.0-Qwen2.5-0.5B-GRPO-V1-GGUF

- Developed by: Emilio407 - License: apache-2.0 - Finetuned from model : cognitivecomputations/Dolphin3.0-Qwen2.5-0.5B This qwen2 model was trained 2x faster with Unsloth and Huggingface's TRL library.

NaNK
license:apache-2.0
10
0

SmolLM2-135M-Instruct-Reasoner-V1-GGUF

llama
10
0

nllb-200-distilled-600M-8bit

NaNK
license:cc-by-nc-4.0
9
1

SmolLM2-360M-Instruct-Reasoner-V1-GGUF

llama
9
0

Llama-3-8B-Instruct-Gradient-4194k-GGUF

NaNK
8
0

granite-3b-code-instruct-GGUF

NaNK
8
0

nllb-200-distilled-1.3B-8bit

NaNK
license:cc-by-nc-4.0
5
1

stablelm-2-1_6b-chat-GGUF

NaNK
5
0

TinyLlama-1.1B-Chat-v1.0-GGUF

NaNK
4
0

Dolphin3.0-Qwen2.5-0.5B-GRPO-V1

NaNK
license:apache-2.0
2
0

madlad400-10b-mt-4bit

0. TL;DR 1. Model Details 2. Usage 3. Uses 4. Bias, Risks, and Limitations 5. Training Details 6. Evaluation 7. Environmental Impact 8. Citation MADLAD-400-10B-MT is a multilingual machine translation model based on the T5 architecture that was trained on 250 billion tokens covering over 450 languages using publicly available data. It is competitive with models that are significantly larger. Disclaimer: Juarez Bochi, who was not involved in this research, converted the original weights and wrote the contents of this model card based on the original paper and Flan-T5. - Model type: Language model - Language(s) (NLP): Multilingual (400+ languages) - License: Apache 2.0 - Related Models: All MADLAD-400 Checkpoints - Original Checkpoints: All Original MADLAD-400 Checkpoints - Resources for more information: - Research paper - GitHub Repo - Hugging Face MADLAD-400 Docs (Similar to T5) - Pending PR Find below some example scripts on how to use the model: First, install the Python packages that are required: `pip install transformers accelerate sentencepiece` > Primary intended uses: Machine Translation and multilingual NLP tasks on over 400 languages. > Primary intended users: Research community. > These models are trained on general domain data and are therefore not meant to > work on domain-specific models out-of-the box. Moreover, these research models have not been assessed > for production usecases. > We note that we evaluate on only 204 of the languages supported by these models and on machine translation > and few-shot machine translation tasks. Users must consider use of this model carefully for their own > usecase. > We trained these models with MADLAD-400 and publicly available data to create baseline models that > support NLP for over 400 languages, with a focus on languages underrepresented in large-scale corpora. > Given that these models were trained with web-crawled datasets that may contain sensitive, offensive or > otherwise low-quality content despite extensive preprocessing, it is still possible that these issues to the > underlying training data may cause differences in model performance and toxic (or otherwise problematic) > output for certain domains. Moreover, large models are dual use technologies that have specific risks > associated with their use and development. We point the reader to surveys such as those written by > Weidinger et al. or Bommasani et al. for a more detailed discussion of these risks, and to Liebling > et al. for a thorough discussion of the risks of machine translation systems. > We train models of various sizes: a 3B, 32-layer parameter model, > a 7.2B 48-layer parameter model and a 10.7B 32-layer parameter model. > We share all parameters of the model across language pairs, > and use a Sentence Piece Model with 256k tokens shared on both the encoder and decoder > side. Each input sentence has a token prepended to the source sentence to indicate the target > language. > For both the machine translation and language model, MADLAD-400 is used. For the machine translation > model, a combination of parallel datasources covering 157 languages is also used. Further details are > described in the paper. > For evaluation, we used WMT, NTREX, Flores-200 and Gatones datasets as described in Section 4.3 in the paper. > The translation quality of this model varies based on language, as seen in the paper, and likely varies on > domain, though we have not assessed this.

NaNK
license:apache-2.0
2
0

Prostate158-PI-CAI-MRI-Tumor-T2W-ADC-HBV-DWI-v01

NaNK
1
3

madlad400-3b-mt-4bit

NaNK
license:apache-2.0
1
1

Qwen2-0.5B-Instruct-Abliterated

NaNK
license:apache-2.0
1
0

guarani-jopara-llama-3.1-8B-instruct-v1

NaNK
llama
1
0

guarani-jopara-gemma-2-2b-it-v1

NaNK
license:apache-2.0
1
0

guarani-jopara-Qwen2-0.5B-Instruct-v1

NaNK
license:apache-2.0
1
0

prostate-mri-T2w-v01

NaNK
1
0

prostate-mri-T2w-v02

NaNK
1
0

Llama-3.2-3B-Instruct-Jopara-V1-GGUF

NaNK
llama
1
0

SmolLM2-135M-Instruct-Reasoner-V2

llama
1
0

Llama-3.2-3B-Instruct-Jopara-V2

NaNK
llama
1
0

madlad400-7b-mt-4bit

NaNK
license:apache-2.0
1
0

madlad400-10b-mt-8bit

NaNK
license:apache-2.0
1
0

Qwen2-1.5B-Instruct-Abliterated

NaNK
license:apache-2.0
0
1

SmolLM2-360M-Instruct-Reasoner-V1

llama
0
1

SmolLM2-360M-Instruct-Reasoner-V1-LoRA

llama
0
1