swap-uniba

32 models • 1 total models in database

Sort by:

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA

📣 New MODEL FAMILY❗ https://huggingface.co/m-polignano/ANITA-NEXT-24B-Magistral-2506-VISION-ITA --> "Built with Meta Llama 3 ". LLaMAntino-3-ANITA-8B-Inst-DPO-ITA is a model of the LLaMAntino - Large Language Models family . The model is an instruction-tuned version of Meta-Llama-3-8b-instruct (a fine-tuned LLaMA 3 model ). This model version aims to be the a Multilingual Model 🏁 (EN 🇺🇸 + ITA🇮🇹) to further fine-tuning on Specific Tasks in Italian. The 🌟ANITA project🌟 (Advanced Natural-based interaction for the ITAlian language) wants to provide Italian NLP researchers with an improved model for the Italian Language 🇮🇹 use cases. Live DEMO: https://chat.llamantino.it/ It works only with Italian connection. | Model | HF | GGUF | EXL2 | |-------|-------|-------|-------| | swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA | Link | Link | Link | - Model developers: Ph.D. Marco Polignano - University of Bari Aldo Moro, Italy SWAP Research Group - Variations: The model release has been supervised fine-tuning (SFT) using QLoRA 4bit, on instruction-based datasets. DPO approach over the mlabonne/orpo-dpo-mix-40k dataset is used to align with human preferences for helpfulness and safety. - Input: Models input text only. - Language: Multilingual 🏁 + Italian 🇮🇹 - Output: Models generate text and code only. - Model Architecture: Llama 3 architecture. - Context length: 8K, 8192. - Library Used: Unsloth To use the model directly, there are many ways to get started, choose one of the following ways to experience it. For direct use with `transformers`, you can easily get started with the following steps. - Firstly, you need to install transformers via the command below with `pip`. - Right now, you can start using the model directly. - Additionally, you can also use a model with 4bit quantization to reduce the required resources at least. You can start with the code below. Evaluated with lm-evaluation-benchmark-harness for the Open Italian LLMs Leaderboard | Metric | Value | |-----------------------|---------------------------| | Avg. | 0.6160 | | ArcIT | 0.5714 | | HellaswagIT | 0.7093 | | MMLUIT | 0.5672 | Unsloth, a great tool that helps us easily develop products, at a lower cost than expected. Acknowledgments We acknowledge the support of the PNRR project FAIR - Future AI Research (PE00000013), Spoke 6 - Symbiotic AI (CUP H97G22000210007) under the NRRP MUR program funded by the NextGenerationEU. Models are built on the Leonardo supercomputer with the support of CINECA-Italian Super Computing Resource Allocation, class C project IscrC\Pro\MRS (HP10CQO70G). Open LLM Leaderboard Evaluation Results Detailed results can be found here | Metric |Value| |---------------------------------|----:| |Avg. |75.12| |AI2 Reasoning Challenge (25-Shot)|74.57| |HellaSwag (10-Shot) |92.75| |MMLU (5-Shot) |66.85| |TruthfulQA (0-shot) |75.93| |Winogrande (5-shot) |82.00| |GSM8k (5-shot) |58.61|

swap-uniba

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA

LLaMAntino-2-7b-hf-ITA

LLaMAntino-2-chat-13b-hf-UltraChat-ITA

LLaMAntino-2-70b-hf-UltraChat-ITA

LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_GGUF

siglip2-large-patch16-256-VWSD-ft

LLM-wsd-FT-ALL

LLaVA-NDiNO_pt_long

LLaMAntino-2-chat-7b-hf-UltraChat-ITA

LLaMAntino-2-13b-hf-ITA

LLaMAntino-2-7b-hf-dolly-ITA

bloom-1b7-evalita-it

LLaMAntino-2-chat-13b-hf-ITA

LLaVA-NDiNO_pt

xVLM2Vec_image_loss

llama-latin-wsd-binary

bloom-1b7-comoscio-it

LLaVA-NDiNO_short_it

llama-latin-wsd

LLaMAntino-2-13b-hf-dolly-ITA

LLM-wsd-TT-10000

Qwen2.5-VL-7B-Instruct-VWSD-ft

LLaMAntino-2-13b-hf-evalita-ITA

bloom-1b7-it

bloom-1b7-evalita

llama3-it-pa-100k-adapter

llama3-it-pa-300k-adapter

LLaVA-NDiNO_pt_short_it

LLM-wsd-FT-20000

LLM-wsd-FT-10000

xVLM2Vec

LLaMAntino-2-chat-7b-hf-ITA