INSAIT-Institute
MamayLM-Gemma-3-12B-IT-v1.0
INSAIT introduces MamayLM-Gemma-3-12B-IT-v1.0, the best performing Ukrainian language model based on google/gemma-3-12b and google/gemma-3-12b-it. MamayLM-Gemma-3-12B-IT-v1.0 is free to use and distributed under the Gemma Terms of Use. This model was created by `INSAIT`, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. The model was built on top of Google’s Gemma 3 12B open models. It was continuously pre-trained on a large pre-filtered dataset using the combination of data mixing and model merging, allowing the model to gain outstanding Ukrainian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Ukrainian web crawl data (Kobza), freely available datasets such as Wikipedia, a range of specialized Ukrainian datasets, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Ukrainian instruction dataset created using machine translations of current best English datasets and specialized Ukrainian datasets, prepared by Ukrainian community. For more information check our blogpost (available in English and Ukrainian). We evaluate our models on a set of standard English benchmarks, a translated version of them in Ukrainian, as well as, Ukrainian specific benchmarks we collected: - Winogrande challenge: testing world knowledge and understanding - Hellaswag: testing sentence completion - ARC Easy/Challenge: testing logical reasoning - TriviaQA: testing trivia knowledge - GSM-8k: solving multiple-choice questions in high-school mathematics - MMLU: testing knowledge on a multitude of topics - IFEval: testing instruction-following skills - ZNO: testing knowledge of the Ukrainian high school curriculum in Ukrainian language & literature, history, mathematics and geography These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-uk. The graphs above show the performance of MamayLM 12B compared to other large open models. The results show the excellent abilities of MamayLM in Ukrainian, which allow them to outperform much larger models, including Alibaba’s Qwen 2.5 72B and Meta’s Llama3.1 70B. Finally, our models retain the excellent English performance inherited from the original Google Gemma 3 models upon which they are based. MamayLM v1.0 12B also shows improved performance on visual benchmarks like MMMU and ZNO-Vision(MMZNO): Use in 🤗 Transformers First install the latest version of the transformers library: For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: In principle, increasing temperature should work adequately as well. In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token ` ` and be formatted in the Gemma 3 chat template. ` ` should only be the first token in a chat sequence. This format is also available as a chat template via the `applychattemplate()` method: The model and instructions for usage in GGUF format are available at INSAIT-Institute/MamayLM-Gemma-3-12B-IT-v1.0-GGUF. We welcome feedback from the community to help improve MamayLM. If you have suggestions, encounter any issues, or have ideas for improvements, please: - Share your experience using the model through Hugging Face's community discussion feature or - Contact us at [email protected] Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases. Summary - Finetuned from: google/gemma-3-12b-it; google/gemma-3-12b-pt; - Model type: Causal decoder-only transformer language model - Language: Ukrainian and English - Contact: [email protected] - License: MamayLM is distributed under Gemma Terms of Use
MamayLM-Gemma-3-12B-IT-v1.0-GGUF
MamayLM-Gemma-3-4B-IT-v1.0
INSAIT introduces MamayLM-Gemma-3-4B-IT-v1.0, the best performing Ukrainian language model based on google/gemma-3-4b-pt and google/gemma-3-4b-it. MamayLM-Gemma-3-4B-IT-v1.0 is free to use and distributed under the Gemma Terms of Use. This model was created by `INSAIT`, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. The model was built on top of Google’s Gemma 3 4B open models. It was continuously pre-trained on a large pre-filtered dataset using the combination of data mixing and model merging, allowing the model to gain outstanding Ukrainian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Ukrainian web crawl data (Kobza), freely available datasets such as Wikipedia, a range of specialized Ukrainian datasets, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Ukrainian instruction dataset created using machine translations of current best English datasets and specialized Ukrainian datasets, prepared by Ukrainian community. For more information check our blogpost (available in English and Ukrainian). We evaluate our models on a set of standard English benchmarks, a translated version of them in Ukrainian, as well as, Ukrainian specific benchmarks we collected: - Winogrande challenge: testing world knowledge and understanding - Hellaswag: testing sentence completion - ARC Easy/Challenge: testing logical reasoning - TriviaQA: testing trivia knowledge - GSM-8k: solving multiple-choice questions in high-school mathematics - MMLU: testing knowledge on a multitude of topics - IFEval: testing instruction-following skills - ZNO: testing knowledge of the Ukrainian high school curriculum in Ukrainian language & literature, history, mathematics and geography These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-uk. The graphs above show the performance of MamayLM 4B compared to other large open models. The results show the excellent abilities of MamayLM in Ukrainian, which allow them to outperform similarly sized models. Finally, our models retain the excellent English performance inherited from the original Google Gemma 3 models upon which they are based. MamayLM v1.0 4B also shows improved performance on visual benchmarks like MMMU and ZNO-Vision(MMZNO): Use in 🤗 Transformers First install the latest version of the transformers library: For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: In principle, increasing temperature should work adequately as well. In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token ` ` and be formatted in the Gemma 3 chat template. ` ` should only be the first token in a chat sequence. This format is also available as a chat template via the `applychattemplate()` method: The model and instructions for usage in GGUF format are available at INSAIT-Institute/MamayLM-Gemma-3-4B-IT-v1.0-GGUF. We welcome feedback from the community to help improve MamayLM. If you have suggestions, encounter any issues, or have ideas for improvements, please: - Share your experience using the model through Hugging Face's community discussion feature or - Contact us at [email protected] Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases. Summary - Finetuned from: google/gemma-3-4b-it; google/gemma-3-4b-pt; - Model type: Causal decoder-only transformer language model - Language: Ukrainian and English - Contact: [email protected] - License: MamayLM is distributed under Gemma Terms of Use
BgGPT-Gemma-2-9B-IT-v1.0-GGUF
MamayLM-Gemma-3-4B-IT-v1.0-GGUF
This repo contains the GGUF format model files for INSAIT-Institute/MamayLM-Gemma-3-4B-IT-v1.0.
BgGPT-Gemma-2-2.6B-IT-v1.0
INSAIT introduces BgGPT-Gemma-2-2.6B-IT-v1.0, a state-of-the-art Bulgarian language model based on google/gemma-2-2b and google/gemma-2-2b-it. BgGPT-Gemma-2-2.6B-IT-v1.0 is free to use and distributed under the Gemma Terms of Use. This model was created by `INSAIT`, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. The model was built on top of Google’s Gemma 2 2B open models. It was continuously pre-trained on around 100 billion tokens (85 billion in Bulgarian) using the Branch-and-Merge strategy INSAIT presented at EMNLP’24, allowing the model to gain outstanding Bulgarian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Bulgarian web crawl data, freely available datasets such as Wikipedia, a range of specialized Bulgarian datasets sourced by the INSAIT Institute, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Bulgarian instruction dataset created using real-world conversations. For more information check our blogpost. We evaluate our models on a set of standard English benchmarks, a translated version of them in Bulgarian, as well as, Bulgarian specific benchmarks we collected: - Winogrande challenge: testing world knowledge and understanding - Hellaswag: testing sentence completion - ARC Easy/Challenge: testing logical reasoning - TriviaQA: testing trivia knowledge - GSM-8k: solving multiple-choice questions in high-school mathematics - Exams: solving high school problems from natural and social sciences - MON: contains exams across various subjects for grades 4 to 12 These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-bg. The graphs above show the performance of BgGPT 2.6B compared to other small open language models such as Microsoft's Phi 3.5 and Alibaba's Qwen 2.5 3B. The BgGPT model not only surpasses them, but also retains English performance inherited from the original Google Gemma 2 models upon which it is based. Use in 🤗 Transformers First install the latest version of the transformers library: For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: In principle, increasing temperature should work adequately as well. In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token ` ` and be formatted in the Gemma 2 chat template. ` ` should only be the first token in a chat sequence. This format is also available as a chat template via the `applychattemplate()` method: Important Note: Models based on Gemma 2 such as BgGPT-Gemma-2-2.6B-IT-v1.0 do not support flash attention. Using it results in degraded performance. The model and instructions for usage in GGUF format are available at INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF. We welcome feedback from the community to help improve BgGPT. If you have suggestions, encounter any issues, or have ideas for improvements, please: - Share your experience using the model through Hugging Face's community discussion feature or - Contact us at [email protected] Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases. Summary - Finetuned from: google/gemma-2-2b-it; google/gemma-2-2b; - Model type: Causal decoder-only transformer language model - Language: Bulgarian and English - Contact: [email protected] - License: BgGPT is distributed under Gemma Terms of Use
BgGPT-Gemma-2-27B-IT-v1.0-GGUF
MamayLM Gemma 2 9B IT V0.1
INSAIT introduces MamayLM-Gemma-2-9B-IT-v0.1, the best performing Ukrainian language model based on google/gemma-2-9b and google/gemma-2-9b-it. MamayLM-Gemma-2-9B-IT-v0.1 is free to use and distributed under the Gemma Terms of Use. This model was created by `INSAIT`, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. The model was built on top of Google’s Gemma 2 9B open models. It was continuously pre-trained on a large pre-filtered dataset (75B tokens of Ukrainian and English data in total) using the combination of data mixing and model merging, allowing the model to gain outstanding Ukrainian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Ukrainian web crawl data (FineWeb2), freely available datasets such as Wikipedia, a range of specialized Ukrainian datasets, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Ukrainian instruction dataset created using machine translations of current best English datasets and specialized Ukrainian datasets, prepared by Ukrainian community. For more information check our blogpost (English, Ukrainian). We evaluate our models on a set of standard English benchmarks, a translated version of them in Ukrainian, as well as, Ukrainian specific benchmarks we collected: - Winogrande challenge: testing world knowledge and understanding - Hellaswag: testing sentence completion - ARC Easy/Challenge: testing logical reasoning - TriviaQA: testing trivia knowledge - GSM-8k: solving multiple-choice questions in high-school mathematics - MMLU: testing knowledge on a multitude of topics - IFEval: testing instruction-following skills - ZNO: testing knowledge of the Ukrainian high school curriculum in Ukrainian language & literature, history, mathematics and geography These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-uk. The graphs above show the performance of MamayLM 9B compared to other large open models. The results show the excellent abilities of MamayLM in Ukrainian, which allow them to outperform much larger models, including Alibaba’s Qwen 2.5 72B and Meta’s Llama3.1 70B. Finally, our models retain the excellent English performance inherited from the original Google Gemma 2 models upon which they are based. Use in 🤗 Transformers First install the latest version of the transformers library: For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: In principle, increasing temperature should work adequately as well. In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token ` ` and be formatted in the Gemma 2 chat template. ` ` should only be the first token in a chat sequence. This format is also available as a chat template via the `applychattemplate()` method: The model and instructions for usage in GGUF format are available at INSAIT-Institute/MamayLM-Gemma-2-9B-IT-v0.1-GGUF. We welcome feedback from the community to help improve MamayLM. If you have suggestions, encounter any issues, or have ideas for improvements, please: - Share your experience using the model through Hugging Face's community discussion feature or - Contact us at [email protected] Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases. Summary - Finetuned from: google/gemma-2-9b-it; google/gemma-2-9b; - Model type: Causal decoder-only transformer language model - Language: Ukrainian and English - Contact: [email protected] - License: MamayLM is distributed under Gemma Terms of Use
BgGPT-Gemma-2-2.6B-IT-v1.0-GGUF
This repo contains the GGUF format model files for INSAIT-Institute/BgGPT-Gemma-2-2.6B-IT-v1.0.
MamayLM Gemma 2 9B IT V0.1 GGUF
This repo contains the GGUF format model files for INSAIT-Institute/MamayLM-Gemma-2-9B-IT-v0.1.
BgGPT-7B-Instruct-v0.2-GGUF
BgGPT-7B-Instruct-v0.1-GGUF
BgGPT-7B-Instruct-v0.2
BgGPT-Gemma-2-9B-IT-v1.0
INSAIT introduces BgGPT-Gemma-2-9B-IT-v1.0, a state-of-the-art Bulgarian language model based on google/gemma-2-9b and google/gemma-2-9b-it. BgGPT-Gemma-2-9B-IT-v1.0 is free to use and distributed under the Gemma Terms of Use. This model was created by `INSAIT`, part of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. The model was built on top of Google’s Gemma 2 9B open models. It was continuously pre-trained on around 100 billion tokens (85 billion in Bulgarian) using the Branch-and-Merge strategy INSAIT presented at EMNLP’24, allowing the model to gain outstanding Bulgarian cultural and linguistic capabilities while retaining its English performance. During the pre-training stage, we use various datasets, including Bulgarian web crawl data, freely available datasets such as Wikipedia, a range of specialized Bulgarian datasets sourced by the INSAIT Institute, and machine translations of popular English datasets. The model was then instruction-fine-tuned on a newly constructed Bulgarian instruction dataset created using real-world conversations. For more information check our blogpost. We evaluate our models on a set of standard English benchmarks, a translated version of them in Bulgarian, as well as, Bulgarian specific benchmarks we collected: - Winogrande challenge: testing world knowledge and understanding - Hellaswag: testing sentence completion - ARC Easy/Challenge: testing logical reasoning - TriviaQA: testing trivia knowledge - GSM-8k: solving multiple-choice questions in high-school mathematics - Exams: solving high school problems from natural and social sciences - MON: contains exams across various subjects for grades 4 to 12 These benchmarks test logical reasoning, mathematics, knowledge, language understanding and other skills of the models and are provided at https://github.com/insait-institute/lm-evaluation-harness-bg. The graphs above show the performance of BgGPT 9B and BgGPT 27B compared to other large open models. The results show the excellent abilities of both 9B and 27B models in Bulgarian, which allow them to outperform much larger models, including Alibaba’s Qwen 2.5 72B and Meta’s Llama3.1 70B. Further, both BgGPT 9B and BgGPT 27B significantly improve upon the previous version of BgGPT based on Mistral-7B (BgGPT-7B-Instruct-v0.2, shown in grey in the figure). Finally, our models retain the excellent English performance inherited from the original Google Gemma 2 models upon which they are based. Use in 🤗 Transformers First install the latest version of the transformers library: For optimal performance, we recommend the following parameters for text generation, as we have extensively tested our model with them: In principle, increasing temperature should work adequately as well. In order to leverage instruction fine-tuning, your prompt should begin with a beginning-of-sequence token ` ` and be formatted in the Gemma 2 chat template. ` ` should only be the first token in a chat sequence. This format is also available as a chat template via the `applychattemplate()` method: Important Note: Models based on Gemma 2 such as BgGPT-Gemma-2-9B-IT-v1.0 do not support flash attention. Using it results in degraded performance. The model and instructions for usage in GGUF format are available at INSAIT-Institute/BgGPT-Gemma-2-9B-IT-v1.0-GGUF. We welcome feedback from the community to help improve BgGPT. If you have suggestions, encounter any issues, or have ideas for improvements, please: - Share your experience using the model through Hugging Face's community discussion feature or - Contact us at [email protected] Your real-world usage and insights are valuable in helping us optimize the model's performance and behaviour for various use cases. Summary - Finetuned from: google/gemma-2-9b-it; google/gemma-2-9b; - Model type: Causal decoder-only transformer language model - Language: Bulgarian and English - Contact: [email protected] - License: BgGPT is distributed under Gemma Terms of Use
Zephyr-7B-MixAT
This is a model adapter for HuggingFaceH4/zephyr-7b-beta, fine-tuned using the MixAT method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: HuggingFaceH4/zephyr-7b-beta - Contact: [email protected] and [email protected] - License: Distributed under Apache License Version 2.0
BrokenMath-Qwen3-4B
We introduce BrokenMath-Qwen3-4B, a model fine-tuned to mitigate sycophancy in mathematical reasoning. To address this, we developed the BrokenMath benchmark and dataset for measuring sycophantic behaviour and aligning against unwanted responses. `BrokenMath-Qwen3-4B` is fine-tuned on this dataset to learn to identify and reject false mathematical statements, while simultaneously improving its general mathematical problem-solving abilities. The model demonstrates improvement in sycophantic behavior and an increase in mathematical utility compared to its base model. BrokenMath-Qwen3-4B is a fine-tuned version of `Qwen/Qwen3-4B-Thinking (25/07)`. It was trained on the `train` split of the BrokenMath dataset, which contains nearly 15,000 problems. This training data includes a balanced mix of standard and adversarially perturbed math problems, enabling the model to learn robust, non-sycophantic reasoning patterns, while retaining its problem-solving capabilities. You can run the model using the standard `transformers` library. The model is trained to identify flawed premises and state its refusal to proceed, as shown in the example below. We evaluated `BrokenMath-Qwen3-4B` on the `benchmark` split of the BrokenMath dataset. The results show improvements in both reducing sycophancy and increasing mathematical problem-solving utility compared to the base model. | Model | Sycophancy Rate (%) ↓ | Utility (Accuracy %) ↑ | |---------------------------|:-------------------------:|:--------------------------:| | Qwen3-4B-Thinking (25/07) | 55.6 | 33.4 | | BrokenMath-Qwen3-4B | 51.0 | 37.9 | Utility is measured as the accuracy on the original, non-perturbed, problems statements within the benchmark. The model was trained on the BrokenMath dataset, which is publicly available for research into sycophantic behaviour in natural language theorem proving. | Dataset | Download | | :-----------------------------: | :----------------------------------------------------------: | | BrokenMath | 🤗 HuggingFace | `BrokenMath-Qwen3-4B` is released under the Apache 2.0 license.
BgGPT-Gemma-2-27B-IT-v1.0
Gemma 2 is a model designed for natural language processing tasks, utilizing the transformers library.
Spear1 Franka
SPEAR-1 is a cutting-edge Vision-Language-Action (VLA) model capable of achieving performance superior or on par with state-of-the-art models such as pi0-FAST and pi0.5 on multiple embodiments while being trained on 20x less robot data. This model was developed by INSAIT, a special unit of Sofia University St. Kliment Ohridski, in Sofia, Bulgaria. Code and model weights for SPEAR-1 models are free to used under the Gemma license. This repo provides model weights fine-tuned for a Franka setup with one wrist and one external camera. The key to SPEAR-1's data efficiency is SPEAR-VLM, a 3D-aware VLM. SPEAR-VLM extends PaliGemma with the MoGe depth encoder and is trained on 3D VQA tasks using primarily non-robot data sources, such as EgoExo-4D. SPEAR-1's architecture combines SPEAR-VLM with a DiT action expert. It is first pre-trained on a mixture of robot demonstration datasets from Open X Embodiment and then fine-tuned for specific embodiments. We provide a fully `AutoModel` compatible implementation of SPEAR-1 that can be used via transformers. The current implementation requires the following additional dependencies: `roma`, `timm`, `flash-attn`. Here is a snippet to set up a working environment for inference via `uv`: SPEAR-1 predicts action chunks of delta end-effector positions. Each step in the predicted action chunk is relative to the input state. Given the current end-effector position `[R, t]` and a model prediction `Arel = [[R1, t1], ..., [Rn, tn]]`, absolute end effector pose commands can be computed as: We welcome feedback from the community to help improve SPEAR-1. If you have suggestions, encounter any issues, or have ideas for improvements, please contact us. - Model type: Vision-Language-Action with flow-matching action decoding - Contact: [email protected] - License: Gemma Terms of Use
Qwen-32B-MixAT
This is a model adapter for Qwen/Qwen2.5-32B-Instruct, fine-tuned using the MixAT method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: Qwen/Qwen2.5-32B-Instruct - Contact: [email protected] and [email protected] - License: Distributed under Apache License Version 2.0
Llama3-8B-MixAT-GCG
This is a model adapter for meta-llama/Meta-Llama-3-8B-Instruct, fine-tuned using the MixAT+GCG method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: meta-llama/Meta-Llama-3-8B-Instruct - Contact: [email protected] and [email protected] - License: Distributed under Meta Llama 3 Community License Agreement
BgGPT-7B-Instruct-v0.1
OPC-R1-8B
Zephyr-7B-MixAT-GCG
This is a model adapter for HuggingFaceH4/zephyr-7b-beta, fine-tuned using the MixAT+GCG method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: HuggingFaceH4/zephyr-7b-beta - Contact: [email protected] and [email protected] - License: Distributed under Apache License Version 2.0
Llama3-8B-MixAT
This is a model adapter for meta-llama/Meta-Llama-3-8B-Instruct, fine-tuned using the MixAT method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: meta-llama/Meta-Llama-3-8B-Instruct - Contact: [email protected] and [email protected] - License: Distributed under Meta Llama 3 Community License Agreement
Qwen-14B-MixAT
This is a model adapter for Qwen/Qwen2.5-14B-Instruct, fine-tuned using the MixAT method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: Qwen/Qwen2.5-14B-Instruct - Contact: [email protected] and [email protected] - License: Distributed under Apache License Version 2.0
Qwen-14B-MixAT-GCG
This is a model adapter for Qwen/Qwen2.5-14B-Instruct, fine-tuned using the MixAT+GCG method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: Qwen/Qwen2.5-14B-Instruct - Contact: [email protected] and [email protected] - License: Distributed under Apache License Version 2.0
ReVLA-Bridge
Mistral-7B-MixAT
This is a model adapter for mistralai/Mistral-7B-Instruct-v0.1, fine-tuned using the MixAT method. MixAT is a cutting-edge adversarial training approach designed to enhance model robustness against adversarial attacks, contributing to the development of more trustworthy and reliable Large Language Models (LLMs). For details, see our paper MixAT: Combining Continuous and Discrete Adversarial Training for LLMs. Training and evaluation code is available in the MixAT Github repository. Use in 🤗 PEFT and Transformers (Quantized) First, install the required libraries: Then, load the base model (4bit quantized) using transformers and apply the adapter using peft: Results MixAT has been evaluated against a broad range of state-of-the-art adversarial attacks, introducing the At Least One Attack Success Rate (ALO-ASR) metric to assess worst-case model vulnerability. Our results show that MixAT achieves significantly improved robustness (ALO-ASR 50%), while maintaining good utility scores and a runtime comparable to continuous relaxation-based methods. - Repository: https://github.com/insait-institute/MixAT - Paper: https://arxiv.org/abs/2505.16947 - Base model: mistralai/Mistral-7B-Instruct-v0.1 - Contact: [email protected] and [email protected] - License: Distributed under Apache License Version 2.0