Unbabel
wmt22-cometkiwi-da
Tower-Plus-9B
Tower+ 9B is build on top of Gemma 2 9B. The model goes through the Continuous Pretraining (CPT), Instruction Tuning (IT), Weighted Preference Optimization (WPO). During all stages we include parallel and multilingual data (covering 22 languages). This approach makes Tower+ 9B one of the best multilingual LLMs under 10B parameters. - Developed by: Unbabel - Model type: A 9B parameter model fine-tuned on a mix of translation-related tasks as well as general instruction-following datasets that include reasoning, code instructions, etc. - Languages: German, Spanish, French, Italian, Korean, Dutch, Russian, English, Portuguese (Portugal), Portuguese (Brazilian), Spanish (Latin America), Chinese (Simplified), Chinese (Traditional), Czech, Ukrainian, Hindi, Icelandic, Japanese, Polish, Swedish, Hungarian, Romanian, Danish, Norwegian (Nynorsk), Norwegian (Bokmål), Finnish - License: CC-BY-NC-4.0 - Context Size:: 8192 tokens Tower is intended for multilingual tasks and its specially strong on machine translation. Because Tower is also a strong multilingual model you can also use it for other multilingual tasks. Another usecase Tower works well is for creating multilingual synthethic data (for the languages it covers). You can do this either by translating instructions and the respective answers or by asking the model to create an instruction given a document as seed data. When using the model, make sure your prompt is formated correctly! Also, we recommend using VLLM rather than Hugging Face. Citation If you use this model please cite our paper:
wmt22-comet-da
gec-t5_small
TowerInstruct-7B-v0.1
Tower-Plus-72B
This repository contains the Tower+ 72B model, as presented in the paper Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs. Project Page: https://huggingface.co/collections/Unbabel/tower-plus-6846ca452a10c0905dc03c0f Tower+ 72B is build on top of Qwen 2.5 72B. The model goes through the Continuous Pretraining (CPT), Instruction Tuning (IT) and Weighted Preference Optimization (WPO). During all these stages we include parallel and multilingual data (covering 22 languages). - Developed by: Unbabel - Model type: A 72B parameter model fine-tuned on a mix of translation-related tasks as well as general instruction-following datasets that include reasoning, code instructions, etc. - Languages: German, Spanish, French, Italian, Korean, Dutch, Russian, English, Portuguese (Portugal), Portuguese (Brazilian), Spanish (Latin America), Chinese (Simplified), Chinese (Traditional), Czech, Ukrainian, Hindi, Icelandic, Japanese, Polish, Swedish, Hungarian, Romanian, Danish, Norwegian (Nynorsk), Norwegian (Bokmål), Finnish - License: CC-BY-NC-4.0 - Context Size:: 131,072 tokens (recommended generation tokens 8192) Tower is intended for multilingual tasks and its specially strong on translation related tasks. Another usecase Tower works well is for creating multilingual synthethic data (for the languages it covers). You can do this either by translating instructions and the respective answers or by asking the model to create an instruction given a document as seed data. When using the model, make sure your prompt is formated correctly! Also, we recommend using VLLM rather than Hugging Face. Citation If you use this model please cite our paper:
Tower Plus 2B
Tower+ 2B is build on top of Gemma 2 2B. The model goes through the Continuous Pretraining (CPT), Instruction Tuning (IT), Weighted Preference Optimization (WPO) and GRPO with verifiable rewards. During all stages we include parallel and multilingual data (covering 22 languages). This approach makes Tower+ 2B one of the best multilingual LLMs under 3B parameters. - Developed by: Unbabel - Model type: A 2B parameter model fine-tuned on a mix of translation-related tasks as well as general instruction-following datasets that include reasoning, code instructions, etc. - Languages: German, Spanish, French, Italian, Korean, Dutch, Russian, English, Portuguese (Portugal), Portuguese (Brazilian), Spanish (Latin America), Chinese (Simplified), Chinese (Traditional), Czech, Ukrainian, Hindi, Icelandic, Japanese, Polish, Swedish, Hungarian, Romanian, Danish, Norwegian (Nynorsk), Norwegian (Bokmål), Finnish - License: CC-BY-NC-4.0 - Context Size:: 8192 tokens Tower is intended for multilingual tasks and its specially strong on machine translation. Because Tower is also a strong multilingual model you can also use it for other multilingual tasks. Another usecase Tower works well is for creating multilingual synthethic data (for the languages it covers). You can do this either by translating instructions and the respective answers or by asking the model to create an instruction given a document as seed data. When using the model, make sure your prompt is formated correctly! Also, we recommend using VLLM rather than Hugging Face. Citation If you use this model please cite our paper:
M-Prometheus-7B
M-Prometheus is a suite of open LLM judges that can natively evaluate multilingual outputs. They were trained on 480k instances of multilingual direct assessment and pairwise comparison data with long-form feedback. They can be prompted in the same way as Prometheus-2. Check out our paper for more details. Our models can be prompted in the same way as Prometheus-2. For direct-assesssment MT Evaluation, we used the following prompt:
TowerInstruct 7B V0.2
TowerInstruct-7B is a language model that results from fine-tuning TowerBase on the TowerBlocks supervised fine-tuning dataset. TowerInstruct-7B-v0.2 is the first model in the series. The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation. We will release more details in the upcoming technical report. For now, you can check results obtained with the model here. - Developed by: Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay - Model type: A 7B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions. - Language(s) (NLP): English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, Russian - License: CC-BY-NC-4.0, Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. - Finetuned from model: TowerBase Update: TowerInstruct-7B-v0.2 has more reliable document-level translation capabilities in comparison with TowerInstruct-7B-v0.1. The new version of TowerBlocks used to train v0.2 is also available in the Tower collection. The model was initially fine-tuned on a filtered and preprocessed supervised fine-tuning dataset (TowerBlocks), which contains a diverse range of data sources: - Translation (sentence and paragraph-level) - Automatic Post Edition - Machine Translation Evaluation - Context-aware Translation - Terminology-aware Translation - Multi-reference Translation - Named-entity Recognition - Paraphrase Generation - Synthetic Chat data - Code instructions You can find the dataset and all data sources of TowerBlocks here. Here's how you can run the model using the `pipeline()` function from 🤗 Transformers: The model is not guaranteed to perform for languages other than the 10 languages it supports. Even though we trained the model on conversational data and code instructions, it is not intended to be used as a conversational chatbot or code assistant. We are currently working on improving quality and consistency on document-level translation. This model should is not intended to be use as a document-level translator. TowerInstruct-v0.2 has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements). TowerInstruct-v0.2 was trained using the ChatML prompt templates without any system prompts. An example follows below: The prompts for all supervised tasks can be found in TowerBlocks. We have used multiple prompt templates for each task. While different prompts may offer different outputs, the difference in downstream performance should be very minimal. The following hyperparameters were used during training: - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
TowerBase 7B V0.1
M-Prometheus-14B
M-Prometheus is a suite of open LLM judges that can natively evaluate multilingual outputs. They were trained on 480k instances of multilingual direct assessment and pairwise comparison data with long-form feedback. They can be prompted in the same way as Prometheus-2. Check out our paper for more details. Our models can be prompted in the same way as Prometheus-2. For direct-assesssment MT Evaluation, we used the following prompt:
xlm-roberta-comet-small
TowerInstruct 13B V0.1
TowerInstruct-13B is a language model that results from fine-tuning TowerBase on the TowerBlocks supervised fine-tuning dataset. TowerInstruct-13B-v0.1 is the first model in the series. The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and paragraph/document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation. We will release more details in the upcoming technical report. For now, you can check results obtained with the model here. - Developed by: Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay - Model type: A 13B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions. - Language(s) (NLP): English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, Russian - License: CC-BY-NC-4.0, Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. - Finetuned from model: TowerBase The model was initially fine-tuned on a filtered and preprocessed supervised fine-tuning dataset (TowerBlocks-v0.2), which contains a diverse range of data sources: - Translation (sentence and paragraph-level) - Automatic Post Edition - Machine Translation Evaluation - Context-aware Translation - Terminology-aware Translation - Multi-reference Translation - Named-entity Recognition - Paraphrase Generation - Synthetic Chat data - Code instructions You can find the dataset and all data sources of TowerBlocks-v0.2 here. Here's how you can run the model using the `pipeline()` function from 🤗 Transformers: The model is not guaranteed to perform for languages other than the 10 languages it supports. Even though we trained the model on conversational data and code instructions, it is not intended to be used as a conversational chatbot or code assistant. We are currently working on improving quality and consistency on document-level translation. This model should is not intended to be use as a document-level translator. TowerInstruct-v0.1 has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements). TowerInstruct-v0.1 was trained using the ChatML prompt templates without any system prompts. An example follows below: The prompts for all supervised tasks can be found in TowerBlocks-v0.2. We have used multiple prompt templates for each task. While different prompts may offer different outputs, the difference in downstream performance should be very minimal. The following hyperparameters were used during training: - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
M-Prometheus-3B
M-Prometheus is a suite of open LLM judges that can natively evaluate multilingual outputs. They were trained on 480k instances of multilingual direct assessment and pairwise comparison data with long-form feedback. They can be prompted in the same way as Prometheus-2. Check out our paper for more details. Our models can be prompted in the same way as Prometheus-2. For direct-assesssment MT Evaluation, we used the following prompt: