MaLA-LM

44 models • 1 total models in database
Sort by:

emma-500-llama3.1-8b-bi

Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data EMMA-500 Llama 3.1 8B is a state-of-the-art multilingual language model designed to improve language representation, especially in low-resource languages, through continual pre-training on the Llama 3.1 8B architecture. Leveraging the MaLA Corpus, which spans over 500 languages and is augmented with books, code, instruction data, and papers, EMMA-500 excels in multilingual tasks like commonsense reasoning, machine translation, and text classification. - Project Website: https://mala-lm.github.io/emma-500-gen2.html - Paper: https://arxiv.org/abs/2506.00469 - Architecture: Built on Llama 3.1 8B with enhanced language adaptation through continual pre-training. - Languages: Supports 546 languages with substantial training data (over 100k tokens each). - Data Mix: A diverse bilingual mix of text from domains like code, books, instruction data, and papers. - Total Tokens: 671B EMMA-500 series - 🤗MaLA-LM/emma-500-llama2-7b: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - 🤗MaLA-LM/emma-500-llama3.1-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3.1-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - MaLA monolingual corpus: 🤗MaLA-LM/mala-monolingual-split - MaLA bilingual translation corpus: 🤗MaLA-LM/mala-bilingual-translation-corpus - MaLA code and reasoning corpus: 🤗MaLA-LM/mala-code-reasoning-v2 You can use EMMA-500 for multilingual text generation. Below is an example to generate text using the model: Use Cases - Massively multilingual NLP tasks, e.g., machine translation - Performance regression on some tasks and high-resource languages - Cannot be used for real-world scenarios, esp. in high-stakes domains. If you find this model useful, please cite the paper below. Check out the below paper for the precedent EMMA-500 model trained on Llama 2 (🤗MaLA-LM/emma-500-llama2-7b).

NaNK
llama
5,930
0

emma-500-llama2-7b

NaNK
llama
945
15

emma-500-llama3-8b-mono

Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data EMMA-500 Llama 3 8B is a state-of-the-art multilingual language model designed to improve language representation, especially in low-resource languages, through continual pre-training on the Llama 3 8B architecture. Leveraging the MaLA Corpus, which spans over 500 languages and is augmented with books, code, instruction data, and papers, EMMA-500 excels in multilingual tasks like commonsense reasoning, machine translation, and text classification. - Project Website: https://mala-lm.github.io/emma-500-gen2.html - Paper: https://arxiv.org/abs/2506.00469 - Architecture: Built on Llama 3 8B with enhanced language adaptation through continual pre-training. - Languages: Supports 546 languages with substantial training data (over 100k tokens each). - Data Mix: A diverse monolingual mix of text from domains like code, books, instruction data, and papers. - Total Tokens: 419B EMMA-500 series - 🤗MaLA-LM/emma-500-llama2-7b: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - 🤗MaLA-LM/emma-500-llama3.1-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3.1-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - MaLA monolingual corpus: 🤗MaLA-LM/mala-monolingual-split - MaLA code and reasoning corpus: 🤗MaLA-LM/mala-code-reasoning-v2 You can use EMMA-500 for multilingual text generation. Below is an example to generate text using the model: Use Cases - Massively multilingual NLP tasks, e.g., machine translation - Performance regression on some tasks and high-resource languages - Cannot be used for real-world scenarios, esp. in high-stakes domains. If you find this model useful, please cite the paper below. Check out the below paper for the precedent EMMA-500 model trained on Llama 2 (🤗MaLA-LM/emma-500-llama2-7b).

NaNK
llama
176
0

emma-500-llama3.1-8b-mono

Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data EMMA-500 Llama 3.1 8B is a state-of-the-art multilingual language model designed to improve language representation, especially in low-resource languages, through continual pre-training on the Llama 3.1 8B architecture. Leveraging the MaLA Corpus, which spans over 500 languages and is augmented with books, code, instruction data, and papers, EMMA-500 excels in multilingual tasks like commonsense reasoning, machine translation, and text classification. - Project Website: https://mala-lm.github.io/emma-500-gen2.html - Paper: https://arxiv.org/abs/2506.00469 - Architecture: Built on Llama 3.1 8B with enhanced language adaptation through continual pre-training. - Languages: Supports 546 languages with substantial training data (over 100k tokens each). - Data Mix: A diverse monolingual mix of text from domains like code, books, instruction data, and papers. - Total Tokens: 419B EMMA-500 series - 🤗MaLA-LM/emma-500-llama2-7b: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - 🤗MaLA-LM/emma-500-llama3.1-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3.1-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - MaLA monolingual corpus: 🤗MaLA-LM/mala-monolingual-split - MaLA code and reasoning corpus: 🤗MaLA-LM/mala-code-reasoning-v2 You can use EMMA-500 for multilingual text generation. Below is an example to generate text using the model: Use Cases - Massively multilingual NLP tasks, e.g., machine translation - Performance regression on some tasks and high-resource languages - Cannot be used for real-world scenarios, esp. in high-stakes domains. If you find this model useful, please cite the paper below. Check out the below paper for the precedent EMMA-500 model trained on Llama 2 (🤗MaLA-LM/emma-500-llama2-7b).

NaNK
llama
91
0

emma-500-llama3-8b-bi

Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data EMMA-500 Llama 3 8B is a state-of-the-art multilingual language model designed to improve language representation, especially in low-resource languages, through continual pre-training on the Llama 3 8B architecture. Leveraging the MaLA Corpus, which spans over 500 languages and is augmented with books, code, instruction data, and papers, EMMA-500 excels in multilingual tasks like commonsense reasoning, machine translation, and text classification. - Project Website: https://mala-lm.github.io/emma-500-gen2.html - Paper: https://arxiv.org/abs/2506.00469 - Architecture: Built on Llama 3 8B with enhanced language adaptation through continual pre-training. - Languages: Supports 546 languages with substantial training data (over 100k tokens each). - Data Mix: A diverse bilingual mix of text from domains like code, books, instruction data, and papers. - Total Tokens: 671B EMMA-500 series - 🤗MaLA-LM/emma-500-llama2-7b: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - 🤗MaLA-LM/emma-500-llama3.1-8b-mono: CPT model trained on monolingual data mix in 500+ languages - 🤗MaLA-LM/emma-500-llama3.1-8b-bi: CPT model trained on monolingual data mix in 500+ languages + bilingual translation data in 2,500+ language pairs - MaLA monolingual corpus: 🤗MaLA-LM/mala-monolingual-split - MaLA bilingual translation corpus: 🤗MaLA-LM/mala-bilingual-translation-corpus - MaLA code and reasoning corpus: 🤗MaLA-LM/mala-code-reasoning-v2 You can use EMMA-500 for multilingual text generation. Below is an example to generate text using the model: Use Cases - Massively multilingual NLP tasks, e.g., machine translation - Performance regression on some tasks and high-resource languages - Cannot be used for real-world scenarios, esp. in high-stakes domains. If you find this model useful, please cite the paper below. Check out the below paper for the precedent EMMA-500 model trained on Llama 2 (🤗MaLA-LM/emma-500-llama2-7b).

NaNK
llama
77
0

lucky52-bloom-7b1-no-32

NaNK
license:cc-by-nc-4.0
10
0

lucky52-bloom-7b1-no-14

NaNK
license:cc-by-nc-4.0
8
0

lucky52-bloom-7b1-no-4

NaNK
license:cc-by-nc-4.0
7
0

lucky52-bloom-7b1-no-46

NaNK
license:cc-by-nc-4.0
6
0

lucky52-bloom-7b1-no-7

NaNK
license:cc-by-nc-4.0
5
0

lucky52-bloom-7b1-no-31

NaNK
license:cc-by-nc-4.0
4
0

lucky52-bloom-7b1-no-6

NaNK
license:cc-by-nc-4.0
3
0

lucky52-bloom-7b1-no-8

NaNK
license:cc-by-nc-4.0
3
0

lucky52-bloom-7b1-no-10

NaNK
license:cc-by-nc-4.0
3
0

lucky52-bloom-7b1-no-18

NaNK
license:cc-by-nc-4.0
3
0

lucky52-bloom-7b1-no-2

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-5

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-13

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-17

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-19

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-27

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-30

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-36

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-37

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-39

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-41

NaNK
license:cc-by-nc-4.0
2
0

lucky52-bloom-7b1-no-1

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-3

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-11

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-12

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-16

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-21

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-26

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-34

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-38

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-42

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-44

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-45

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-49

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-50

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-51

NaNK
license:cc-by-nc-4.0
1
0

lucky52-bloom-7b1-no-52

NaNK
license:cc-by-nc-4.0
1
0

mala-500-10b-v1

NaNK
base_model:meta-llama/Llama-2-7b-hf
0
59

mala-500-10b-v2

NaNK
base_model:meta-llama/Llama-2-7b-hf
0
6