neulab
codebert-python
This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batch_size=32`) on **Python** code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling task.
codebert-c
This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batchsize=32`) on C code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling task. It is intended to be used in CodeBERTScore: https://github.com/neulab/code-bert-score, but can be used for any other model or task. For more information, see: https://github.com/neulab/code-bert-score
codebert-javascript
This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batchsize=32`) on JavaScript code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling task. It is intended to be used in CodeBERTScore: https://github.com/neulab/code-bert-score, but can be used for any other model or task. For more information, see: https://github.com/neulab/code-bert-score
codebert-java
This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batch_size=32`) on **Java** code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling ...
Pangea-7B
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages 🇪🇹 🇸🇦 🇧🇬 🇧🇩 🇨🇿 🇩🇪 🇬🇷 🇬🇧 🇺🇸 🇪🇸 🇮🇷 🇫🇷 🇮🇪 🇮🇳 🇮🇩 🇳🇬 🇮🇹 🇮🇱 🇯🇵 🇮🇩 🇰🇷 🇳🇱 🇲🇳 🇲🇾 🇳🇴 🇵🇱 🇵🇹 🇧🇷 🇷🇴 🇷🇺 🇱🇰 🇮🇩 🇰🇪 🇹🇿 🇱🇰 🇹🇭 🇹🇷 🇺🇦 🇵🇰 🇻🇳 🇨🇳 🇹🇼 🏠 Homepage | 🤖 Pangea-7B | 📊 PangeaIns | 🧪 PangeaBench | 💻 Github | 📄 Arxiv | 📕 PDF | 🖥️ Demo - Model: Pangea is a fully open-source Multilingual Multimodal Multicultural LLM. - Date: Pangea-7B was trained in 2024. - Training Dataset: 6M PangeaIns. - Architecture: Pangea-7B follows the architecture of LLaVA-NeXT, with a Qwen2-7B-Instruct backbone. You could either (1) follow the same model loading procedures as of LLaVA-NeXT, an example of loading Pangea-7B directly is shown in the Python code below, or (2) use our hf version of Pangea-7B: [Pangea-7B-hf]https://huggingface.co/neulab/Pangea-7B-hf Direct Use First you would need to clone and install LLaVA-NeXT. Then, you could load Pangea-7B using the following code: Defining some helper functions for using the model: Note that the example above demonstrates multimodal usage. To use the model with text-only inputs, you would need to reload the model with :
gpt2-finetuned-wikitext103
Pangea 7B Hf
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages 🇪🇹 🇸🇦 🇧🇬 🇧🇩 🇨🇿 🇩🇪 🇬🇷 🇬🇧 🇺🇸 🇪🇸 🇮🇷 🇫🇷 🇮🇪 🇮🇳 🇮🇩 🇳🇬 🇮🇹 🇮🇱 🇯🇵 🇮🇩 🇰🇷 🇳🇱 🇲🇳 🇲🇾 🇳🇴 🇵🇱 🇵🇹 🇧🇷 🇷🇴 🇷🇺 🇱🇰 🇮🇩 🇰🇪 🇹🇿 🇱🇰 🇹🇭 🇹🇷 🇺🇦 🇵🇰 🇻🇳 🇨🇳 🇹🇼 🏠 Homepage | 🤖 Pangea-7B | 📊 PangeaIns | 🧪 PangeaBench | 💻 Github | 📄 Arxiv | 📕 PDF | 🖥️ Demo - Model: Pangea is a fully open-source Multilingual Multimodal Multicultural LLM. - Date: Pangea-7B was trained in 2024. - Training Dataset: 6M PangeaIns. - Architecture: Pangea-7B follows the architecture of LLaVA-NeXT, with a Qwen2-7B-Instruct backbone. Uses The hf version is intended so that you could use Pangea-7B with the huggingface generate function. If you want to use it with the Llava-Next codebase, please refer to our original checkpoint.