neulab

36 models • 3 total models in database
Sort by:

codebert-python

This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batch_size=32`) on **Python** code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling task.

1,469,876
26

codebert-c

This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batchsize=32`) on C code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling task. It is intended to be used in CodeBERTScore: https://github.com/neulab/code-bert-score, but can be used for any other model or task. For more information, see: https://github.com/neulab/code-bert-score

145,731
6

codebert-javascript

This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batchsize=32`) on JavaScript code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling task. It is intended to be used in CodeBERTScore: https://github.com/neulab/code-bert-score, but can be used for any other model or task. For more information, see: https://github.com/neulab/code-bert-score

38,094
15

codebert-java

This is a `microsoft/codebert-base-mlm` model, trained for 1,000,000 steps (with `batch_size=32`) on **Java** code from the `codeparrot/github-code-clean` dataset, on the masked-language-modeling ...

4,050
13

Pangea-7B

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages 🇪🇹 🇸🇦 🇧🇬 🇧🇩 🇨🇿 🇩🇪 🇬🇷 🇬🇧 🇺🇸 🇪🇸 🇮🇷 🇫🇷 🇮🇪 🇮🇳 🇮🇩 🇳🇬 🇮🇹 🇮🇱 🇯🇵 🇮🇩 🇰🇷 🇳🇱 🇲🇳 🇲🇾 🇳🇴 🇵🇱 🇵🇹 🇧🇷 🇷🇴 🇷🇺 🇱🇰 🇮🇩 🇰🇪 🇹🇿 🇱🇰 🇹🇭 🇹🇷 🇺🇦 🇵🇰 🇻🇳 🇨🇳 🇹🇼 🏠 Homepage | 🤖 Pangea-7B | 📊 PangeaIns | 🧪 PangeaBench | 💻 Github | 📄 Arxiv | 📕 PDF | 🖥️ Demo - Model: Pangea is a fully open-source Multilingual Multimodal Multicultural LLM. - Date: Pangea-7B was trained in 2024. - Training Dataset: 6M PangeaIns. - Architecture: Pangea-7B follows the architecture of LLaVA-NeXT, with a Qwen2-7B-Instruct backbone. You could either (1) follow the same model loading procedures as of LLaVA-NeXT, an example of loading Pangea-7B directly is shown in the Python code below, or (2) use our hf version of Pangea-7B: [Pangea-7B-hf]https://huggingface.co/neulab/Pangea-7B-hf Direct Use First you would need to clone and install LLaVA-NeXT. Then, you could load Pangea-7B using the following code: Defining some helper functions for using the model: Note that the example above demonstrates multimodal usage. To use the model with text-only inputs, you would need to reload the model with :

NaNK
license:apache-2.0
1,610
131

gpt2-finetuned-wikitext103

1,091
2

Pangea 7B Hf

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages 🇪🇹 🇸🇦 🇧🇬 🇧🇩 🇨🇿 🇩🇪 🇬🇷 🇬🇧 🇺🇸 🇪🇸 🇮🇷 🇫🇷 🇮🇪 🇮🇳 🇮🇩 🇳🇬 🇮🇹 🇮🇱 🇯🇵 🇮🇩 🇰🇷 🇳🇱 🇲🇳 🇲🇾 🇳🇴 🇵🇱 🇵🇹 🇧🇷 🇷🇴 🇷🇺 🇱🇰 🇮🇩 🇰🇪 🇹🇿 🇱🇰 🇹🇭 🇹🇷 🇺🇦 🇵🇰 🇻🇳 🇨🇳 🇹🇼 🏠 Homepage | 🤖 Pangea-7B | 📊 PangeaIns | 🧪 PangeaBench | 💻 Github | 📄 Arxiv | 📕 PDF | 🖥️ Demo - Model: Pangea is a fully open-source Multilingual Multimodal Multicultural LLM. - Date: Pangea-7B was trained in 2024. - Training Dataset: 6M PangeaIns. - Architecture: Pangea-7B follows the architecture of LLaVA-NeXT, with a Qwen2-7B-Instruct backbone. Uses The hf version is intended so that you could use Pangea-7B with the huggingface generate function. If you want to use it with the Llava-Next codebase, please refer to our original checkpoint.

NaNK
license:apache-2.0
817
13

codebert-cpp

449
11

CulturalPangea-7B

NaNK
license:apache-2.0
228
2

gpt2-med-finetuned-wikitext103

46
0

distilgpt2-finetuned-wikitext103

43
1

SP3F-7B

NaNK
license:mit
23
2

omnitab-large-finetuned-wtq

22
7

omnitab-large

10
2

omnitab-large-16shot-finetuned-wtq-16shot

5
1

reatt-large-nq

4
1

pangea_checkpoint_1216

3
0

UIX-Qwen2

NaNK
2
22

UIX-Qwen2-Mind2Web

2
4

omnitab-large-128shot-finetuned-wtq-128shot

2
0

omnitab-large-128shot

1
0

omnitab-large-1024shot

1
0

omnitab-large-1024shot-finetuned-wtq-1024shot

1
0

reatt-large-nq-bioasq

1
0

docprompting-tldr-gpt-neo-125M

1
0

pangea_checkpoint_2432

1
0

pangea_checkpoint_3648

1
0

pangea_checkpoint_4864

1
0

pangea_checkpoint_6080

1
0

pangea_checkpoint_7296

1
0

pangea_checkpoint_8512

1
0

pangea_checkpoint_9728

1
0

docprompting-codet5-python-doc-retriever

0
3

omnitab-large-16shot

0
2

reatt-large-nq-fiqa

0
2

docprompting-tldr-gpt-neo-1.3B

NaNK
0
1