yam-peleg

22 models • 3 total models in database

Sort by:

Hebrew-Mistral-7B

Hebrew-Mistral-7B is an open-source Large Language Model (LLM) pretrained in hebrew and english pretrained with 7B billion parameters, based on Mistral-7B-v1.0 from Mistral. It has an extended hebrew tokenizer with 64,000 tokens and is continuesly pretrained from Mistral-7B on tokens in both English and Hebrew. The resulting model is a powerful general-purpose language model suitable for a wide range of natural language processing tasks, with a focus on Hebrew language understanding and generation. Below are some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase. Hebrew-Mistral-7B is a pretrained base model and therefore does not have any moderation mechanisms. Authors - Trained by Yam Peleg. - In collaboration with Jonathan Rouach and Arjeo, inc.

license:apache-2.0

Hebrew-Gemma-11B-V2

Hebrew-Gemma-11B-Instruct

Base Models: - 07.03.2024: Hebrew-Gemma-11B - 16.03.2024: Hebrew-Gemma-11B-V2 Instruct Models: - 07.03.2024: Hebrew-Gemma-11B-Instruct The Hebrew-Gemma-11B-Instruct Large Language Model (LLM) is a instruct fine-tuned version of the Hebrew-Gemma-11B generative text model using a variety of conversation datasets. It is continued pretrain of gemma-7b, extended to a larger scale and trained on 3B additional tokens of both English and Hebrew text data. This format must be strictly respected, otherwise the model will generate sub-optimal outputs. - The conversation starts with ` `. - Each turn is preceded by a ` ` delimiter and then the role of the entity (`user` or `model`). - Turns finish with the ` ` token. - Conversation finish with the ` ` token. You can follow this format to build the prompt manually, if you need to do it without the tokenizer's chat template. A simple example using the tokenizer's chat template: As an extention of Gemma-7B, this model is subject to the original license and terms of use by Google. Hebrew-Gemma-11B is a pretrained base model and therefore does not have any moderation mechanisms. - Trained by Yam Peleg. - In collaboration with Jonathan Rouach and Arjeo, inc.

Hebrew-Mistral-7B-200K

> Please note: There has been some issues reported about this model, updates coming soon. Hebrew-Mistral-7B-200K is an open-source Large Language Model (LLM) pretrained in hebrew and english pretrained with 7B billion parameters and with 200K context length, based on Mistral-7B-v1.0 from Mistral. It has an extended hebrew tokenizer with 64,000 tokens and is continuesly pretrained from Mistral-7B on tokens in both English and Hebrew. The resulting model is a powerful general-purpose language model suitable for a wide range of natural language processing tasks, with a focus on Hebrew language understanding and generation. Below are some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase. Hebrew-Mistral-7B-200K is a pretrained base model and therefore does not have any moderation mechanisms.

license:apache-2.0

Experiment26-7B

license:apache-2.0

Hebrew-Gemma-11B

Experiment28-7B

license:apache-2.0

Experiment27-7B

license:apache-2.0

Experiment29-7B

license:apache-2.0

Experiment23-7B

license:apache-2.0

Experiment31-7B

license:apache-2.0

Hebrew-Mixtral-8x22B

license:apache-2.0

Experiment21-7B

license:apache-2.0

Experiment30-7B

license:apache-2.0

Experiment4-7B

license:apache-2.0

Experiment15-7B

license:apache-2.0

Experiment22-7B

license:apache-2.0

gemma-7b-it-experiment

license:apache-2.0

Experiment1-7B

license:apache-2.0

Experiment2-7B

license:apache-2.0

Experiment10-7B

license:apache-2.0

Hebrew-Mistral-7B-V3-pre_release2

license:apache-2.0