YanweiLi

22 models • 2 total models in database
Sort by:

llama-vid-7b-full-224-long-video

NaNK
llama
46
17

llama-vid-13b-full-224-video-fps-1

NaNK
llama
29
2

llama-vid-7b-full-224-video-fps-1

NaNK
llama
16
9

MGM 7B HD

Model details The framework supports a series of dense and MoE Large Language Models (LLMs) from 2B to 34B with HD image understanding, reasoning, and generation simultaneously. Normal resolution setting: MGM-2B, MGM-7B, MGM-13B, MGM-8x7B, MGM-34B High resolution setting: MGM-13B-HD, MGM-8x7B-HD, MGM-34B-HD Model type: MGM is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It empowers existing frameworks to support HD image understanding, reasoning, and generation simultaneously. Model version: MGM HD Version with LLM Vicuna-7B-v1.5 License Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved. Where to send questions or comments about the model: https://github.com/dvlab-research/MGM/issues Intended use Primary intended uses: The primary use is research on large multimodal models and chatbots. Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence. Training data This model is trained based on MGM-Instruction dataset, please to the Github for more detail. Acknowledgement This project is not affiliated with Google LLC.

NaNK
llama
11
31

llama-vid-7b-pretrain-224-video-fps-1

NaNK
llama
10
2

MGM 2B

Model details The framework supports a series of dense and MoE Large Language Models (LLMs) from 2B to 34B with HD image understanding, reasoning, and generation simultaneously. You can also try our other MGM series models: Normal resolution setting: MGM-7B, MGM-13B, MGM-8x7B, MGM-34B High resolution setting: MGM-7B-HD, MGM-13B, MGM-8x7B-HD, MGM-34B-HD Model type: MGM is an open-source chatbot trained by fine-tuning Gemma on GPT-generated multimodal instruction-following data. It empowers existing frameworks to support HD image understanding, reasoning, and generation simultaneously. License Gemma is licensed under the Gemma Terms of Use License, Where to send questions or comments about the model: https://github.com/dvlab-research/MGM/issues Intended use Primary intended uses: The primary use is research on large multimodal models and chatbots. Primary intended users: The primary intended users of the model are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence. Training data This model is trained based on MGM-Instruction dataset, please to the Github for more detail. Acknowledgement This project is not affiliated with Google LLC.

NaNK
9
21

MGM-8x7B-HD

NaNK
8
9

MGM-34B

NaNK
llama
6
9

MGM-7B

NaNK
llama
5
8

MGM-8x7B

NaNK
5
7

llama-vid-7b-full-224

NaNK
llama
5
1

MGM-8B

NaNK
llama
5
1

llama-vid-13b-full-336

NaNK
llama
4
0

llama-vid-7b-pretrain-224

NaNK
llama
3
0

llama-vid-13b-pretrain-224-video-fps-1

NaNK
llama
3
0

MGM-13B-HD

NaNK
llama
2
13

MGM-13B

NaNK
llama
2
1

llama-vid-7b-pretrain-336

NaNK
llama
2
0

llama-vid-7b-full-336

NaNK
llama
2
0

MGM-34B-HD

NaNK
llama
1
21

MGM-8B-HD

NaNK
llama
1
6

MGM-Pretrain

0
1