mPLUG

21 models • 1 total models in database
Sort by:

TinyChart-3B-768-siglip

NaNK
912
3

GUI-Owl-7B

GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks. Paper: Paper Link GitHub Repository: https://github.com/X-PLUG/MobileAgent Online Demo: Comming soon This script has been validated on an A100 with 96 GB of VRAM. If you want GUI-Owl to recieve more than two images, you could increase `IMAGELIMITARGS` and reduce `maxpixels`. Citation If you find our paper and model useful in your research, feel free to give us a cite.

NaNK
license:mit
909
47

DocOwl2

license:apache-2.0
606
113

TinyChart-3B-768

NaNK
518
7

mPLUG-Owl3-7B-240728

NaNK
license:apache-2.0
419
42

mPLUG-Owl3-2B-241014

NaNK
license:apache-2.0
382
6

GUI-Owl-1.5-8B-Instruct

NaNK
license:mit
277
3

mPLUG-Owl3-7B-241101

NaNK
license:apache-2.0
265
10

GUI-Owl-1.5-32B-Instruct

NaNK
license:mit
216
4

GUI-Owl-32B

GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks. Paper: Paper Link GitHub Repository: https://github.com/X-PLUG/MobileAgent Online Demo: Comming soon This script has been validated on an A100 with 96 GB of VRAM. If you serve GUI-Owl-32B on an H20-3e, you can set MPSIZE=1 for faster inference speed. If you want GUI-Owl to recieve more than two images, you could increase `IMAGELIMITARGS` and reduce `maxpixels`. Citation If you find our paper and model useful in your research, feel free to give us a cite.

NaNK
license:mit
208
22

GUI-Owl-1.5-8B-Think

NaNK
license:mit
123
4

GUI-Owl-1.5-32B-Think

NaNK
license:mit
122
2

GUI-Owl-1.5-4B-Instruct

NaNK
license:mit
107
2

GUI-Owl-1.5-2B-Instruct

NaNK
license:mit
85
4

mPLUG-Owl3-1B-241014

NaNK
license:apache-2.0
83
2

GUI-Owl-7B-Desktop-RL

GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks. Paper: Paper Link GitHub Repository: https://github.com/X-PLUG/MobileAgent Online Demo: Comming soon This is a variant version of GUI-Owl that is specially RL-tuned for desktop environments. You can deploy and invoke the model using the same methods as GUI-Owl 7B. Limitation: This model is primarily intended for validating improvements and optimizations for environment-rl. Therefore, its performance in production scenarios cannot be guaranteed. This script has been validated on an A100 with 96 GB of VRAM. If you want GUI-Owl to recieve more than two images, you could increase `IMAGELIMITARGS` and reduce `maxpixels`. Citation If you find our paper and model useful in your research, feel free to give us a cite.

NaNK
license:mit
28
3

UI-S1-7B

NaNK
license:apache-2.0
26
1

DocOwl1.5-Chat

license:apache-2.0
25
27

DocOwl1.5-Omni

license:apache-2.0
20
17

DocOwl1.5

license:apache-2.0
13
26

DocOwl1.5-stage1

license:apache-2.0
9
12