mPLUG
TinyChart-3B-768-siglip
GUI-Owl-7B
GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks. Paper: Paper Link GitHub Repository: https://github.com/X-PLUG/MobileAgent Online Demo: Comming soon This script has been validated on an A100 with 96 GB of VRAM. If you want GUI-Owl to recieve more than two images, you could increase `IMAGELIMITARGS` and reduce `maxpixels`. Citation If you find our paper and model useful in your research, feel free to give us a cite.
DocOwl2
TinyChart-3B-768
mPLUG-Owl3-7B-240728
mPLUG-Owl3-2B-241014
GUI-Owl-1.5-8B-Instruct
mPLUG-Owl3-7B-241101
GUI-Owl-1.5-32B-Instruct
GUI-Owl-32B
GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks. Paper: Paper Link GitHub Repository: https://github.com/X-PLUG/MobileAgent Online Demo: Comming soon This script has been validated on an A100 with 96 GB of VRAM. If you serve GUI-Owl-32B on an H20-3e, you can set MPSIZE=1 for faster inference speed. If you want GUI-Owl to recieve more than two images, you could increase `IMAGELIMITARGS` and reduce `maxpixels`. Citation If you find our paper and model useful in your research, feel free to give us a cite.
GUI-Owl-1.5-8B-Think
GUI-Owl-1.5-32B-Think
GUI-Owl-1.5-4B-Instruct
GUI-Owl-1.5-2B-Instruct
mPLUG-Owl3-1B-241014
GUI-Owl-7B-Desktop-RL
GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks. Paper: Paper Link GitHub Repository: https://github.com/X-PLUG/MobileAgent Online Demo: Comming soon This is a variant version of GUI-Owl that is specially RL-tuned for desktop environments. You can deploy and invoke the model using the same methods as GUI-Owl 7B. Limitation: This model is primarily intended for validating improvements and optimizations for environment-rl. Therefore, its performance in production scenarios cannot be guaranteed. This script has been validated on an A100 with 96 GB of VRAM. If you want GUI-Owl to recieve more than two images, you could increase `IMAGELIMITARGS` and reduce `maxpixels`. Citation If you find our paper and model useful in your research, feel free to give us a cite.