InstantX

16 models • 4 total models in database
Sort by:

InstantID

license:apache-2.0
36,815
835

FLUX.1-dev-Controlnet-Union

- [2024/08/26] 🔥 Release FLUX.1-dev-ControlNet-Union-Pro. Please install from the source before the next release. We have supported CN-Union and Multi-ControlNets via this PR. The training of union controlnet requires a significant amount of computational power. The current release is the first beta version checkpoint that maybe not been fully trained. The fully trainedbeta version is in the training process. We have conducted ablation studies that have demonstrated the validity of the code. The open-source release of the first beta version is solely to facilitate the rapid growth of the open-source community and the Flux ecosystem; it is common to encounter bad cases (please accept my apologies). It is worth noting that we have found that even a fully trained Union model may not perform as well as specialized models, such as pose control. However, as training progresses, the performance of the Union model will continue to approach that of specialized models. | Control Mode | Description | Current Model Validity | |:------------:|:-----------:|:-----------:| |0|canny|🟢high| |1|tile|🟢high| |2|depth|🟢high| |3|blur|🟢high| |4|pose|🟢high| |5|gray|🔴low| |6|lq|🟢high| Resources - InstantX/FLUX.1-dev-Controlnet-Canny - InstantX/FLUX.1-dev-Controlnet-Union - Shakker-Labs/FLUX.1-dev-ControlNet-Depth - Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro Acknowledgements Thanks zzzzzero for help us pointing out some bugs in the training.

13,191
463

Qwen-Image-ControlNet-Union

Qwen-Image-ControlNet-Union This repository provides a unified ControlNet that supports 4 common control types (canny, soft edge, depth, pose) for Qwen-Image. Model Cards - This ControlNet consists of 5 double blocks copied from the pretrained transformer layers. - We train the model from scratch for 50K steps using a dataset of 10M high-quality general and human images. - We train at 1328x1328 resolution in BFloat16, batch size=64, learning rate=4e-5. We set the text drop ratio to 0.10. - This model supports multiple control modes, including canny, soft edge, depth, pose. You can use it just as a normal ControlNet. Inference Setting You can adjust control strength via controlnetconditioningscale. - Canny: use cv2.Canny, set controlnetconditioningscale in [0.8, 1.0] - Soft Edge: use AnylineDetector, set controlnetconditioningscale in [0.8, 1.0] - Depth: use depth-anything, set controlnetconditioningscale in [0.8, 1.0] - Pose: use DWPose, set controlnetconditioningscale in [0.8, 1.0] We strongly recommend using detailed prompts, especially when include text elements. For example, use "a poster with text 'InstantX Team' on the top" instead of "a poster". For multiple conditions inference, please refer to PR. ComfyUI Support ComfyUI offers native support for Qwen-Image-ControlNet-Union. Check the blog for more details. Community Support Liblib AI offers native support for Qwen-Image-ControlNet-Union. Visit for online inference. Limitations We find that the model was unable to preserve some details without explicit 'TEXT' in prompt, such as small font text. Acknowledgements This model is developed by InstantX Team. All copyright reserved.

license:apache-2.0
8,392
86

SD3.5-Large-IP-Adapter

7,163
114

Qwen-Image-ControlNet-Inpainting

Qwen-Image-ControlNet-Inpainting This repository provides a ControlNet that supports mask-based image inpainting and outpainting for Qwen-Image. Model Cards - This ControlNet consists of 6 double blocks copied from the pretrained transformer layers. - We train the model from scratch for 65K steps using a dataset of 10M high-quality general and human images. - We train at 1328x1328 resolution in BFloat16, batch size=128, learning rate=4e-5. We set the text drop ratio to 0.10. - This model supports Object replacement, Text modification, Background replacement, Outpainting. Showcases You can find more use cases in this blog. ComfyUI Support ComfyUI offers native support for Qwen-Image-ControlNet-Inpainting. The official workflow can be found here. Make sure your ComfyUI version is >=0.3.59. Community Support Liblib AI offers native support for Qwen-Image-ControlNet-Inpainting. Visit for online WebUI or ComfyUI inference. Limitations This model is slightly sensitive to user prompts. Using detailed prompts that describe the entire image (both the inpainted area and the background) is highly recommended. Please use descriptive prompt instead of instructive prompt. Acknowledgements This model is developed by InstantX Team. All copyright reserved.

license:apache-2.0
5,396
75

FLUX.1-dev-Controlnet-Canny

3,455
191

FLUX.1-dev-IP-Adapter

This repository contains a IP-Adapter for FLUX.1-dev model released by researchers from InstantX Team, where image work just like text, so it may not be responsive or interfere with other text, but we do hope you enjoy this model, have fun and share your creative works with us on Twitter. Model Card This is a regular IP-Adapter, where the new layers are added into 38 single and 19 double blocks. We use google/siglip-so400m-patch14-384 to encode image for its superior performance, and adopt a simple MLPProjModel of 2 linear layers to project. The image token number is set to 128. The currently released model is trained on the 10M open source dataset with a batch size of 128 and 80K training steps. Showcases (LoRA) We adopt Shakker-Labs/FLUX.1-dev-LoRA-collections as a character LoRA and use its default prompt. Inference The code has not been integrated into diffusers yet, please use our local files at this moment. Online Inference You can also enjoy this model at Shakker AI. Limitations This model supports image reference, but is not for fine-grained style transfer or character consistency, which means that there exists a trade-off between content leakage and style transfer. We don't find similar properties in FLUX.1-dev (DiT-based) as in InstantStyle (UNet-based). It may take several attempts to get satisfied results. Furthermore, current released model may suffer from limited diversity, thus cannot cover some styles or concepts, License The model is released under flux-1-dev-non-commercial-license. All copyright reserved. Acknowledgements This project is sponsored by HuggingFace, fal.ai and Shakker Labs. Citation If you find this project useful in your research, please cite us via

2,413
307

flux-dev-de-distill-diffusers

660
22

SD3-Controlnet-Canny

539
119

SD3-Controlnet-Pose

452
55

FLUX.1-dev-LoRA-Ghibli

379
17

SD3-Controlnet-Tile

337
51

CSGO

license:apache-2.0
170
37

FLUX.1-dev-LoRA-Makoto-Shinkai

148
15

SD3-Controlnet-Depth

license:apache-2.0
59
8

InstantIR

license:apache-2.0
2
180