zhengchong
CatVTON
π CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference ( Problems under Windows OS, please refer to issue#8. > When you run the CatVTON workflow for the first time, the weight files will be automatically downloaded, usually taking dozens of minutes. To deploy the Gradio App for CatVTON on your machine, run the following command, and checkpoints will be automatically downloaded from HuggingFace. When using `bf16` precision, generating results with a resolution of `1024x768` only requires about `8G` VRAM. Inference 1. Data Preparation Before inference, you need to download the VITON-HD or DressCode dataset. Once the datasets are downloaded, the folder structures should look like these: For the DressCode dataset, we provide script to preprocessed agnostic masks, run the following command: 2. Inference on VTIONHD/DressCode To run the inference on the DressCode or VITON-HD dataset, run the following command, checkpoints will be automatically downloaded from HuggingFace. After obtaining the inference results, calculate the metrics using the following command: - `--gtfolder` and `--predfolder` should be folders that contain only images. - To evaluate the results in a paired setting, use `--paired`; for an unpaired setting, simply omit it. - `--batchsize` and `--numworkers` should be adjusted based on your machine. Acknowledgement Our code is modified based on Diffusers. We adopt Stable Diffusion v1.5 inpainting as the base model. We use SCHP and DensePose to automatically generate masks in our Gradio App and ComfyUI workflow. Thanks to all the contributors! License All the materials, including code, checkpoints, and demo, are made available under the Creative Commons BY-NC-SA 4.0 license. You are free to copy, redistribute, remix, transform, and build upon the project for non-commercial purposes, as long as you give appropriate credit and distribute your contributions under the same license.
Human-Toolkit
CatV2TON
CatVTON-MaskFree
FastFit-SR-1024
FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models FastFit is a diffusion-based framework optimized for high-speed, multi-reference virtual try-on. It enables simultaneous try-on of multiple fashion itemsβsuch as tops, bottoms, dresses, shoes, and bagsβon a single person. The framework leverages reference KV caching during inference to significantly accelerate generation. Updates - `2025/08/29`: π We release the arXiv paper of FastFit! - `2025/08/06`: βοΈ We release the code for inference and evaluation on the DressCode-MR, DressCode, and VITON-HD test datasets. - `2025/08/05`: π§© We release the ComfyUI workflow for FastFit! - `2025/08/04`: π Our gradio demo is online with Chinese & English support! The code of the demo is also released in app.py. - `2025/07/03`: π We release the weights of FastFit-MR and FastFit-SR model on Hugging Face! - `2025/06/24`: π We release DressCode-MR dataset with 28K+ Multi-reference virtual try-on Samples on Hugging Face! DressCode-MR is constructed based on the DressCode dataset with 28K+ Multi-reference virtual try-on Samples. - Multi-reference Samples: Each sample comprises a person's image paired with a set of compatible clothing and accessory items: tops, bottoms, dresses, shoes, and bags. - Large Scale: Contains a total of 28,179 high-quality multi-reference samples with 25,779 for training and 2,400 for testing. DressCode-MR is released under the exact same license as the original DressCode dataset. Therefore, before requesting access to DressCode-MR dataset, you must complete the following steps: 1. Apply and be granted a license to use the DressCode dataset. 2. Use your educational/academic email address (e.g., one ending in .edu, .ac, etc.) to request access to DressCode-MR on Hugging Face. (Any requests from non-academic email addresses will be rejected.) 1. Clone the FastFit repository into your `ComfyUI/customnodes/` directory. 4. Restart ComfyUI. 5. Drag and drop the fastfitworkflow.json file onto the ComfyUI web interface. The model weights will be automatically downloaded from Hugging Face when you run the demo. To perform inference on the DressCode-MR, DressCode, or VITON-HD test datasets, use the `inferdatasets.py` script, for example: - `--dataset`: Specify the target dataset. Choose from `dresscode-mr`, `dresscode`, or `viton-hd`. - `--datadir`: The root directory path for the specified dataset. - `--paired`: Include this flag to run inference in the paired setting. Omit this flag for the unpaired setting. By default, inference results will be saved to the `results/` directory at the project root. After inference, use the `eval.py` script to ecalculate the evaluation metrics: - `--gtfolder`: The directory path containing the ground truth images. - `--predfolder`: The directory path containing the generated (predicted) images from the inference step. - `--paired`: Include this flag to evaluate results from the paired setting. Omit this flag for the unpaired setting. Acknowledgement Our code is modified based on Diffusers. We adopt Stable Diffusion v1.5 inpainting as the base model. We use a modified AutoMasker to automatically generate masks in our Gradio App and ComfyUI workflow. Thanks to all the contributors! All weights, parameters, and code related to FastFit are governed by the FastFit Non-Commercial License. For commercial collaboration, please contact LavieAI or LoomlyAI.