ussoewwin
WAN2.2_14B_GGUF
- `wan2.2i2vhighnoise14Bfp16.gguf`: High-noise model in FP16 format (not quantized) - `wan2.2i2vlownoise14Bfp16.gguf`: Low-noise model in FP16 format (not quantized) - `wan2.2t2vhighnoise14Bfp16.gguf`: High-noise model in FP16 format (not quantized) - `wan2.2t2vlownoise14Bfp16.gguf`: High-noise model in FP16 format (not quantized) - Important: These are NOT quantized models but FP16 precision models in GGUF container format - Base model: Wan-AI/Wan2.2-I2V-A14B -Base model: Wan-AI/Wan2.2-T2V-A14B - Format: GGUF container with FP16 precision (unquantized) - Original model size: ~27B parameters (14B active per step) - File sizes: - high: 28.6 GB for FP16 (SHA256: 3a7d4e...) - low: 28.6 GB (SHA256: 1b4e28...) While GGUF is typically used for quantized models, ComfyUI-GGUF extension supports: - Loading FP16 models in GGUF container format - This provides compatibility with ComfyUI workflow
Wan2.2_T2V_A14B_VACE-test_fp16_GGUF
Wan2.2 T2V A14B VACE FP16 GGUF Models (High & Low Noise) This GGUF conversion is based on lym00/Wan2.2T2VA14BVACE-test, which is explicitly labeled as "intended for experimental use only" by the creator. While the underlying Wan2.2 model is licensed under Apache 2.0 (permitting commercial use), this specific configuration has known limitations: - Legal Status: The Apache 2.0 license allows commercial use of the generated content - Technical Limitations: This is an experimental integration of Wan2.2 T2V A14B with VACE scopes - Known Issue: Color shifting problems may occur (as documented in the original model) - Stability: Not recommended for production environments without thorough testing - `Wan2.2T2VHighNoise14BVACEfp16.gguf` — High-noise model (used for initial denoising steps) - `wan2.2t2vlownoise14Bfp16.gguf` — Low-noise model (used for detail refinement) 1. Download both GGUF files and place them in `ComfyUI/models/unet/` 2. Install ComfyUI-GGUF extension 3. Restart ComfyUI 1. Load the workflow file included in this repository (drag and drop into ComfyUI) 2. The workflow will automatically use: - High-noise model for initial denoising steps (first 2–4 steps) - Low-noise model for final detail refinement (remaining steps) Important: These are NOT quantized models but FP16 precision models in GGUF container format. - Base model: lym00/Wan2.2T2VA14BVACE-test - Original model: Combination of Wan2.2 T2V A14B and VACE scopes - Format: GGUF container with FP16 precision (unquantized) - Model size: ~27B parameters (14B active per step) - File sizes: - High: 34.7 GB - Low: 34.7 GB While GGUF is typically used for quantized models, ComfyUI-GGUF supports: - Loading FP16 models in GGUF format - Full compatibility with ComfyUI workflows - Twice the file size of quantized models, but maximum quality Wan2.2 uses a Mixture-of-Experts (MoE) architecture: - High-noise expert: Used for early denoising, focuses on layout and motion - Low-noise expert: Used later for refining textures and details - Transition point determined by signal-to-noise ratio (SNR) This model incorporates VACE (Video Aesthetic Control Embedding): - Enhances cinematic-level aesthetics - Allows fine control over lighting, composition, contrast, and color tone - Enables more controllable cinematic style generation 1. Color Shifting Issue: - Same issue as in the original lym00 model - VACE team is reportedly working on a fix (Banodoco Discord) - Avoid for applications requiring color accuracy 2. Experimental Status: - Some features may not work as expected - Output quality can vary 3. Commercial Use Recommendations: - Allowed under Apache 2.0 - Test thoroughly before commercial deployment - Consider the official Wan-AI/Wan2.2-T2V-A14B for production 4. Legal Disclaimer: - You are fully responsible for compliance with laws and ethical use - Wan2.2 T2V A14B — Text-to-Video MoE model supporting 480p & 720p - VACE — Video Aesthetic Control Embedding from Wan2.1 Features: - Effective MoE separation of denoising steps - Cinematic-level control over visuals - High-definition motion generation at 720p@24fps on consumer GPUs Same Apache 2.0 terms as the original model. Commercial use is allowed, but stability issues mean testing is strongly advised.
FakeVace2.2_fp16_GGUF
Flash-Attention-2_for_Windows
Sage-Attention-for-Windows
Hybrid-Sensitivity-Weighted-Quantization-SDXL-fp8e4m3
Nunchaku-R128-SDXL-Series
HSWQ-Z-Image-fp8e4m3
nunchaku-build-on-cu130-windows
Naturally, the PyTorch on the ComfyUI side also requires the installation of version 2.9.0+cu130.
xformers-build-on-cu130
15.11.2025, updated a newest whl built on 0.0.33.post1
Onnxruntime Gpu 1.24.0
ONNX Runtime GPU 1.24.0 - CUDA 13.0 Build with Blackwell Support Custom-built ONNX Runtime GPU 1.24.0 for Windows with full CUDA 13.0 and Blackwell architecture (sm120) support. This build addresses the `cudaErrorNoKernelImageForDevice` error that occurs with RTX 5060 Ti and other Blackwell-generation GPUs when using official PyPI distributions. Environment - OS: Windows 10/11 x64 - CUDA Toolkit: 13.0 - cuDNN: 9.13 (CUDA 13.0 compatible) - Visual Studio: 2022 (v17.x) with Desktop development with C++ - Python: 3.13 - CMake: 3.26+ Supported GPU Architectures - sm89: Ada Lovelace (RTX 4060, 4070, etc.) - sm90: Ada Lovelace High-end (RTX 4090) / Hopper (H100) - sm120: Blackwell (RTX 5060 Ti, 5080, 5090) Note: Flash Attention is disabled because ONNX Runtime 1.24.0's Flash Attention kernels are sm80-specific and incompatible with sm90/sm120 architectures. ✅ Blackwell GPU Support: Full compatibility with RTX 5060 Ti, 5080, 5090 ✅ CUDA 13.0 Optimized: Built with latest CUDA toolkit for optimal performance ✅ Multi-Architecture: Single build supports Ada Lovelace and Blackwell ✅ Stable for Inference: Tested with WD14Tagger, Stable Diffusion pipelines ⚠️ Flash Attention Disabled: Due to sm80-only kernel implementation in ONNX Runtime 1.24.0, Flash Attention is not available. This has minimal impact on most inference workloads (e.g., WD14Tagger, image generation models). ⚠️ Windows Only: This build is specifically for Windows x64. Linux users should build from source with similar configurations. Compared to CPU-only execution: - Image tagging (WD14Tagger): 10-50x faster - Inference latency: Significant reduction on GPU-accelerated operations - Memory: Efficiently utilizes 16GB VRAM on RTX 5060 Ti - ComfyUI: WD14Tagger nodes - Stable Diffusion Forge: ONNX-based models - General ONNX Model Inference: Any ONNX model requiring CUDA acceleration Official ONNX Runtime GPU distributions (PyPI) are typically built for older CUDA versions (11.x/12.x) and do not include sm120 (Blackwell) architecture support. When running inference on Blackwell GPUs with official builds, users encounter: This custom build resolves the issue by: 1. Compiling with CUDA 13.0 2. Explicitly targeting sm89, sm90, sm120 3. Disabling incompatible Flash Attention kernels ONNX Runtime's Flash Attention implementation currently only supports: - sm80: Ampere (A100, RTX 3090) - Kernels are hardcoded with `sm80.cu` file naming Future ONNX Runtime versions may add sm90/sm120 support, but as of 1.24.0, this remains unavailable. Built by @ussoewwin for the community facing Blackwell GPU compatibility issues with ONNX Runtime.
Nunchaku-R32-SDXL-Series
mediapipe-0.10.21-Python3.13
This will work on Python3.13.1x...unfirtunately, but it can't work perfectly. It can't work on ComfyUILayerStyleAdvance unfortunately...however it works on Insptre-Pack. Uploaded an another built version "uncomplete-mediapipe-v1-0.10.21-cp313-cp313-winamd64.whl", but it can't work perfectly. However, it works on Reactor at least. I uploaded it only for mediapipe. But newest Reactor for ComfyUI&A1111 can work without mediapipe. 📌 About the ProtectAI “LiteRT Model Contains Unknown Operators” Warning This repository provides a custom-built Mediapipe wheel (Python 3.13, Windows), compiled directly from the official Mediapipe source code. Some security scanners—including ProtectAI Guardian—may display the warning: This warning is not an actual security issue. It occurs because Mediapipe internally contains TensorFlow Lite (LiteRT) models such as: `.tflite` models FlatBuffer-based graph definitions Custom operators required by Mediapipe tasks TensorFlow Lite uses several custom operations (Custom Ops) that are not part of the minimal LiteRT operator set. ProtectAI flags these custom ops as “unknown,” even though they are: Official Mediapipe components Required for normal operation Safe and expected in any Mediapipe build This is a false positive caused by how LiteRT models are analyzed. There is no malicious code, no dynamic execution payloads, and no non-standard behavior in this wheel. The warning is normal for any Mediapipe build. Custom TFLite operators are officially used by Mediapipe. The wheel is built from the unmodified upstream source. There is no security risk and the warning can be safely ignored. If additional verification is needed, users may inspect the wheel’s contents or FlatBuffer schemas, but the presence of TFLite custom ops is expected by design.
Insightface_for_windows
15 September 2025 Updated completed python3.13version.