calcuis
wan2-gguf
gguf quantized version of wan2.2 models - drag wan to > `./ComfyUI/models/diffusionmodels` - drag umt5xxl to > `./ComfyUI/models/textencoders` - drag pig to > `./ComfyUI/models/vae` tip: the lite lora for s2v [1.23GB], can apply to animate model also tip: for 5b model, use pig-wan2-vae [1.41GB]; for 14b model, please use pig-wan-vae [254MB] update - upgrade your node (see last item from reference) for new/full quant support - get more umt5xxl gguf encoder either here or here reference - base model from wan-ai - 4/8-step lightning lora from lightx2v - comfyui from comfyanonymous - gguf-node (pypi|repo|pack)
hunyuanimage-gguf
pig-vae
- pig architecture from connector - 25-50% faster; compare to safetensors version - save memory up to 25-50%; good for old machine - compatible with all model; no matter safetensors or gguf - upgrade your node (pypi|repo) for pig๐ท vae support; with the new gguf vae loader - you could drag the picture/video below to your browser for example workflow - tips: make good use of the convertor zero for gguf model/encoder/vae๐ท conversion update - get the new vae from: pigflux2vaefp32-f16.gguf
ltxv-gguf
wan-gguf
gguf quantized version of wan video - drag gguf to > `./ComfyUI/models/diffusionmodels` - drag t5xxl-um to > `./ComfyUI/models/textencoders` - drag vae to > `./ComfyUI/models/vae` workflow - for i2v model, drag clip-vision-h to > `./ComfyUI/models/clipvision` - run the .bat file in the main directory (assume you are using gguf pack below) - if you opt to use fp8 scaled umt5xxl encoder (if applies to any fp8 scale t5 actually), please use cpu offload (switch from default to cpu under device in gguf clip loader; won't affect speed); btw, it works fine for both gguf umt5xxl and gguf vae - drag any demo video (below) to > your browser for workflow review - `pig` is a lazy architecture for gguf node; it applies to all model, encoder and vae gguf file(s); if you try to run it in comfyui-gguf node, you might need to manually add `pig` in it's IMGARCHLIST (under loader.py); easier than you edit the gguf file itself; btw, model architecture which compatible with comfyui-gguf, including `wan`, should work in gguf node - 1.3b model: t2v, vace gguf is working fine; good for old or low end machine update - wan2.1-v5-vace-1.3b: except block weights, all in `f32` status (avoid triggering time/text embedding key error for inference usage) reference - base model from wan-ai - comfyui from comfyanonymous - pig architecture from connector - gguf-connector (pypi) - gguf-node (pypi|repo|pack)
pig-encoder
- text encoder base model from google - llama encoder base model from meta - pig architecture from connector - 50% faster at least; compare to safetensors version - save memory up to 50% as well; good for old machine - compatible with all model; no matter safetensors or gguf - tested on pig-1k/1k-aura/1k-turbo/cosmos, etc.; works fine - upgrade your node for pig๐ท encoder support - you could drag the picture below to your browser for example workflow
cosmos-predict2-gguf
hidream-gguf
qwen-image-gguf
gguf quantized version of qwen-image - run it straight with `gguf-connector` > >GGUF file(s) available. Select which one to use: > >1. qwen-image-iq2s.gguf >2. qwen-image-iq4nl.gguf >3. qwen-image-q40.gguf >4. qwen-image-q80.gguf > >Enter your choice (1 to 4): > run it with gguf-node via comfyui - drag qwen-image to > `./ComfyUI/models/diffusionmodels` - drag qwen2.5-vl-7b [4.43GB] to > `./ComfyUI/models/textencoders` - drag pig [254MB] to > `./ComfyUI/models/vae` tip: the text encoder used for this model is qwen2.5-vl-7b; get more encoder either here (pig quant) or here (llama.cpp quant); the size is different from the one (qwen2.5-vl-3b) used in omnigen2 note: diffusers not yet supported t and i quants; opt gguf-node via comfyui or run it straight with gguf-connector reference - base model from qwen - distilled model from modelscope - lite model is a lora merge from lightx2v - comfyui from comfyanonymous - diffusers from huggingface - gguf-node (pypi|repo|pack) - gguf-connector (pypi)
gguf-node
illustrious
gguf quantized and fp8 scaled versions of illustrious (test pack) setup (in general) - drag gguf file(s) to diffusionmodels folder (./ComfyUI/models/diffusionmodels) - drag clip or encoder(s), i.e., illustriousgclip and illustriouslclip, to textencoders folder (./ComfyUI/models/textencoders) - drag vae decoder(s), i.e., vae, to illustriousvae folder (./ComfyUI/models/vae) run it straight (no installation needed way) - get the comfy pack with the new gguf-node (pack) - run the .bat file in the main directory workflow - drag any workflow json file to the activated browser; or - drag any generated output file (i.e., picture, video, etc.; which contains the workflow metadata) to the activated browser review - use tag/word(s) as input for more accurate results for those legacy models; not very convenient (compare to the recent models) at the very beginning - credits should be given to those contributors from civitai platform - fast-illustrious gguf was quantized from fp8 scaled safetensors while illustrious gguf was quantized from the original bf16 (this is just an attempt to test: is it true? the trimmed model with 50% tensors lesser really load faster? please test it yourself; btw, some models might have their unique structure/feature affecting the loader performance, never one size fits all) - fp8 scaled file works fine in this model; including vae and clips - good to run on old machines, i.e., 9xx series or before (legacy mode [--disable-cuda-malloc --lowvram] supported); compatible with the new gguf-node - disclaimer: some models (original files) are provided by someone else and we might not easily spot out the creator/contributor(s) behind, unless it was specified in the source; rather let it blank instead of anonymous/unnamed/unknown; if it is your work, do let us know; we will address it back properly and probably; thanks for everything reference - wai creator - comfyui comfyanonymous - gguf-node (pypi|repo|pack)
qwen-image-edit-plus-gguf
qwen-image-edit-plus-gguf - run it with `gguf-connector`; simply execute the command below in console/terminal > >GGUF file(s) available. Select which one to use: > >1. qwen-image-edit-plus-v2-iq3s.gguf >2. qwen-image-edit-plus-v2-iq4nl.gguf >3. qwen-image-edit-plus-v2-mxfp4moe.gguf > >Enter your choice (1 to 3): > - opt a `gguf` file in your current directory to interact with; nothing else - `ggc q8` accepts multiple image input (see picture above; two images as input) - as lite lora auto applied, able to generate output with merely 4/8 steps instead of the default 40 steps; save up to 80% loading time - up to 3 pictures plus customize prompt as input (above is 3 images input demo) - though `ggc q8` is accepting single image input (see above), you could opt the legacy `ggc q7` (see below); similar to image-edit model before run it with gguf-node via comfyui - drag qwen-image-edit-plus to > `./ComfyUI/models/diffusionmodels` - anyone below, drag it to > `./ComfyUI/models/textencoders` - option 1: just qwen2.5-vl-7b-test [5.03GB] - option 2: just qwen2.5-vl-7b-edit [7.95GB] - option 3: both qwen2.5-vl-7b [4.43GB] and mmproj-clip [608MB] - drag pig [254MB] to > `./ComfyUI/models/vae` run it with diffusers - might need the most updated git version for `QwenImageEditPlusPipeline`, should after this pr; for i quant support, should after this commit; install the updated git version diffusers by: - simply replace `QwenImageEditPipeline` by `QwenImageEditPlusPipeline` from the qwen-image-edit inference example (see here) run nunchaku safetensors straight with gguf-connector (experimental feature) - run it with the new `q9` connector; simply execute the command below in console/terminal > >Safetensors available. Select which one to use: > >1. qwen-image-edit-lite-blackwell-fp4.safetensors >2. qwen-image-edit-lite-int4.safetensors (for non-blackwell card) > >Enter your choice (1 to 2): - opt a `safetensors` file in your current directory to interact with; nothing else note: able to generate output with 4/8 steps (see above); surprisingly fast even with low end device; compatible with safetensors in nunchaku repo (depends on your machine; opt the right one) run the lite model (experimental) with gguf-connector > >GGUF file(s) available. Select which one to use: > >1. qwen-image-edit-lite-iq4nl.gguf >2. qwen-image-edit-lite-q40.gguf >3. qwen-image-edit-lite-q4ks.gguf > >Enter your choice (1 to 3): > - opt a `gguf` file in your current directory to interact with; nothing else note: a new lite lora auto applied to `q0` and `q9`; able to generate output with 4/8 steps; and more working layers in these versions, should be more stable than `p0` (v2.0) below - for lite v2.0, please use `p0` connector (experimental) > >GGUF file(s) available. Select which one to use: > >1. qwen-image-edit-lite-v2.0-iq2s.gguf >2. qwen-image-edit-lite-v2.0-iq3s.gguf >3. qwen-image-edit-lite-v2.0-iq4nl.gguf > >Enter your choice (1 to 3): > - opt a `gguf` file in your current directory to interact with; nothing else run the new lite v2.1 (experimental) with gguf-connector - for lite v2.1, please use `p9` connector > >GGUF file(s) available. Select which one to use: > >1. qwen-image-edit-lite-v2.1-q40.gguf >2. qwen-image-edit-lite-v2.1-mxfp4moe.gguf > >Enter your choice (1 to 2): > - opt a `gguf` file in your current directory to interact with; nothing else note: `ggc p9` is able to generate picture with 4/8 steps but need a higher guidance (i.e., 3.5); if too many elements involved, you might consider increasing the steps (i.e., 15) for better output reference - gguf-node (pypi|repo|pack) - gguf-connector (pypi)
kontext-gguf
wan-1.3b-gguf
pony
gguf quantized legacy models for anime (additional test pack for gguf-node) setup (in general) - drag gguf file(s) to diffusionmodels folder (./ComfyUI/models/diffusionmodels) - drag clip or encoder(s), i.e., g-clip and l-clip, to textencoders folder (./ComfyUI/models/textencoders) - drag vae decoder(s), i.e., legacy-vae, to vae folder (./ComfyUI/models/vae) run it straight (no installation needed way) - get the comfy pack with the new gguf-node here - run the .bat file in the main directory workflow - drag any workflow json file to the activated browser; or - drag any generated output file (i.e., picture, video, etc.; which contains the workflow metadata) to the activated browser - example workflow json for the safetensors - example workflow json for the gguf review - use tag/word(s) as input for more accurate results for those legacy models; not very convenient (compare to the recent models) at the very beginning - credits should be given to those contributors from civitai platform - good to run on old machines, i.e., 9xx series or before (legacy mode [--disable-cuda-malloc --lowvram] supported); compatible with the new gguf-node - disclaimer: some models (original files) are provided by someone else and we might not easily spot out the creator/contributor(s) behind, unless it was specified in the source; rather let it blank instead of anonymous/unnamed/unknown; if it is your work, do let us know; we will address it back properly and probably; thanks for everything reference - comfyui comfyanonymous - gguf-node (pypi|repo|pack)
pig
- tiny model but good quality output - ๐ทpig is a lightweight architecture - good performance on old machine - upgrade your node for pig support setup (once) - drag gguf to > `./ComfyUI/models/diffusionmodels` - drag t5xxl to > `./ComfyUI/models/textencoders` - drag vae to > `./ComfyUI/models/vae` run it straight (no installation needed way) - run the .bat file in the main directory (assuming you are using the gguf-node with comfy pack) - drag the workflow json file below or the demo picture/video (below) to > your browser workflow - example workflow (json) for pig-1k [t5xxl] or opt gguf [t5xxl] (save memory choice) - example workflow (json) for pig-video - example workflow (json) for pig-mochi - example workflow (json) for pig-1k-turbo - example workflow (json) for pig-1k-aura [t5xl] or opt gguf [t5xl] (save memory choice) - example workflow (json) for pig-1k-lumina [gemma2b] - example workflow (json) for pig-cosmos [t5xxl-old] or opt gguf [t5xxl-old] (save memory) - example workflow (json) for pig-t2w - example workflow (json) for pig-v2w or -v2w-mini - example workflow (json) for pig-i2v or -i2v-plus [clipvision] run it with gguf-connector (optional; for demo recently) - text2image generator reference - base model from connector (1k|1k-turbo|video|mochi|cosmos|t2w|v2w|v2w-mini|i2v|i2v-plus) - comfyui comfyanonymous - gguf-connector (pypi) - gguf-node (pypi|repo|pack)
sd3.5-large-gguf
qwen-image-edit-gguf
hyvid
krea-gguf
cow-encoder
๐ฎcow architecture gguf encoder - don't need to rebuild tokenizer from metadata โณ๐ฅ - don't need separate tokenizer file ๐ฑ๐ฅ - no more oom issues (possibly) ๐ซ๐ป๐ฅ eligible model example - use cow-mistral3small 7.73GB for flux2-dev - use cow-gemma2 2.33GB for lumina - use cow-umt5base 451MB for ace-audio - use cow-umt5xxl 3.67GB for wan-s2v or any wan video model the example workflow above is from wan-s2v-gguf; cow encoder is a special designed clip, even the lowest q2 quant still working very good; upgrade your node for cow-encoder support๐ฅ๐ฎ and do drink more milk
ltxv0.9.6-gguf
chatterbox-gguf
ace-gguf
sd3.5-medium-gguf
GGUF quantized version of Stable Diffusion 3.5 Medium Setup (once) - drag sd3.5medium-q50.gguf (2.02GB) to > ./ComfyUI/models/unet - drag clipg.safetensors (1.39GB) to > ./ComfyUI/models/clip - drag clipl.safetensors (246MB) to > ./ComfyUI/models/clip - drag t5xxlfp8e4m3fn.safetensors (4.89GB) to > ./ComfyUI/models/clip - drag diffusionpytorchmodel.safetensors (168MB) to > ./ComfyUI/models/vae Run it straight (no installation needed way) - run the .bat file in the main directory (assuming you are using the gguf-comfy pack below) - drag the workflow json file (see below) to > your browser - generate your first picture with sd3, awesome! Workflows - example workflow for gguf (if it doesn't work, upgrade your pack: ggc y) ๐ป - example workflow for the original safetensors ๐ Bug reports (or brief review) - t/q10 and t/q20; not working recently (invalid GGMLQ type error) - q2k is super fast; not usable; but might be good for medical research or abstract painter - q3 family is fast; finger issue can be easily detected but picture quality is interestingly good - btw, q3 family is usable; just need some effort on +/- prompt(s); good for old machine user - notice the same file size in some quantized models; but they are different (according to its SHA256 hash) even they are exactly the same in size; still keep them all here (in full set); see who can deal with it - q4 and above should be no problem for general-to-high quality production (demo picture shown above generated by just Q40 - 1.74GB); and sd team is pretty considerable, though you might not think this model is useful if you have good hardware; imagine you can run it with merely an ancient CPU, should probably appreciate their great effort; good job folks, thumbs up ๐ Upper tier options - sd3.5-large (recommended) - sd3.5-turbo References - base model from stabilityai - comfyui from comfyanonymous - gguf node from city96 - gguf-comfy pack
omnigen2-gguf
gguf quantized version of omnigen2 - drag omnigen2 to > `./ComfyUI/models/diffusionmodels` - drag qwen2.5-vl-3b to > `./ComfyUI/models/textencoders` - drag pig to > `./ComfyUI/models/vae` - don't need safetensors anymore; all gguf (model + encoder + vae) - full set gguf works on gguf-node (see the last item from reference below) - t2i is roughly 3x to 5x faster than i2i or image editing - get more qwen2.5-vl-3b gguf encoder either here (pig quant) or here (llama.cpp quant) - alternatively, you could get fp8-e4m3fn safetensors encoder here, or make it with `TENSOR Cutter (Beta)`; works pretty good as well; and don't even need to switch loader (gguf clip loader supports scaled fp8 safetensors) reference - base model from omnigen2 - comfyui from comfyanonymous - gguf-node (pypi|repo|pack)
sd3.5-lite-gguf
sd3.5-large-controlnet
wan-s2v-gguf
gguf quantized version of wan2.2-s2v (all gguf: incl. encoders + vae) - drag wan to > `./ComfyUI/models/diffusionmodels` - anyone below, drag it to > `./ComfyUI/models/textencoders` - option 1: just cow-umt5xxl [3.67GB] - option 2: both cat-umt5xxl [3.66GB] and tokenizer [4.55MB] - option 3: just umt5xxl [3.47GB] (need protobuf to rebuild tokenizer) - drag wav2vec2-v2 [632MB] to > `./ComfyUI/models/audioencoders` - drag pig [254MB] to > `./ComfyUI/models/vae` note: the new `GGUF AudioEncoder Loader` on test; running gguf audio encoder `wav2vec2` w/o ending error msg compare to fp16 safetensors (depends how long of your prompt/video) reference - for the lite workflow (save >70% loading time), get the `lite lora` for 4/8-step operation here - or opt to use scaled fp8 e4m3 safetensors `audio encoder` here and/or fp8 e4m3 `vae` here and/or scaled fp8 e4m3 safetensors `text encoder` here (don't even need to switch to native loaders as `GGUF AudioEncoder Loader`, `GGUF VAE Loader` and `GGUF CLIP Loader` support both gguf and fp8 scaled safetensors files; can mix up or combine use as well) - gguf-node (pypi|repo|pack)
mochi
hy3d-gguf
lumina-gguf
flux1-gguf
humo-gguf
humo-gguf - drag humo to > `./ComfyUI/models/diffusionmodels` - drag cow-umt5xxl [3.67GB] to > `./ComfyUI/models/textencoders` - drag pig [254MB] to > `./ComfyUI/models/vae` s2v workflow - drag humo to > `./ComfyUI/models/diffusionmodels` - anyone below, drag it to > `./ComfyUI/models/textencoders` - option 1: just cow-umt5xxl [3.67GB] - option 2: just umt5xxl [3.47GB] (rebuild tokenizer process) - option 3: both cat-umt5xxl [3.66GB] and tokenizer [4.55MB] - drag whisper3 [3.23GB] to > `./ComfyUI/models/audioencoders` - drag pig [254MB] to > `./ComfyUI/models/vae` note: output seems different from wan; don't expect too much; get the `lite lora` for 4/8-step operation here, the lora works for 17b only; but 1.7b itself is fast enough
hunyuan-gguf
Setup (once) - drag hunyuan-video-t2v-720p-q40.gguf (7.74GB) to > ./ComfyUI/models/unet - drag clipl.safetensors (246MB) to > ./ComfyUI/models/textencoders - drag llavallama3fp8scaled.safetensors (9.09GB) to > ./ComfyUI/models/textencoders - drag hunyuanvideovaebf16.safetensors (493MB) to > ./ComfyUI/models/vae Run it straight (no installation needed way) - run the .bat file in the main directory (assuming you are using the gguf-comfy pack below) - drag the workflow json file (below) to > your browser Workflows - example workflow for gguf (see demo above) - example workflow for safetensors (repackage by comfyui) References - base model from tencent - comfyui from comfyanonymous - gguf node from city96 - gguf-comfy pack prompt: "anime style anime girl with massive fennec ears and one big fluffy tail, she has blonde hair long hair blue eyes wearing a pink sweater and a long blue skirt walking in a beautiful outdoor scenery with snow mountains in the background"
sketch
- run it with `gguf-connector`; simply execute the command below in console/terminal > >GGUF file(s) available. Select which one to use: > >1. sketch-s8-q2k.gguf >2. sketch-s8-q2ks.gguf > >Enter your choice (1 to 2): > - bring your ideas ๐ก on; nothing else > >Safetensors available. Select which one to use: > >1. sketch-s9-12b-fp4.safetensors >2. sketch-s9-20b-fp4.safetensors >3. sketch-s9-20b-int4.safetensors > >Enter your choice (1 to 3): > - dual mode is available: accept two drawings or mixing up with existing image(s)
pixart
openaudio-gguf
mochi-gguf
bagel-gguf
gguf quantized and fp8/16 scaled version of bagel - base model from bytedance-seed - multimodal trial model (i.e., t2i, image editing/recogition) review/reference - simply execute the command (`ggc b2`) above in console/terminal - opt a `vae` then opt a `model` file in the current directory to interact (see example below) > >Detecting GGUF/Safetensors... > >GGUF file(s) available. Select which one for VAE: >1. pigaefp32-f16.gguf >2. pigaefp32-f32.gguf > >Enter your choice (1 to 2): 1 > >VAE file: pigaefp32-f16.gguf is selected! > > >Safetensors file(s) available. Select which one for MODEL: >1. emabf16.safetensors >2. emafp16.safetensors (for non-cuda user) >3. emafp8e4m3fn.safetensors (recommended) >4. emafp8e5m2.safetensors > >Enter your choice (1 to 4): > - note: for the latest update, only tokenizer will be pulled to gguf-connector folder (cache) automatically during the first launch; you need to prepare the bulky model and vae files still, working like vision connector right away; mix and match, more flexible - run it entirely offline; i.e., from local URL: http://127.0.0.1:7860 with lazy webui - require dependency: bagel2; `pip install bagel2`; for flash-attn and triton, could opt to install it with pre-built wheels, i.e.,here, unless you can build the wheel yourself successfully - might need some optional dependencies, please refer to the checklist; as connector won't force your machine to install any of those by default - gguf-connector (pypi)
deepseek-r1
hyvid-i2v-gguf
aura
cat-encoder
ltxv0.9.7-gguf
phi4
higgs-gguf
phi3
koji
gguf
ltxv0.9.5-gguf
reference - base model from lightricks - comfyui from comfyanonymous - pig architecture from connector - gguf-node (pypi|repo|pack)
dia-gguf
olmo-gguf
vauxz3d
cosmos
sd3.5-large-turbo
openmath2
llava-gguf
code_mini
chat
medi_mini
phi2_mini
law_mini
lam2_mini
tiny
Studio
studio - run it with `gguf-connector`; simply execute the command below in console/terminal > >GGUF/Safetensors available. Please select: > >1. fastvlm-0.5b-f16.gguf (act as recognizor 1.52GB) >2. sd3.5-2b-lite-mxfp4moe.gguf (act as generator 2.86GB) >3. sketch-s9-20b-fp4.safetensors (act as transformer; for blackwell card 11.9GB) >4. sketch-s9-20b-int4.safetensors (act as transformer; for non-blackwell card 11.5GB) > >Enter your choice (1 to 4): > - all in one inference - two pictures as input for comparison; similarity metrics as output - recognize image; customize output, i.e., description, locate a specific subject/object