FLUX.1-dev-SDNQ-uint4-svd-r32
214
2
—
by
Disty0
Image Model
OTHER
New
214 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
Unknown
Mobile
Laptop
Server
Quick Summary
4 bit (UINT4 with SVD rank 32) quantization of black-forest-labs/FLUX.
Code Examples
text
pip install sdnqEnable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:pythonpytorch
import torch
import diffusers
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers
from sdnq.common import use_torch_compile as triton_is_available
from sdnq.loader import apply_sdnq_options_to_model
pipe = diffusers.FluxPipeline.from_pretrained("Disty0/FLUX.1-dev-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16)
# Enable INT8 MatMul for AMD, Intel ARC and Nvidia GPUs:
if triton_is_available and (torch.cuda.is_available() or torch.xpu.is_available()):
pipe.transformer = apply_sdnq_options_to_model(pipe.transformer, use_quantized_matmul=True)
pipe.text_encoder_2 = apply_sdnq_options_to_model(pipe.text_encoder_2, use_quantized_matmul=True)
pipe.transformer = torch.compile(pipe.transformer) # optional for faster speeds
pipe.enable_model_cpu_offload()
prompt = "A cat holding a sign that says hello world"
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.manual_seed(0)
).images[0]
image.save("flux-dev-sdnq-uint4-svd-r32.png")Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.