Qwen-Image-bnb-4bit
7
license:apache-2.0
by
mudaai
Image Model
OTHER
4B params
New
7 downloads
Early-stage
Edge AI:
Mobile
Laptop
Server
9GB+ RAM
Mobile
Laptop
Server
Quick Summary
AI model with specialized capabilities.
Device Compatibility
Mobile
4-6GB RAM
Laptop
16GB RAM
Server
GPU
Minimum Recommended
4GB+ RAM
Code Examples
Running with `diffusers`pythontransformers
import torch
from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig
from transformers import Qwen2_5_VLForConditionalGeneration
from diffusers import (
BitsAndBytesConfig as DiffusersBitsAndBytesConfig,
QwenImagePipeline,
QwenImageTransformer2DModel,
)
# Model configuration
model_id = "mudaai/Qwen-Image-bnb-4bit"
torch_dtype = torch.float32 # Use float32 for broader GPU compatibility
device = "cuda"
# =============================================================================
# Step 1: Load the Diffusion Transformer (DiT) with 4-bit quantization
# =============================================================================
dit_quantization_config = DiffusersBitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4", # NormalFloat4 quantization
bnb_4bit_compute_dtype=torch.bfloat16, # Compute in bfloat16 for stability
llm_int8_skip_modules=["transformer_blocks.0.img_mod"], # Keep early block full precision
)
transformer = QwenImageTransformer2DModel.from_pretrained(
model_id,
subfolder="transformer",
quantization_config=dit_quantization_config,
torch_dtype=torch_dtype,
)
# =============================================================================
# Step 2: Load the Text Encoder (Qwen2.5-VL) with 4-bit quantization
# =============================================================================
text_encoder_quantization_config = TransformersBitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch_dtype, # Match master dtype for compatibility
)
text_encoder = Qwen2_5_VLForConditionalGeneration.from_pretrained(
model_id,
subfolder="text_encoder",
quantization_config=text_encoder_quantization_config,
torch_dtype=torch_dtype,
)
# =============================================================================
# Step 3: Assemble the Pipeline
# =============================================================================
pipe = QwenImagePipeline.from_pretrained(
model_id,
transformer=transformer,
text_encoder=text_encoder,
torch_dtype=torch_dtype,
)
pipe.to(device)
# =============================================================================
# Step 4: Enable Memory Optimizations
# =============================================================================
pipe.enable_attention_slicing("auto") # Reduces peak memory usage
pipe.vae.enable_tiling() # Enables generation of large images
# =============================================================================
# Step 5: Generate an Image
# =============================================================================
prompt = "A serene mountain landscape at sunset with a reflective lake"
negative_prompt = "blurry, low quality, distorted"
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
height=1024,
width=1024,
true_cfg_scale=1.0,
num_inference_steps=20,
generator=torch.Generator(device=device).manual_seed(42),
).images[0]
# Save the generated image
image.save("generated_image.png")
print("Image saved to generated_image.png")Deploy This Model
Production-ready deployment in minutes
Together.ai
Instant API access to this model
Production-ready inference API. Start free, scale to millions.
Try Free APIReplicate
One-click model deployment
Run models in the cloud with simple API. No DevOps required.
Deploy NowDisclosure: We may earn a commission from these partners. This helps keep LLMYourWay free.