REPA-E

10 models • 2 total models in database

Sort by:

E2e Qwenimage Vae

End-to-End Tuned VAEs for Supercharging Text-to-Image Diffusion Transformers 🌐 Project Page &ensp; 🤗 Models &ensp; 📃 Paper &ensp; Xingjian Leng 1,2 &ensp; · &ensp; Jaskirat Singh 1 &ensp; · &ensp; Ryan Murdock 2 &ensp; · &ensp; Ethan Smith 2 &ensp; · &ensp; Rebecca Li 2 &ensp; · &ensp; Saining Xie 3 &ensp; · &ensp; Liang Zheng 1 &ensp; 1 Australian National University &emsp; 2 Canva &emsp; 3 New York University &emsp; Done during internship at Canva &emsp; 📄 REPA-E Paper &ensp; | &ensp; 🌐 Blog Post &ensp; | &ensp; 🤗 Models --> We present REPA-E for T2I, a family of end-to-end tuned VAEs designed to supercharge text-to-image generation training. These models consistently outperform Qwen-Image-VAE across all benchmarks (COCO-30K, DPG-Bench, GenAI-Bench, GenEval, and MJHQ-30K) without requiring any additional representation alignment losses. For training, we adopt the official REPA-E training code to optimize the Qwen-Image-VAE for 80 epochs with a batch size of 256 on the ImageNet-256 dataset. The REPA-E training effectively refines the VAE’s latent-space structure and enables faster convergence in downstream text-to-image latent diffusion model training. This repository provides diffusers -compatible weights for the end-to-end trained Qwen-Image-VAE . In addition, we release end-to-end trained variants of several other widely used VAEs to facilitate research and integration within text-to-image diffusion frameworks. > Use `vae.encode(...)` / `vae.decode(...)` in your pipeline. (A full example is provided below.) | Model | Hugging Face Link | |-------|-------------------| | E2E-FLUX-VAE | 🤗 REPA-E/e2e-flux-vae | | E2E-SD-3.5-VAE | 🤗 REPA-E/e2e-sd3.5-vae | | E2E-Qwen-Image-VAE | 🤗 REPA-E/e2e-qwenimage-vae | 📦 Requirements The following packages are required to load and run the REPA-E VAEs with the `diffusers` library: 🚀 Example Usage Below is a minimal example showing how to load and use the REPA-E end-to-end trained Qwen-Image-VAE with `diffusers`:

license:mit

301

e2e-invae-hf

REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers Xingjian Leng 1 &ensp; · &ensp; Jaskirat Singh 1 &ensp; · &ensp; Yunzhong Hou 1 &ensp; · &ensp; Zhenchang Xing 2 &ensp; · &ensp; Saining Xie 3 &ensp; · &ensp; Liang Zheng 1 &ensp; 1 Australian National University &emsp; 2 Data61-CSIRO &emsp; 3 New York University &emsp; Project Leads&emsp; 🌐 Project Page &ensp; 🤗 Models &ensp; 📃 Paper &ensp; We address a fundamental question: Can latent diffusion models and their VAE tokenizer be trained end-to-end? While training both components jointly with standard diffusion loss is observed to be ineffective — often degrading final performance — we show that this limitation can be overcome using a simple representation-alignment (REPA) loss. Our proposed method, REPA-E, enables stable and effective joint training of both the VAE and the diffusion model. REPA-E significantly accelerates training — achieving over 17× speedup compared to REPA and 45× over the vanilla training recipe. Interestingly, end-to-end tuning also improves the VAE itself: the resulting E2E-VAE provides better latent structure and serves as a drop-in replacement for existing VAEs (e.g., SD-VAE), improving convergence and generation quality across diverse LDM architectures. Our method achieves state-of-the-art FID scores on ImageNet 256×256: 1.12 with CFG and 1.69 without CFG. > New in this release: We are releasing the REPA-E E2E-VAE as a fully Hugging Face AutoencoderKL checkpoint — ready to use with `diffusers` out of the box. We previously released the REPA-E VAE checkpoint, which required loading through the model class in our REPA-E repository. This new version provides a Hugging Face–compatible AutoencoderKL checkpoint that can be loaded directly via the `diffusers` API — no extra code or custom wrapper needed. It offers plug-and-play compatibility with diffusion pipelines and can be seamlessly used to build or train new diffusion models. > Use `vae.encode(...)` / `vae.decode(...)` in your pipeline. (A full example is provided below.) 📦 Requirements The following packages are required to load and run the REPA-E VAEs with the `diffusers` library: 🚀 Example Usage Below is a minimal example showing how to load and use the REPA-E end-to-end trained IN-VAE with `diffusers`:

license:mit