lrzjason

24 models • 3 total models in database

Sort by:

qwen_image_edit_2509_14B_nf4

Re hosted from https://huggingface.co/OPPOer/Qwen-Image-Edit-2509-Pruning This repo is re-hosted for the lora training with T2ITrainer https://github.com/lrzjason/T2ITrainer

NaNK

license:apache-2.0

qwen_image_edit_plus_nf4

💜 Qwen Chat &nbsp&nbsp | &nbsp&nbsp🤗 Hugging Face &nbsp&nbsp | &nbsp&nbsp🤖 ModelScope &nbsp&nbsp | &nbsp&nbsp 📑 Tech Report &nbsp&nbsp | &nbsp&nbsp 📑 Blog &nbsp&nbsp 🖥️ Demo &nbsp&nbsp | &nbsp&nbsp💬 WeChat (微信) &nbsp&nbsp | &nbsp&nbsp🫨 Discord &nbsp&nbsp| &nbsp&nbsp Github &nbsp&nbsp Introduction This September, we are pleased to introduce Qwen-Image-Edit-2509, the monthly iteration of Qwen-Image-Edit. To experience the latest model, please visit Qwen Chat and select the "Image Editing" feature. Compared with Qwen-Image-Edit released in August, the main improvements of Qwen-Image-Edit-2509 include: Multi-image Editing Support: For multi-image inputs, Qwen-Image-Edit-2509 builds upon the Qwen-Image-Edit architecture and is further trained via image concatenation to enable multi-image editing. It supports various combinations such as "person + person," "person + product," and "person + scene." Optimal performance is currently achieved with 1 to 3 input images. Enhanced Single-image Consistency: For single-image inputs, Qwen-Image-Edit-2509 significantly improves editing consistency, specifically in the following areas: - Improved Person Editing Consistency: Better preservation of facial identity, supporting various portrait styles and pose transformations; - Improved Product Editing Consistency: Better preservation of product identity, supporting product poster editing； - Improved Text Editing Consistency: In addition to modifying text content, it also supports editing text fonts, colors, and materials； Native Support for ControlNet: Including depth maps, edge maps, keypoint maps, and more. The following contains a code snippet illustrating how to use `Qwen-Image-Edit-2509`: The primary update in Qwen-Image-Edit-2509 is support for multi-image inputs. In fact, multi-image input also supports commonly used ControlNet keypoint maps—for example, changing a person’s pose: Similarly, the following examples demonstrate results using three input images: Another major update in Qwen-Image-Edit-2509 is enhanced consistency. First, regarding person consistency, Qwen-Image-Edit-2509 shows significant improvement over Qwen-Image-Edit. Below are examples generating various portrait styles: For instance, changing a person’s pose while maintaining excellent identity consistency: Leveraging this improvement along with Qwen-Image’s unique text rendering capability, we find that Qwen-Image-Edit-2509 excels at creating meme images: Of course, even with longer text, Qwen-Image-Edit-2509 can still render it while preserving the person’s identity: Person consistency is also evident in old photo restoration. Below are two examples: Naturally, besides real people, generating cartoon characters and cultural creations is also possible: Second, Qwen-Image-Edit-2509 specifically enhances product consistency. We find that the model can naturally generate product posters from plain-background product images: Third, Qwen-Image-Edit-2509 specifically enhances text consistency and supports editing font type, font color, and font material: Moreover, the ability for precise text editing has been significantly enhanced: It is worth noting that text editing can often be seamlessly integrated with image editing—for example, in this poster editing case: The final update in Qwen-Image-Edit-2509 is native support for commonly used ControlNet image conditions, such as keypoint control and sketches: We kindly encourage citation of our work if you find it useful.

license:apache-2.0

qwen_image_nf4

This repo only serve for T2ITrainer Training. It is not a good usage for image generation. 💜 Qwen Chat &nbsp&nbsp | &nbsp&nbsp🤗 Hugging Face &nbsp&nbsp | &nbsp&nbsp🤖 ModelScope &nbsp&nbsp | &nbsp&nbsp 📑 Tech Report &nbsp&nbsp | &nbsp&nbsp 📑 Blog &nbsp&nbsp 🖥️ Demo &nbsp&nbsp | &nbsp&nbsp💬 WeChat (微信) &nbsp&nbsp | &nbsp&nbsp🫨 Discord &nbsp&nbsp Introduction We are thrilled to release Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing. Experiments show strong general capabilities in both image generation and editing, with exceptional performance in text rendering, especially for Chinese. News - 2025.08.04: We released the Technical Report of Qwen-Image! - 2025.08.04: We released Qwen-Image weights! Check at huggingface and Modelscope! - 2025.08.04: We released Qwen-Image! Check our blog for more details! The following contains a code snippet illustrating how to use the model to generate images based on text prompts: One of its standout capabilities is high-fidelity text rendering across diverse images. Whether it’s alphabetic languages like English or logographic scripts like Chinese, Qwen-Image preserves typographic details, layout coherence, and contextual harmony with stunning accuracy. Text isn’t just overlaid—it’s seamlessly integrated into the visual fabric. Beyond text, Qwen-Image excels at general image generation with support for a wide range of artistic styles. From photorealistic scenes to impressionist paintings, from anime aesthetics to minimalist design, the model adapts fluidly to creative prompts, making it a versatile tool for artists, designers, and storytellers. When it comes to image editing, Qwen-Image goes far beyond simple adjustments. It enables advanced operations such as style transfer, object insertion or removal, detail enhancement, text editing within images, and even human pose manipulation—all with intuitive input and coherent output. This level of control brings professional-grade editing within reach of everyday users. But Qwen-Image doesn’t just create or edit—it understands. It supports a suite of image understanding tasks, including object detection, semantic segmentation, depth and edge (Canny) estimation, novel view synthesis, and super-resolution. These capabilities, while technically distinct, can all be seen as specialized forms of intelligent image editing, powered by deep visual comprehension. Together, these features make Qwen-Image not just a tool for generating pretty pictures, but a comprehensive foundation model for intelligent visual creation and manipulation—where language, layout, and imagery converge. We kindly encourage citation of our work if you find it useful.

license:apache-2.0

OpenKolors

license:apache-2.0

RealisticKolorsBeta

license:apache-2.0

FLUX.1-Krea-dev-nf4

—

RealisticKolors-alpha

license:apache-2.0

OpenKolors_v2_1

license:apache-2.0

flux-fill-nf4

`FLUX.1 Fill [dev]` is a 12 billion parameter rectified flow transformer capable of filling areas in existing images based on a text description. For more information, please read our blog post. Key Features 1. Cutting-edge output quality, second only to our state-of-the-art model `FLUX.1 Fill [pro]`. 2. Blends impressive prompt following with completing the structure of your source image. 3. Trained using guidance distillation, making `FLUX.1 Fill [dev]` more efficient. 4. Open weights to drive new scientific research, and empower artists to develop innovative workflows. 5. Generated outputs can be used for personal, scientific, and commercial purposes as described in the [`FLUX.1 [dev]` Non-Commercial License](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md). Usage We provide a reference implementation of `FLUX.1 Fill [dev]`, as well as sampling code, in a dedicated github repository. Developers and creatives looking to build on top of `FLUX.1 Fill [dev]` are encouraged to use this as a starting point. API Endpoints The FLUX.1 models are also available in our API bfl.ml To use `FLUX.1 Fill [dev]` with the 🧨 diffusers python library, first install or upgrade diffusers Then you can use `FluxFillPipeline` to run the model To learn more check out the diffusers documentation Limitations - This model is not intended or able to provide factual information. - As a statistical model this checkpoint might amplify existing societal biases. - The model may fail to generate output that matches the prompts. - Prompt following is heavily influenced by the prompting-style. - There may be slight-color shifts in areas that are not filled in - Filling in complex textures may produce lines at the edges of the filled-area. Out-of-Scope Use The model and its derivatives may not be used - In any way that violates any applicable national, federal, state, local or international law or regulation. - For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; including but not limited to the solicitation, creation, acquisition, or dissemination of child exploitative content. - To generate or disseminate verifiably false information and/or content with the purpose of harming others. - To generate or disseminate personal identifiable information that can be used to harm an individual. - To harass, abuse, threat

—

qwen_image_edit_nf4

license:apache-2.0

openxl25

—

Consistance_Edit_Lora

license:apache-2.0

ObjectRemovalFluxFill

license:mit

QwenImage Rebalance

Model Overview Rebalance is a high-fidelity image generation model trained on a curated dataset comprising thousands of cosplay photographs and handpicked, high-quality real-world images. All training data was sourced exclusively from publicly accessible internet content, and the dataset explicitly excludes any NSFW material. The primary goal of Rebalance is to produce photorealistic outputs that overcome common AI artifacts—such as an oily, plastic, or overly flat appearance—delivering images with natural texture, depth, and visual authenticity. Training Strategy Training was conducted in multiple stages, broadly divided into two phases: - Cosplay Photo Training Focused on refining facial expressions, pose dynamics, and overall human figure realism—particularly for female subjects. - High-Quality Photograph Enhancement Aimed at elevating atmospheric depth, compositional balance, and aesthetic sophistication by leveraging professionally curated photographic references. Captioning & Metadata The model was trained using two complementary caption formats: plain text and structured JSON. Each data subset employed a tailored JSON schema to guide fine-grained control during generation. > Note: Cosplayer names are anonymized (using placeholder IDs) solely to help the model associate multiple images of the same subject during training—no real identities are preserved. For high-quality photographs, the JSON structure emphasizes scene composition: In addition to structured JSON, all images were also trained with plain-text captions and with randomized caption dropout (i.e., some training steps used no caption or partial metadata). This dual approach enhances both controllability and generalization. Inference Guidance - For maximum aesthetic precision and stylistic control, use the full JSON format during inference. - For broader generalization or simpler prompting, plain-text captions are recommended. Technical Details All training was performed using lrzjason/T2ITrainer, a customized extension of the Hugging Face Diffusers DreamBooth training script. The framework supports advanced text-to-image architectures, including Qwen and Qwen-Edit (2509). Previous Work This project builds upon several prior tools developed to enhance controllability and efficiency in diffusion-based image generation and editing: - ComfyUI-QwenEditUtils A collection of utility nodes for Qwen-based image editing in ComfyUI, enabling multi-reference image conditioning, flexible resizing, and precise prompt encoding for advanced editing workflows. - ComfyUI-LoraUtils A suite of nodes for advanced LoRA manipulation in ComfyUI, supporting fine-grained control over LoRA loading, layer-wise modification (via regex and index ranges), and selective application to diffusion or CLIP models. - T2ITrainer A lightweight, Diffusers-based training framework designed for efficient LoRA (and LoKr) training across multiple architectures—including Qwen Image, Qwen Edit, Flux, SD3.5, and Kolors—with support for single-image, paired, and multi-reference training paradigms. These tools collectively establish a robust ecosystem for training, editing, and deploying personalized diffusion models with high precision and flexibility. Feel free to reach out via any of the following channels: - Twitter: @Lrzjason - Email: [email protected] - QQ Group: `866612947` - WeChat ID: `fkdeai` - CivitAI: xiaozhijason

license:apache-2.0

QwenEdit_Consistance_Edit

This lora and workflow is to improve qwen edit consistance issue. When using qwenvl encode image, it usually facing a random movement of image structure. To prevent this issue, we use kontext like workflow to only set reference image but not encode image. Plus using consistance lora to achieve high fidelity image editing. If your task requires more consistance, use higher lora strength. If original model able to output modified image but plus lora is unable to achieve result, try to lower the lora strength.

license:apache-2.0

playground-v2-1024px-aesthetic-fp16

—

QwenEdit2509 ObjectRemovalAlpha

This lora is to adjust pixels shift and color shift from removal task. Due to the training set limitation, the alpha version might have less strength on some remove task than original qwen edit 2509. Adjust the lora strength if original model perform better than added lora. To further improve the lora, you could post original image, edited image on the gralley which I could download the images and add those to the training set.

license:apache-2.0

QWEN-OOTD-Lora

This lora trained with T2ITrainer https://github.com/lrzjason/T2ITrainer Used 20 pairs data. Cloth image as reference and target is nano-banana generated images. Caption dropout 0.5. It is quite strong and blended the model image into lora. But it could use it without prompt. Civitai: https://civitai.com/models/1894921?modelVersionId=2144977

license:apache-2.0

noise-classifier

license:apache-2.0

joy_caption_watermark_yolo

from https://huggingface.co/spaces/fancyfeast/joycaption-watermark-detection

license:mit

lrzjason

Anything2Real

Anything2Real_2601

QwenEdit-Anything2Real_Alpha

flux-kontext-nf4

qwen_image_edit_2509_14B_nf4

qwen_image_edit_plus_nf4

qwen_image_nf4

OpenKolors

RealisticKolorsBeta

FLUX.1-Krea-dev-nf4

RealisticKolors-alpha

OpenKolors_v2_1

flux-fill-nf4

qwen_image_edit_nf4

openxl25

Consistance_Edit_Lora

ObjectRemovalFluxFill

QwenImage Rebalance

QwenEdit_Consistance_Edit

playground-v2-1024px-aesthetic-fp16

QwenEdit2509 ObjectRemovalAlpha

QWEN-OOTD-Lora

noise-classifier

joy_caption_watermark_yolo