rockerBOO

10 models • 1 total models in database

Sort by:

flux.1-dev-SRPO

bf16 and BlackForestLabs (BFL) versions of SRPO from Tencent - BFL are in BlackForestLabs reference implementation - Otherwise it's in diffusers format Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference Xiangwei Shen 1,2 , Zhimin Li 1 , Zhantao Yang 1 , Shiyi Zhang 3 , Yingfang Zhang 1 , Donghao Li 1 , 2 School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen 3 Shenzhen International Graduate School, Tsinghua University Abstract Recent studies have demonstrated the effectiveness of directly aligning diffusion models with human preferences using differentiable reward. However, they exhibit two primary challenges: (1) they rely on multistep denoising with gradient computation for reward scoring, which is computationally expensive, thus restricting optimization to only a few diffusion steps; (2) they often need continuous offline adaptation of reward models in order to achieve desired aesthetic quality, such as photorealism or precise lighting effects. To address the limitation of multistep denoising, we propose Direct-Align, a method that predefines a noise prior to effectively recover original images from any time steps via interpolation, leveraging the equation that diffusion states are interpolations between noise and target images, which effectively avoids over-optimization in late timesteps. Furthermore, we introduce Semantic Relative Preference Optimization (SRPO), in which rewards are formulated as text-conditioned signals. This approach enables online adjustment of rewards in response to positive and negative prompt augmentation, thereby reducing the reliance on offline reward fine-tuning. By fine-tuning the FLUX.1.dev model with optimized denoising and online reward adjustment, we improve its human-evaluated realism and aesthetic quality by over 3x. Quick Started Checkpoints The `diffusionpytorchmodel.safetensors` is online version of SRPO based on FLUX.1 Dev, trained on HPD dataset with HPSv2 Inference Replace the `diffusionpytorchmodel.safetensors` of FLUX License SRPO is licensed under the License Terms of SRPO. See `./License.txt` for more details. Citation If you use SRPO for your research, please cite our paper:

—

6,867

flux.1-dev-SRPO-LoRA

|Flux.1 Dev|SRPO|SRPO LoRA| |---|---|---| |||| |||| ||||

—

346

Flux-Dev2Pro-BFL

Flux-Dev2Pro model converted to BFL (Black Forest Labs) format. Flux-Dev2Pro finetunes the transformer of Flux-Dev to make LoRA training better. As discussed in this blog https://medium.com/@zhiwangshi28/why-flux-lora-so-hard-to-train-and-how-to-overcome-it-a0c70bc59eaf, LoRA trained on Flux-Dev often yields bad results, because without guidance distillation the LoRA training is diverged from the original training process. Flux-Dev2Pro recovers Flux-pro from Flux-dev by finetuning the model for many steps. Two epoch of 3M high quality images have been trained. The LoRA trained on Flux-Dev2pro yields a much better results when being applied on Flux-dev, just like LoRA trained on SDXL and being applied to SDXL-turbo/lightning. “The FLUX.1 [dev] Model is licensed by Black Forest Labs. Inc. under the FLUX.1 [dev] Non-Commercial License. Copyright Black Forest Labs. Inc. IN NO EVENT SHALL BLACK FOREST LABS, INC. BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH USE OF THIS MODEL.”

—

171

Flux-Dev2Pro-fp8_e4m3fn

—

165

flux-bpo-po-lora

note Not indicative of performance but an example of the current file. Preference optimization example (for testing) using BPO

—

Flux-SCFM-Distilled-LoRA

Sourced from https://civitai.com/models/2064593/flux-scfm-distilled-lora?modelVersionId=2336250 Shortcutting Pretrained Flow Matching Diffusion Models. This model allows you to generate images within 3-8 steps by applying the weight as a LoRA on the FLUX series checkpoint. It has been accepted for presentation at NeurIPS 2025. For further technical details, please refer to our paper and the associated project. Recommended settings: lora strength 1.0-1.75, cfg >=4.5. Lower steps require higher strength.

NaNK

—

rockerBOO

flux.1-dev-SRPO

flux.1-dev-SRPO-LoRA

Flux-Dev2Pro-BFL

Flux-Dev2Pro-fp8_e4m3fn

flux-bpo-po-lora

Flux-SCFM-Distilled-LoRA

Flex.1-alpha-fp8-e4m3fn

stablelm-tuned-alpha-3b-8bit

Flex.1-alpha-NF4

flux1-dev-krea-dare-merge