High-Performance Merged Text-to-Video Model Built on WAN 2.1 and fused with research-grade components for cinematic motion, detail, and speed β optimized for ComfyUI and rapid iteration in as few as 6 steps.
Merged models for faster, richer motion & detail β high performance even at just 8 steps.
> π Important: To match the quality shown here, use the linked workflows or make sure to follow the recommended settings outlined below.
A powerful text-to-video model built on top of WAN 2.1 14B, merged with several research-grade models to boost:
- Motion quality - Scene consistency - Visual detail
Comparable with closed-source solutions, but open and optimized for ComfyUI workflows.
This model is made up of the following which is on TOP of Wan 2.1 14B 720p(FusionX would not be what it is without these Models):
- CausVid β Causal motion modeling for better flow and dynamics - AccVideo β Better temporal alignment and speed boost - MoviiGen1.1 β Cinematic smoothness and lighting - MPS Reward LoRA β Tuned for motion and detail - Custom LoRAs β For texture, clarity, and small detail enhancements (Set at a very low level)
All merged models are provided for research and non-commercial use only. Some components are subject to licenses such as CC BY-NC-SA 4.0, and do not fall under permissive licenses like Apache 2.0 or MIT. Please refer to each modelβs original license for full usage terms.
We finally cooked up FusionX LoRAs!! π§ π₯ This is huge β now you can plug FusionX into your favorite workflows as a LoRA on top of the Wan base models and SkyReels models!ππ« You can still stick with the base FusionX Model if you already use it, but if you would rather have more control over the "FusionX" strength and a speed boost, then this might be for you.
Oh, and thereβs a nice speed boost too! β‘ Example: (RTX 5090) - FusionX as a full base model: 8 steps = 160s β±οΈ - FusionX as a LoRA on Wan 2.1 14B fp8 T2V: 8 steps = 120s π
Bonus: You can bump up the FusionX LoRA strength and lower your steps for a huge speed boost while testing/drafting. Example: strength `2.00` with `3 steps` takes `72 seconds`. Or lower the strength to experiment with a less βFusionXβ look. β‘π
Weβve got: - T2V (Text to Video) π¬ β works perfectly with VACE βοΈ - I2V (Image to Video) πΌοΈβ‘οΈπ½οΈ - A dedicated Phantom LoRA π» The new LoRA's are HERE Note: The LoRa's are not meant to be put on top of the FusionX main models and instead you would use them with the Wan base models. New workflows are HERE π οΈπ
After lots of testing π§ͺ, the video quality with the LoRA is just as good (and sometimes even better! π―) Thatβs thanks to it being trained on the fp16 version of FusionX π§¬π
π Preview Gallery These are compressed GIF previews for quick viewing β final video outputs are higher quality.
- π‘ ComfyUI workflows can be found here: π Workflow Collection (WIP)
- π¦ Model files (T2V, I2V, Phantom, VACE): π Main Hugging Face Repo
π§ GGUF Variants: - πΌοΈ FusionX Image-to-Video (GGUF) - π₯ FusionX Text-to-Video (GGUF) - ποΈ FusionX T2V VACE (for native) - π» FusionX Phantom
Want to see what FusionX can do? Check out these real outputs generated using the latest workflows and settings:
- CGF: Must be set to `1` - Shift: - `1024x576`: Start at `1` - `1080x720`: Start at `2` - For realism β lower values - For stylized β test `3β9` - Scheduler: - Recommended: `unipc` - Alternative: `flowmatchcausvid` (better for some details)
- CGF: `1` - Shift: `2` works best in most cases - Scheduler: - Recommended: `dmp++sde/beta` - To boost motion and reduce slow-mo effect: - Frame count: `121` - FPS: `24`
- Works in as few as 6 steps - Best quality at 8β10 steps - Drop-in replacement for `Wan2.1-T2V-14B` - Up to 50% faster rendering, especially with SageAttn - Works natively and with Kaji Wan Wrapper Wrapper GitHub - Do not re-add merged LoRAs (CausVid, AccVideo, MPS) - Feel free to add other LoRAs for style/variation - Native WAN workflows also supported (slightly slower)
- RTX 5090 β ~138 sec/video at 1024x576 / 81 frames - If VRAM is limited: - Enable block swapping - Start with `5` blocks and adjust as needed - Use SageAttn for ~30% speedup (wrapper only) - Do not use `teacache` - "Enhance a video" (tested): Adds vibrance (try values 2β4) - "SLG" not tested β feel free to explore
Want better cinematic prompts? Try the WAN Cinematic Video Prompt Generator GPT β it adds visual richness and makes a big difference in quality. Download Here
Weβre building a friendly space to chat, share outputs, and get help.
- Motion LoRAs coming soon - Tips, updates, and support from other users
Some merged components use permissive licenses (Apache 2.0 / MIT), but others β such as those from research models like CausVid β may be released under non-commercial licenses (e.g., CC BY-NC-SA 4.0).
- β
You can use, modify, and redistribute under original license terms - β You must retain and respect the license of each component - β οΈ Commercial use is not permitted for models or components under non-commercial licenses - π Outputs are not automatically licensed β do your own due diligence
This model is intended for research, education, and personal use only. For commercial use or monetization, please consult a legal advisor and verify all component licenses.
- WAN Team (base model) - aejion (AccVideo) - Tianwei Yin (CausVid) - ZuluVision (MoviiGen) - Alibaba PAI (MPS LoRA) - Kijai (ComfyUI Wrapper)