Lightricks

27 models • 4 total models in database

Sort by:

LTX-2

This model card focuses on the LTX-2 model, as presented in the paper LTX-2: Efficient Joint Audio-Visual Foundation Model. The codebase is available here.

—

1,850,605

1,593

LTX-2.3

—

1,658,199

943

LTX-Video

--- tags: - ltx-video - image-to-video pinned: true language: - en license: other library_name: diffusers ---

—

239,445

2,043

LTX-Video-ICLoRA-detailer-13b-0.9.8

This is a video detailer model on top of `LTXV13B098DEV` trained on custom data. IC LoRA is a method that enables adding video context into the video generation process. This approach allows for video-to-video control on top of the text-to-video model, providing more precise control over the generated content by conditioning the model on reference video frames during inference. comfy compatible model: ltxv-098-ic-lora-detailer-comfyui.safetensors diffusers compatible model:ltxv-098-ic-lora-detailer-diffusers.safetensors For licensing information, please refer to the LTXV Open Weights License. Model Details - Base Model: `LTXV13B098DEV` - Training Type: IC LoRA training - Learning Rate: 0.0002 This model is designed to be used with the LTXV (Lightricks Text-to-Video) pipeline. 🔌 Using Trained LoRAs in ComfyUI In order to use the trained lora in comfy: 1. Copy the comfyui trained LoRA weights to the `models/loras` folder in your ComfyUI installation. 2. Use iclora/ic-lora.json from official LTXV ComfyUI repository - Base model by Lightricks - Training infrastructure: LTX-Video-Trainer

NaNK

—

4,320

LTX-Video-0.9.8-13B-distilled

LTX-Video 0.9.8 13B Distilled Model Card This model card focuses on the model associated with the LTX-Video model, codebase available here. LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 30 FPS videos at a 1216×704 resolution faster than they can be watched. Trained on a large-scale dataset of diverse videos, the model generates high-resolution videos with realistic and varied content. Image-to-video examples | | | | |:---:|:---:|:---:| | | | | | | | | | | | | | Name | Notes | inference.py config | ComfyUI workflow (Recommended) | |----------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------| | ltxv-13b-0.9.8-dev | Highest quality, requires more VRAM | ltxv-13b-0.9.8-dev.yaml | ltxv-13b-i2v-base.json | | ltxv-13b-0.9.8-mix | Mix ltxv-13b-dev and ltxv-13b-distilled in the same multi-scale rendering workflow for balanced speed-quality | N/A | ltxv-13b-i2v-mixed-multiscale.json | | ltxv-13b-0.9.8-distilled | Faster, less VRAM usage, slight quality reduction compared to 13b. Ideal for rapid iterations | ltxv-13b-0.9.8-distilled.yaml | ltxv-13b-dist-i2v-base.json | | ltxv-2b-0.9.8-distilled | Smaller model, slight quality reduction compared to 13b distilled. Ideal for light VRAM usage | ltxv-2b-0.9.8-distilled.yaml | N/A | | ltxv-13b-0.9.8-fp8 | Quantized version of ltxv-13b | ltxv-13b-0.9.8-dev-fp8.yaml | ltxv-13b-i2v-base-fp8.json | | ltxv-13b-0.9.8-distilled-fp8 | Quantized version of ltxv-13b-distilled | ltxv-13b-0.9.8-distilled-fp8.yaml | ltxv-13b-dist-i2v-base-fp8.json | | ltxv-2b-0.9.8-distilled-fp8 | Quantized version of ltxv-2b-distilled | ltxv-2b-0.9.8-distilled-fp8.yaml | N/A | | ltxv-2b-0.9.6 | Good quality, lower VRAM requirement than ltxv-13b | ltxv-2b-0.9.6-dev.yaml | ltxvideo-i2v.json | | ltxv-2b-0.9.6-distilled | 15× faster, real-time capable, fewer steps needed, no STG/CFG required | ltxv-2b-0.9.6-distilled.yaml | ltxvideo-i2v-distilled.json | Model Details - Developed by: Lightricks - Model type: Diffusion-based image-to-video generation model - Language(s): English Direct use You can use the model for purposes under the license: - 2B version 0.9: license - 2B version 0.9.1 license - 2B version 0.9.5 license - 2B version 0.9.6-dev license - 2B version 0.9.6-distilled license - 13B version 0.9.7-dev license - 13B version 0.9.7-dev-fp8 license - 13B version 0.9.7-distilled license - 13B version 0.9.7-distilled-fp8 license - 13B version 0.9.7-distilled-lora128 license - 13B version 0.9.7-ICLoRA Depth license - 13B version 0.9.7-ICLoRA Pose license - 13B version 0.9.7-ICLoRA Canny license - Temporal upscaler version 0.9.7 license - Spatial upscaler version 0.9.7 license - 13B version 0.9.8-dev license - 13B version 0.9.8-dev-fp8 license - 13B version 0.9.8-distilled license - 13B version 0.9.8-distilled-fp8 license - 2B version 0.9.8-distilled license - 2B version 0.9.8-distilled-fp8 license - 13B version 0.9.8-ICLoRA detailer license - Temporal upscaler version 0.9.8 license - Spatial upscaler version 0.9.8 license General tips: The model works on resolutions that are divisible by 32 and number of frames that are divisible by 8 + 1 (e.g. 257). In case the resolution or number of frames are not divisible by 32 or 8 + 1, the input will be padded with -1 and then cropped to the desired resolution and number of frames. The model works best on resolutions under 720 x 1280 and number of frames below 257. Prompts should be in English. The more elaborate the better. Good prompt looks like `The turquoise waves crash against the dark, jagged rocks of the shore, sending white foam spraying into the air. The scene is dominated by the stark contrast between the bright blue water and the dark, almost black rocks. The water is a clear, turquoise color, and the waves are capped with white foam. The rocks are dark and jagged, and they are covered in patches of green moss. The shore is lined with lush green vegetation, including trees and bushes. In the background, there are rolling hills covered in dense forest. The sky is cloudy, and the light is dim.` Online demo The model is accessible right away via the following links: - LTX-Studio image-to-video (13B-mix) - LTX-Studio image-to-video (13B distilled) - Fal.ai image-to-video (13B full) - Fal.ai image-to-video (13B distilled) - Replicate image-to-video ComfyUI To use our model with ComfyUI, please follow the instructions at a dedicated ComfyUI repo. The codebase was tested with Python 3.10.5, CUDA version 12.2, and supports PyTorch >= 2.1.2. To use our model, please follow the inference code in inference.py: You can now generate a video conditioned on a set of images and/or short video segments. Simply provide a list of paths to the images or video segments you want to condition on, along with their target frame numbers in the generated video. You can also specify the conditioning strength for each item (default: 1.0). LTX Video is compatible with the Diffusers Python library for image-to-video generation. Make sure you install `diffusers` before trying out the examples below. Now, you can run the examples below (note that the upsampling stage is optional but reccomeneded): To learn more, check out the official documentation. Diffusers also supports directly loading from the original LTX checkpoints using the `fromsinglefile()` method. Check out this section to learn more. Limitations - This model is not intended or able to provide factual information. - As a statistical model this checkpoint might amplify existing societal biases. - The model may fail to generate videos that matches the prompts perfectly. - Prompt following is heavily influenced by the prompting-style.

NaNK

—

2,377

LTX-Video-0.9.7-dev

—

1,445

ltxv-spatial-upscaler-0.9.7

—

1,179

LTX-Video-ICLoRA-depth-13b-0.9.7

NaNK

—

912

LTX Video ICLoRA Pose 13b 0.9.7

This is a pose control model on top of `LTXV13B097DEV` trained on custom data. IC LoRA is a method that enables adding video context into the video generation process. This approach allows for video-to-video control on top of the text-to-video model, providing more precise control over the generated content by conditioning the model on reference video frames during inference. comfy compatible model: ltxv-097-ic-lora-pose-control-comfyui.safetensors diffusers compatible model:ltxv-097-ic-lora-pose-control-diffusers.safetensors For licensing information, please refer to the LTXV Open Weights License. Model Details - Base Model: `LTXV13B097DEV` - Training Type: IC LoRA training - Learning Rate: 0.0002 - Rank: 24 This model is designed to be used with the LTXV (Lightricks Text-to-Video) pipeline. 🔌 Using Trained LoRAs in ComfyUI In order to use the trained lora in comfy: 1. Copy the comfyui trained LoRA weights to the `models/loras` folder in your ComfyUI installation. 2. Use iclora/ic-lora.json from official LTXV ComfyUI repository - Base model by Lightricks - Training infrastructure: LTX-Video-Trainer

NaNK

—

844