Frequently Asked Questions

What is LTX-2 and how is it different from other AI video models?

LTX-2 is a DiT-based audio-video foundation model that generates synchronized video and audio in a single unified process. Unlike other models that generate video and audio separately, LTX-2 creates them together with natural timing and synchronization. It supports 4K resolution at 50 FPS and includes LoRA customization capabilities.

What GPU do I need to run LTX-2 locally?

For optimal performance, we recommend:

  • Recommended: RTX 40 Series or newer with 16GB+ VRAM
  • Optimal: 24GB+ VRAM (RTX 3090/4090) for 720p 24fps 4-second clips
  • Minimum: 8GB VRAM for 540p at reduced duration

Alternatively, use the online playground which requires no local GPU.

How long can generated videos be?

LTX-2 can generate videos up to 20 seconds in a single generation. The LTX Platform currently supports 6, 8, or 10-second clips, with 15-second support coming soon. For longer content, you can chain multiple generations together in your editing workflow.

Can I use LTX-2 generated content commercially?

Yes, subject to our Terms of Service, you may use AI-generated content for personal and commercial purposes. However, please review the prohibited uses section and ensure your use case complies with all applicable laws and regulations.

How do I train custom LoRAs for LTX-2?

The LTX-2 base (dev) model is fully trainable. You can create custom LoRAs for style, motion, or identity in under an hour using the LTX-2 Trainer package. Visit our GitHub repository for detailed training instructions and example workflows.

Does LTX-2 support ComfyUI?

Yes! LTX-2 has built-in support for ComfyUI with native nodes available in ComfyUI Manager. NVIDIA provides a detailed quick-start guide for running LTX-2 in ComfyUI, including optimized workflows for different GPU configurations.

Why is my generated audio quality lower than expected?

Audio quality can vary based on the prompt. LTX-2 tends to produce better audio quality when the prompt includes speech or dialogue. Non-speech audio (ambient sounds, music) may have lower quality. For best results, include specific audio descriptions in your prompts.

How do I report bugs or request features?

Please use our GitHub Issues page to report bugs or request new features. Before submitting, please check if a similar issue already exists.

Still Need Help?

Can't find what you're looking for? Reach out through our community channels.