LTX 2.0 is Now Available on fal

LTX 2.0 is Now Available on fal

We’re thrilled to announce that LTX 2.0 is now live on fal on day 0.
LTX 2.0 brings next-level text-to-video and image-to-video generation, offering a blend of speed, fidelity, and cinematic control for AI creators.

LTX 2.0 is a state-of-the-art open-source model, giving developers full freedom to build, experiment, and customize. At fal, our commitment to support and extend open-source generative models is core to how we ship. This is a very powerful model and we’re actively cooking up some exciting projects.

The model allows you to create cinematic videos with native synchronized audio, advanced camera motion, and up to 10-second sequences at up to 60 fps. The distilled version of the model is blazing fast, generating videos in less than 30 seconds without compromising quality.

Key Strengths of LTX 2.0

1. Cinematic Visuals

LTX 2.0 excels at producing movie-like visuals, capturing cinematic tone, dynamic lighting, and smooth motion that give every scene a strong sense of atmosphere and storytelling.

0:00
/0:06

A cowboy walking through a dusty town at high noon, camera following from behind, cinematic depth, realistic lighting, western mood, 4K film grain.

2. Precise Dialogue & Audio Control

One of the biggest leaps in LTX 2.0 is its native audio generation, tightly synchronized with the video output.

Voices feel grounded, expressive, and realistic, with dialogue that naturally matches lip movement, pacing, and emotional tone. The model demonstrates strong control over timing, pauses, breath, and delivery, making spoken scenes feel authentic rather than synthetic.

LTX 2.0 adapts voices to the speaker’s persona and context. Accents, vocal texture, and delivery adjust naturally to the scene and character.

0:00
/0:04

Slow handheld close-up. The man breathes steadily, eyes scanning the horizon. Wind tugs at his jacket and beard, sea spray drifting through the frame as waves rise and fall behind him. The boat creaks softly beneath the wind and water. After a brief pause, he mutters under his breath, “Weather’s turning.”

3. Style Adaptation

LTX 2.0 intelligently adapts to artistic styles, maintaining subject coherence while translating prompts into expressive animation. Whether your reference image is photorealistic or stylized, the model interprets it seamlessly into motion. What's also impressive about the model, is how the accent and tone of voices adapt to the style specified. As seen in the owl cartoon example, the voice generated fits the character really well and matches the style of audio a 1990s cartoon would have.

0:00
/0:04

"Style: animated movie, An elderly owl with fluffy grey feathers and spectacles perched on his beak stands on a thick tree branch, facing a smaller, downy baby owl. The wise owl, positioned on the left, speaks in a raspy, deep voice, saying, \"Focus, close your eyes, and flap!\" As he finishes speaking, the baby owl, startled, jumps off the branch, emitting a high-pitched scream of fear that echoes briefly through the forest. The baby owl quickly disappears from view, leaving the elder owl standing alone on the branch, his expression conveying disappointment as his head tilts slightly. The ambient soundscape includes the gentle rustling of leaves in the breeze and distant birdsong, creating a peaceful yet slightly melancholic atmosphere."

0:00
/0:04

Handheld close-up in rain. He whips his head toward what’s ahead, holding a three-quarter profile as he lets out a gasp

4. Advanced Camera Controls

Camera behavior in LTX 2.0 feels deliberate: smooth pans, steady zooms, and dynamic focus transitions. It gives creators the ability to experiment with professional-grade movement without post-processing.

0:00
/0:06

Timelapse of a woman stands still amid a busy neon-lit street at night. The camera slowly dollies in toward her face as people blur past, their motion emphasizing her calm presence. City lights flicker and reflections shift across her denim jacket.

Full Model Endpoints:

LTX-2 19B | Text to Video | fal.ai
Generate video with audio from text using LTX-2
LTX-2 19B | Image to Video | fal.ai
Generate video with audio from images using LTX-2
LTX-2 19B | Video to Video | fal.ai
Extend video with audio using LTX-2

Distilled Endpoints:

LTX-2 19B Distilled | Text to Video | fal.ai
Generate video with audio from text using LTX-2 Distilled
LTX-2 19B Distilled | Image to Video | fal.ai
Generate video with audio from images using LTX-2 Distilled
LTX-2 19B Distilled | Video to Video | fal.ai
Extend videos with audio using LTX-2 Distilled

Full+LoRA Endpoints:

LTX-2 19B | Text to Video | fal.ai
Generate video with audio from text using LTX-2 and custom LoRA
LTX-2 19B | Image to Video | fal.ai
Generate video with audio from images using LTX-2 and custom LoRA
LTX-2 19B | Video to Video | fal.ai
Extend video with audio using LTX-2 and custom LoRA


Distilled+LoRA Endpoints:

LTX-2 19B Distilled | Video to Video | fal.ai
Extend videos with audio using LTX-2 Distilled and custom LoRA
LTX-2 19B | Image to Video | fal.ai
Generate video with audio from images using LTX-2 and custom LoRA
LTX-2 19B Distilled | Text to Video | fal.ai
Generate video with audio from text using LTX-2 Distilled and custom LoRA

Getting Started with LTX 2.0

The easiest way to explore LTX 2.0 capabilities is through fal's Playground, where you can experiment with prompts and see immediate results. A detailed guide on how to integrate LTX 2.0 into your platform is available in our API documentation.


Stay tuned to our RedditblogTwitter, or Discord for the latest updates on generative media and the new model releases!