VEED Fabric 1.0 on fal: Turn Any Image Into a Talking Video

Image-to-video tools are everywhere right now. VEED's latest model, Fabric 1.0 allows you to generate faster and longer video outputs. It can animate any image, giving end-users the flexibility to go beyond preset avatars and create a wide variety of content.
What is VEED Fabric 1.0?
Fabric 1.0 is an Image + Speech Audio to Video model. It uses an image as the starting frame and generates the continuation of the scene based on the audio. The audio drives not only lip movements but also body, hand, and head movements. The avatar’s motion follows the rhythm of the spoken voice.
Why Fabric 1.0 over other video models?
Most talking head generators offer only a handful of stock avatars and limited creative control. Fabric takes a different approach. It can animate any input image—photos, illustrations, mascots, or characters—while preserving the original style. The model is suitable for both individual and enterprise use cases, generating ready-to-share videos quickly and at low cost, without the need for custom filming.
With Fabric, you can:
- Generate lip sync videos with real people, pets, and cartoon characters by simply adding an image and speech audio

- Generate videos with stylized or edited images (combine with image editing network like Nano-banana)

- Generate videos with synthetic voices (combine with TTS like Elevenlabs)

What powers Fabric 1.0?
The core of the pipeline is a state-of-the-art Diffusion Transformer (DiT). We use image for both first frame and style conditioning, and we use audio for the frames’ sequence continuation. The model was trained on diverse talking people data, enabling robust and precise lip sync results with any character that can speak.
Here is a high-level breakdown of the specs:
- Inputs: Image (jpg/jpeg/png) and audio (mp3/wav/m4a/aac) under 10 MB
- Max video length: 1 minute
- FPS: 25 only
- Aspect ratio: all popular formats (including 16:9, 4:3, 1:1, 3:4, 9:16), based on the aspect ratio of the source image
- Resolution: 480p and 720p supported for 16:9. For other aspect ratios, resolution is scaled to keep the same height x width. For example: 1:1 at 480p to 640×640. Or 1:1 at 720p to 960×960.
- Generation time:
- For 480p: ~1.5 minutes for 10 seconds
- 720p: ~5 minutes for 10 seconds
How Fabric 1.0 works
- Go to Fabric 1.0 on fal.
- Choose an image – Upload any character or product shot. Fabric accepts photos, illustrations, 3D renders and more.
- Provide audio or text – Drop in a voice recording or type a script. If you choose text, VEED’s AI voice generator can synthesize natural narration.
- Generate your talking video – Fabric matches the voice to your character and animates the mouth and face. Within seconds, you have a video file.
- Publish anywhere – Export in square, vertical or landscape formats and share to TikTok, Instagram, YouTube, ads or presentations.
What you can create
For developers, the Fabric 1.0 API opens up programmatic generation—ideal for automating content production or integrating talking videos into your own products.
- Product‑plus‑avatar videos: Put your product in front of the camera—literally. Pair an image of your device or packaging with a friendly avatar who explains its features.
- Stylized character variations: Generate alternate versions of a person in claymation, anime or other art styles for eye‑catching campaigns.
- Lip‑synced clips from audio: Turn any voice note or podcast snippet into a talking video, keeping the speaker’s personality intact.
- Text‑to‑video: Type a script and let Fabric 1.0 handle the voiceover and animation. Ideal for quick tutorials and social posts when you’re on the go.
- Campaign A/B testing: Quickly produce multiple variations of the same message using different characters or looks.
Benefits at a glance
- Studio-grade quality: Get consistent storytelling with expressive output, accurate lip sync, and vivid details.
- Customizable output: Fabric 1.0 doesn’t tie you down to preset avatars. It can animate any image or character.
- Limitless use cases: Your end-users enjoy the flexibility to create a wide range of videos that are 7x longer.
Getting started
You can start creating with Fabric today on fal. Go to the playground, upload your image, enter your audio or text, and hit generate. Within seconds you’ll have a talking video ready for editing.
Try VEED Fabric 1.0 on fal now.