Generate videos

These models generate videos from text prompts, images, and reference materials. The field is advancing fast — most models now generate native audio alongside video.

Models we recommend

For cinematic realism and physical accuracy

Runway Gen-4.5 is the top-rated video generation model, ranked #1 on the Artificial Analysis text-to-video benchmark. It produces videos with realistic physics — objects have weight, liquids flow naturally, and fine details like hair and fabric stay coherent across frames. Great for polished, cinematic clips where visual fidelity matters most.

Google Veo 3.1 and Veo 3.1 Fast are strong alternatives with native audio generation. Veo 3.1 Fast is a good pick when you want high quality with quicker turnaround. Veo 3.1 Lite is a more affordable option for high-volume use.

For multi-shot storytelling with audio

Kling Video 3.0 generates cinematic videos up to 15 seconds with native audio — including lip-synced dialogue, sound effects, and ambient sound. Its multi-shot mode lets you define up to 6 connected scenes in a single generation, making it ideal for short narratives, product demos, and ads.

Kling Video 3.0 Omni adds reference-based generation and video editing on top. Upload reference images to keep character appearance consistent across scenes, or feed in a reference video for style and camera movement transfer.

For multimodal reference inputs

Seedance 2.0 from ByteDance accepts up to 9 reference images, 3 video clips, and 3 audio files — all combinable in your prompt. Supports T2V, I2V, video continuation, character consistency, motion transfer, and lip-synced dialogue with intelligent duration control. Seedance 2.0 Fast trades some quality for speed.

Seedance 1.5 Pro offers cinema-quality output with multi-language lip-sync and cinematic camera movements.

For fast, audio-rich social content

Grok Imagine Video from xAI generates short video clips with synchronized audio in around 30 seconds. Multiple aspect ratios (16:9, 9:16, 1:1) make it a natural fit for TikTok, Reels, and Shorts.

For start/end frame control

Vidu Q3 Pro supports a start-end-to-video mode — provide first and last frames and it generates smooth transitions between them. Up to 16 seconds at 1080p with audio. Vidu Q3 Turbo is a faster, cheaper variant.

For balanced cost and quality

Hailuo 2.3 from Minimax supports both text-to-video and image-to-video with standard and pro quality tiers. Hailuo 2.3 Fast trades some quality for speed.

PixVerse v5.6 is another cost-effective choice with unit-based pricing.

For fast iteration with draft mode

PrunaAI p-video offers T2V, I2V, and audio-to-video in a single endpoint. Its draft mode generates previews 4× faster for quick iteration before final rendering. Up to 1080p at 48 FPS.

For open source

The Wan video models are excellent open-source options, competitive with many proprietary models. Wan 2.7 T2V is the newest generation with a 27 billion parameter MoE architecture. Wan 2.5 T2V and the fast variants (Wan 2.5 T2V Fast, Wan 2.5 I2V Fast) are among the quickest options on Replicate.

Other rankings

Generative video is a rapidly advancing field. Check out the arena and leaderboard at Artificial Analysis to see what's popular today.

Featured models

xai/grok-imagine-video-1.5

Image-to-video with synchronized audio using xAI's Grok Imagine Video 1.5 preview model

Updated 1 week, 5 days ago

68.3K runs

Models we recommend

For cinematic realism and physical accuracy

For multi-shot storytelling with audio

For multimodal reference inputs

For fast, audio-rich social content

For start/end frame control

For balanced cost and quality

For fast iteration with draft mode

For open source

Other rankings

Frequently asked questions

Which models are the fastest?

Which models give the best balance of cost and quality?

Which models produce the most realistic video?

Which models support native audio?

What about multi-shot or narrative videos?

What's the best open-source option?

How long can generated videos be?

Can I use these models commercially?

Tips for better results