Videos are muted by default — click any video to unmute it (others auto-mute). Press Enter to pause or resume all videos.
30-second videos with synchronized audio, from LTX2.
Extending LTX2's base 5-second generation budget by ×6.
Additional 16-second samples with synchronized audio, from LTX2.
Extending LTX2's base 5-second generation budget by ×3.
30-second videos from HunyuanVideo-1.5.
Extending HunyuanVideo-1.5's base 5-second generation budget by ×6.
30-second generation against bidirectional long-video baselines (RIFLEx, UltraVico) on the same prompts.
RIFLEx is built on CogVideoX (5B); UltraVico and FlowLong (Ours) both use Wan2.1 (1.3B).
30-second generation against autoregressive long-video baselines on 10 diverse prompts.
All baselines and FlowLong (Ours) are built on Wan2.1 (1.3B).
FlowLong's long-generation idea applied to text-to-3DGS: VIST3A runs out of frames mid-trajectory while FlowLong (Ours) continues seamlessly.