Ismayil Ismayilov

Ulysses Unbound: Experiments in Communication–Computation Overlap

Ulysses Unbound: Experiments in Communication–Computation Overlap

As video diffusion models scale, sequence lengths get uncomfortably large. A practical answer is context parallelism, and within that family, Ulysses is the canonical approach. The core idea is simple: full sequence, sharded heads. It maps well to modern GPU clusters built for high-throughput all-to-all communication, while still letting you