alphaXiv

Upgrade to Pro

Dark mode

We're hiring

Ask or search anything...

What are the most popular benchmarks for math reasoning?

Alt + Enter to search

alphaXiv Compute

Live

H100 SXMfrom $3.29/hr

H200 SXMfrom $4.39/hr

B200from $5.89/hr

Browse compute

Hot Likes

Cosmos 3: Omnimodal World Models for Physical AI

01 Jun 2026

Aditi

Niket Agarwal

Arslan Ali

NVIDIA introduces Cosmos 3, a family of omnimodal world models that jointly process and generate language, image, video, audio, and action sequences within a unified Mixture-of-Transformers architecture for Physical AI. This framework achieves competitive to state-of-the-art performance across 48 understanding benchmarks and leads open-source models in various image, video, and robot policy generation tasks, including a 39.7% success rate on RoboLab.

alphaXiv

Explore

Sign In

Blog

Feedback

ComputeNEW

Browser Extension

Upgrade to Pro

Dark mode

Ask or search anything...

alphaXiv Compute