Running NVIDIA Lyra 2.0 on RunPod H100

NVIDIA Lyra 2.0 is a state-of-the-art generative model for creating immersive 3D world videos from a single image. Given one photo and a text caption, it generates a zoom-in and zoom-out video that explores the scene — building a coherent 3D world around it. We tested it on a RunPod H100 pod using lyra3d, a lightweight orchestration framework that handles the full bootstrap, inference, and result download over SSH.

Results

Both tests used 81 frames per direction at 480×832 resolution on an H100 80GB.

Test 1 — Desert Military Outpost (Zoom In / Zoom Out)

Input image

Lyra 2.0 output — zoom in then zoom out

Caption used: “A cinematic aerial view of a massive futuristic desert military outpost with modular domed buildings, solar panel arrays, glass skyscrapers, pipelines, and military vehicles rising from golden sand dunes under a warm hazy sky.”

Test 2 — Urban Glass Skyscraper (Orbit Horizontal)

Input image

Lyra 2.0 output — horizontal orbit trajectory

Caption used: “A modern glass curtain wall skyscraper reflecting an ornate classical building, lush green trees in the foreground, urban city scene with bright blue sky.”

Setup Overview

The full setup runs on a fresh RunPod H100 pod with no pre-installed dependencies. The lyra3d bootstrap script handles everything automatically.

1. Launch a RunPod pod

Go to runpod.io and launch a pod:

GPU: H100 80GB (required — model uses ~75GB VRAM)
Template: RunPod PyTorch (Ubuntu 24.04, CUDA 12.8)
Add your SSH public key under Settings → SSH Public Keys

2. Clone lyra3d and configure

git clone https://github.com/tech-microcosm/lyra3d.git
cd lyra3d
cp .env.example .env
# Edit .env with the pod IP and SSH port from the RunPod dashboard

3. Upload and run the bootstrap

# Upload bootstrap script to the pod
scp -i ~/.ssh/id_ed25519 -P <PORT> remote/bootstrap/lyra2_bootstrap.sh root@<IP>:/workspace/lyra3d/

# SSH in and run it in a detached screen session (~38 min on a fresh pod)
ssh -i ~/.ssh/id_ed25519 -p <PORT> root@<IP>
screen -dmS lyra_boot bash /workspace/lyra3d/lyra2_bootstrap.sh
screen -r lyra_boot   # to monitor

4. Place your input and run inference

# Upload your image
scp -P <PORT> my_image.png root@<IP>:/workspace/lyra3d/inputs/

# On the pod: set up sample dir and launch inference
LYRA=/workspace/lyra3d/Lyra/Lyra-2
mkdir -p ${LYRA}/assets/my_sample
cp /workspace/lyra3d/inputs/my_image.png ${LYRA}/assets/my_sample/00.png
echo "Your scene description here." > ${LYRA}/assets/my_sample/00.txt

# Run inference (zoom in + out, minimum valid frame count)
export NVTE_FUSED_ATTN=0
VENV=/workspace/lyra3d/venv
PYTHONPATH=${LYRA} ${VENV}/bin/python -m lyra_2._src.inference.lyra2_zoomgs_inference \
    --input_image_path ${LYRA}/assets/my_sample \
    --sample_id 0 \
    --experiment lyra2 \
    --checkpoint_dir checkpoints/model \
    --prompt_dir ${LYRA}/assets/my_sample \
    --output_path /workspace/lyra3d/outputs/my_run \
    --num_frames_zoom_in 81 \
    --num_frames_zoom_out 81

5. Download results

scp -P <PORT> -r root@<IP>:/workspace/lyra3d/outputs/my_run ./outputs/

Timing (H100 80GB)

Stage	Time
Bootstrap (fresh pod)	~38 min
Checkpoint loading	~13 min
Inference (81 + 81 frames)	~22 min
Total first run	~73 min

Subsequent runs on the same pod skip bootstrap and checkpoint loading, bringing inference to ~22 minutes.

Trajectory Options

Lyra supports several camera trajectory modes via --zoom_in_trajectory:

Mode	Effect
`horizontal_zoom`	Default zoom in/out with horizontal drift
`orbit_horizontal`	Horizontal orbital sweep around the scene
`orbit_vertical`	Vertical orbital arc
`spiral`	Inward spiral towards scene center
`spiral_outwards`	Outward spiral away from center

The frame count must satisfy: (frames − 1) must be divisible by 80 — minimum valid value is 81.

Key Notes

NVTE_FUSED_ATTN=0 is required to avoid a cuDNN error on H100 with transformer_engine
The model uses ~75GB VRAM — an H100 80GB is the minimum recommended GPU
Checkpoints are ~40GB and are downloaded once from HuggingFace (nvidia/Lyra-2.0)
Full source and bootstrap scripts: github.com/tech-microcosm/lyra3d