4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes

Yuanxing Duan*1     Fangyin Wei*3     Qiyu Dai1,2     Yuhang He1     Wenzheng Chen1,4     Baoquan Chen1,2
1Peking University    2SKL of General AI    3Princeton University    4NVIDIA
* Equal contribution

Abstract

We consider the problem of novel-view synthesis (NVS) for dynamic scenes. Recent neural approaches have accomplished exceptional NVS results for static 3D scenes, but extensions to 4D time-varying scenes remain non-trivial. Prior efforts often encode dynamics by learning a canonical space plus implicit or explicit deformation fields, which struggle in challenging scenarios like sudden movements or generating high-fidelity renderings. In this paper, we introduce 4D Gaussian Splatting (4DRotorGS), a novel method that represents dynamic scenes with anisotropic 4D XYZT Gaussians, inspired by the success of 3D Gaussian Splatting in static scenes [Kerbl et al. 2023]. We model dynamics at each timestamp by temporally slicing the 4D Gaussians, which naturally compose dynamic 3D Gaussians and can be seamlessly projected into images. As an explicit spatial-temporal representation, 4DRotorGS demonstrates powerful capabilities for modeling complicated dynamics and fine details—especially for scenes with abrupt motions. We further implement our temporal slicing and splatting techniques in a highly optimized CUDA acceleration framework, achieving real-time inference rendering speeds of up to 277 FPS on an RTX 3090 GPU and 583 FPS on an RTX 4090 GPU (off screen). Rigorous evaluations on scenes with diverse motions showcase the superior efficiency and effectiveness of 4DRotorGS, which consistently outperforms existing methods both quantitatively and qualitatively.

XYT Slicing

A Simplified 2D Illustration of the Proposed Temporal Slicing. (a) We model 2D dynamics with 3D XYT ellipsoids and slice them with different time queries. (b) The sliced 3D ellipsoids form 2D dynamic ellipses at each timestamp.

Interpolate start reference image.

Method

Framework Overview: after initialization, we first temporally slice the 4D Gaussians whose spatio-temporal movements are modeled with rotors. The dynamics such as the flickering flames naturally evolve through time, even within a short period of 0.27 second. The sliced 3D Gaussians are then projected onto 2D using differentiable rasterization. The gradients from image loss are back-propagated and guide the adaptive density control of 4D Gaussians.

Interpolate start reference image.

4DRotorGS vs. 3DGS

Our rotor-based representation enables both 3D static and 4D dynamic scene modeling. Our framework matches the results of 3DGS on static 3D scenes.

Interpolate start reference image.

Real-Time Onscreen Rendering

4D-Rotor Gaussian Splatting achieves high quality dynamic reconstruction with an interactive onscreen rendering speed of 144 FPS at 3840x2107 resolution and an offscreen rendering speed of 583 FPS at 1352x1014 on a single NVIDIA RTX4090.

BibTeX

@inproceedings{duan:2024:4drotorgs,
         author = "Yuanxing Duan and Fangyin Wei and Qiyu Dai and Yuhang He and Wenzheng Chen and Baoquan Chen",
         title = "4D-Rotor Gaussian Splatting: Towards Efficient Novel-View Synthesis for Dynamic Scenes",
         booktitle = "Proc. SIGGRAPH",
         year = "2024",
         month = July
}