Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation
Preprint
¹GenAI, Meta ²University of Oxford
Flex3D comprises two stages:
(1) candidate view generation and selection, and
(2) 3D reconstruction using FlexRM.
In the first stage, an input image or textual prompt drives the generation of a diverse set of candidate views through fine-tuned multi-view and video diffusion models.
These views are subsequently filtered based on quality and consistency using a view selection mechanism.
The second stage leverages the selected high-quality views, feeding them to FlexRM which reconstruct the 3D object using a tri-plane representation decoded into 3D Gaussians.
Summary: Flex3D is a two-stage pipeline that generates high-quality 3D assets from single images or text prompts.
Interactive Results
Explore generation results (Gaussian Splats) below.Method
Acknowledgements
Junlin Han is supported by Meta. We would like to thank Luke Melas-Kyriazi, Runjia Li, Yawar Siddiqui, Minghao Chen, David Novotny, and Natalia Neverova for the helpful discussions and support.