Magic3D: High-Resolution Text-to-3D Content Creation

Chen-Hsuan Lin*
Jun Gao*
Luming Tang*
Towaki Takikawa*
Xiaohui Zeng*
Xun Huang
Karsten Kreis
Sanja Fidler†
Ming-Yu Liu†
Tsung-Yi Lin
*† : equal contributions
NVIDIA Corporation
Paper (arXiv)

Magic3D is a new text-to-3D content creation tool that creates 3D mesh models with unprecedented quality. Together with image conditioning techniques as well as prompt-based editing approach, we provide users with new ways to control 3D synthesis, opening up new avenues to various creative applications.

(best viewed with Google Chrome on a desktop/laptop)
Video
X

High-Resolution 3D Meshes

Magic3D can create high-quality 3D textured mesh models from input text prompts. It utilizes a coarse-to-fine strategy that leverages both low- and highresolution diffusion priors for learning the 3D representation of the target content. Magic3D synthesizes 3D content with 8× higher-resolution supervision than DreamFusion while also being 2× faster.

[...] indicates helper captions added to improve quality, e.g. "A DSLR photo of".
Videos are best viewed with Google Chrome.

Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
A beautiful dress made out of garbage bags, on a mannequin. Studio lighting, high quality, high resolution.
A blue poison-dart frog sitting on a water lily.
[...] a car made out of sushi.
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
[...] a bagel filled with cream cheese and lox.
[...] an ice cream sundae.
[...] a peacock on a surfboard.
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
[...] a plate piled high with chocolate chip cookies.
[...] Neuschwanstein Castle, aerial view.
[...] the Imperial State Crown of England.
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
[...] the leaning tower of Pisa, aerial view.
A ripe strawberry.
A silver platter piled high with fruits.
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
Reveal
3D mesh!
Download
3D mesh!
[...] a silver candelabra sitting on a red velvet tablecloth, only one candle is lit.
[...] Sydney opera house, aerial view.
Michelangelo style statue of an astronaut.

Prompt-based Editing

Given a coarse model generated with a base text prompt, we can modify parts of the text in the prompt, and then fine-tune the NeRF and 3D mesh models to obtain an edited high-resolution 3D mesh.

A squirrel wearing a leather jacket riding a motorcycle.
A bunny riding a scooter.
A fairy riding a bike.
A steampunk squirrel riding a horse.
A baby bunny sitting on top of a stack of pancakes.
A lego bunny sitting on top of a stack of books.
A metal bunny sitting on top of a stack of broccoli.
A metal bunny sitting on top of a stack of chocolate cookies.

Other Editing Capabilities

Given input images for a subject instance, we can fine-tune the diffusion models with DreamBooth and optimize the 3D models with the given prompts. The identity of the subject can be well-preserved in the 3D models.

We can also condition the diffusion model (eDiff-I) on an input image to transfer its style to the output 3D model.


Method

We utilize a two-stage coarse-to-fine optimization framework for fast and high-quality text-to-3D content creation. In the first stage, we obtain a coarse model using a low-resolution diffusion prior and accelerate this with a hash grid and sparse acceleration structure. In the second stage, we use a textured mesh model initialized from the coarse neural representation, allowing optimization with an efficient differentiable renderer interacting with a high-resolution latent diffusion model.


Citation
@article{lin2022magic3d,
  title={Magic3D: High-Resolution Text-to-3D Content Creation},
  author={Lin, Chen-Hsuan and Gao, Jun and Tang, Luming and Takikawa, Towaki and Zeng, Xiaohui and Huang, Xun and Kreis, Karsten and Fidler, Sanja and Liu, Ming-Yu and Lin, Tsung-Yi},
  journal={arXiv preprint arXiv:2211.10440},
  year={2022}
}

Back to top