Breakthrough AI could soon generate whole 3D worlds from 2D videos

A new tool by NVIDIA can create complex 3D objects from short videos, and could solve a lot of AI's problems.

Try 6 issues for £9.99 when you subscribe to BBC Science Focus Magazine!
Published: June 2, 2023 at 1:51 pm

In the latest push of never-ending artificial intelligence projects announced this year, software giant NVIDIA has unveiled a program capable of creating full 3D replicas of objects based solely on 2D video footage.

Called Neuralangelo (a blend of neural and Michelangelo), the software can generate lifelike virtual replicas of buildings, sculptures, complicated structures, and a wide array of other intricate 3D models.

“The 3D reconstruction capabilities Neuralangelo offers will be a huge benefit to creators, helping them recreate the real world in the digital world,” said Ming-Yu Liu, senior director of research and co-author of the Neuralangelo paper.

“This tool will eventually enable developers to import detailed objects – whether small statues or massive buildings – into virtual environments for video games or industrial digital twins.”

NVIDIA isn’t the first company to create an AI model like this, but this is arguably the most advanced. While previous versions have struggled to capture repetitive texture patterns or detailed colours, this is much less a problem for Neuralangelo.

By using 2D videos of an object, structure, or scene – all filmed from a variety of angles – the model selects out certain frames, mapping out the key angles for a complete view of the structure.

Once the camera position is decided for each frame, the program creates a rough 3D interpretation of the scene. The render is then optimised, sharpening details and producing a final 3D object that can be put into virtual reality, or used in a range of industries.

While the capabilities of the program have been unveiled, the software itself is not yet available for public use.

NVIDIA is one of the many companies betting big on artificial intelligence this year. Adobe, Google, OpenAI, Microsoft, and a number of other leading companies have all poured billions into producing the model that will take over an industry.

So far, we’ve seen AI try to create music, write poetry and complicated code, and even craft award-winning artwork. However, 3D generation is one of the big nuts for AI to crack.

Due to the complicated and unpredictable nature of a 3D form, this hasn’t been as easy to replicate as a 2D image or piece of writing. It is something that OpenAI has attempted to understand with its Point-E project but has admitted it is a complicated project to create.

If NVIDIA, and other companies that follow suit, can finally create an AI 3D model generator, it will have far-reaching effects on the world of artificial intelligence. One of the biggest issues that AI art has faced so far, is its inability to understand complex shapes.

Because it is trained through samples of 2D art, these generators struggle to understand hands and intricate shapes. With the inclusion of 3D-generated models, it could better understand the models it is trying to replicate.

Read more: