Microsoft’s AI generates 3D objects from 2D images

0
Microsoft’s AI generates 3D objects from 2D images

Apparently, companies like Facebook, Nvidia, and others have all tried creat AI that can generate 3D objects from 2D pictures before. We haven’t heard much about it because there has not been quantifiable or breakthrough success in the technology to make headlines, Now there is.

A team from Microsoft has released a breakthrough research paper that elucidates the technology. Titled Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data, the paper details a framework which the researchers at Microsoft claim is the first “scalable” training technique for generating 3D models from 2D data.

The system takes information from just 2D inputs like pictures to generate a 3D object. The researchers say the system consistently produces better 3D objects than existing models when trained with exclusively 2D images.

The AI takes advantage of fully-featured industrial renderers which are software that produce images from display data. To that end, they train a generative model for 3D shapes such that rendering the shapes generates images matching the distribution of a 2D data set. The generator model takes in a random input vector (values representing the data set’s features) and generates a continuous voxel representation (values on a grid in 3D space) of the 3D object. Then, it feeds the voxels to a non-differentiable rendering process, which thresholds them to discrete values before they’re rendered using an off-the-shelf renderer (the Pyrender, which is built on top of OpenGL).

A novel proxy neural renderer directly renders the continuous voxel grid generated by the 3D generative model. As the researchers explain, it’s trained to match the rendering output of the off-the-shelf renderer given a 3D mesh input.

In lay terms, the Algorithm studies the lighting and shading cues of the source images and is able to get a sense of the object in the picture with respect to its third dimension. The AI is able to produce realistic samples when trained on data sets of natural images.

“Our approach … successfully detects the interior structure of concave objects using the differences in light exposures between surfaces,” wrote the paper’s co-authors, “enabling it to accurately capture concavities and hollow spaces.”

The technology looks to particularly benefit video game developers, eCommerce businesses, and animation studios that lack the means or expertise to create 3D shapes from scratch. Companies will be able to automatically create 3D models for their game just by feeding the system with pictures of the objects they want to create.

There has been a lot of buzz lately about Facebook making 3D images from 2D ones. It’s not the exact same technology but it’s close. Facebook’s new tool uses machine learning to extrapolate the three-dimensional shape of objects in your image and uses that to generate a convincing 3D effect. It can even be used on decades-old images, and it works on any mid-range phone running Android or iOS.

Facebook says the tool works best if you avoid images with narrow objects in the foreground, or with lots of reflections. It’s also a good idea to pick images with objects at various depths for the best effect.

NVIDIA researchers invented an Artificial Intelligence program around December last year, which is called DIB-R, that can turn 2D images into 3D models. The machine can predict what would the 2D image look like in three dimensions and create a 3D model, by taking lighting, texture, and depth into the consideration.