Deep network in the browser converts to 3D any image of a person/cat/anime face

栏目: IT技术 · 发布时间: 5年前

内容简介:Summary: We propose a method to learn weakly symmetric deformable 3D object categories from raw single-view images, without ground-truth 3D, multiple views, 2D/3D keypoints, prior shape models or any other supervision.[We store a copy of the uploaded image

Summary: We propose a method to learn weakly symmetric deformable 3D object categories from raw single-view images, without ground-truth 3D, multiple views, 2D/3D keypoints, prior shape models or any other supervision.

[ Paper · Project Page · Code ]

Demo

Input

Upload your own image

Lighting Mode

Share

We store a copy of the uploaded image for 7 days, after which it will be automatically deleted. The uploaded image is not used for any other purpose.

or select one of the examples below

Method Overview

We propose a method to learn 3D deformable object categories from raw single-view images, without any manual or external supervision . The method is based on an autoencoder that factors each input image into depth, albedo, viewpoint and illumination. In order to disentangle these components without supervision, we use the fact that many object categories have, at least in principle, a symmetric structure. We show that reasoning about illumination allows us to exploit the underlying object symmetry even if the appearance is not symmetric due to shading. Furthermore, we model objects that are probably, but not certainly, symmetric by predicting a symmetry probability map, learned end-to-end with the other components of the model.

Photo-Geometric Autoencoding

Our method is based on an autoencoder that factors each input image into depth , albedo , viewpoint and lighting . These four components are combined to reconstruct the input image. The model is trained only using a reconstruction loss, without any external supervision.

Exploiting Symmetry

In order to achieve this decomposition without supervision, we exploit the fact that many object categories have a bilateral symmetry . Assuming an object is perfectly symmetric, one can obtain a virtual second view of it by simply mirroring the image and perform 3D reconstruction using stereo geometry [1, 2].

Here, we would like to leverage this symmetry assumption. We enforce the model to predict a symmetric view of the object by injecting a flipping operation, and obtain two reconstructions (with and without flipping) of the same input view through predicted viewpoint transformation. Minimizing two reconstruction losses at the same time essentially imposes a “two-view” constraint and provides sufficient signal for recovering accurate 3D shapes.

Note that even if an object has symmetric intrinsic textures (aka. albedo), it may still result in an asymmetric appearance due to asymmetric illumination. Here, this is handled by predicting albedo and lighting separately, and enforcing symmetry only on albedo while allowing the shading to be asymmetric. We assume a simple Lambertian illumination model, and compute a shading map from the predicted light direction and depth map.

In fact, doing so does not only allow the model to learn accurate intrinsic image decomposition, but also provides strong regularization on the shape prediction (similar to shape from shading)! Unnatural shapes are avoided since they result in unnatural shading and thus a higher reconstruction loss.

Probabilistic Modeling of Symmetry using Confidence Maps

Although symmetry provides strong signal for recovering 3D shapes, specific object instances are in practice never fully symmetric. We account for potential asymmetry using uncertainty modeling [3]. Our model additionally predicts a pair of per-pixel confidence maps, and is trained to minimize the two confidence-adjusted reconstruction losses at the same time, and with asymmetric weights to allow for a dominant side.

References

[1] Mirror Symmetry ⇒ 2-View Stereo Geometry. Alexandre R. J. François, Gérard G. Medioni, and Roman Waupotitsch. Image and Vision Computing, 2003.

[2] Detecting and Reconstructing 3D Mirror Symmetric Objects. Sudipta N. Sinha, Krishnan Ramnath, and Richard Szeliski. Proc. ECCV, 2012.

[3] What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? Alex Kendall and Yarin Gal. NeurIPS, 2017.

Author’s webpage: Shangzhe & Christian


以上所述就是小编给大家介绍的《Deep network in the browser converts to 3D any image of a person/cat/anime face》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

乔布斯离开了,马斯克来了

乔布斯离开了,马斯克来了

[日]竹内一正 / 干太阳 / 中信出版社 / 2015-11

在电动汽车的创新上,特斯拉抓住了一个群体的独特需求,外形很酷,不烧油,智能化控制。所有的颠覆式创新都不是敲锣打鼓来的,而是隐藏在一片噪声里,马斯克给我们带来的特斯拉虽然不尽完美,但他做产品的思维和执着于未来的勇气,值得学习。埃隆•马斯克创办公司也不是为了赚钱,而是为了拯救人类和地球,电动汽车、太阳能发电、宇宙火箭,不管是哪一项都足以令一个国家付出巨大的代价去研究开发,但埃隆•马斯克却一个人在做这些......一起来看看 《乔布斯离开了,马斯克来了》 这本书的介绍吧!

SHA 加密
SHA 加密

SHA 加密工具

Markdown 在线编辑器
Markdown 在线编辑器

Markdown 在线编辑器

HSV CMYK 转换工具
HSV CMYK 转换工具

HSV CMYK互换工具