Multi-View Stereo Network With Gaussian Distribution Iteration

Multi-view stereo estimates the depth maps of multiple perspective images in a scene and then fuses them to generate a 3D point cloud of the scene, which is an essential technology of 3D reconstruction. In this paper, we propose a deep learning method GDINet, applying probabilistic methods to the py...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access Vol. 11; pp. 53359 - 53372
Main Authors:	Zhang, Xiaohan, Li, Shikun
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	3D reconstruction Datasets Deep learning Feature extraction Filtering Gaussian distribution Image reconstruction Iterative methods Multi-view stereo Normal distribution Probabilistic logic Probabilistic methods Probability Statistical analysis Temples Three dimensional models
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multi-view stereo estimates the depth maps of multiple perspective images in a scene and then fuses them to generate a 3D point cloud of the scene, which is an essential technology of 3D reconstruction. In this paper, we propose a deep learning method GDINet, applying probabilistic methods to the pyramid framework, which can significantly improve reconstruction quality. In detail, we first establish a Gaussian distribution for each image's pixel and iterate it in the pyramid framework. The mean value is the estimated depth, and the variance represents the depth estimation error. In addition, we design a novel loss function with excellent convergence to train our network. Finally, we present an initialization module to generate the coarse Gaussian distribution, controlling the parameters in a reasonable range. Our results rank <inline-formula> <tex-math notation="LaTeX">2nd </tex-math></inline-formula> on both DTU and Tanks & Temples datasets, showing that our network has high accuracy, completeness, and robustness. We also make a visualization comparison on the BlendedMVS dataset (containing many aerial scene images) to demonstrate the generalization ability of our model.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2023.3280929