Object Pose Estimation Using Edge Images Synthesized from Shape Information

This paper presents a method for estimating the six Degrees of Freedom (6DoF) pose of texture-less objects from a monocular image by using edge information. The deep learning-based pose estimation method needs a large dataset containing pairs of an image and ground truth pose of objects. To alleviat...

Full description

Saved in:
Bibliographic Details
Published in:Sensors (Basel, Switzerland) Vol. 22; no. 24; p. 9610
Main Authors: Moteki, Atsunori, Saito, Hideo
Format: Journal Article
Language:English
Published: Switzerland MDPI AG 08-12-2022
MDPI
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a method for estimating the six Degrees of Freedom (6DoF) pose of texture-less objects from a monocular image by using edge information. The deep learning-based pose estimation method needs a large dataset containing pairs of an image and ground truth pose of objects. To alleviate the cost of collecting a dataset, we focus on the method using a dataset made by computer graphics (CG). This simulation-based method prepares a thousand images by rendering the computer-aided design (CAD) data of the object and trains a deep-learning model. As an inference stage, a monocular RGB image is entered into the model, and the object's pose is estimated. The representative simulation-based method, Pose Interpreter Networks, uses silhouette images as the input, thereby enabling common feature (contour) extraction from RGB and CG images. However, estimating rotation parameters is less accurate. To overcome this problem, we propose a method to use edge information extracted from the object's ridgelines for training the deep learning model. Since edge distribution changes largely according to the pose, the estimation of rotation parameters becomes more robust. Through an experiment with simulation data, we quantitatively proved the accuracy improvement compared to the previous method (error rate decreases at a certain condition are translation 22.9% and rotation: 43.4%). Moreover, through an experiment with physical data, we clarified the issues of this method and proposed an effective solution by fine-tuning (error rate decrease at a certain condition are translation 20.1% and rotation 57.7%).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1424-8220
1424-8220
DOI:10.3390/s22249610