CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera
Camera-to-robot calibration is crucial for vision-based robot control and requires effort to make it accurate. Recent advancements in markerless pose estimation methods have eliminated the need for time-consuming physical setups for camera-to-robot calibration. While the existing markerless pose est...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
16-09-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Camera-to-robot calibration is crucial for vision-based robot control and
requires effort to make it accurate. Recent advancements in markerless pose
estimation methods have eliminated the need for time-consuming physical setups
for camera-to-robot calibration. While the existing markerless pose estimation
methods have demonstrated impressive accuracy without the need for cumbersome
setups, they rely on the assumption that all the robot joints are visible
within the camera's field of view. However, in practice, robots usually move in
and out of view, and some portion of the robot may stay out-of-frame during the
whole manipulation task due to real-world constraints, leading to a lack of
sufficient visual features and subsequent failure of these approaches. To
address this challenge and enhance the applicability to vision-based robot
control, we propose a novel framework capable of estimating the robot pose with
partially visible robot manipulators. Our approach leverages the
Vision-Language Models for fine-grained robot components detection, and
integrates it into a keypoint-based pose estimation network, which enables more
robust performance in varied operational conditions. The framework is evaluated
on both public robot datasets and self-collected partial-view datasets to
demonstrate our robustness and generalizability. As a result, this method is
effective for robot pose estimation in a wider range of real-world manipulation
scenarios. |
---|---|
DOI: | 10.48550/arxiv.2409.10441 |