Encoding Pose Features to Images With Data Augmentation for 3-D Action Recognition

Recently, numerous methods have been introduced for three-dimensional (3-D) action recognition using handcrafted feature descriptors coupled traditional classifiers. However, they cannot learn high-level features of a whole skeleton sequence exhaustively. In this paper, a novel encoding technique-na...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on industrial informatics Vol. 16; no. 5; pp. 3100 - 3111
Main Authors:	Huynh-The, Thien, Hua, Cam-Hao, Kim, Dong-Seong
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01-05-2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Artificial neural networks Color Color imagery Data augmentation deep convolutional neural networks (DCNNs) Feature extraction Feature recognition human action recognition Image coding Image color analysis Image recognition Joints (anatomy) Object recognition pose feature to image (PoF2I) encoding technique Skeleton Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recently, numerous methods have been introduced for three-dimensional (3-D) action recognition using handcrafted feature descriptors coupled traditional classifiers. However, they cannot learn high-level features of a whole skeleton sequence exhaustively. In this paper, a novel encoding technique-namely, pose feature to image (PoF2I), is introduced to transform the pose features of joint-joint distance and orientation to color pixels. By concatenating the features of all skeleton frames in a sequence, a color image is generated to depict spatial joint correlations and temporal pose dynamics of an action appearance. The strategy of end-to-end fine-tuning a pretrained deep convolutional neural network, which completely capture multiple high-level features at multiscale action representation, is implemented for learning recognition models. We further propose an efficient data augmentation mechanism for informative enrichment and overfitting prevention. The experimental results on six challenging 3-D action recognition datasets demonstrate that the proposed method outperforms state-of-the-art approaches.
ISSN:	1551-3203 1941-0050
DOI:	10.1109/TII.2019.2910876