Human Action Recognition Based on a Two-stream Convolutional Network Classifier
Currently, video generation devices are simpler to manipulate, more portable and with lower prices. This allowed easy storage and transmission of large amounts of media, such as videos, which has facilitated the analysis of information, independent of human assistance for evaluation and exhaustive s...
Saved in:
Published in: | 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) pp. 774 - 778 |
---|---|
Main Authors: | , , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-12-2017
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Currently, video generation devices are simpler to manipulate, more portable and with lower prices. This allowed easy storage and transmission of large amounts of media, such as videos, which has facilitated the analysis of information, independent of human assistance for evaluation and exhaustive search of videos. Virtual reality, robotics, tele-medicine, humanmachine interface and tele-surveillance are applications for these techniques. This paper describes a method for human action recognition in videos using two convolutional neural networks (CNNs). The first one Spatial Stream (trained with frames of the video) and the second one Temporal Stream, trained with stacks of Dense Optical Flow (DOF). Both streams were trained separately and from both of them we generated a classification histogram based on the most frequent class assignment. For final classification, those histograms were combined to produce a single output. The technique was tested in two public action video datasets: Weizmann and UCF Sports. We achieve 84.44% of accuracy on Weizmann dataset for Spatial stream and 78.46% on UCF Sports dataset. For the Weizmann dataset we obtained 91.11% with networks combination. |
---|---|
DOI: | 10.1109/ICMLA.2017.00-64 |