Visual dictionaries as intermediate features in the human brain

The human visual system is assumed to transform low level visual features to object and scene representations via features of intermediate complexity. How the brain computationally represents intermediate features is still unclear. To further elucidate this, we compared the biologically plausible HM...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in computational neuroscience Vol. 8; p. 168
Main Authors: Ramakrishnan, Kandan, Scholte, H Steven, Groen, Iris I A, Smeulders, Arnold W M, Ghebreab, Sennay
Format: Journal Article
Language:English
Published: Switzerland Frontiers Research Foundation 15-01-2015
Frontiers Media S.A
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The human visual system is assumed to transform low level visual features to object and scene representations via features of intermediate complexity. How the brain computationally represents intermediate features is still unclear. To further elucidate this, we compared the biologically plausible HMAX model and Bag of Words (BoW) model from computer vision. Both these computational models use visual dictionaries, candidate features of intermediate complexity, to represent visual scenes, and the models have been proven effective in automatic object and scene recognition. These models however differ in the computation of visual dictionaries and pooling techniques. We investigated where in the brain and to what extent human fMRI responses to short video can be accounted for by multiple hierarchical levels of the HMAX and BoW models. Brain activity of 20 subjects obtained while viewing a short video clip was analyzed voxel-wise using a distance-based variation partitioning method. Results revealed that both HMAX and BoW explain a significant amount of brain activity in early visual regions V1, V2, and V3. However, BoW exhibits more consistency across subjects in accounting for brain activity compared to HMAX. Furthermore, visual dictionary representations by HMAX and BoW explain significantly some brain activity in higher areas which are believed to process intermediate features. Overall our results indicate that, although both HMAX and BoW account for activity in the human visual system, the BoW seems to more faithfully represent neural responses in low and intermediate level visual areas of the brain.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Edited by: Mazyar Fallah, York University, Canada
This article was submitted to the journal Frontiers in Computational Neuroscience.
Reviewed by: Marcel Van Gerven, Donders Institute for Brain, Cognition and Behaviour, Netherlands; Tianming Liu, Uga, USA; Daniel Leeds, Fordham University, USA
ISSN:1662-5188
1662-5188
DOI:10.3389/fncom.2014.00168