A Numerical Study of the Bottom-Up and Top-Down Inference Processes in And-Or Graphs

This paper presents a numerical study of the bottom-up and top-down inference processes in hierarchical models using the And-Or graph as an example. Three inference processes are identified for each node A in a recursively defined And-Or graph in which stochastic context sensitive image grammar is e...

Full description

Saved in:

Bibliographic Details
Published in:	International journal of computer vision Vol. 93; no. 2; pp. 226 - 252
Main Authors:	Wu, Tianfu, Zhu, Song-Chun
Format:	Journal Article
Language:	English
Published:	Boston Springer US 01-06-2011 Springer Springer Nature B.V
Subjects:	Algorithms Analysis Artificial Intelligence Case studies Computer Imaging Computer Science Graphs Human performance Hypotheses Image Processing and Computer Vision Inference Mathematical models Pattern Recognition Pattern Recognition and Graphics Performance enhancement Trains Vision And-Or graph process Object parsing – Bottom-up/Top-down inference Information contribution Hierarchical model
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper presents a numerical study of the bottom-up and top-down inference processes in hierarchical models using the And-Or graph as an example. Three inference processes are identified for each node A in a recursively defined And-Or graph in which stochastic context sensitive image grammar is embedded: the α ( A ) process detects node A directly based on image features, the β ( A ) process computes node A by binding its child node(s) bottom-up and the γ ( A ) process predicts node A top-down from its parent node(s). All the three processes contribute to computing node A from images in complementary ways. The objective of our numerical study is to explore how much information each process contributes and how these processes should be integrated to improve performance. We study them in the task of object parsing using And-Or graph formulated under the Bayesian framework. Firstly, we isolate and train the α ( A ), β ( A ) and γ ( A ) processes separately by blocking the other two processes. Then, information contributions of each process are evaluated individually based on their discriminative power, compared with their respective human performance. Secondly, we integrate the three processes explicitly for robust inference to improve performance and propose a greedy pursuit algorithm for object parsing. In experiments, we choose two hierarchical case studies: one is junctions and rectangles in low-to-middle-level vision and the other is human faces in high-level vision. We observe that (i) the effectiveness of the α ( A ), β ( A ) and γ ( A ) processes depends on the scale and occlusion conditions, (ii) the α (face) process is stronger than the α processes of facial components, while β (junctions) and β (rectangle) work much better than their α processes, and (iii) the integration of the three processes improves performance in ROC comparisons.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-010-0346-6