A Numerical Study of the Bottom-Up and Top-Down Inference Processes in And-Or Graphs
This paper presents a numerical study of the bottom-up and top-down inference processes in hierarchical models using the And-Or graph as an example. Three inference processes are identified for each node A in a recursively defined And-Or graph in which stochastic context sensitive image grammar is e...
Saved in:
Published in: | International journal of computer vision Vol. 93; no. 2; pp. 226 - 252 |
---|---|
Main Authors: | , |
Format: | Journal Article |
Language: | English |
Published: |
Boston
Springer US
01-06-2011
Springer Springer Nature B.V |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper presents
a numerical study
of the bottom-up and top-down inference processes in hierarchical models using the And-Or graph as an example. Three inference processes are identified for each node
A
in a recursively defined And-Or graph in which stochastic context sensitive image grammar is embedded: the
α
(
A
) process detects node
A
directly based on image features, the
β
(
A
) process computes node
A
by binding its child node(s) bottom-up and the
γ
(
A
) process predicts node
A
top-down from its parent node(s). All the three processes contribute to computing node
A
from images in complementary ways. The objective of our numerical study is to explore how much information each process contributes and how these processes should be integrated to improve performance. We study them in the task of object parsing using And-Or graph formulated under the Bayesian framework. Firstly, we isolate and train the
α
(
A
),
β
(
A
) and
γ
(
A
) processes separately by blocking the other two processes. Then, information contributions of each process are evaluated individually based on their discriminative power, compared with their respective human performance. Secondly, we integrate the three processes explicitly for robust inference to improve performance and propose a greedy pursuit algorithm for object parsing. In experiments, we choose two hierarchical case studies: one is junctions and rectangles in low-to-middle-level vision and the other is human faces in high-level vision. We observe that (i) the effectiveness of the
α
(
A
),
β
(
A
) and
γ
(
A
) processes depends on the scale and occlusion conditions, (ii) the
α
(face) process is stronger than the
α
processes of facial components, while
β
(junctions) and
β
(rectangle) work much better than their
α
processes, and (iii) the integration of the three processes improves performance in ROC comparisons. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 |
ISSN: | 0920-5691 1573-1405 |
DOI: | 10.1007/s11263-010-0346-6 |