A measure of difference between discrete sample sets

The estimation of statistical distance between populations is a task of importance for many applications. Conventional methods often rely on the use of a maximum-likelihood (ML) estimator, usually due to its analytical and computational simplicity. However, the ML point estimate provides no informat...

Full description

Saved in:
Bibliographic Details
Published in:2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR) pp. 1908 - 1912
Main Authors: Chakraborty, D., Kovvali, N.
Format: Conference Proceeding
Language:English
Published: IEEE 01-11-2011
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The estimation of statistical distance between populations is a task of importance for many applications. Conventional methods often rely on the use of a maximum-likelihood (ML) estimator, usually due to its analytical and computational simplicity. However, the ML point estimate provides no information about the uncertainty in the parameters and distance estimated, which grows with lesser amounts of observed data. In this paper, a new measure is developed for statistical difference between finite sized sample sets of discrete observations. The measure is defined as the expected distance between probability mass functions (pmfs), with the expectation carried out over Dirichlet posteriors on the pmfs given the observed samples. In contrast to conventional ML estimates of distance, this approach by-design accounts for the uncertainty due to the finite size of the observation sets. In the limit of infinite number of observation samples, the expected distance simplifies to the ML estimate. For finite and small sized sample sets, the expected distance yields a more reliable measure of statistical difference.
ISBN:9781467303217
1467303216
ISSN:1058-6393
2576-2303
DOI:10.1109/ACSSC.2011.6190355