Almost nonparametric and nonparametric estimation in mixture models

An almost nonparametric approach for the estimation of the mixing proportion in a mixture of two distributions, when we have a vector of observations on each subject, is to define a mixture of binomials. A mixture of binomials can be obtained if the vector or observations is mapped into a vector of...

Full description

Saved in:
Bibliographic Details
Main Author: Cruz-Medina, Isidro Roberto
Format: Dissertation
Language:English
Published: ProQuest Dissertations & Theses 01-01-2001
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:An almost nonparametric approach for the estimation of the mixing proportion in a mixture of two distributions, when we have a vector of observations on each subject, is to define a mixture of binomials. A mixture of binomials can be obtained if the vector or observations is mapped into a vector of zeroes and ones by selecting a cut point c. In this dissertation it is shown that the estimation of the cut point c, which minimizes the variance of the estimator of the mixing parameter, does not need to be very precise for some common distributions when the means of these distributions are more than two standard deviations apart. If more cut points are introduced a multinomial distribution is obtained and it is shown that the trinomial distribution is preferable to the binomial and the tetranomial is preferable to the trinomial distribution. In general, we prove that the multinomial distribution with r + 1 classes is preferable to the multinomial distribution with r classes. Nevertheless, it seems that if we introduce more than two cut points (a multinomial distribution with more than three regions) the gain in efficiency is not significant. Nonparametric approaches are proposed for the estimation of the mixing parameter in a mixture of two continuous distributions with equal shapes and unimodal symmetric densities. In these approaches some cut points ci are introduced in order to define a multinomial distribution, three cut points for a tetranomial distribution and five cut points for a sextinomial distribution. The assumed symmetry of the component distributions is exploited in order to obtain the probabilities for each class of the multinomial approach and five methods of estimation of the parameters of the multinomial mixture are studied. These methods basically measure the concordance among the observed frequencies and the expected frequencies. We present Mathematica and S-plus program codes in order to obtain the estimates of the parameters in the multinomial mixture. A Monte Carlo study shows that for normal components, the estimators of the mixing proportion in the sextinomial approaches are comparable with the EM algorithm estimator if the means are 1.75 standard deviations apart, but the estimators of the sextinomial approaches have an efficiency of 50% with respect to the EM estimator when the distance between the means is 2.32 standard deviations. When the component distribution are not normal, the sextinomial approaches outperform the EM algorithm that assumes that the components are normal. These tetranomial and sextinomial approaches can be easily adapted for use with training samples and three methods of sampling are considered. With training samples and normal components, the estimators from the sextinomial methods are comparable with the EM algorithm estimator. However, when component distributions are not normal, the sextinomial estimators outperform the EM algorithm estimator which assumes that the component distributions are normal.
ISBN:0493271805
9780493271804