Joint Statistics / Biostatistics Seminar
Misspecification error in missing data models
Yun Ju Sung, Ph.D. Candidate, University of Minnesota
Joint Statistics / Biostatistics and Medical Informatics Assistant Professor Candidate
Wednesday, February 12, 2003, 4-5 p.m.
1221 Computer Science and Statistics Center, 1210 W. Dayton St.
When a statistical model is incorrect, the MLE is inconsistent, converging to the minimizer theta of Kullback-Leibler information. Any difference between the density ftheta* and the true density g is error due to model misspecification. We propose a Monte Carlo method to find è* when there are missing data and the observed data likelihood doesn't have closed form. The motivating example was models for mutation accumulation data from statistical genetics.
We prove consistency and asymptotic normality of the Monte Carlo estimate of theta*. The method involves generating two samples, the first for observed data from the true density and the second for missing data from an importance sampling density. The entire second sample is used with each member of the first sample. We show that this results in an asymptotic variance for the estimate smaller than that obtained by using the first sample only once.
If nature, instead of a computer, generates the first sample, then our estimate is a Monte Carlo approximation to the MLE. Now its asymptotic variance reflects sampling variability of the first sample and Monte Carlo variability of the second sample.
Back to General Departmental Seminar Series