Seminars
General Departmental Seminar Series
A semiparametric twocomponent "compound" mixture model and its application to estimating malaria attributable fractions.
Jing Qin
Biostatistics Research Branch,
National Institute of Allergy and Infectious Diseases
April 1, 2005, 3265 MSC 12:001:00pm
ABSTRACT
Malaria remains a major epidemiological problem in many developing countries. Malaria is defined as the presence of parasites and symptoms (usually fever) due to the parasites. In endemic areas, an
individual may have symptoms attributable either to malaria or to other causes. From a clinical viewpoint, it is important to correctly diagnose an individual who has developed symptoms so that the
appropriate treatments can be given. From an epidemiologic and economic viewpoint, it is important to determine the proportion of malaria affected cases in individuals who have symptoms so that
policies on intervention programmes can be developed. Once symptoms have developed in an individual, the diagnosis of malaria can be based on analysis of the parasite levels in blood samples. However, even a blood test is not conclusive as in endemic areas, many healthy individuals can have parasites in their blood slides. Therefore, data from this type of studies can be viewed as coming from a mixture
distribution, with the components corresponding to malaria and nonmalaria cases. A unique feature in this type of data, however, is the fact that a proportion of the nonmalaria cases have zero parasite levels. Therefore, one of the component distributions is itself a mixture distribution. In this article, we propose a semiparametric likelihood approach for estimating the proportion of clinical malaria using parasite level data from a group of individuals with symptoms. Our approach assumes the density ratio for the parasite
levels in clinical malaria and nonclinical malaria cases can be modeled using a logistic model. We use empirical likelihood to combine the zero and nonzero data. The maximum semiparametric likelihood estimate is more efficient than existing nonparametric estimates using only the frequencies of zero and nonzero data. On the other hand, it is more robust than a fully parametric maximum likelihood estimate that assumes a parametric model for the nonzero data. Simulation results show that the performance of the proposed method is satisfactory. The proposed method is used to analyze data from a malaria survey carried out in Tanzania.
