In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
will defend her dissertation
Association Rule Mining for Risk Assessment in Epidemiology
In epidemiology, a risk assessment measures the association between exposures and a health outcome of interest. Risk characterization has been traditionally performed using statistical methods, but they have limitations, such as difficulty in handling highly correlated variable s and in assessing synergic actions between exposures.
To overcome these limitations, we propose: (i) a modified Apriori association rule mining method for identification of connections between exposures and increase (or decrease) in risk, and (ii) a novel genetic algorithm (GA) designed to mine risk-based quantitative association rules. Both methods have been tested and validated on group of synthetic datasets, and on real collection of data about pediatric asthma cases and pollution levels in Houston. The results on the synthetic datasets show the advanta ges of applying our methods as substitute or integrative tools to traditional logistic regression. Details about the design and testing of the GA fitness function will also be illustrated. Tests of the methods on the clinical data in our possession suggest the existence of a correlation between asthma and outdoor air pollutants, both alone and as a mixture. The genetic algorithm further improves the results of the Apriori-based method by automatically recognizing what appear to be the most dangerous levels of exposure.
Date: Friday, July 8, 2016
Time: 11:00 AM
Place: HBS 350
Advisor: Prof. Ricardo Vilalta
Faculty, students, and the general public are invited.