In Partial Fulfillment of the Requirements for the Degree of
Master of Science
Will defend his thesis
Association rules is a popular data mining technique that discovers predictive patterns in a dataset. Even though this problem has been generally solved outside a database system, the continuous growth of databases and hardware acceleration have made a DBMS platform a better choice. In this thesis, we study how to efficiently search for association rules pruning the search space with several constraints, solving the problem with relational queries. We study how to efficiently store itemsets and rules and we introduce several optimizations. We experimentally compare association rules with rules induced by decision trees on a medical data set, showing association rules are more abundant and exhibit higher prediction accuracy. We also study time performance and scalability. Our experiments show constrained association rules can be efficiently solved inside a database system and have better accuracy than decision trees.