In Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
Will defend her dissertation
In traditional clustering, groupings of objects are obtained by running a clustering algorithm once on a single dataset. The main goal of this dissertation is the investigation of frameworks, tools, techniques, and algorithms for cluster analysis that significantly deviate from the traditional clustering paradigm. We propose techniques to cluster multiple datasets to mine interesting relationships between them. The first approach, called Correspondence Analysis by Interestingness Comparison, clusters individual datasets separately and then obtains knowledge by analyzing the relationship between different clusterings. A variant of this approach is a technique called Correspondence Clustering that clusters one dataset by using clusters of the other datasets as guidance. The last approach called CMDJ (Clustering Multiple Datasets Jointly) clusters multiple datasets jointly. We cluster multiple datasets in a single run of a clustering algorithm and the clusters obtained by this approach contain objects that originate from different datasets. Post analysis techniques are then used to extract knowledge from the obtained clusters. Moreover, we introduce a clustering technique called Multi-Run Clustering that clusters a single dataset multiple times to obtain the better clustering results combining clusters that originate from different runs. The presented framework are evaluated in case studies centering on progression of glaucoma, ozone pollution and earthquake analysis.