Department of Computer Science at UH

University of Houston

Department of Computer Science

In Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy

Ruth Huang Miller

Will defend her dissertation

Framework and Algorithms for Extraction of Knowledge


Data is the basic building block of computing. Extracting knowledge from the abundance of data requires substantial processing. Annotation, mining and visualization are three transformational processes that convert this data into knowledge. Unstructured, semi-structured, and geo-spatial data has experienced unprecedented growth in volume and on-line availability with the explosion of the Internet. This growth makes it increasingly likely that the precise knowledge the user needs or wants is available somewhere, but makes retrieval, usage and understanding of this data much more challenging. This dissertation will look at three strategies for transforming data into knowledge. The first strategy is to collect and aggregate data from difference sources into domain specific data warehouse repositories that enables rapid knowledge retrieval and use. This strategy is when the specific purpose has not been established in advance or the retrieval of this knowledge is time critical. The second strategy is to annotate the retrieved data with XML according to predetermined domain specific ontologies to facilitate querying of this knowledge. This strategy is best used for unstructured or semi-structured domain specific documents. The third strategy centers on extracting knowledge from spatially annotated data. In this case, spatial context, particularly location, serves as the glue which ties information together that originates from different knowledge sources. The main contributions of this dissertation research include: 1) development of a framework for finding geo-spatial hotspots, 2) development of a geo-feature pre-selection algorithm to automatically search for promising candidates, 3) development of ZIPS, a interestingness hotspot detection algorithm based on polygons, 4)experimental evaluation of the proposed algorithms for in case studies involving Internet advertising, housing vacancies, unemployment .

Date: Monday, July 25, 2011
Time: 1:30 PM
Place: 550-PGH
Faculty, students, and the general public are invited.
Advisor: Prof Christoph F. Eick