In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Seyyedeh Qazale Mirsharif
will defend her dissertation
A Computational Study of Visual Attention on Objects and Gestures During Infancy
Understanding the development of visual system and role of vision in learning during infancy have been one of the focus of observational and cognitive studies over the years. Head cameras have become very popular in such studies as they provide a unique source of information about child's momentary visual experiences from his own perspective. They are wearable by children and are frequently employed to estimate child's visual field during object name learning experiment. The resulting videos opens up new insights into child's learning and cognitive development and have led to emergence of computer vision in cognitive studies. Humans are not able to completely assess child's visual experience by manually coding the videos frame by frame and as a result important information might be missing from their evaluations.Computer vision and machine learning methods are fast and accurate and allow cognitive scientist to further their research by providing them significant information about potential patterns in child's developmental process that is impossible to measure by manual processing of videos.
Computer vision researchers have made great progress in developing automated methods aimed at understanding the development of visual focus of attention in children. In this study we specifically focus on developing methods that helps us further studies on object and gestures perception in infants during object name learning experiment. We explore a learning environment where parents performs several gestures on objects and name the objects. They are trained to perform actions naturally to guide child's visual attention toward desired object being learned while providing verbal cues about the object name synchronously.
We perform two experiments in this study. In the first experiment we develop a semi-automated method for segmentation and tracking of objects in child's egocentric view and perform a statistical analysis to understand the object distribution in child's view at progressive stages of development including infants at 6, 9, 12, 15, and 18 months. The method initially takes user annotations on boundary of objects and then obtains two Gaussian mixture models for foreground and background using graph cut segmentation algorithm. Then the object is tracked for subsequent frames by estimating dense optical flows frame by frame and object mask is computed for each frame. The results are used to provide heat maps of objects in child's egocentric view to find regions in his visual field where the object under focus is frequently occurring.
In the second experiment, we explore gesture perception in children. We add videos obtained from third and top views as egocentric view is frequently occluded and fast changes of view that is due to large head movement leads to important information missing from this view about parent's gestures. We add videos recorded from third view and top view using two static IP cameras. We perform a hierarchical unsupervised clustering approach to cluster videos into groups of motion and movement patterns and investigate object saliency in child's view for each motion cluster to find attention directive movements during learning experiment. We perform an extensive computational analysis on the results and discuss that children might recognize a different set of movements and actions as interesting and attention directing while parents perform a common set of gestures known to adults for guiding child's attention to objects.
Date: Tuesday, July 25, 2017
Time: 12:00 PM
Place: PGH 550
Advisor: Dr. Shishir Shah
Faculty, students, and the general public are invited.