In Partial Fulfillment of the Requirements for the Degree of Master of Science
will defend his thesis
Trend Analysis on Phishing Email Data using Natural Language Processing
Online phishing email attacks have been increasingly causing financial losses to the organizations and recently, the frequency of such attacks have been observed to increase. This thesis attempts to find the patterns inherent to phishing attacks using natural language processing methodologies. It is quite a challenging task to separate out the phishing attack emails from the legitimate ones, thanks to the high degree of similarity between syntax of both sets of emails. Fortunately, there are plenty of methodologies in natural language processing which have shown promising results in categorizing text based on its grammar. In this context, this thesis adapts three similarity measures: Cosine similarity, Jaccard coefficient, and Euclidean distance to cluster the emails. An attempt has been made to analyze the trend in the vocabulary and grammatical syntax of phishing attack emails. Moreover, by utilizing software provided by The Stanford Natural Language Processing Group, we attempt to find the most frequent linguistic patterns in the phishing attacks that can be used to further separate out such emails from the legitimate ones.
Date: Thursday, August 2, 2018
Time: 12:00 PM
Place: PGH 550
Advisor: Dr. Rakesh Verma
Faculty, students, and the general public are invited.