COMMITTEE CHAIR: Dr. Sarhan Musa

TITLE: PERFORMANCE EVALUATION OF MACHINE LEARNING AND DEEP LEARNING TECHNIQUES FOR TEXT CLASSIFICATION AND CATEGORIZATION

ABSTRACT: Text classification is one of the most important fields in Natural Language Processing (NLP). It assigns texts from online documents into at least two categories in the domain by analyzing their texts as inputs and assigning a labeled set of correct pre-defined tags based on large content. It is used in several real-life applications such as engineering, science, and marketing, and it can be quite effective in addressing problems with labeled data. There are certain Machine Learning (ML) methods such as supervised, unsupervised, and semi-supervised learning with algorithms that have proven to be quite handy in categorizing text data like Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Naïve Bayes as well as Deep Learning (DL) methods involving Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN). This research work illustrated the text from each document that was reviewed and grouped into different sets through ML and DL techniques. These special techniques were then translated into Jupyter Notebook as Python programming codes to optimize results as numerical values/percentages to be later compared with each other.  Moreover, this research selected which method assembled a greater accuracy score through certain instructions from the required programming language to determine which one allocates online texts the most. The method with the highest accuracy score was found to classify texts online to improve in fields such as document management, chatbots, news preferences, and translation services for healthcare, marketplace, and even Artificial Intelligence (AI) when it comes to processing natural languages.

Room Location: Electrical Engineering Conference, Room 315D

&

Zoom Meeting:

https://pvpanther.zoom.us/j/96934065334?pwd=QXFLZHA4L050TmdkVEExV3kvVUhNUT09