Gary Yu Zhao

Date of Award

Spring 3-2024

Document Type


Degree Name

Doctor of Philosophy in Information Systems (PhDIS)

First Advisor

Omar El-Gayar

Second Advisor

Ronghua Shan

Third Advisor

Yenling Chang


In the realm of IT support, it is crucial to extract valuable information from various support channels, such as telephone, web chat, email, and social media. This extracted knowledge can help organizations prioritize customer-centric approaches and improve customer service. It also has broader applications, from decision support to product development and human resources policies. Traditionally, extracting knowledge from unstructured text was time-consuming and inefficient, but natural language processing (NLP) techniques demonstrated potential for extracting valuable insights in a variety of application contexts. However, their efficacy and potential in the context of domain-specific IT support transcripts remain primarily limited.

This research follows the Design Science Research Methodology (DSRM), encompassing problem identification, objective definition, artifact design, effectiveness demonstration, performance evaluation, and communication of findings. Moreover, the study uses the Attribute-Driven Design (ADD) method to create an approach for NLP-based knowledge extraction, integrating domain-specific knowledge and models. This research explores specific domain requirements and presents an innovative solution. The proposed approach comprises two main processes: domain knowledge extraction, involving off-topic content identification, domain stop-words identification, category extraction, and priority score determination; and transcript knowledge extraction, involving NLP preprocessing, adapted TFIDF keyword extraction, and TranGCN topic categorization. Experimental results demonstrate the effectiveness of this hybrid algorithm, combining rule-based, unsupervised, and supervised machine learning methods.

Notably, this study introduces the concept of category/keyword priority scores for keyword extraction and topic categorization, proving their efficacy in experiments. The adaptable nature of this approach suggests its potential application to various IT support domains with customization and updates. Our solution is expected to advance keyword extraction and topic classification courtesy of a novel TF-IDF algorithm adaptation combined with an optimized Graphic Convolutional Neural Network (GCN) method.