Date of Award


Document Type

Thesis open access


Computer Science

First Advisor

Matthew A. Hibbs

Second Advisor

Diana K. Young

Third Advisor

Paul J. Meyers


Since the rise of data science and analytics, the definitions of data scientists and analysts have been obscure. Various perceptions of those positions originate from companies, which have been attempting to gain competitive advantages over their competitors by using data in their business. However, the data science and analytics job markets are suffering from a high turnover rate as well as long times to fill job openings. This research conducts a hierarchical clustering and topic modeling in order to demonstrate how data scientist positions and data analyst positions can be distinguished from each other by finding hidden correlations among words used in job descriptions.