Date of Award


Document Type

Thesis campus only


Computer Science

First Advisor

Matthew Hibbs


The genetic cause of osteoporosis is poorly understood, but a wealth of functional genomic data exist from which osteoporosis related pathways could be identified. A machine learning pipeline was created using Support Vector Machines and was first applied using as inputs all available gene expression data and a second time using only bone-related data. In both cases, models were trained using a manually curated training set of gene relationships known to support bone maintenance and development. Each model was used to predict novel pairwise gene relationships, and specific pathways were compared between models to identify relationships supported primarily by data collected in bone-related contexts as opposed to other cellular contexts. Our results indicate a more accurate result was achieved through biologically-motivated feature selection that considers mammalian cellular context. Our results reinforce the observation that if two genes are functionally associated in one context they may not be functionally associated in all contexts, necessitating careful consideration of training sets and input data into functional prediction methods.