Synergy Between Embedding and Protein Functional Association Networks for Drug Label Prediction using Harmonic Function
Refereed Original Article
Semi-Supervised Learning (SSL) is an approach to machine learning that makes use of unlabeled data for training with a small amount of labeled data. In the context of molecular biology and pharmacology, one can take advantage of unlabeled data. For instance, to identify drugs and targets where a few genes are known to be associated with a specific target for drugs and considered as labeled data. Labeling the genes requires laboratory verification and validation. This process is usually very time consuming and expensive. Thus, it is useful to estimate the functional role of drugs from unlabeled data using computational methods. To develop such a model, we used openly available data resources to create (i) drugs and genes, (ii) genes and disease, bipartite graphs. We constructed the genetic embedding graph from the two bipartite graphs using Tensor Factorization methods. We integrated the genetic embedding graph with the publicly available genetic interaction graphs. Our results show the usefulness of the integration by effectively predicting drug labels.
Digital Object Identifer (DOI):
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
National University of Ireland, Galway (NUIG)
Open access repository: