Predicting the Impact of Missense Mutations via Graph Embeddings

Term: 
2018-2019 Summer
Faculty Department of Project Supervisor: 
Faculty of Engineering and Natural Sciences
Number of Students: 
2

Determining the impact of a missense mutation remains to be a challenge. In this project, students will aim to improve upon the state-of-the-art using the protein structural knowledge and their graph representations. Traditional machine approaches on graphs rely on summary statistics on nodes such as the degree of the node, or the clustering coefficient when representing graphs. Recent representation learning approaches offer an alternative. The idea behind these representation learning approaches is to learn a mapping that embeds nodes as points in a low-dimensional vector space. The goal is that learning such a space such that nodes that are similar in terms of their topological roles in the graph are embedded closely. Such a vector representation has been shown to be useful for subsequent prediction tasks on graphs in other domains such as social network analysis. In this project, the students will explore graph embedding representations for protein structures and then use these representations in building a machine learning technique for predicting the impact of a missense variant.

Related Areas of Project: 
Computer Science and Engineering
Molecular Biology, Genetics and Bioengineering
​Mathematics