Deniz Özlem Er
Enhancing the understanding of human brain pharmacome: predicting biological causal interactions using link prediction methods
Deniz Er provides insights into predicting biological causal relationships in Alzheimer's research through link prediction methods applied to the Human Brain Pharmacome (HBP), the focus of her Master's thesis.
Background
Alzheimer’s Disease (AD) is a devastating condition affecting millions worldwide, leading to memory loss and cognitive decline. With nearly 55 million people affected as of 2023 and 10 million new cases each year, finding effective treatments is critical. One promising strategy is drug repurposing – finding new uses for existing medications – which can speed up the discovery of therapies compared to developing new drugs from scratch [1].
In this context, knowledge graphs (KGs) play a crucial role by organizing complex biological data into structured networks of interconnected nodes (genes, proteins, drugs) and edges (relationships). These graphs allow researchers to uncover hidden patterns and identify potential therapeutic targets, accelerating the search for new treatments [2]. My research focuses on improving the Human Brain Pharmacome (HBP), a biomedical knowledge graph that maps the interactions between drugs and biological pathways in the brain. However, only 67.8% of its relationships are currently causal [3]. By using Knowledge Graph Embedding algorithms, I aim to transform the non-causal links into meaningful causal relationships, helping to better understand how drugs influence brain functions at a molecular level and potentially accelerating the discovery of treatments for AD.
Methods
The HBP knowledge graph comprises 136,838 nodes and 731,974 edges, making it a comprehensive map of entities like pathologies, genes, proteins, and approved active substances sourced from reputable databases such as PubMed, Reactome, and DrugBank. Additional annotations were incorporated using standard terminologies and ontologies [2].
To analyze and predict causal interactions within this network, Knowledge Graph embeddings were employed. KG embeddings are mathematical representations of entities (nodes) and their relationships (edges) within a knowledge graph, mapped into a continuous vector space. This enables complex, multi-dimensional biological data to be represented in a structured format, where similar or related entities are positioned closer together. By capturing these proximity relationships, KG embeddings allow computational models to perform link prediction, inferring hidden interactions and predicting new connections, such as causal relationships, which are vital for understanding drug actions in biological pathways [3].
The KG embeddings in this study were generated using five distinct models – TransE, TransR, RotatE, ComplEX, and HolE. Each model has unique approaches for capturing relational patterns. For instance, TransE and TransR use translation-based methods, while RotatE, ComplEX, and HolE incorporate complex vector spaces and rotations to better represent interactions. Hyperparameter optimization was conducted to identify the most effective configurations for each model.
The dataset was divided into 80% for training, 10% for validation, and 10% for testing, with an additional 10% of the total links reserved for future predictions. Model performance was evaluated using metrics such as precision, recall, F1 score, hits@k, and Mean Reciprocal Rank (MRR) to assess the accuracy of causal predictions.
Results
After hyperparameter optimization, the performance of various KG embedding models was assessed. The test set, comprising 10% of the data, was utilized to calculate the percentage of true predictions at different ranks for each model. Among the models – TransE, RotatE, ComplEX, TransR, and HolE – RotatE achieved the highest accuracy across several metrics, including hits@1 (94.84%), hits@3 (98.76%), hits@10 (100.00%), and MRR (96.88%).
Subsequently, predictions were validated against existing literature, concentrating on the drug action mechanisms of Galantamine, Rivastigmine, and Memantine in AD. For Galantamine, various pathways were analyzed, leading to predictions that transformed non-causal relationships (e.g., "Association") into causal (e.g., "Increases", “Decreases”) based on evidence linking acetylcholinesterase (AChE), cholesterol, amyloid precursor protein (APP), and neuroprotective effects through α7 nicotinic acetylcholine receptor (CHRNA7) activation [4-7].
Rivastigmine demonstrated the capacity to lower amyloid-beta (Aβ) levels by increasing PRNP protein levels, which subsequently decreased beta-secretase 1 (BACE1) activity, thereby supporting a causal relationship. Higher PRNP protein levels were linked to reduced BACE1 activity, leading to lower Aβ production, reinforcing the predicted "Decreases" relationship between PRNP and BACE1 [8] (see Figure 1).
For Memantine, pathways indicated a negative correlation between excitatory synapses and GRIN2B protein expression, aligning with compensatory mechanisms to mitigate excitotoxicity in AD [9-10]. Additionally, the predicted relationship between CREBBP and HEY1 shifted from an "Interaction" to a "Decreases" causal link due to CREBBP’s regulatory role in gene expression relevant to AD [11].
Overall, these findings highlight the effectiveness of the RotatE model in making accurate predictions, demonstrating the biological relevance of the validated pathways concerning drug repurposing for Alzheimer's treatment.
Conclusion
This study illustrates the potential of KG embeddings, particularly the RotatE model, in predicting causal relationships for drug repurposing in complex diseases like AD. By leveraging the HBP knowledge graph, my research successfully transformed non-causal relationships into plausible causal links, revealing novel interactions, such as Galantamine's relationship with AChE and other proteins.
The findings emphasize the importance of data quality and the ongoing challenge of distinguishing true causality from mere association, underscoring the need for continuous improvements in data coverage. While the focus of this study is on AD, the methodology is adaptable to various diseases, including cancer and metabolic disorders, making it a valuable tool for identifying new therapeutic uses for existing drugs. This study also highlights the promising potential of AI-driven approaches in drug discovery, paving the way for more efficient and sustainable research practices.
Citations
[1] World Health Organization. “Dementia.” World Health Organization, 2023, www.who.int/news-room/fact-sheets/detail/dementia.
[2] Lage-Rupprecht, V., Schultz, B., Dick, J., Marcin Namysl, Zaliani, A., Gebel, S., Pless, O., Reinshagen, J., Ellinger, B., Ebeling, C., Esser, A., Jacobs, M., Claussen, C. and Hofmann-Apitius, M. (2022). A hybrid approach unveils drug repurposing candidates targeting an Alzheimer pathophysiology mechanism. Patterns, 3(3), pp.100433–100433. https://doi.org/10.1016/j.patter.2021.100433
[3] Hogan, A., Blomqvist, E., Cochez, M., D’amato, C., Melo, G.D., Gutierrez, C., Kirrane, S., Gayo, J.E.L., Navigli, R., Neumaier, S., Ngomo, A.-C.N., Polleres, A., Rashid, S.M., Rula, A., Schmelzeisen, L., Sequeda, J., Staab, S. and Zimmermann, A. (2022). Knowledge Graphs. ACM Computing Surveys, 54(4), pp.1–37. https://doi.org/10.1145/3447772
[4] Simons, M., Keller, P., Dichgans, J., & Schulz, J. B. (2001). Cholesterol and Alzheimer's disease: is there a link?. Neurology, 57(6), 1089–1093. https://doi.org/10.1212/wnl.57.6.1089
[5] Xue-Shan, Z., Juan, P., Qi, W., Zhong, R., Li-Hong, P., Zhi-Han, T., Zhi-Sheng, J., Gui-Xue, W., & Lu-Shan, L. (2016). Imbalanced cholesterol metabolism in Alzheimer's disease. Clinica chimica acta; international journal of clinical chemistry, 456, 107–114. https://doi.org/10.1016/j.cca.2016.02.024
[6] Zhang, W. B., Huang, Y., Guo, X. R., Zhang, M. Q., Yuan, X. S., & Zu, H. B. (2023). DHCR24 reverses Alzheimer's disease-related pathology and cognitive impairment via increasing hippocampal cholesterol levels in 5xFAD mice. acta neuropathol commun 11, 102 (2023). https://doi.org/10.1186/s40478-023-01593-y
[7] Sinkus, M. L., Graw, S., Freedman, R., Ross, R. G., Lester, H. A., & Leonard, S. (2015). The human CHRNA7 and CHRFAM7A genes: A review of the genetics, regulation, and function. Neuropharmacology, 96(Pt B), 274–288. https://doi.org/10.1016/j.neuropharm.2015.02.006
[8] Whitehouse, I. J., Miners, J. S., Glennon, E. B., Kehoe, P. G., Love, S., Kellett, K. A., & Hooper, N. M. (2013). Prion protein is decreased in Alzheimer's brain and inversely correlates with BACE1 activity, amyloid-β levels and Braak stage. PloS one, 8(4), e59554. https://doi.org/10.1371/journal.pone.0059554
[9] Andreoli, V., De Marco, E. V., Trecroci, F., Cittadella, R., Di Palma, G., & Gambardella, A. (2014). Potential involvement of GRIN2B encoding the NMDA receptor subunit NR2B in the spectrum of Alzheimer's disease. Journal of neural transmission, 121(5), 533–542. https://doi.org/10.1007/s00702-013-1125-7
[10] Samojedny, S., Czechowska, E., Pańczyszyn-Trzewik, P., & Sowa-Kućma, M. (2022). Postsynaptic Proteins at Excitatory Synapses in the Brain-Relationship with Depressive Disorders. International journal of molecular sciences, 23(19), 11423. https://doi.org/10.3390/ijms231911423
[11] Huang, Y. H., Cai, K., Xu, P. P., Wang, L., Huang, C. X., Fang, Y., Cheng, S., Sun, X. J., Liu, F., Huang, J. Y., Ji, M. M., & Zhao, W. L. (2021). CREBBP/EP300 mutations promoted tumor progression in diffuse large B-cell lymphoma through altering tumor-associated macrophage polarization via FBXW7-NOTCH-CCL2/CSF1 axis. Signal transduction and targeted therapy, 6(1), 10. https://doi.org/10.1038/s41392-020-00437-8