Thomas Gerlach

Development of a graph neural network based approach for prediction of off-target drug effects

Thomas Gerlach discusses his recently submitted Master's thesis on using graph neural networks to predict unintended drug interactions and off-target effects, with a focus on antibody-based drugs.

Background

The drug development process is complex, time-consuming, and expensive, often taking years and significant financial investment. A major challenge is identifying promising drug candidates early on that minimize the risk of adverse effects or contraindications [1]. Despite rigorous testing, drugs may still exhibit unintended interactions, leading to Adverse Drug Reactions (ADRs). This issue is particularly pressing with the rise of antibody-based drugs, which offer targeted treatments but present challenges in predicting off-target effects due to limited knowledge.

To address these challenges, this work explores the use of advanced computational methods, specifically Graph Neural Networks (GNNs), to enhance the prediction of unexpected drug interactions. The research aims to adapt and enhance the MultiGML model developed by Krix et al. [2], a type of GNN, to better predict unintended interactions by incorporating new data on drug contraindications, side effects, and antibody-based therapies.

Methods

The approach includes enriching the training dataset with potential ADRs from PrimeKG [3] and the Secondary Pharmacology Database (SPD) [4], retraining the model, and validating its predictions using empirical data. The study also investigates the model's capability to handle antibody data from TheraSAbDab [5], aiming to predict new applications for antibody-based drugs despite limited data availability.

The research utilizes Relational Graph Convolutional Networks (RGCNs) to model complex biological interactions effectively. This is achieved by integrating data into a knowledge graph (KG) that includes nodes for drugs, proteins, conditions, and antibodies and relationships such as antibody-target, antibody-condition, and contraindications. Features like antibody sequences are transformed into fixed-length embeddings. These multimodal embeddings are normalized, fused, and input into the RGCN architecture. The model generates its output through a modified bilinear decoder that computes the probability of connections between two nodes in the KG, mapping these probabilities into binary classifications to predict the presence or absence of a connection. Its performance is then evaluated using key metrics such as Area Under Receiver Operating Characteristic Curve (AUROC) and Area Under Precision-Recall Curve (AUPR).

This project incorporated a new dataset for ADRs, added antibody data from TheraSAbDab to enhance antibody-target and antibody-condition predictions, retrained the model on the expanded KG, and validated its predictions through a comprehensive literature search.

Results and Conclusion

The results indicate that the proposed model outperforms traditional models in predicting unintended drug interactions. This benchmarking underscores the robustness of the model in comparison to competitors, affirming its superior predictive capabilities. The findings demonstrate that GNNs, specifically RGCNs, can effectively model complex biological interactions, establishing them as a powerful tool for predicting drug behaviors.

The study extends its impact by making real-world predictions, which were subsequently evaluated against existing literature. This validation process confirms the model's ability to detect connections and highlights its practical applicability in identifying potential drug interactions.

A key highlight of this proposed model is its performance in drug repurposing, specifically antibody repurposing, which enhances the efficiency of identifying new uses for existing drugs. Through this work, the thesis contributes to the ongoing evolution of drug discovery processes, emphasizing the integration of Artificial Intelligence (AI) to tackle contemporary challenges in pharmaceutical research and development.

This research underscores the potential of AI in drug development, especially for antibody therapies, paving the way for safer and more effective treatments. The approach demonstrates promise in improving the efficiency and accuracy of predicting off-target drug effects, potentially accelerating the drug development process and reducing the risk of adverse reactions in clinical use.

Fig. 1: The schematic knowledge graph including antibodies as the newly added node, connected to drugs and proteins with grey arrows. Antibodies are annotated with sequence embeddings. Other nodes include drugs, proteins, and phenotypes, each annotated with their respective descriptive features.

Citations

[1] Lavecchia, Antonio. “Advancing Drug Discovery with Deep Attention Neural Networks.” Drug Discovery Today 29, no. 8 (August 2024): 104067. https://doi.org/10.1016/j.drudis.2024.104067

[2] Krix, Sophia, Lauren Nicole DeLong, Sumit Madan, Daniel Domingo-Fernández, Ashar Ahmad, Sheraz Gul, Andrea Zaliani, and Holger Fröhlich. “MultiGML: Multimodal Graph Machine Learning for Prediction of Adverse Drug Events.” Heliyon 9, no. 9 (September 2023). https://doi.org/10.1016/j.heliyon.2023.e19441

[3] Chandak, Payal, Kexin Huang, and Marinka Zitnik. “Building a Knowledge Graph to Enable Precision Medicine.” Scientific Data 10, no. 1 (February 2, 2023). https://doi.org/10.1038/s41597-023-01960-3

[4] Sutherland, Jeffrey J., Dimitar Yonchev, Alexander Fekete, and Laszlo Urban. “A Preclinical Secondary Pharmacology Resource Illuminates Target-Adverse Drug Reaction Associations of Marketed Drugs.” Nature Communications 14, no. 1 (July 19, 2023). https://doi.org/10.1038/s41467-023-40064-9

[5] Raybould, Matthew I, Claire Marks, Alan P Lewis, Jiye Shi, Alexander Bujotzek, Bruck Taddese, and Charlotte M Deane. “Thera-SAbDab: The Therapeutic Structural Antibody Database.” Nucleic Acids Research 48, no. D1 (September 26, 2019). https://doi.org/10.1093/nar/gkz827