Kiril Klein

Automating drug side effect detection using transformer models for enhanced pharmacovigilance

Kiril Klein introduces the PHAIR project, which aims to improve the surveillance of drug safety monitoring by integrating health data and artificial intelligence. By using transformer models, the project seeks to make the detection of drug side effects more efficient and accurate.

Background

New drugs undergo rigorous testing in randomized control trials (RCTs) before being approved for the market. However, these trials often involve selected populations, which can limit the applicability of the results to a broader patient base [1]. This issue is even more pronounced with drug repurposing, where the safety profile may be less established.

To manage these risks, healthcare systems use spontaneous reporting systems to detect potential side effects, supplemented by observational studies. Unfortunately, spontaneous reporting is often underutilized by both patients and practitioners, leading to underreporting of side effects [2]. Observational studies, while valuable, are labor-intensive, time-consuming, and require expert input, which limits their scalability and delays the discovery of new side effects.

Automating side-effect detection

In the PHAIR (Pharmacovigilance by AI Real-time Analyses) project, we aim to address these challenges by leveraging recent advancements in electronic health record (EHR) modeling. Specifically, transformer models can learn meaningful representations of patient data [3-6] that can be utilized in causal inference settings for risk estimations [7].

A crucial aspect of our approach is using transformers to estimate propensity scores – the probability that a patient would be prescribed a specific drug based on their characteristics. Traditionally estimated with simpler models on limited features, propensity scores could greatly benefit from the transformers' ability to handle complex, high-dimensional data [8]. This could lead to more accurate causal estimates, essential for drug safety assessment.

By automating and scaling the detection of drug-side effect pairs, our approach marks a significant departure from traditional, expert-driven methods in pharmacovigilance. Transformers could not only accelerate this process but also enhance the precision of findings, ultimately improving patient safety.

Supported by the Innovation Fund Denmark and in collaboration with the Danish Medicines Agency, our goal is to equip epidemiologists with advanced tools that make drug safety monitoring more efficient and responsive to emerging risks.

Read more about the PHAIR project here: https://di.ku.dk/english/news/2022/faster-knowledge-of-side-effects-via-artificial-intelligence/

Fig. 1: Patient data from electronic health records is used as input to the representation learning transformer. These representations are used to predict propensity scores and outcomes, which are critical for causal inference. Compared to traditional methods, more information can be retained while labor-intensive steps such as confounder and propensity score model selection are automated.

Citations

[1] Kostis, John B., and Jeanne M. Dobrzynski. “Limitations of randomized clinical trials.“ The American journal of cardiology 129 (2020): 109-115.

[2] Hazell, Lorna, and Saad AW Shakir. “Under-reporting of adverse drug reactions.“ Drug safety 29.5 (2006): 385-396.

[3] Odgaard, Klein et al. “CORE-BEHRT: A Carefully Optimized and Rigorously Evaluated BEHRT.“ Machine Learning for Healthcare Conference 2024

[4] Li, Yikuan, et al. “BEHRT: transformer for electronic health records.“ Scientific reports 10.1 (2020): 7155.

[5] Rasmy, Laila, et al. “Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.“ NPJ digital medicine 4.1 (2021): 86.

[6] Lentzen, Manuel, et al. “A transformer-based model trained on large scale claims data for prediction of severe COVID-19 disease progression.“ IEEE Journal of Biomedical and Health Informatics 27.9 (2023): 4548-4558.

[7] Rao, Shishir, et al. “Targeted-BEHRT: deep learning for observational causal inference on longitudinal electronic health records.“ IEEE Transactions on Neural Networks and Learning Systems 35.4 (2022): 5027-5038.

[8] Schneeweiss, Sebastian, et al. “High-dimensional propensity score adjustment in studies of treatment effects using health care claims data.“ Epidemiology 20.4 (2009): 512-522.