© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Polypharmacy, exacerbated by aging populations and global health crises, underscores the need for accurate drug-drug interaction (DDI) prediction. This review article offers a thorough analysis of the latest advancements in machine learning (ML) models for predicting DDIs. The review zeroes in on the progress made since 2020, a notable period characterized by a significant increase in both the volume of drug-related datasets and the sophistication of deep learning (DL) techniques. We meticulously examine various dataset sources pivotal in the development of these models and delve into the methodologies employed for featurizing molecular structures and biological data. The article further explores a range of DL models and graph neural networks, assessing their efficacy in the accurate prediction of DDIs. Through a comparative analysis, we elucidate the strengths, limitations, and potential challenges faced by these models. Crucially, the review underscores the necessity of incorporating comprehensive clinical and biochemical factors to augment the real-world applicability and accuracy of DDI predictions. This comprehensive overview not only sheds light on the current state of DDI predictive modeling but also paves the way for future research directions, emphasizing the need for more advanced, adaptable models in the dynamic landscape of polypharmacy and drug interactions.
polypharmacy, drug-drug interactions (DDIs), machine learning in pharmacology, deep learning models, graph neural networks for DDIs, clinical data integration in drug interaction prediction
Drugs are chemical substances capable of inducing biological effects when introduced into a living organism, traditionally derived from medicinal plants and increasingly synthesized through organic chemistry [1, 2]. Advances in pharmaceutical science have led to the creation of comprehensive databases supporting research into drug mechanisms and interactions [3, 4]. A prime example is DrugBank [3], which lists 15,451 drug entries, including approved small molecule drugs, biologics, nutraceuticals, and experimental drugs. In clinical treatment, managing diseases often requires multiple drugs [5], as combination therapies generally yield higher success rates than monotherapies [6]. For instance, drug cocktails like doxorubicin, cyclophosphamide, vincristine, and prednisone are standard in cancer chemotherapy [7], while co-administration of anti-tuberculosis drugs enhances efficacy and delays resistance [8].
Nevertheless, the possibility of harmful drug-drug interactions (DDIs) rises with the number of medications given to a patient [9]. According to the study [10], for example, more than one-third of elderly Americans frequently take five or more medications or supplements, and 15% are at high risk of developing serious DDIs. DDIs include interactions with food, metabolites, endogenous chemicalsand diagnostic agents in addition to interactions between therapeutic medications [11]. They can change the nature, intensity, duration, side effects, and toxicity of drugs by either amplifying (synergistic action) or decreasing (antagonistic action) their efficacy [12]. Reactions resulting from DDIs can be advantageous, negligible, or detrimental [13]. In drug safety, these dangerous DDIs are the main emphasis.
Comprehending the pharmacological actions of each drug in a combination therapy is crucial for optimizing therapeutic efficacy while minimizing adverse reactions [14]. The importance of drug–drug interactions expands even more when aging is considered because the elderly take more than one medication at a time. An investigation performed in 2019 by The Health Insurance Review and Assessment Service (HIRA) found 41.8% of seniors ages 65 and older, who were seen in outpatient clinics, were given five or more prescriptions 14.4% were given 10 or more [15]. The data used in the study was based on the Korean National Health Insurance (NHI). Following the COVID-19 pandemic, the necessity of DDIs has been underscored, particularly for COVID-19 infected individuals with underlying medical comorbidities who are recurrently on multiple drugs. Due to the difficulties triggered by DDI prediction, an increase in computational approaches has been observed to flourish in recent years with revived interest in ML approaches and these methodologies. The one that garnered the most attention was called DeepDDI, the baby of the group as a computational model, being the first of its kind to use DL techniques to predict the possibility of the deadliest hyperport in DDI with any two of the thousand vis-a-vis drugs were to be combined which produces an array of permutations, most deadly [16]. This innovation lead to a brand new era in DDI precision, leading to many new extremely complicated ML models that utilizes much higher bandwidths of data not only new sources but also that are analysed even closer using far more deeper and complicated learning techniques.
The rapid progress in this field makes it a perfect time to complete a comprehensive survey of these modern DL models. Firstly, takes a look at the various dataset sources used by these models, and explains how this data is highly varied and detailed and is not just one type of DDI, which helps the predictions to be more accurate. The second aspect to be considered concerns the featurization approaches of model input data. It demonstrates the ways adopted to process and integrate molecular structures and graphs coding DDIs or biological knowledge into the models. Thirdly, the review will ascertain the methodologies used in utilizing those ML models in predicting DDIs, and the effectiveness of such methodologies in predicting DDIs, as illustrated in Figure 1.
This review demonstrates forthcoming research prospects in DDI predictions. Furthermore, the document includes various gaps that interested researchers can cover in order to advance in this field. Particularly, the paper clearly indicates areas that need to be considered so as there could be a perfect and right DDI predictions, which are basically to increase on the accuracy, reliability and convenience of the DDI predictions towards pharmaceutical and clinical areas. Taking an all-inclusive perspective, the purpose of this review is to guide future research and development of DDI prediction.
Figure 1. Overview pipeline of DDI prediction based on drug features and deep learning models
Direct DDI avenues a vast array of influences, some of which include–but not limited to–chemical substructures, biological targets, enzymatic actions not merely binary relationships. Existing datasets in this field are rich repositories of drug-related data, encompassing diverse aspects such as the drugs’ mechanisms of action, intricate protein structures, and the wide-ranging effects in pharmacogenomics. Due to the volume of data available researchers are now able to not only able to predict interactions from probable to impossible but also predict the complexity of the interactions that occur, which in turn allows them to begin to understand why those interactions are happening. We examine a variety of important chemical databases in the area as well as some bioinformatics database. Table 1 presents these databases in a logical manner and provides thorough descriptions that are helpful in DDI prediction. With the field of DDA evolving and its constant pacing, this table acts as a reference point to show the many different traits and functions that each database has to offer and allows researchers to pick and choose their own tools for their specific research needs.
Table 1. Comprehensive overview of key databases utilized in drug-drug interaction prediction [17]
Database |
Pub. Year |
Num. of Drugs |
Num. of Drug-Related Pairs |
KEGG [18] |
1995 |
11,147 |
324,183 DDIs |
DrugBank [3] |
2006 |
1,706 |
191,808 DDIs |
SIDER [19] |
2008 |
1,430 |
139,756 drug-side effect pairs |
TWOSIDES [20] |
2012 |
645 |
4,649,441 DDIs |
OFFSIDES [20] |
2012 |
1,332 |
18,842 drug-event associations |
BIOSNAP [21] |
2018 |
1,332 |
41,520 DDIs |
2.1 KEGG
The database of KEGG is essential when examining the uses and capabilities of biological systems because of the large molecular datasets with high-throughput techniques and genome sequencing. Containing 16 individual sources, this detailed database is categorized into the domains of systems, health, chemical, and genomic information. Notable components, including KEGG DRUG and KEGG PATHWAY, are included. This component of the database focuses on KEGG DRUG, which is an aggregation of databases with broad data on approved drugs, drugs in development, and similar substances. The pharmaceuticals are arranged according to their chemical structures. A distinct drug number is used to carefully catalog each drug entry, which is then enhanced with KEGG’s exclusive annotations, including information on drug metabolism. As a result, a thorough compilation of 1,925 authorized medications and their vast network of interactions is produced, encompassing a total of 324,183 interactions and 11,147 medications.
2.2 DrugBank
DrugBank, an openly accessible online resource, aggregates a wealth of information on drugs, their targets, mechanisms, and interactions. Launched with its first version (1.0) in 2006, the database has evolved to its current iteration, version 5.1.9, as of 2022. Presently, DrugBank boasts a comprehensive catalog of 14,944 drug entries. This includes 2,729 approved small molecule drugs and 1,564 approved biologics, such as allergenics and proteins, in addition to more than 6,713 drugs in the experimental phase, including those in discovery. DrugBank’s ability to predict the kind of drug interaction between two drugs—identified by their SMILES sequences—across multiple categories, including binary, multi-class, and multi-label classifications, is one of its primary features. Drug-Bank, a popular tool for comparative research, has version 5.1.4 with 1,706 drugs and 191,808 drug pairs that are categorized into 86 distinct DDI kinds.
2.3 SIDER
A comprehensive view of the effects and possible negative reactions of commercially accessible pharmaceuticals is provided by The Side Effect Resource, a database that compiles a wealth of information on these drugs’ side effects. This database can predict potential adverse effects by analyzing the chemical compositions, binding fingerprints, and other pertinent chemical properties of drug candidates. It also incorporates side effect information with other chemical biology resources, greatly improving pharmacological and medical research. This resource’s most recent version, 4.1, includes information on 1,430 medications, 5,868 different side effects, and 139,756 potential drug-side effect combinations.
2.4 TWOSIDES
The TWOSIDES database is an extensive database created to monitor polypharmacy side effects resulting from more complicated drug combinations or drug pairings. This database contains information on 1,301 unique adverse events and 868,221 associations encompassing 59,220 medication pairings. With 3,782,910 significant associations—associations where the combined drug combination shows a greater score for side-effect association—it stands out in particular. This score is calculated using the proportional reporting ratio (PRR). TWOSIDES particularly offers information on 645 medicines and the side effects resulting from 63,473 unique drug combinations. When examining two medications, distinguished by their SMILES sequences in this database, the main goal is to precisely forecast every possible adverse effect; this method is known as multi-label classification.
2.5 OFFSIDES
The database of OFFSIDES is an extensive collection featuring 438,801 off-label side effects involving 10,097 adverse events and 1,332 drugs. In this context, ‘off-label’ refers to side effects not documented on the official drug labels approved by the FDA, symbolizing the US Food and Drug Administration, as opposed to ‘on-label’ effects, which are documented. On average, an FDA drug label includes information about 69 on-label adverse events. In contrast, for every drug, OFFSIDES records an average of 329 high-confidence off-label adverse occurrences. Additionally, OFFSIDES successfully identifies 38.8% of the drug-event associations, amounting to 18,842 associations, which were initially recorded in the SIDER database from adverse event reports.
2.6 BIOSNAP
The BIOSNAP dataset is an expansive collection that aggregates a wide array of interactions among FDA-approved drugs, formulated through the construction of a biological network. In this network, each node symbolizes a distinct drug, while the edges depict the interactions between these drugs. Specifically, this dataset includes 41,520 labeled DDIs and encompasses 1,322 FDA-approved drugs. These DDIs are meticulously extracted and compiled from a variety of sources, including detailed drug labels and a range of scientific publications, providing a rich and comprehensive resource for understanding the complex interplay between different pharmaceutical compounds.
Molecular representation is critical in many drug-related tasks, with its importance particularly arising in the DDI prediction. An example of this would be the example of Tranylcypromine, which is a substance known for its function as an inhibitor of the monoamine oxidase enzyme [22]. When it comes to using medication to deal with mood or anxiety disorders a good way to approach this problem is with using Tranylcypromine, which can clear the way for a nonselective, irreversible antidepressant and anxiolytic agent.
There are many different ways to represent a drug molecule, and one of them, is through SMILES notation [23]. An investigation by experimental and computative accounts showed a summary and systematic layout of the arrangement of molecules which scientist employ to view and dissect a compound easily and not in great detail, as demonstrated in Figure 2. The choice of molecular representation is a significant issue, as this is the basis of how the computational techniques especially DL models operate towards the prediction of DDIs also whether it be smiles notation or some other way. The precision of DDI predictions increases and in addition we learn more about to gain more understanding of pharmacological effects, from the DDI given, Molecular representation is important in aspect of its accuracy, to give additional information.
Figure 2. Different representations of a drug molecule including fingerprint, graph, SMILES, InChI, and 3D electrostatic potential
3.1 SMILES sequence
SMILES, the Simplified Molecular Input Line Entry System, is one of the most useful tools in chemical computing, and is used by every chemist involved with computer information. SMILES strings are character strings that can describe complex chemical structures. Each atom in the SMILES string is represented by a unique ASCII symbol. SMILES strings contain very special symbols for stereochemistry, chemical bonds, and branching patterns.
The fantastic ability of SMILES strings is being able to turn any intricate chemical in the world into a simplified, tree shaped figure that’s easy to understand. The transformation of the molecular information is done by following a tree pattern where it is done longitudinal-first neighbor. A series of characters are going to take as the last result. Models based on DL can provide a well-rounded approach to handling differing measures to the sequence of data that is introduced in given output [24, 25]. Working together, they use what they know to interpret the SMILES Strings into consumable information in the same way that humans read a string of text.
With their compactness, memory efficiency, and ease in searchability, there are many advantages to using sequence-based representations. Since they are so compressed, they don’t waste valuable space in memory while conserving space. They are great for encoding molecular structures because they are so great at being triggered by words in a sentence. SMILES representations can also allow for the translation of chemical context from SMILES sequences, using Mol2Vec and FCS, which are techniques that are built to be able to understand the chemical relationship within the molecule, similar to how NLP have strategies to help with the translation of context [26, 27]. In summary, the remarkable and highly adaptable representation via the SMILES sequences make the entrée into the utilization of DL models for the foreseeing of the drug connections and several other chemical tasks and outcomes.
3.2 2D graph
Molecular representations utilize graph-based structures to make use of the most updated pharmacological compounds that will be very helpful and will be able to be produced by two-dimensional molecular graphs that month RDKit is a computer program that allows a SMILES string to be turned into its 2-D structure.
Following that, each node is given a set of atomic features that was established according to the node’s particular atomic number. Each node in a molecule’s graph has a starting 78-dimensional feature vector consisting of a variety of atomic attributes such as its atomic symbol, implicit value and aromaticity, neighbor atoms, neighbor hydrogens, and also mean values. As a result, we are left with the molecular graph representation, like the one for tranyl-cypromine, which is made up of edge features, atom numbers, and atomic features. It is possible to extract important structural information from the molecular graph thanks to this representation. 2D GNN models often leverage message passing neural networks (MPNN), a classic approach for encoding graph-based methods. Since 2D graphs are commonly stored as adjacency matrices, 2D GNNs facilitate efficient and accurate property combination between adjacent atoms or chemical bonds. Furthermore, they optimize weights during the message-passing procedure. Graph-based representations are advantageous in their ability to extract structural information through graph convolutional operations compared to sequence-based methods. These operations allow for the updating and optimization of bond weights within message-passing networks, enhancing their utility in various computational chemistry tasks.
3.3 3D graph
While 2D molecular graphs effectively capture structural connectivity among atoms, they fall short in representing spatial configurations crucial for understanding real-world drug interactions. The 3D graph representation addresses this limitation by incorporating the spatial coordinates of atoms within a molecule, enabling a more precise depiction of its geometry and conformation. This is especially important for modeling inter-molecular interactions such as ligand-receptor binding, where three-dimensional shape, orientation, and atomic distances determine binding affinity and biological activity. Applications of 3D graphs include the generation of conformer ensembles and the accurate prediction of molecular properties like binding energy, reactivity, or selectivity [28, 29]. By including features such as atomic coordinates, bond angles, and torsional geometry, 3D graph-based models provide richer information and enable deep learning architectures—such as 3D graph neural networks—to learn spatial dependencies that are critical in drug–target or drug–drug interaction prediction.
3.4 Drug-drug interaction network
DDIs are a multifaceted area of research that combines information from areas including biology, chemistry, and other information about drugs to measure the likelihood of interaction between two drugs. A way to take apart the very complex web of DDIs is to create a DDI network. This network is an outline of certain drug molecules and can give a clear account of the possibilities of chemical linkage between them. By doing this we will have a better understand of how the drugs are designed to interact.
Within the context of the DDI prediction task, the problem is often cast as a missing link prediction problem. Drugs are cast as nodes, and the established interactions as the edges connecting them. By putting drugs into a state that allows a predictive model to be created, what’s done is to change drugs into feature vectors. In order to do this, we consider its interaction profiles that have been gotten from interactions known. The use of this representation allows for the building of models that can determine potential DDI occurrences, giving a better look into the very complex interaction of drugs and allowing for better choice-making with regards to drugs.
3.5 Heterogeneous graphs
A repository of extensive information represents a hetergenous graph (HetG) that contains structured relations among diverse types of nodes and unstructured contents related to each node [30]. Hitting the genetic switch proves to have some merit in predicting DDI’s. HetGs signals the function of cancer related gene pairs and is a notable player in determining successes.
When talking about DDIs the format of a typical HetG sometimes looks like a graph, it is usually denoted as G = (V, E, OV, RE), where V is the set of nodes, E is the set of links, OV is the set of object types and RE is the set of relation types. Moreover, each node within this HetG is imbued with heterogeneous content, such as attributes and properties. The graph encodes relationships between several pairs of entities. Among those considered are a drug and the protein that it targets, a drug and the side effects that it elicits, and a drug and the diseases that it treats. To illustrate, consider a biological heterogeneous graph centered around a drug like Fulvestrant. The reassignment of the bar that come in diverse paint illustrates the many solutions and arrow mean the direction of it and it can help for me the view of shift the framework by graping a itself containing more information.
3.6 Knowledge graphs
Knowledge graphs (KGs) have emerged as a valuable resource in the realm of drug discovery, garnering attention from both the academic community and various sectors within the field [31]. KGs offer a structured representation of human knowledge, and their application has proven beneficial in the drug discovery domain. The extraction of high-order semantic characteristics that enhance the estimate of DDIs is made possible by these KGs, which allow the smooth integration of various entity kinds and association interactions among biological entities.
A knowledge graph typically takes the form of G = (V, E, F), where E represents the set of entities, R denotes the set of relations, and F encompasses the set of facts. Facts within the KG are expressed as triples (h, r, t) F, where h and t are entities connected by relation r. Entities are depicted as nodes, each characterized by distinct colors and alphabets, representing real-world biological objects such as drugs, targets, and side-effects. Relationships (edges) illustrate the connections between entities, and they incorporate semantic descriptions, encompassing types and properties with well-defined meanings, including associations like Drug-Disease, Drug-Target Gene, and Drug Brite.
As an illustration of its practical application, KGs have played a pivotal role in addressing challenges posed by the COVID-19 pandemic [32, 33]. Notably, there exist several knowledge graphs tailored to various facets of the drug discovery procedure, such as Clinical Knowledge Graph, DRKG, OpenBioLink, PharmKG, BioKG, and Hetionet. While providing a brief overview, it’s worth noting that a comprehensive review of these KGs goes outside the limits of what the work covers, and interested readers are encouraged to explore devoted reviews on the subject [34].
Utilizing a feature vector that captures the molecular makeup of a drug has proven effective in predicting DDIs with considerable accuracy. However, integrating additional ’biological information’ could further refine these predictions and enhance their interpretability. For instance, in models forecasting DDIs, various biological elements have been employed alongside molecular characteristics. These include drug target proteins in DDIMDL [35] and MDF-SA-DDI [36], DDI networks in DPDDI [37] and deepMDDI [38], and biological/drug knowledge graphs in MUFFIN [39], SumGNN [40], and BioDKG-DDI [41]. Other aspects like gene ontology terms in Lee et al.’s model [42], and gene expression signatures in DeSIDE-DDI have also been considered. Key proteins like cytochromes P450 or drug target proteins are crucial in influencing DDIs. The drug target information for a medication is represented in a binary format within a vector, where ‘1’ signifies a drug target protein, and ‘0’ a non-target. Analyzing these binary vectors for two drugs can hint at potential interactions; a similarity in the vectors suggests a high likelihood of DDI, as both drugs may interact with the same protein target, influencing each other’s drug-target interaction.
To enhance DDI prediction, graph neural networks (GNNs) have been employed in a novel manner, differing from the earlier-discussed molecular structure representation (using nodes and edges in a graph to represent atoms and bonds of a drug). In terms of biological data, the nodes and edges in a GNN signify drugs and their interactions, respectively, turning the GNN framework into a DDI network. A prime example is DPDDI [37], which utilizes a graph convolutional network (GCN) to gather and modify information from connected neighboring drugs (represented as nodes and edges, respectively). Applying a GCN to multiple drugs creates a latent feature vector for each drug (node), which serves as input for a ML model to predict DDIs. Incorporating additional biological data like genetic, protein, and/or chemical interactions, along with gene ontology [43], can reveal more varied DDI impacts.
Integrating both molecular and biological characteristics has notably enhanced DDI prediction performance compared to using only molecular structures. DDIMDL [35], a DDI prediction model, uses four different drug features: drug targets, relevant enzymes (mainly cytochromes P450), pathways (including those with drug targets), and molecular structures. To determine the impact of each feature on DDI prediction, various model versions were created, each utilizing different feature combinations. The version relying solely on molecular structure surpassed those based on drug targets, enzymes, or pathways in terms of accuracy. However, models that combined target and enzyme features with molecular data showed even better prediction accuracy. The advantages of using biological features were also confirmed by Lee et al. [42]. Their model, which merged three similarity profiles for two drugs, encompassing molecular structures, target genes, and gene ontology terms, yielded higher classification accuracy than those using only structural similarity profiles.
Beyond GNNs, various models for embedding knowledge graphs have been applied to distill essential semantic features from these graphs. A knowledge graph typically consists of nodes formatted as triplets (h, r, t), where ‘h’ and ‘t’ denote the head and tail entities, respectively, and ‘r’ indicates their interconnecting relationship. These relationships can vary widely within a single knowledge graph. For instance, DRKG is a widely-used knowledge graph in developing DDI prediction models, offering insights into the connections between diverse entities such as drugs, diseases, biological processes, and side effects. To leverage all this drug-related information for DDI prediction, it’s crucial first to translate the entities and their multifaceted semantic relationships in DRKG into a more manageable, low-dimensional format using suitable knowledge graph embedding models.
One such model is 3WDDI [44], which utilizes embedding vectors derived from DRKG. This process employs ComplEx [45], a model known for its semantic similarity-based approach to knowledge graph embedding. The embeddings are then fed into a downstream model. This model uses the embeddings to predict the probability that a pair of drugs will have a DDI by calculating the probability.
Despite these advances, limitations remain in biological information integration, particularly concerning the scarcity and imbalance of biological data—especially for rare drug targets or newly approved compounds. Incomplete annotations for certain proteins or pathways can hinder generalization across datasets and limit the applicability of models in real-world settings. Furthermore, biological data is often heterogeneous and noisy, posing challenges for direct incorporation. To mitigate these limitations, recent studies have proposed hybrid fusion approaches that combine biological and chemical features in a unified framework. For instance, models like Bio-JOIE and DeepDDS explore joint embeddings of molecular graphs and biological knowledge graphs to enhance robustness. Other works have introduced attention-based fusion strategies and graph-level contrastive learning to reconcile discrepancies between heterogeneous sources. These hybrid methodologies represent a promising direction for future research, aiming to balance informativeness, interpretability, and data availability across multiple biological domains.
This rapidly advancing subject has seen a vast number of techniques used as proven by the timetable of results found in Table 2. The table has shown the range of techniques being used as well as their chronological order which can correspond to the date of publication. A comprehensive overview of deep and graph learning techniques that have been developed recently is shown in the table below. In the table, we see that each model has its unique characteristics in a set of columns, i.e., model name, the types of input, the representation method, the architectural frameworks, the classification tasks.
Beginning in 2021 with SumGNN the path has been paved for a variety of new techniques, GNNs with knowledge graph/subgraph representations and attention mechanism, and GCN on molecular graph with contrastive learning. There are a lot of unique methods like MIRACLE and SSI-DDI that are binary classing amongst themselves from SMILES data down to molecular substructures, and others such as AAEs using knowledge graphs with adversarial autoencoders. This trend shows a greater change in this field of work and the way more compound and diverse data representations and also more intricate and multifaceted architectural strategies. Entering 2022, we observe a growing diversification of models and approaches. GNNs on Molecular Graphs such as GNN-DDI, and MFFGNN with multi-type features, continue trending. The increasing complexity of these models can be observed by DeepDrug that introduces a RGCN architecture. Furthermore, this year shows the increasing range of the field, with models employing drug features, biomedical networks, and directed graphs.
In 2023, the trend towards more intricate models continues with DSN-DDI and DGNN-DDI, among others, embracing dual-view encoders and directed MPNNs. This not only reflects the ongoing refinement of existing methodologies but also the introduction of novel approaches to tackle the complexity of DDI prediction. Each model’s contribution to the field is further delineated by their unique approaches to data representation and processing architecture, be it through attention mechanisms, contrastive learning, or capsule networks.
In our comprehensive analysis, we conducted a detailed comparison of various new deep and graph learning models, focusing on their capabilities in binary, multi-label classification, and multi-class tasks for DDI predictions. This comparison, as presented in Tables 3 and 4, evaluates the performance of different models under the binary classification task on two benchmark datasets: DrugBank and TWOSIDES. The evaluation metrics employed to assess the effectiveness of these models include the Area Under the Precision-Recall Curve (AUPRC), Accuracy (ACC), Area Under the Receiver Operating Characteristic (AUROC), and F1-score. These metrics were calculated using a 5-fold cross-validation approach to ensure robustness and reliability in our assessment. Notably, higher values in these metrics correlate with superior predictive performance. An important aspect of our comparison is the recognition that despite the fact that different models may divide training and test data differently, the evaluation remains statistically meaningful. This is because each model’s performance is appraised under consistent, rigorous criteria, providing a fair and comprehensive comparison.
Table 2. Summary of DDI prediction models
Model |
Input |
Architecture |
Representation |
Classification |
SumGNN (2021) [40] |
SMILES/Drug ID |
GNN + attention |
Knowledge graph/subgraph |
Multi-class/multi-label |
MIRACLE (2021) [46] |
SMILES |
GCN + Contrastive learning |
Molecular graph |
Binary |
SSI-DDI (2021) [47] |
SMILES |
GAT + attention |
Substructure |
Binary |
AAEs (2021) [48] |
Drug ID |
Adversarial autoencoders |
Knowledge graph |
Binary |
GNN-DDI (2022) [49] |
SMILES |
GAT |
Molecular graph |
Binary |
MFFGNN (2022) [50] |
SMILES + molecular graph |
GNN + BiGRU |
Multi-type feature |
Binary |
GCNMK (2022) [51] |
Drug ID |
GCN + Linear transformation |
DDI graph + drug features |
Binary |
DeepDrug (2022) [52] |
SMILES |
RGCN |
Molecular graph |
Binary/multi-class/label |
LR-GNN (2022) [53] |
Drug ID |
GCN |
Biomedical network |
Binary |
DANN-DDI (2022) [54] |
Drug ID |
SDNE + attention |
Biomedical network |
Binary |
DGAT-DDI (2022) [55] |
Directed graph |
GAT |
Source/target encoding |
Binary |
GMPNN (2022) [56] |
SMILES |
Gated MPNN |
Molecular graph |
Binary |
STNN-DDI (2022) [57] |
SMILES |
Encoder + decoder |
Substructure |
Binary |
DeepMDDI (2022) [38] |
Drug ID |
RGCN Encoder + decoder |
Sub-networks |
Multi-label |
RANEDDI (2022) [58] |
Drug ID |
RotatE + network embedding |
DDI network |
Binary/multi-class |
DeSIDE-DDI (2022) [59] |
Fingerprints |
DNN |
Gene expressions |
Multi-class |
SA-DDI (2022) [60] |
SMILES |
D-MPNN |
Substructure |
Binary |
MSAN (2022) [61] |
SMILES |
Transformer-like framework |
Substructure |
Binary |
LaGAT (2022) [62] |
Drug ID |
Link-aware GAT |
Knowledge graph/subgraph |
Binary/multi-class |
Molormer (2022) [63] |
2D structures |
Attention + Siamese network |
Molecular graph spatial structure |
Binary |
MDDI-SCL (2022) [36] |
Drug ID |
Attention + Contrastive learning |
Drug features |
Multi-class |
R2-DDI (2022) [64] |
SMILES |
DeeperGCN + Feature refinement |
Molecular graph |
Binary |
BioDKG-DDI (2022) [41] |
SMILES |
Attention + DNN |
Multiple drug features |
Binary |
AMDE (2022) [65] |
SMILES |
MPAN + Transformer |
Sequence + atomic graph |
Binary |
DDKG (2022) [66] |
SMILES/Drug ID |
Encoder-decoder GCN + |
Knowledge graph |
Binary |
3DGT-DDI (2022) [67] |
3D structures |
3D GNN + text attention |
Molecular graph + position information |
Binary/multi-class |
DSN-DDI (2023) [29] |
Molecular graph |
Dual-view encoder + decoder |
Substructure |
Binary |
DGNN-DDI (2023) [68] |
SMILES |
Directed MPNN + substructure attention |
Molecular graph + substructure |
Multi-class |
KG2ECapsule (2023) [68] |
Drug ID |
GCN + Capsule |
Knowledge graph |
Multi-label |
Table 3. Performance metrics for DrugBank dataset (in %)
Method and Year |
AUPRC (%) |
ACC (%) |
AUROC (%) |
F1-score (%) |
DANN-DDI [54] 2022 |
97.09 |
99.62 |
97.63 |
96.92 |
DGAT-DDI [55] 2022 |
94.3 |
88.6 |
95.1 |
88.4 |
RANEDDI [58] 2022 |
98.94 |
– |
98.98 |
95.62 |
AMDE [65] 2022 |
– |
97.63 |
99.01 |
97.60 |
SSI-DDI [47] 2021 |
98.14 |
94.47 |
98.38 |
– |
MFFGNN [50] 2022 |
96.81 |
– |
95.39 |
92.54 |
DeepDrug [52] 2022 |
98.0 |
– |
– |
94.0 |
GMPNN [56] 2022 |
– |
95.30 |
98.46 |
– |
SA-DDI [60] 2022 |
– |
96.23 |
98.80 |
96.29 |
MSAN [61] 2022 |
– |
97.00 |
99.27 |
97.04 |
R2-DDI [64] 2022 |
– |
98.15 |
99.70 |
98.16 |
3DGT-DDI [67] 2022 |
– |
– |
97.0 |
– |
DSN-DDI [29] 2023 |
– |
96.94 |
99.47 |
96.93 |
MIRACLE [46] 2021 |
92.34 |
– |
95.51 |
83.60 |
BioDKG-DDI [42] 2022 |
– |
93.70 |
98.30 |
93.90 |
Table 4. Performance metrics for TWOSIDES dataset (in %)
Method and Year |
ACC (%) |
AUROC (%) |
F1-score (%) |
SSI-DDI [47] 2021 |
78.20 |
85.85 |
79.81 |
DeepDrug [52] 2021 |
– |
– |
84.0 |
GMPNN [56] 2022 |
82.83 |
90.07 |
84.08 |
SA-DDI [60] 2022 |
87.45 |
93.17 |
88.35 |
R2-DDI [64] 2022 |
86.15 |
91.49 |
87.31 |
DSN-DDI [65] 2023 |
98.83 |
99.90 |
98.83 |
In our observations, particularly on the DrugBank dataset, we noted standout performances by RANEDDI (AUPRC = 98.94%) and KGNN (AUPRC = 98.92%), both of which are network-based methods. These models achieved the highest and second-highest AUPRC performance, respectively, surpassing those of chemical structure-based and hybrid methods. The success of RANEDDI and KGNN can be attributed to their ability to effectively utilize multi-relational information inherent in DDI networks or knowledge graphs. In contrast, graph embedding procedures, such as DeepDDI, GraRep, DeepWalk, and substructure-based methods such as CASTER primarily leverage similar chemical structural information or drug characteristics.
It’s also important to note that R2-DDI performed better in terms of both AUROC and F1-score, while DANN-DDI had the highest ACC result of 0.9962, surpassing all other models.
On the TWOSIDES dataset, the recently published chemical structure-based model DSN-DDI showed remarkable results, outshining other baseline models across all evaluation metrics. Specifically, it achieved an ACC of 0.9883, an AUROC of 0.9990, and an F1-score of 0.9883, indicating its robustness and accuracy in DDI prediction. Another interesting insight derived from the comparative studies is that the network-based approaches, such as DANN-DDI and RANE-DDI, have performance comparable with the chemical structure-based methods (e.g. R2-DDI). In the mean-time, for the binary classification task, the hybrid methods still perform steadily on DrugBank. The value of different methods to predict DDI seems promising, which signifies the ease of selecting suitable models in the dataset that can relate to the task.
In an endeavor to design the multi-class performance evaluating methods. The work we had presented here in this comprehensive analysis deals with the multi-class performance metric, which will mainly enable to investigate various types of DL and graph learning models by taking the benchmark datasets such as DrugBank and TWOSIDES datasets has been intensively evaluated, as depicted in Tables 5 and 6.
Provided in these tables is a thorough comparison of the effectiveness of the models in relation to multiple key evaluation metrics. The experiments lead us to conclude that KG2ECapsule and SumGNN ranked as the first two best models with consistent peak performances regardless of which metric we evaluated. They have demonstrated their extraordinary capability in tackling complicated classification tasks.
KG2ECapsule shows significant improvements over the best baseline models on the DrugBank dataset in particular. It showed an enhancement of 2.71% in PR-AUC, 1.03% in ACC, 2.8% in ROC–AUC, and 2% in F1 score. The reason for this increase in accuracy is due to KG2ECapsule being able to accurately model the triplets and include the connections that are present in the edges into the embedding algorithm. This demonstrates a more discriminate and effective method for integrating and employing KG information.
However, for the TWOSIDES dataset, SumGNN has showed better improvement by at least 2.45% and 2.82% in PR-AUC and ROC-AUC respectively compared to other methods. It is strongly suggested by this that SumGNN must be giving thought to the subgraphs that they are using since they are being so successful. The fact that it is able to exploit these outside information looks like to give it a step ahead against many other models. Moreover, in terms of KG based approaches such as KG-DDI and KGNN, as a comparison, SumGNN and KG2ECapsule are both superior even though the comparison of both methods is only single, while KG2ECapsule and SumGNN consistently are both higher than any other on two datasets. This observation really drives the point home: just using KGEmbed and neighborhood sampling might not be that great an idea to sufficiently leverage KG information for DDI prediction. This implies the requirement for more sophisticated techniques that can handle the complex structures and relationships inherent in KGs in a more comprehensive manner.
In addition, network-based techniques, with all three different approaches, had better performances in the multi-class problem. Evidence of these trends may mean the architectural and computational strategies in certain network-based models could have a high performance on multi-class DDI prediction tasks, because DDI prediction is a complex task involving many aspects of information. This knowledge presents a promising research direction where network-based models can be extrapolated and used to potentially redefine how accurate we can be and how fast we can do predictions for DDI.
Table 5. Multi-class performance metrics for DrugBank dataset (in %)
Method Year |
Mean Accuracy (%) |
Macro Precision (%) |
Macro Recall (%) |
Macro F1 (%) |
SSI-DDI [47] 2021 |
89.65 |
87.63 |
93.21 |
89.93 |
GMPNN [56] 2022 |
94.85 |
93.46 |
97.25 |
94.95 |
SA-DDI [60] 2022 |
95.65 |
94.72 |
97.46 |
95.73 |
Molormer [63] 2022 |
96.67 |
94.19 |
92.70 |
93.11 |
MDDI-SCL [36] 2022 |
93.78 |
88.04 |
87.67 |
87.55 |
DGNN-DDI [68] 2023 |
96.09 |
94.72 |
97.88 |
96.16 |
MUFFIN [39] 2021 |
– |
96.48 |
94.95 |
– |
KGNN [69] 2020 |
85.87 |
79.47 |
86.02 |
79.45 |
KG2Ecapsule [70] 2023 |
88.58 |
80.50 |
88.82 |
81.45 |
Table 6. Multi-class performance metrics for TWOSIDES dataset (in %)
Method and Year |
PR-AUC (%) |
ROC–AUC (%) |
KGNN [69] 2020 |
65.84 |
89.48 |
SkipGNN [71] 2020 |
90.90 |
92.04 |
SumGNN [40] 2021 |
93.35 |
94.86 |
MUFFIN [39] 2021 |
70.33 |
91.60 |
In reviewing the progression of deep learning and graph-based models for drug–drug interaction (DDI) prediction from 2021 onward, several key trends and performance patterns emerge. Firstly, network-based models, particularly those leveraging knowledge graphs and biomedical networks—such as RANEDDI, DANN-DDI, and KGNN—demonstrate superior performance in metrics like AUPRC and AUROC. These models excel due to their ability to capture complex, multi-relational patterns inherent in drug interaction networks, as opposed to relying solely on chemical structure or sequence-level data. For example, RANEDDI achieved the highest AUPRC on the DrugBank dataset (98.94%), while DANN-DDI recorded the highest accuracy (99.62%), showcasing the power of embedding techniques and attention mechanisms when applied to relational data. In contrast, structure-based methods using SMILES strings and molecular graphs—such as GMPNN, DeepDrug, and DSN-DDI—have also shown competitive performance, particularly in binary and multi-class classification, thanks to the incorporation of message-passing neural networks (MPNNs), graph convolutional networks (GCNs), and attention modules that enable more expressive molecular feature learning. Hybrid approaches like MFFGNN and DGNN-DDI, which integrate multi-type features from both chemical and relational domains, offer balanced effectiveness across metrics and task types. Another trend is the increasing use of attention mechanisms, contrastive learning, and transformer-like architectures, which enhance the interpretability and generalization of models. Moreover, the recent inclusion of 3D structural information in models such as 3DGT-DDI highlights a growing recognition of spatial configuration’s role in accurate DDI prediction. Overall, the diversity in input types (SMILES, drug IDs, graphs), architectural complexity (from GCNs to capsule networks), and task orientation (binary, multi-class, multi-label) reflects a maturing field, where model choice is often dictated by the specific DDI task, dataset characteristics, and desired performance trade-offs. The following tables offer a detailed breakdown and comparison of these models across different metrics, datasets, and classification types.
This review has comprehensively examined the evolving landscape of drug-drug interaction (DDI) prediction from the pre-2020 era through to 2023, focusing on four core aspects: data sources, molecular representation, biological information, and deep learning (DL) models. The analysis of data sources highlights the increasing availability and diversity of datasets used in DDI prediction, reflecting a broader and richer basis for model development. In the domain of molecular representation, we observe persistent challenges in accurately capturing molecular structures, which can negatively impact the performance of predictive models when foundational inputs are insufficient or misrepresented.
The integration of biological information has notably enriched predictive capabilities by providing deeper insights into underlying pharmacological and biochemical processes—many of which remain only partially understood. This biological context is essential for enhancing model precision and clinical relevance. Deep learning models, particularly the application of advanced architectures such as GNNs and knowledge graph embeddings, have demonstrated strong potential in modeling complex DDI mechanisms and continue to drive innovation in the field. Recent trends point toward using DL to derive more expressive and informative molecular and biological features.
Looking forward, two key directions deserve focused attention. First, improving the interpretability of predictive models is crucial for clinical adoption, as black-box models may hinder trust and decision-making in healthcare settings. Incorporating explainable AI techniques and visual analytics can bridge this gap. Second, there is a pressing need to establish standardized benchmarking protocols and datasets for DDI prediction. Such benchmarks would facilitate consistent evaluation, reproducibility, and fair comparisons among models, thereby accelerating progress and fostering collaboration across research communities.
[1] Atanasov, A.G., Waltenberger, B., Pferschy-Wenzig, E.M., Linder, T., Wawrosch, C., Uhrin, P., Temml, V., Wang, L., Schwaiger, S., Heiss, E.H. (2015). Discovery and resupply of pharmacologically active plant-derived natural products: A review. Biotechnology Advances, 33(8): 1582-1614. https://doi.org/10.1016/j.biotechadv.2015.08.001
[2] Pereira, R.D.S. (1998). The use of baker's yeast in the generation of asymmetric centers to produce chiral drugs and other compounds. Critical Reviews in Biotechnology, 18(1): 25-64. https://doi.org/10.1080/0738-859891224211
[3] Sayeeda, Z. (2018). DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Research, 46: D1074-D1082. https://doi.org/10.1093/nar/gkx1037
[4] Zheng, S., Aldahdooh, J., Shadbahr, T., Wang, Y., Aldahdooh, D., Bao, J., Wang, W., Tang, J. (2021). DrugComb update: A more comprehensive drug sensitivity data repository and analysis portal. Nucleic Acids Research, 49(W1): W174-W184. https://doi.org/10.1093/nar/gkab438
[5] Espinal, M.A., Kim, S.J., Suarez, P.G., Kam, K.M., Khomenko, A.G., Migliori, G.B., Baéz, J., Kochi, A., Dye, C., Raviglione, M.C. (2000). Standard short-course chemotherapy for drug-resistant tuberculosis: Treatment outcomes in 6 countries. The Journal of the American Medical Association, 283(19): 2537-2545. https://doi.org/10.1001/jama.283.19.2537
[6] Walkup, J.T., Albano, A.M., Piacentini, J., Birmaher, B., Compton, S.N., Sherrill, J.T., Ginsburg, G.S., Rynn, M.A., McCracken, J., Waslick, B. (2008). Cognitive behavioral therapy, sertraline, or a combination in childhood anxiety. New England Journal of Medicine, 359(26): 2753-2766. https://doi.org/10.1056/NEJMoa0804633
[7] Keith, C.T., Borisy, A.A., Stockwell, B.R. (2005). Multicomponent therapeutics for networked systems. Nature Reviews Drug Discovery, 4: 71-78. https://doi.org/10.1038/nrd1609
[8] Genina, N., Boetker, J.P., Colombo, S., Harmankaya, N., Rantanen, J., Bohr, A. (2017). Anti-tuberculosis drug combination for controlled oral delivery using 3D printed compartmental dosage forms: From drug product design to in vivo testing. Journal of Controlled Release, 268: 40-48. https://doi.org/10.1016/j.jconrel.2017.10.003
[9] Huang, J., Niu, C., Green, C.D., Yang, L., Mei, H., Han, J.D.J. (2013). Systematic prediction of pharmacodynamic drug-drug interactions through protein-protein-interaction network. PLoS Computational Biology, 9(3): e1002998. https://doi.org/10.1371/journal.pcbi.1002998
[10] Qato, D.M., Wilder, J., Schumm, L.P., Gillet, V., Alexander, G.C. (2016). Changes in prescription and over-the-counter medication and dietary supplement use among older adults in the United States, 2005 vs 2011. JAMA Internal Medicine, 176(4): 473-482. https://doi.org/10.1001/jamainternmed.2015.8581
[11] Wienkers, L.C., Heath, T.G. (2005). Predicting in vivo drug interactions from in vitro drug discovery data. Nature Reviews Drug Discovery, 4: 825-833. https://doi.org/10.1038/nrd1851
[12] Juurlink, D.N., Mamdani, M., Kopp, A., Laupacis, A., Redelmeier, D.A. (2003). Drug-drug interactions among elderly patients hospitalized for drug toxicity. The Journal of the American Medical Association, 289(13): 1652-1658. https://doi.org/10.1001/jama.289.13.1652
[13] Yeh, P., Tschumi, A.I., Kishony, R. (2006). Functional classification of drugs by properties of their pairwise interactions. Nature Genetics, 38: 489-494. https://doi.org/10.1038/ng1755
[14] Zhao, X.M., Iskar, M., Zeller, G., Kuhn, M., Van Noort, V., Bork, P. (2011). Prediction of drug combinations by integrating molecular and pharmacological data. PLoS Computational Biology, 7(12): e1002323. https://doi.org/10.1371/journal.pcbi.1002323
[15] Cho, H.J., Chae, J., Yoon, S.H., Kim, D.S. (2022). Aging and the prevalence of polypharmacy and hyper-polypharmacy among older adults in South Korea: A national retrospective study during 2010–2019. Frontiers in Pharmacology, 13: 866318. https://doi.org/10.3389/fphar.2022.866318
[16] Ryu, J.Y., Kim, H.U., Lee, S.Y. (2018). Deep learning improves prediction of drug–drug and drug–food interactions. Proceedings of the National Academy of Sciences, 115(18): E4304-E4311. https://doi.org/10.1073/pnas.1803294115
[17] Lin, X., Dai, L., Zhou, Y., Yu, Z.G., et al. (2023). Comprehensive evaluation of deep and graph learning on drug–drug interactions prediction. Briefings in Bioinformatics, 24(4): bbad235. https://doi.org/10.1093/bib/bbad235
[18] Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M., Ishiguro-Watanabe, M. (2023). KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Research, 51(D1): D587-D592. https://doi.org/10.1093/nar/gkac963
[19] Kuhn, M., Letunic, I., Jensen, L.J., Bork, P. (2016). The SIDER database of drugs and side effects. Nucleic Acids Research, 44(D1): D1075-D1079. https://doi.org/10.1093/nar/gkv1075
[20] Tatonetti, N.P., Ye, P.P., Daneshjou, R., Altman, R.B. (2012). Data-driven prediction of drug effects and interactions. Science Translational Medicine, 4(125): 125ra131. https://doi.org/10.1126/scitranslmed.3003377
[21] Zitnik, M., Sosic, R., Leskovec, J. (2018). BioSNAP Datasets: Stanford biomedical network dataset collection. http://snap.stanford.edu/biodata.
[22] Baldessarini, R.J. (2006). Drug therapy of depression and anxiety disorders. In Goodman and Gilman’s: The Pharmacological Basis of Therapeutics. New York, McGraw-Hill, pp. 429-460.
[23] Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28(1): 31-36. https://doi.org/10.1021/ci00057a005
[24] Xu, Z., Wang, S., Zhu, F., Huang, J. (2017). Seq2seq fingerprint: An unsupervised deep molecular embedding for drug discovery. In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Boston, Massachusetts, USA.
[25] Zhang, X., Wang, S., Zhu, F., Xu, Z., Wang, Y., Huang, J. (2018). Seq3seq fingerprint: Towards end-to-end semi-supervised deep drug discovery. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Washington, DC, USA.
[26] Huang, K., Xiao, C., Glass, L.M., Sun, J. (2021). MolTrans: Molecular interaction transformer for drug–target interaction prediction. Bioinformatics, 37(6): 830-836. https://doi.org/10.1093/bioinformatics/btaa880
[27] Jaeger, S., Fulle, S., Turk, S. (2018). Mol2vec: Unsupervised machine learning approach with chemical intuition. Journal of Chemical Information and Modeling, 58(1): 27-35. https://doi.org/10.1021/acs.jcim.7b00616
[28] Ganea, O., Pattanaik, L., Coley, C., Barzilay, R., Jensen, K., Green, W., Jaakkola, T. (2021). GeoMol: Torsional geometric generation of molecular 3D conformer ensembles. Advances in Neural Information Processing Systems.
[29] Li, Z., Zhu, S., Shao, B., Zeng, X., Wang, T., Liu, T.Y. (2023). DSN-DDI: An accurate and generalized framework for drug–drug interaction prediction by dual-view representation learning. Briefings in Bioinformatics, 24(1): bbac597. https://doi.org/10.1093/bib/bbac597
[30] Zhang, C., Song, D., Huang, C., Swami, A., Chawla, N.V. (2019). Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, pp. 793-803.
[31] Ji, S., Pan, S., Cambria, E. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning, 33(2): 494-514. https://doi.org/10.1109/TNNLS.2021.3070843
[32] Domingo-Fernández, D., Baksi, S., Schultz, B., Gadiya, Y., Karki, R., Raschka, T., Ebeling, C., Hofmann-Apitius, M., Kodamullil, A.T. (2021). COVID-19 Knowledge Graph: A computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology. Bioinformatics, 37(9): 1332-1334. https://doi.org/10.1093/bioinformatics/btaa834
[33] Reese, J.T., Unni, D., Callahan, T.J., Cappelletti, L., Ravanmehr, V., Carbon, S., Shefchek, K.A., Good, B.M., Balhoff, J.P., Fontana, T. (2021). KG-COVID-19: A framework to produce customized knowledge graphs for COVID-19 response. Patterns, 2(1): 100155. https://doi.org/10.1016/j.patter.2020.100155
[34] Bonner, S., Barrett, I.P., Ye, C., Swiers, R., Engkvist, O., Bender, A., Hoyt, C.T., Hamilton, W.L. (2022). A review of biomedical datasets relating to drug discovery: A knowledge graph perspective. Briefings in Bioinformatics, 23(6): bbac404. https://doi.org/10.1093/bib/bbac404
[35] Deng, Y., Xu, X., Qiu, Y., Xia, J., Zhang, W., Liu, S. (2020). A multimodal deep learning framework for predicting drug–drug interaction events. Bioinformatics, 36(15): 4316-4322. https://doi.org/10.1093/bioinformatics/btaa501
[36] Lin, S., Chen, W., Chen, G., Zhou, S., Wei, D.Q., Xiong, Y. (2022). MDDI-SCL: Predicting multi-type drug-drug interactions via supervised contrastive learning. Journal of Cheminformatics, 14: 81. https://doi.org/10.1186/s13321-022-00659-8
[37] Feng, Y.H., Zhang, S.W., Shi, J.Y. (2020). DPDDI: A deep predictor for drug-drug interactions. BMC Bioinformatics, 21: 419. https://doi.org/10.1186/s12859-020-03724-x
[38] Feng, Y.H., Zhang, S.W., Zhang, Q.Q., Zhang, C.H., Shi, J.Y. (2022). DeepMDDI: A deep graph convolutional network framework for multi-label prediction of drug-drug interactions. Analytical Biochemistry, 646: 114631. https://doi.org/10.1016/j.ab.2022.114631
[39] Chen, Y., Ma, T., Yang, X., Wang, J., Song, B., Zeng, X. (2021). MUFFIN: Multi-scale feature fusion for drug–drug interaction prediction. Bioinformatics, 37(17): 2651-2658. https://doi.org/10.1093/bioinformatics/btab169
[40] Yu, Y., Huang, K., Zhang, C., Glass, L.M., Sun, J., Xiao, C. (2021). SumGNN: Multi-typed drug interaction prediction via efficient knowledge graph summarization. Bioinformatics, 37(18): 2988-2995. https://doi.org/10.1093/bioinformatics/btab207
[41] Ren, Z.H., Yu, C.Q., Li, L.P., You, Z.H., Guan, Y.J., Wang, X.F., Pan, J. (2022). BioDKG–DDI: Predicting drug–drug interactions based on drug knowledge graph fusing biochemical information. Briefings in Functional Genomics, 21(3): 216-229. https://doi.org/10.1093/bfgp/elac004
[42] Lee, G., Park, C., Ahn, J. (2019). Novel deep learning model for more accurate prediction of drug-drug interaction effects. BMC Bioinformatics, 20: 415. https://doi.org/10.1186/s12859-019-3013-0
[43] Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., et al. (2000). Gene ontology: Tool for the unification of biology. Nature Genetics, 25: 25-29. https://doi.org/10.1038/75556
[44] Hao, X., Chen, Q., Pan, H., Qiu, J., Zhang, Y., Yu, Q., Han, Z., Du, X. (2023). Enhancing drug–drug interaction prediction by three-way decision and knowledge graph embedding. Granular Computing, 8: 67-76. https://doi.org/10.1007/s41066-022-00315-4
[45] Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G. (2016). Complex embeddings for simple link prediction. In Proceedings of the 33rd International Conference on Machine Learning, pp. 2071-2080.
[46] Wang, Y., Min, Y., Chen, X., Wu, J. (2021). Multi-view graph contrastive representation learning for drug-drug interaction prediction. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, pp. 2921-2933. https://doi.org/10.1145/3442381.3449786
[47] Nyamabo, A.K., Yu, H., Shi, J.Y. (2021). SSI–DDI: Substructure–substructure interactions for drug–drug interaction prediction. Briefings in Bioinformatics, 22(6): bbab133. https://doi.org/10.1093/bib/bbab133
[48] Dai, Y., Guo, C., Guo, W., Eickhoff, C. (2021). Drug–drug interaction prediction with Wasserstein Adversarial Autoencoder-based knowledge graph embeddings. Briefings in Bioinformatics, 22(4): bbaa256. https://doi.org/10.1093/bib/bbaa256
[49] Feng, Y.H., Zhang, S.W. (2022). Prediction of drug-drug interaction using an attention-based graph neural network on drug molecular graphs. Molecules, 27(9): 3004. https://doi.org/10.3390/molecules27093004
[50] He, C., Liu, Y., Li, H., Zhang, H., Mao, Y., Qin, X., Liu, L., Zhang, X. (2022). Multi-type feature fusion based on graph neural network for drug-drug interaction prediction. BMC Bioinformatics, 23: 224. https://doi.org/10.1186/s12859-022-04763-2
[51] Wang, F., Lei, X., Liao, B., Wu, F.X. (2022). Predicting drug–drug interactions by graph convolutional network with multi-kernel. Briefings in Bioinformatics, 23(1): bbab511. https://doi.org/10.1093/bib/bbab511
[52] Yin, Q., Cao, X., Fan, R., Liu, Q., Jiang, R., Zeng, W. (2020). DeepDrug: A general graph-based deep learning framework for drug-drug interactions and drug-target interactions prediction. Quantitative Biology, 11(3): 260-274. https://doi.org/10.15302/J-QB-022-0320
[53] Kang, C., Zhang, H., Liu, Z., Huang, S., Yin, Y. (2022). LR-GNN: A graph neural network based on link representation for predicting molecular associations. Briefings in Bioinformatics, 23(1): bbab513. https://doi.org/10.1093/bib/bbab513
[54] Liu, S., Zhang, Y., Cui, Y., Qiu, Y., Deng, Y., Zhang, Z., Zhang, W. (2022). Enhancing drug-drug interaction prediction using deep attention neural networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 20(2): 976-985. https://doi.org/10.1109/TCBB.2022.3172421
[55] Feng, Y.Y., Yu, H., Feng, Y.H., Shi, J.Y. (2022). Directed graph attention networks for predicting asymmetric drug–drug interactions. Briefings in Bioinformatics, 23(3): bbac151. https://doi.org/10.1093/bib/bbac151
[56] Nyamabo, A.K., Yu, H., Liu, Z., Shi, J.Y. (2022). Drug–drug interaction prediction with learnable size-adaptive molecular substructures. Briefings in Bioinformatics, 23(1): bbab441. https://doi.org/10.1093/bib/bbab441
[57] Yu, H., Zhao, S., Shi, J. (2022). STNN-DDI: A substructure-aware tensor neural network to predict drug–drug interactions. Briefings in Bioinformatics, 23(4): bbac209. https://doi.org/10.1093/bib/bbac209
[58] Yu, H., Dong, W., Shi, J. (2022). RANEDDI: Relation-aware network embedding for drug-drug interaction prediction. Information Sciences, 582: 167-180. https://doi.org/10.1016/j.ins.2021.09.008
[59] Kim, E., Nam, H. (2022). DeSIDE-DDI: interpretable prediction of drug-drug interactions using drug-induced gene expressions. Journal of Cheminformatics, 14: 9. https://doi.org/10.1186/s13321-022-00589-5
[60] Yang, Z., Zhong, W., Lv, Q., Chen, C.Y.C. (2022). Learning size-adaptive molecular substructures for explainable drug–drug interaction prediction by substructure-aware graph neural network. Chemical Science, 13: 8693-8703. https://doi.org/10.1039/D2SC02023H
[61] Zhu, X., Shen, Y., Lu, W. (2022). Molecular substructure-aware network for drug-drug interaction prediction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, Atlanta, GA, USA, pp. 4757-4761. https://doi.org/10.1145/3511808.3557648
[62] Hong, Y., Luo, P., Jin, S., Liu, X. (2022). LaGAT: Link-aware graph attention network for drug–drug interaction prediction. Bioinformatics, 38(24): 5406-5412. https://doi.org/10.1093/bioinformatics/btac682
[63] Zhang, X., Wang, G., Meng, X., Wang, S., Zhang, Y., Rodriguez-Paton, A., Wang, J., Wang, X. (2022). Molormer: A lightweight self-attention-based method focused on spatial structure of molecular graph for drug–drug interactions prediction. Briefings in Bioinformatics, 23(5): bbac296. https://doi.org/10.1093/bib/bbac296
[64] Lin, J., Wu, L., Zhu, J., Liang, X., Xia, Y., Xie, S., Qin, T., Liu, T.Y. (2023). R2-DDI: Relation-aware feature refinement for drug–drug interaction prediction. Briefings in Bioinformatics, 24(1): bbac576. https://doi.org/10.1093/bib/bbac576
[65] Pang, S., Zhang, Y., Song, T., Zhang, X., Wang, X., Rodriguez-Patón, A. (2022). AMDE: A novel attention-mechanism-based multidimensional feature encoder for drug–drug interaction prediction. Briefings in Bioinformatics, 23(1): bbab545. https://doi.org/10.1093/bib/bbab545
[66] Su, X., Hu, L., You, Z., Hu, P., Zhao, B. (2022). Attention-based knowledge graph representation learning for predicting drug-drug interactions. Briefings in Bioinformatics, 23(3): bbac140. https://doi.org/10.1093/bib/bbac140
[67] He, H., Chen, G., Yu-Chian Chen, C. (2022). 3DGT-DDI: 3D graph and text based neural network for drug–drug interaction prediction. Briefings in Bioinformatics, 23(3): bbac134. https://doi.org/10.1093/bib/bbac134
[68] Ma, M., Lei, X. (2023). A dual graph neural network for drug–drug interactions prediction based on molecular structure and interactions. PLoS Computational Biology, 19(1): e1010812. https://doi.org/10.1371/journal.pcbi.1010812
[69] Lin, X., Quan, Z., Wang, Z.J., Ma, T., Zeng, X. (2020). KGNN: Knowledge graph neural network for drug-drug interaction prediction. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), Yokohama, Japan, pp. 2739-2745.
[70] Su, X., You, Z., Huang, D., Wang, L., Wong, L., Ji, B., Zhao, B. (2022). Biomedical knowledge graph embedding with capsule network for multi-label drug-drug interaction prediction. IEEE Transactions on Knowledge and Data Engineering, 35(6): 5640-5651. https://doi.org/10.1109/TKDE.2022.3154792
[71] Huang, K., Xiao, C., Glass, L.M., Zitnik, M., Sun, J. (2020). SkipGNN: Predicting molecular interactions with skip-graph networks. Scientific Reports, 10: 21092. https://doi.org/10.1038/s41598-020-77766-9