TxGNN uses a knowledge graph combined with graph neural networks for prediction:
Knowledge Graph Construction
17,080 nodes (drugs, diseases, genes, etc.)
Complex inter-node relationships
Graph Neural Network Training
Learns latent associations between nodes
Predicts new drug-disease relationships
Prediction Output
Prediction score for each drug-disease pair
Higher scores indicate higher confidence
Prediction Methods
Method
Description
Predictions
Knowledge Graph (KG)
Network topology-based associations
718
Deep Learning (DL)
Neural network score prediction
31,650
Total
Merged and deduplicated
32,368
Step 2: Drug Mapping
EMA to DrugBank Mapping
EMA medicines are mapped to DrugBank identifiers for TxGNN compatibility:
Step
Description
1. Ingredient Extraction
Extract active substances from EMA data
2. Name Normalization
Standardize drug names (remove salts, etc.)
3. DrugBank Lookup
Match against DrugBank vocabulary
4. Manual Review
Verify ambiguous mappings
Mapping Statistics
Metric
Value
Total EMA Drugs
1,543
Mapped to DrugBank
638
Mapping Rate
41.3%
Step 3: Disease Mapping
Indication Extraction
Therapeutic indications from EMA are mapped to standardized disease terms:
MeSH Terms: EMA provides MeSH therapeutic areas
Text Extraction: Parse indication text for disease mentions
Vocabulary Matching: Match against TxGNN disease vocabulary
Mapping Statistics
Metric
Value
Unique Indications
4,570
Mapped Diseases
~2,000
Step 4: Evidence Levels
Classification System
Level
Definition
Clinical Significance
L1
Multiple Phase 3 RCTs / Systematic Reviews
Strong support
L2
Single RCT or multiple Phase 2 trials
Moderate support
L3
Observational studies / Large case series
Preliminary evidence
L4
Preclinical / Mechanistic studies
Reasonable mechanism
L5
AI prediction only
Research hypothesis
Data Sources
Source
Usage
EMA Medicines Database
EU approved drugs
Article 57 Database
Additional drug data
TxGNN Knowledge Graph
Drug-disease predictions
DrugBank
Drug identifiers
ClinicalTrials.gov
Clinical evidence
PubMed
Literature evidence
Limitations
Predictions require clinical validation
Limited to drugs with DrugBank mappings
Based on TxGNN model trained primarily on US data
Evidence levels require manual curation
Citation
@article{huang2023txgnn,title={A foundation model for clinician-centered drug repurposing},author={Huang, Kexin and others},journal={Nature Medicine},year={2023}}