Graphs
Working with network data? Test graph learning models on node classification, link prediction, and molecular property tasks.
Graph neural networks process relational and structured data, from social networks to molecular structures. The Open Graph Benchmark (OGB) standardized evaluation, and GNNs are now production-ready for drug discovery, recommendation systems, and fraud detection.
State of the Field (2025)
- Message-passing GNNs (GIN, GAT, GraphSAGE) remain dominant for node and graph classification, with graph transformers (GPS, Exphormer) closing the gap on long-range dependency tasks
- OGB leaderboards show saturation on smaller datasets (ogbg-molhiv AUC >0.82) while large-scale challenges (ogbn-papers100M, ogbl-citation2) remain actively competitive
- Molecular property prediction drives commercial adoption: GNNs power virtual screening pipelines at major pharma companies, with 3D-aware models (SchNet, DimeNet++) outperforming 2D approaches on quantum chemistry tasks
- Graph foundation models emerge: pre-trained GNNs on massive molecular and knowledge graph datasets enable few-shot transfer to downstream tasks, reducing labeled data requirements by 5-10x
Quick Recommendations
Node classification on citation/social graphs
GraphSAGE or GAT with neighbor sampling
Scalable to million-node graphs via mini-batch training. GAT attention weights provide interpretability. Well-supported in PyG and DGL frameworks.
Molecular property prediction
DimeNet++ or SphereNet for 3D, GIN for 2D fingerprints
3D models capture geometric information critical for binding affinity and quantum properties. GIN provides strong baseline when only SMILES/2D structures are available.
Link prediction on knowledge graphs
CompGCN or NBFNet
CompGCN jointly embeds entities and relations with composition operators. NBFNet generalizes path-based reasoning with strong inductive capability on unseen entities.
Large-scale graph learning (100M+ nodes)
GraphSAGE with ClusterGCN or SIGN
Subgraph sampling and pre-computation strategies make training tractable. SIGN eliminates message passing at inference, enabling real-time serving.
Tasks & Benchmarks
Node Classification
Node classification — assigning labels to vertices in a graph using both node features and neighborhood structure — is the flagship task for Graph Neural Networks. GCN (Kipf & Welling, 2017) established the Cora/Citeseer/PubMed benchmark trinity, but these datasets are tiny by modern standards and results have saturated well above 85% accuracy. The field has moved toward large-scale heterogeneous graphs (ogbn-arxiv, ogbn-products from OGB) and the unsettled debate over whether simple MLPs with neighborhood features can match GNNs, as shown by SIGN and SGC ablations.
Link Prediction
Link prediction — inferring missing or future edges in a graph — underpins knowledge graph completion, drug-target discovery, and social network recommendation. TransE (2013) launched the knowledge graph embedding era, and the field matured through DistMult, RotatE, and CompGCN, benchmarked on FB15k-237 and WN18RR. The current frontier is inductive link prediction (generalizing to unseen entities), where GNN-based methods like NBFNet and foundation models like ULTRA (2024) show that a single model can transfer across entirely different knowledge graphs without retraining.
Molecular Property Prediction
Molecular property prediction — estimating toxicity, solubility, binding affinity, or other properties from molecular structure — is the workhorse task of AI-driven drug discovery. GNNs operate on molecular graphs while transformer approaches (ChemBERTa, Uni-Mol) use SMILES strings or 3D coordinates. MoleculeNet (2018) and the Therapeutic Data Commons (TDC) provide standardized benchmarks, but the real bottleneck is distribution shift: models trained on known chemical space struggle with novel scaffolds, and the gap between leaderboard accuracy and actual wet-lab utility remains the field's central challenge.
Graph Classification
Graph classification — predicting a label for an entire graph, not individual nodes — matters for molecular screening, social network analysis, and program verification. GIN (Xu et al., 2019) formalized the connection between GNN expressiveness and the Weisfeiler-Leman graph isomorphism test, and the TU datasets became standard benchmarks. Recent work on graph transformers (GPS, Exphormer) and higher-order GNNs pushes beyond WL limits, while OGB's ogbg-molhiv and ogbg-molpcba provide more rigorous large-scale evaluation than the classic small-graph benchmarks.
Show all datasets and SOTA results
Node Classification
Link Prediction
Molecular Property Prediction
Graph Classification
Honest Takes
Graph transformers are overhyped for most tasks
GPS and Exphormer show gains on long-range benchmarks (LRGB) but rarely beat well-tuned GIN or GCN on standard molecular and social graph tasks. The quadratic attention cost makes them impractical at scale. Stick with message-passing unless you have a proven long-range dependency problem.
OGB leaderboards are becoming a hyperparameter competition
Top entries on ogbg-molhiv differ by 0.1-0.3% AUC and rely on extensive ensembling and augmentation tricks. The architectural innovations that matter for real applications are harder to see through the leaderboard noise.
GNNs for drug discovery are real, not hype
Unlike many ML-for-science applications, GNN-based virtual screening is deployed at scale in pharma. Companies report 2-5x hit-rate improvements over traditional docking. The combination of molecular graphs with 3D geometry is a genuine success story.
Get notified when these results update
New models drop weekly. We track them so you don't have to.