Graphs

Working with network data? Test graph learning models on node classification, link prediction, and molecular property tasks.

4 tasks5 datasets12 results

Graph neural networks process relational and structured data, from social networks to molecular structures. The Open Graph Benchmark (OGB) standardized evaluation, and GNNs are now production-ready for drug discovery, recommendation systems, and fraud detection.

State of the Field (2025)

Message-passing GNNs (GIN, GAT, GraphSAGE) remain dominant for node and graph classification, with graph transformers (GPS, Exphormer) closing the gap on long-range dependency tasks
OGB leaderboards show saturation on smaller datasets (ogbg-molhiv AUC >0.82) while large-scale challenges (ogbn-papers100M, ogbl-citation2) remain actively competitive
Molecular property prediction drives commercial adoption: GNNs power virtual screening pipelines at major pharma companies, with 3D-aware models (SchNet, DimeNet++) outperforming 2D approaches on quantum chemistry tasks
Graph foundation models emerge: pre-trained GNNs on massive molecular and knowledge graph datasets enable few-shot transfer to downstream tasks, reducing labeled data requirements by 5-10x

Quick Recommendations

Node classification on citation/social graphs

GraphSAGE or GAT with neighbor sampling

Scalable to million-node graphs via mini-batch training. GAT attention weights provide interpretability. Well-supported in PyG and DGL frameworks.

Molecular property prediction

DimeNet++ or SphereNet for 3D, GIN for 2D fingerprints

3D models capture geometric information critical for binding affinity and quantum properties. GIN provides strong baseline when only SMILES/2D structures are available.

Link prediction on knowledge graphs

CompGCN or NBFNet

CompGCN jointly embeds entities and relations with composition operators. NBFNet generalizes path-based reasoning with strong inductive capability on unseen entities.

Large-scale graph learning (100M+ nodes)

GraphSAGE with ClusterGCN or SIGN

Subgraph sampling and pre-computation strategies make training tractable. SIGN eliminates message passing at inference, enabling real-time serving.

Tasks & Benchmarks

Node Classification

Node classification — assigning labels to vertices in a graph using both node features and neighborhood structure — is the flagship task for Graph Neural Networks. GCN (Kipf & Welling, 2017) established the Cora/Citeseer/PubMed benchmark trinity, but these datasets are tiny by modern standards and results have saturated well above 85% accuracy. The field has moved toward large-scale heterogeneous graphs (ogbn-arxiv, ogbn-products from OGB) and the unsettled debate over whether simple MLPs with neighborhood features can match GNNs, as shown by SIGN and SGC ablations.

2 datasets6 resultsSOTA tracked

Link Prediction

Link prediction — inferring missing or future edges in a graph — underpins knowledge graph completion, drug-target discovery, and social network recommendation. TransE (2013) launched the knowledge graph embedding era, and the field matured through DistMult, RotatE, and CompGCN, benchmarked on FB15k-237 and WN18RR. The current frontier is inductive link prediction (generalizing to unseen entities), where GNN-based methods like NBFNet and foundation models like ULTRA (2024) show that a single model can transfer across entirely different knowledge graphs without retraining.

1 datasets3 resultsSOTA tracked

Molecular Property Prediction

Molecular property prediction — estimating toxicity, solubility, binding affinity, or other properties from molecular structure — is the workhorse task of AI-driven drug discovery. GNNs operate on molecular graphs while transformer approaches (ChemBERTa, Uni-Mol) use SMILES strings or 3D coordinates. MoleculeNet (2018) and the Therapeutic Data Commons (TDC) provide standardized benchmarks, but the real bottleneck is distribution shift: models trained on known chemical space struggle with novel scaffolds, and the gap between leaderboard accuracy and actual wet-lab utility remains the field's central challenge.

1 datasets3 resultsSOTA tracked

Graph Classification

Graph classification — predicting a label for an entire graph, not individual nodes — matters for molecular screening, social network analysis, and program verification. GIN (Xu et al., 2019) formalized the connection between GNN expressiveness and the Weisfeiler-Leman graph isomorphism test, and the TU datasets became standard benchmarks. Recent work on graph transformers (GPS, Exphormer) and higher-order GNNs pushes beyond WL limits, while OGB's ogbg-molhiv and ogbg-molpcba provide more rigorous large-scale evaluation than the classic small-graph benchmarks.

1 datasets0 results

Show all datasets and SOTA results

Node Classification

CoraCora Citation Network2000

83.5(accuracy)ACNet

Open Graph BenchmarkOpen Graph Benchmark2020

Link Prediction

OGB ogbl-collabOpen Graph Benchmark - ogbl-collab2020

70.98(hits_at_50)PROXI

Molecular Property Prediction

OGB ogbg-molhivOpen Graph Benchmark - ogbg-molhiv2020

79.7(roc_auc)DGN

Graph Classification

OGB (Open Graph Benchmark)2020

Honest Takes

Graph transformers are overhyped for most tasks

GPS and Exphormer show gains on long-range benchmarks (LRGB) but rarely beat well-tuned GIN or GCN on standard molecular and social graph tasks. The quadratic attention cost makes them impractical at scale. Stick with message-passing unless you have a proven long-range dependency problem.

OGB leaderboards are becoming a hyperparameter competition

Top entries on ogbg-molhiv differ by 0.1-0.3% AUC and rely on extensive ensembling and augmentation tricks. The architectural innovations that matter for real applications are harder to see through the leaderboard noise.

GNNs for drug discovery are real, not hype

Unlike many ML-for-science applications, GNN-based virtual screening is deployed at scale in pharma. Companies report 2-5x hit-rate improvements over traditional docking. The combination of molecular graphs with 3D geometry is a genuine success story.

Get notified when these results update

New models drop weekly. We track them so you don't have to.