Learn by
Reproducing SOTA
Not another ML course. Each lesson targets a real benchmark — you reproduce a published result, then improve on it. Your improvement goes on the leaderboard.
The Method
Understand
Learn the theory with real math, working code, and interactive visualizations. No hand-waving — actual matrix operations, attention formulas, loss functions.
Reproduce
Run the exact evaluation pipeline for a published result. If the paper says 51.68, you should get 51.68 ± 0.5. This proves you understand the full pipeline.
Improve
Beat the baseline. Fine-tune, change the architecture, try a different pooling strategy. If your approach works, it's a real benchmark contribution on CodeSOTA.
Lessons
0-1: What is an Embedding?
How neural networks convert text into numbers. Reproduce a published MTEB score and improve on it.
1-1: Text Embeddings Deep Dive
Sentence-BERT, contrastive learning, and how modern embedding models are trained.
1-4: Text Classification
Use embeddings for zero-shot and few-shot text classification without fine-tuning.
Deep Dives
Editorial research pages. Actual math, numerical walkthroughs, and working code — not summaries.
Why Neural Networks Need Embeddings: The Matrix Operations Problem
How Transformers Work: From Attention to Embeddings
Why 768? The Science Behind Embedding Dimensions
Looking for the free version?
The /learn section covers the same topics without the reproduce/improve workflow.