Model card
ST-MoE-32B.
Google Brainopen-sourceUnknown paramsSparse Mixture-of-Experts Transformer
Stable and transferable sparse expert model. SuperGLUE SOTA in 2022.
§ 01 · Benchmarks
Every benchmark ST-MoE-32B has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | SuperGLUE | Natural Language Processing · Text Classification | average-score | 91.2% | #2 | 2022-02-17 | source ↗ |
| 02 | GLUE | Natural Language Processing · Text Classification | SuperGLUE avg | 91.2% | #2 | 2022-02-01 | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area
Where ST-MoE-32B actually performs.
§ 03 · Papers
2 papers with results for ST-MoE-32B.
- 2022-02-17· Natural Language Processing· 1 result
ST-MoE: Designing Stable and Transferable Sparse Expert Models
- 2022-02-17· 1 result
ST-MoE: Designing Stable and Transferable Sparse Expert Models
§ 05 · Sources & freshness
Where these numbers come from.
arxiv
2
results
2 of 2 rows marked verified. · first result 2022-02-01, latest 2022-02-17.