The tasks and counterfactuals from the Mechanistic Interpretability Benchmark.
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
Recent Activity
View all activity
datasets 7
mib-bench/ravel
Viewer • Updated • 117k • 545
mib-bench/arithmetic_subtraction
Viewer • Updated • 20.9k • 246
mib-bench/arithmetic_addition
Viewer • Updated • 40.4k • 377
mib-bench/ioi
Viewer • Updated • 21k • 4.58k
mib-bench/arc_easy
Viewer • Updated • 4.01k • 974
mib-bench/arc_challenge
Viewer • Updated • 2k • 479
mib-bench/copycolors_mcqa
Viewer • Updated • 1.89k • 2.66k