We’re building a research coalition on evaluating evaluations (EvalEval)! Hosted by Hugging Face, University of Edinburgh, and EleutherAI.
Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting
Standardized evaluation cards for AI models and benchmarks
Receive and process benchmark data via webhook