← Home

About & Disclosures

Conflict of Interest Statement

This benchmark receives no funding from any AI platform. There is no advertising on this site. The scoring judge model is Claude Haiku 4.5 (Anthropic). Anthropic's own models are rated by this benchmark. To mitigate potential bias, all raw data is published and a secondary non-Anthropic judge validation is run on 25% of prompts.

Advisory Board

Advisory board recruitment in progress.