Models
Every model card carries its evidence limits.
Model version is currently unknown unless independently documented in the source artifacts.
| Model | Provider | Run | Completion | Parse | Evidence | Caveat |
|---|---|---|---|---|---|---|
| Claude Haiku 4.5 | anthropic | j97fccxfgwzwrtsqcfc2xsw0ex85fpg4 | 100% | 100% | Level 3 | model version unknown |
| Claude Opus 4.6 | anthropic | j978sz65j5185m8djs923vtzcd85hgk2 | 100% | 100% | Level 3 | model version unknown |
| Claude Opus 4.7 | anthropic | j974xckwt2hgw6vq005x4eqhm185ffm9 | 100% | 100% | Level 3 | model version unknown |
| Claude Sonnet 4.6 | anthropic | j972k3c86rfn80hmekwhvppmh185es59 | 100% | 100% | Level 3 | model version unknown |
| Command A | cohere | j97emhr96a7px410xaqhvb4xfh85e6vk | 100% | 100% | Level 3 | model version unknown |
| Cydonia 24B V4.1 | thedrummer | j972f5vmnqzb67t3q6ersqjg1s85h60s | 100% | 100% | Level 3 | model version unknown |
| DeepSeek R1 | deepseek | j976q86kqd2r9w321trar7137985g7tb | 100% | 100% | Level 3 | model version unknown |
| DeepSeek V3.1 | deepseek | j97d0rz5th1hs8dt33pnk6eyvn85hr0a | 100% | 100% | Level 3 | model version unknown |
| DeepSeek V3.1 Terminus | deepseek | j975wet2gnykcvg6hbwakq6dz985hqa8 | 100% | 100% | Level 3 | model version unknown |
| DeepSeek V3.2 | deepseek | j97926c87q3kq18h8k3wps5frh85eevg | 100% | 100% | Level 3 | model version unknown |
| DeepSeek V3.2 Exp | deepseek | j97bxyg56bwpk6shf2avdwr3r985g86y | 100% | 100% | Level 3 | model version unknown |
| DeepSeek V4 Flash | deepseek | j97558j95yjknfeayjqww1mjpx85e6f6 | 100% | 100% | Level 3 | model version unknown |
| DeepSeek V4 Pro | deepseek | j9751m5tnrattr50rkevqc83zn85fqkb | 100% | 100% | Level 3 | model version unknown |
| Gemini 2.5 Flash | j97326aq5cgd6g307zqwg0vnn185kzk0 | 100% | 100% | Level 3 | model version unknown | |
| Gemini 2.5 Flash Lite | j97e691sdmdkq7aapkxmgg3cc585jxs7 | 100% | 100% | Level 3 | model version unknown | |
| Gemini 3 Flash Preview | j9703v87ezvqrq8spna4g3j9en85fyd1 | 100% | 100% | Level 3 | model version unknown | |
| Gemini 3.1 Flash Lite Preview | j9712jf329acx64c34fq8gsg5h85esx6 | 100% | 100% | Level 3 | model version unknown | |
| Gemini 3.1 Pro Preview | j974bynzat3xpze5hb0zqp130185fg7r | 100% | 100% | Level 3 | model version unknown | |
| Gemma 4 26B A4B | j97569yb32w3gsj6hqjd6dybes85jfq7 | 100% | 100% | Level 3 | model version unknown | |
| Gemma 4 31B | j979ttg2qxqzmtn14pr5vhhr3x85j520 | 100% | 100% | Level 3 | model version unknown | |
| GLM 4.7 Flash | z-ai | j97adzpw9epv1m1gm2npz5p86n85kzqm | 100% | 100% | Level 3 | model version unknown |
| GLM 5 | z-ai | j97dxxae37b10j854sbjsejzwd85h6td | 100% | 100% | Level 3 | model version unknown |
| GLM 5.1 | z-ai | j975jp9v89wvjq5f3jr8nn2ewd85ekgk | 100% | 100% | Level 3 | model version unknown |
| Goliath 120B | alpindale | j972zxe0bxxkwqszyfpyckrzh185hqks | 100% | 100% | Level 3 | model version unknown |
| GPT OSS 120B | openai | j975mtsnr4s2jsxwthrxes5y5n85jbfm | 100% | 100% | Level 3 | model version unknown |
| GPT OSS 20B | openai | j97arx37czhpjcwaasmjx68p0x85j8z0 | 100% | 100% | Level 3 | model version unknown |
| GPT-4.1 Mini | openai | j97cdjx35w4j8f7gw163tpvyjh85j6xt | 100% | 100% | Level 3 | model version unknown |
| GPT-5.2 | openai | j9738hkxta6qate5rm4h90nchh85e75a | 100% | 100% | Level 3 | model version unknown |
| GPT-5.3 Chat | openai | j97030dbj0smf8jgfss3cwn4p185f73k | 100% | 100% | Level 3 | model version unknown |
| GPT-5.3 Codex | openai | j97fj32eyva9ectzrn9y2fz0dh85emf3 | 100% | 100% | Level 3 | model version unknown |
| GPT-5.4 | openai | j976yztmsq9f35crkvwgr6wcdn85ef9d | 100% | 100% | Level 3 | model version unknown |
| GPT-5.4 Mini | openai | j97487ck53tjz2t8zkf2fkey2n85epvs | 100% | 100% | Level 3 | model version unknown |
| GPT-5.4 Nano | openai | j97ev857pwr89j0b1c1rvbm28h85g28v | 100% | 100% | Level 3 | model version unknown |
| GPT-5.5 | openai | j97drygvx6x2mcmnygsqn2zyjs85ek57 | 100% | 100% | Level 3 | model version unknown |
| Grok 4 Fast | x-ai | j976d8qr13tt14n7z0j4sf00xs85evs8 | 100% | 100% | Level 3 | model version unknown |
| Grok 4.1 Fast | x-ai | j979szf5scyfhftyytsj0v9cv585eq5y | 100% | 100% | Level 3 | model version unknown |
| Grok 4.20 | x-ai | j97e8dpd84ta0958wrdey84ax185fm7p | 100% | 100% | Level 3 | model version unknown |
| Hermes 4 405B | nousresearch | j97ejpjfe0hs83r9cxy7wfatmn85hxj1 | 100% | 100% | Level 3 | model version unknown |
| Hermes 4 70B | nousresearch | j973srp02sjd306vt2dyc6mfd185gvkv | 100% | 100% | Level 3 | model version unknown |
| Jamba Large 1.7 | ai21 | j975f44yspeb5es849tex4bgy185h01r | 100% | 100% | Level 3 | model version unknown |
| Kimi K2.5 | moonshotai | j970pka97k8h3fmamjx2tqspsd85eyeb | 100% | 100% | Level 3 | model version unknown |
| Kimi K2.6 | moonshotai | j972rpbb4zj92abj1wzfyszvxh85ej4k | 100% | 100% | Level 3 | model version unknown |
| LFM 2.5 1.2B Instruct | liquid | j97cyfrkh9bds0hxtk1q4y0qvh85f5xm | 100% | 100% | Level 3 | model version unknown |
| LFM 2.5 1.2B Thinking | liquid | j9779tqd3776s4xf81ns9ghzm985ejew | 100% | 100% | Level 3 | model version unknown |
| LFM2 24B A2B | liquid | j972z0ab4awj4dm83gehryxxrs85h4e3 | 100% | 100% | Level 3 | model version unknown |
| Ling 2.6 1T | inclusionai | j976n5byp7xkgxwkswqhwdvkqs85exhg | 100% | 100% | Level 3 | model version unknown |
| Ling 2.6 Flash | inclusionai | j97dwh68yka3fahp3ykqz8gte585fgpr | 100% | 100% | Level 3 | model version unknown |
| Llama 3.3 Nemotron Super 49B V1.5 | nvidia | j970yx8r25xjv5cpqchbkzd2pd85hw6v | 100% | 100% | Level 3 | model version unknown |
| Llama 4 Maverick | meta-llama | j979bzkv7nt0652dfcc8a492bh85em8b | 100% | 100% | Level 3 | model version unknown |
| Llama 4 Scout | meta-llama | j97046nzsgev30rvz34023r09s85fe2r | 100% | 100% | Level 3 | model version unknown |
| Mercury 2 | inception | j974q6jv567bwj7averm73v29185gxvq | 100% | 100% | Level 3 | model version unknown |
| MiMo V2 Pro | xiaomi | j971zvzjjhvdmsja2yhwrjrdmx85f81b | 100% | 100% | Level 3 | model version unknown |
| MiMo V2.5 | xiaomi | j9798nyatbek9q6d90b3np17v985gekp | 100% | 100% | Level 3 | model version unknown |
| MiMo V2.5 Pro | xiaomi | j977zckgs0n16jkjhw18fyr15585g9yn | 100% | 100% | Level 3 | model version unknown |
| Ministral 3 14B 2512 | mistralai | j97evkvga01cbryq8k0rnqdcw585jzyf | 100% | 100% | Level 3 | model version unknown |
| Ministral 3 3B 2512 | mistralai | j974d5w0nnfmp415t12phqa3f585k252 | 100% | 100% | Level 3 | model version unknown |
| Ministral 3 8B 2512 | mistralai | j9775m5heecrtm0b58p70d7f4185jya2 | 100% | 100% | Level 3 | model version unknown |
| Mistral Large 3 2512 | mistralai | j97arjecv51yhx2arm6pxptpyx85fwkp | 100% | 100% | Level 3 | model version unknown |
| Mistral Saba | mistralai | j970bzqn5e2bk5bzn4pmyv8qds85g4y1 | 100% | 100% | Level 3 | model version unknown |
| Mistral Small 4 | mistralai | j976b2a43zzaypq24xz28zc68d85fm62 | 100% | 100% | Level 3 | model version unknown |
| MythoMax 13B | gryphe | j978d78pg1f5d9gkqkbgdns2nh85hv4p | 100% | 100% | Level 3 | model version unknown |
| Nemotron 3 Nano 30B A3B | nvidia | j97a2fb4waf743hw5f8v8dxhzs85f6rc | 100% | 100% | Level 3 | model version unknown |
| Nemotron 3 Super | nvidia | j97bpjxwb44wy95jf0ftxghj5185fcfx | 100% | 100% | Level 3 | model version unknown |
| Nemotron 3 Super 120B | nvidia | j976fzsxb60vx0vshb059v097585f2hd | 100% | 100% | Level 3 | model version unknown |
| Nemotron Nano 9B V2 | nvidia | j97d41m0hcy7zwqe15p3ha2sjs85jjr5 | 100% | 100% | Level 3 | model version unknown |
| Nova 2 Lite | amazon | j97f3q6jm1m0xept1vqw1arxxd85ggb7 | 100% | 100% | Level 3 | model version unknown |
| Nova Premier 1.0 | amazon | j97cfqvrcej9cr2eab6gt3n6cn85edd7 | 100% | 100% | Level 3 | model version unknown |
| OLMo 3.1 32B Instruct | allenai | j9769xggfdhbzr2bz8kjgz4k9h85gqeg | 100% | 100% | Level 3 | model version unknown |
| OpenAI OSS 120B | openai | j9715xceb2fdrpdqjg380qp8nn85ftxc | 100% | 100% | Level 3 | model version unknown |
| OpenAI OSS 20B | openai | j97b9evejnbbxnvkx3eng1fg5s85f5wm | 100% | 100% | Level 3 | model version unknown |
| Phi 4 | microsoft | j973s6nx8vv8g4mvmscc0z4v7x85h828 | 100% | 100% | Level 3 | model version unknown |
| Qwen3 Max Thinking | qwen | j97br8p7zfpvmanpw37gqd136x85exmc | 100% | 100% | Level 3 | model version unknown |
| Qwen3 Next 80B A3B Instruct | qwen | j971md79pynf3gevqas44y8f6h85g5ry | 100% | 100% | Level 3 | model version unknown |
| Qwen3.5 27B | qwen | j97be7gfths29n1wvwbrv0dzmh85gkee | 100% | 100% | Level 3 | model version unknown |
| Qwen3.5 35B A3B | qwen | j97adjehvznawfpb4p70ndqrxs85h8eh | 100% | 100% | Level 3 | model version unknown |
| Qwen3.5 397B A17B | qwen | j974jm92bywb9etdebymgxg8h985fnfx | 100% | 100% | Level 3 | model version unknown |
| Qwen3.5 Flash | qwen | j97evxnvs1v51qepm5yknbxv1985gcva | 100% | 100% | Level 3 | model version unknown |
| Qwen3.5 Plus | qwen | j97dh1vsp50ppzkymkx18d94zd85ffk4 | 100% | 100% | Level 3 | model version unknown |
| Qwen3.6 Plus | qwen | j97da4s45y2565za5q74j76qph85f1ep | 100% | 100% | Level 3 | model version unknown |
| Rocinante 12B | thedrummer | j9719vvw14qd8pem010344rff185gjr9 | 100% | 100% | Level 3 | model version unknown |
| Seed 2.0 Lite | bytedance-seed | j979jzwqhtqms8rw95c5rmmj2s85jxxj | 100% | 100% | Level 3 | model version unknown |
| Seed 2.0 Mini | bytedance-seed | j972cxx7yrkgm4kywwtcnpa0zn85gzjv | 100% | 100% | Level 3 | model version unknown |
| Solar Pro 3 | upstage | j97177wat2y8sep0zs1wh5rv2n85g9qj | 100% | 100% | Level 3 | model version unknown |
| Trinity Large Preview | arcee-ai | j974r07tcewfvap24mt4mffwj985hg7s | 100% | 100% | Level 3 | model version unknown |