Downloads

Release files are boring on purpose.

The paper release stores canonical samples, exclusions, duplicate decisions, version metadata, and schemas for missing human and external evidence.

Files

FilePurposeRowsSHA-256
canonical_sample.csv canonicalSample 84 afc3bddc9d4480a5c49d0f3c0d6941faa5a74e7cf8333a033041414c535996c5
canonical_responses.csv canonicalResponses 22680 9e902ee604844482bbdb728040def04d9f4d4c318989c8a852f54ad762b9e6ce
axis_scores_recomputed.csv axisScoresRecomputed 756 f20df6554b42fff9e71fa71c09def6d35f6179c8f49f708c9b62d3acb3e40d7f
axis_intervals.csv axisIntervals 756 71841a6eb0590ffc936c3fe6ab3c88092a5baa4ba65c8c7cf7972b6ddac40b49
response_style_controls.csv responseStyleControls 84 f36234c504de7944a8c30f8626c324e5bcda6a6bb93b87cab3c2e75d21df01c3
item_diagnostics.csv itemDiagnostics 270 e153955252cffb1747edfcc1ed05b6fff10bcf9dc94dd477f394323cee83c365
duplicate_resolution.csv duplicateResolution 24 09737a66e9c225852afb870e3f932419a3d564ebb31afe6a0e5eaeeff4924266
exclusions.csv exclusions 63 635bccf966bb1e13c502c1c53342abef6f10ae23e3cea61ac4e672260d54cde8
schema_manifest.csv schemaManifest 7 2b401f0034087c8f45cf0dcdc3d4961f547425a9035e1266221e043815dd87d2
release_validation.json releaseValidation Unknown d9da7f6f59cbdfe7887c8cb0ece0b5d1323346cd25cb8b2a825618d955c33db1
validation_manifest.json validationManifest Unknown bb4d69f2a808ef507f9fdcf8e0ef334de7cd4d75d3e06e80b88c593d8f5423cc
scoring_config.json scoringConfig Unknown f4b558c903bcc8c0957fde6c0dd11df2f8b52cf30a27cc606b8b0a31bc2307f3
benchmark_version.json benchmarkVersion Unknown c2dcffaab0096e3955348ee367a90536cab950f897c21c2b3e73b8b76b069a99
human_expert_coding.csv humanExpertCoding 0 9e2a843e528300150c9eef8f979ad3510ae7c77701e7503b05a8bf6977a6fa81
external_anchors.csv externalAnchors 0 71172b617481954cba819a14c609ae8ab2f0f9b0d885417239da267bf0da6f17
open_ended_responses.csv openEndedResponses 0 71695289bed8d8602fe097b1b4eb0493d7fc64ee0c3ac3ee7f317dd1f61a9b21
paper_release.json releaseSummary Unknown 83116db54cd72d348d31804e6a8c06b1fcfe8654ea3b00658514ab0176750429
runs.csv runs 147 225b2c784d97660c490b65a8075e287402971be60962e159891fc2c4bc27ab66
responses.csv responses 34875 71e1b3e8807b067e3ced97e4cc36a5b6ce3dd7c04c1d93a9517bde0efbebdead
questions.csv questions 522 f774a33f5894bddb8cb444c3f7b99c13660c860b6aea1756fb1734b835ac01e8
axis_scores.csv axisScores 1009 32c7c1581a59ace148a34afbb36df193d5a9e0d1508b0ec5024ee7150b881851
axis_stats.csv axisStats 1314 f12c19d44790ec4e501fb717cf56b1f83f2283c215a38a0ad963e668d59463a6
axis_definitions.csv axisDefinitions 9 07df22fed304479cac658efe5b04f66b3086c5f4410742d64aea362ef3d21ade
model_catalog.csv modelCatalog 127 f99cd3db01c90192c3fd84d2743c6483d6506735eb1019ece16df8e9fb7e3ecc
artifact_packs.csv artifactPacks 50 7bcc4d28c1da87ddf5cf1745fbfa8779fdefee50a0dc9f0789d626aa0ec9bdbf
pack_runs.csv packRuns 158 8d36b3ab362db4bdd36626a812c5e67bf69133918c107a5ccd07ffd36f30532f
data_dictionary.csv dataDictionary 65 40dfd18813a7d7ad32f75b425c6edbca201a1fc3851f661411b2cc8539dedd0c