Skip to content

fix(eval): exclude booleans from parsed benchmark metrics#30

Merged
sacredvoid merged 1 commit intomainfrom
fix/parse-results-bool-filter
Mar 26, 2026
Merged

fix(eval): exclude booleans from parsed benchmark metrics#30
sacredvoid merged 1 commit intomainfrom
fix/parse-results-bool-filter

Conversation

@sacredvoid
Copy link
Owner

Summary

  • Adds and not isinstance(v, bool) guard to the metric filter in parse_results
  • bool is a subclass of int in Python, so isinstance(True, (int, float)) returns True
  • Boolean metadata from lm-eval (e.g. config flags) was leaking into benchmarks as 1/0

Fixes #29

Test plan

  • New test: test_filters_booleans verifies True/False are excluded
  • All 153 tests pass

bool is a subclass of int in Python, so isinstance(True, (int, float))
returns True. Boolean metadata from lm-eval results was leaking into
benchmark metrics as 1/0. Added explicit bool exclusion.

Fixes #29
@sacredvoid sacredvoid merged commit 2fba9ba into main Mar 26, 2026
@sacredvoid sacredvoid deleted the fix/parse-results-bool-filter branch March 26, 2026 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: parse_results includes booleans as numeric metrics (bool is subclass of int)

1 participant