Skip to content

Handle missing forecasts before summarisation#1156

Open
seabbs-bot wants to merge 31 commits intomainfrom
issue-1122-missing-scores
Open

Handle missing forecasts before summarisation#1156
seabbs-bot wants to merge 31 commits intomainfrom
issue-1122-missing-scores

Conversation

@seabbs-bot
Copy link
Copy Markdown
Collaborator

Summary

Closes #1122

  • Adds filter_missing_scores() with strategy pattern for dropping incomplete targets
  • Adds impute_missing_scores() with strategy pattern for filling missing score rows
  • Filter strategies: filter_to_intersection(min_coverage, models)
  • Impute strategies: impute_worst_score(), impute_mean_score(), impute_na_score(), impute_model_score(model)
  • Shared internal build_missing_grid() detects missing model × target combinations
  • .imputed column added to protected columns for tracking

Test plan

  • 21 filter tests pass (coverage thresholds, model subsetting, class preservation)
  • 35 impute tests pass (all strategies, custom functions, integration with summarise_scores())
  • Full test suite (788 tests) passes with no regressions
  • Integration pipeline: filter_missing_scores(strategy = filter_to_intersection(models = "baseline")) |> impute_missing_scores(strategy = impute_model_score("baseline"))

This was opened by a bot. Please ping @seabbs for any questions. # checks-passed

@seabbs seabbs self-requested a review March 30, 2026 20:51
@seabbs-bot seabbs-bot force-pushed the issue-1122-missing-scores branch from a2b0160 to 1c2996e Compare March 30, 2026 20:53
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.15%. Comparing base (4a222da) to head (1cf4beb).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1156      +/-   ##
==========================================
+ Coverage   97.98%   98.15%   +0.16%     
==========================================
  Files          37       40       +3     
  Lines        1984     2163     +179     
==========================================
+ Hits         1944     2123     +179     
  Misses         40       40              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@seabbs-bot
Copy link
Copy Markdown
Collaborator Author

Review

Design feedback (from @seabbs)

Should this even be filter_missing_scores or might people want to filter for other reasons, with the intersection being a special case? Not sure the filter even needs to know the missing grid.

Agreed. Renamed filter_missing_scores()filter_scores() — a general-purpose filtering function where filter_to_intersection() is just one strategy. Removed the redundant build_missing_grid() call from the filter path entirely. The strategy handles everything independently.

Does the filter always need to do the expand grid thing?

No. filter_to_intersection() works purely by counting coverage per target combination. It never calls build_missing_grid(). The expand-grid detection is only needed on the impute side.

Code review findings (addressed)

  1. compare hardcoded as "model" in all impute strategiesimpute_worst_score(), impute_mean_score(), impute_model_score() all did setdiff(fu, "model") instead of using the actual compare value. The strategy signature didn't receive compare at all. Fixed: strategy signature changed to function(scores, missing_rows, metrics, compare), all hardcoded "model" references replaced.

  2. build_missing_grid() called redundantly in filter — result stored but never passed to strategy. Fixed: removed entirely from filter path.

  3. No validation that compare column exists — both functions now have assert_subset(compare, names(scores)).

  4. Inconsistent @keywords — all functions now use @keywords handle-metrics.

  5. Weak test for impute_worst_score() — was checking global max, now verifies per-target-combination maximum.

This was opened by a bot. Please ping @seabbs for any questions.

@nikosbosse
Copy link
Copy Markdown
Collaborator

I'll wait until you tag me for review, right?

Copy link
Copy Markdown
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all works as expected. I was a bit uncertain that compare was the correct name but I think its the best option overall.

@seabbs seabbs requested review from nikosbosse and sbfnk March 31, 2026 17:13
Copy link
Copy Markdown
Contributor

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshots look like a ggplot2 change

@seabbs-bot seabbs-bot force-pushed the issue-1122-missing-scores branch from bc98a89 to 8d0c866 Compare April 1, 2026 09:44
seabbs-bot and others added 20 commits April 1, 2026 11:01
…actories

Tests for impute_missing_scores(), impute_worst_score(), impute_mean_score(),
impute_na_score(), impute_model_score(), and integration with summarise_scores.
…_to_intersection

Add shared internal build_missing_grid() for detecting missing
model-target combinations. Add filter_missing_scores() with
strategy pattern and filter_to_intersection() strategy factory.
Add .imputed to protected columns and globalVariables.
Update test for protected columns. All tests green.

Ref #1122
Add impute_missing_scores() with .imputed column tracking, plus four
strategy factories: impute_worst_score(), impute_mean_score(),
impute_na_score(), and impute_model_score(). Fix data.table scoping
issues in closures by using explicit namespacing and avoiding variable
name collisions with column names.
Generalise the function as a strategy-based score filter that
delegates all logic to the strategy function. Remove the redundant
build_missing_grid() call and early-return check. Add validation
that the compare column exists in scores. Rename source and test
files accordingly.
- Add compare to strategy function signature (4th argument)
- Replace hardcoded "model" with compare in all strategies
- Add assert_subset validation for compare column
- Change @Keywords from scoring to handle-metrics
- Strengthen impute_worst_score test to verify per-target max
- Rename filter_to_intersection(models=) to include= for genericity
- Add validation for unknown values in include argument
- Fix cli_abort messages with embedded whitespace
- Add non-default compare tests for both filter and impute
- Pass compare through to all impute strategies (was already done,
  now tested)
Restructure the handling-missing-forecasts vignette to introduce
concepts incrementally with summarise_scores() shown after each
approach. Add cli_inform() messages to impute_missing_scores() to
match the pattern used by filter_scores().

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Add tests for edge cases in imputation strategies:
- impute_worst_score/impute_mean_score skip metrics not in columns
- impute_model_score errors for nonexistent reference model

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
seabbs-bot and others added 11 commits April 1, 2026 11:01
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Moved to a separate issue (#1159).

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
- Rename "Relaxing with min_coverage" to "Requiring partial coverage"
  with clearer explanation of why 0.75 keeps all targets here
- Mention compare argument early in the Scoring section
- Reorder imputation strategies by severity: NA (diagnostic), worst
  (heavy penalty), reference model (moderate), mean (least severe)
- Show imputed row counts by model and target_type instead of raw rows
- Add context to each strategy about when to use it
- Fix combined workflow text

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
- filter_to_intersection(include) with scores_quantile verifies
  EpiNow2's targets are correctly selected
- filter_to_intersection(min_coverage) boundary: 0.75 keeps cases,
  1.0 drops them
- filter then summarise gives equal target counts per model
- impute then summarise gives equal row counts per model
- impute_model_score values match reference model scores exactly

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Vignette:
- Rename "Relaxing with min_coverage" to "Requiring partial coverage"
  with explanation of why 0.75 keeps all targets here
- Mention compare argument early in the Scoring section
- Reorder imputation: NA, worst, reference model, mean (severity order)
- Show imputed row counts by model/target_type not raw rows
- Add context to each strategy about when to use it
- Soften absolute claims

Tests:
- Wrap filter_scores/impute_missing_scores calls in suppressMessages
  to reduce test output noise, matching existing test conventions

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Move filter_scores, filter_to_intersection, and impute_* functions
from handle-metrics to a new postprocess-scores keyword and pkgdown
section, separating them from metric selection functions.

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
…r messages

- impute_mean_score now verifies imputed values match per-target mean
- imputation preserves original score values (compared as numeric
  to handle logical-to-integer coercion from rbindlist)
- filter_scores reports correct "Filtered out N rows" message

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
Co-authored-by: Sam Abbott <contact@samabbott.co.uk>
@seabbs-bot seabbs-bot force-pushed the issue-1122-missing-scores branch from 8d0c866 to 1cf4beb Compare April 1, 2026 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle missing forecasts before summarisation

3 participants