Rework get_duplicate_forecasts() to S3 generic by seabbs-bot · Pull Request #1148 · epiforecasts/scoringutils

seabbs-bot · 2026-03-30T14:37:05Z

Summary

Adds exported S3 generic get_forecast_type_ids() that returns the type-specific columns (beyond the forecast unit) identifying a unique row. Only types with actual ID columns need a method (quantile, sample, multivariate_sample, nominal, ordinal); the default returns character(0).
get_duplicate_forecasts() now calls get_forecast_type_ids() instead of hard-coding c("sample_id", "quantile_level", "predicted_label")
Adds a type argument to get_duplicate_forecasts() for use on raw data.frames (e.g. type = "quantile"). Internally constructs a temporary forecast object via new_forecast() to dispatch correctly.
Calling get_duplicate_forecasts() on a plain data.frame without type emits a deprecation warning
check_duplicates() error message now includes the detected type in the suggested get_duplicate_forecasts() call
New forecast types only need to define a get_forecast_type_ids method

Closes #888

Test plan

All 742 tests pass
New tests for get_forecast_type_ids() per forecast type
New tests for get_duplicate_forecasts() with type on raw data
Test for deprecation warning on raw data without type
Test for type hint in check_duplicates() error message
Linter passes on all changed files

This was opened by a bot. Please ping @seabbs for any questions.

Replace hard-coded column detection in get_duplicate_forecasts() with S3 dispatch. Each forecast type now defines get_duplicate_columns() returning the type-specific columns needed for duplicate detection. Plain data.frames retain the previous column-detection behaviour via the default method. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

codecov · 2026-03-30T14:41:36Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.96%. Comparing base (4a222da) to head (b1d82c5).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1148      +/-   ##
==========================================
- Coverage   97.98%   97.96%   -0.02%     
==========================================
  Files          37       37              
  Lines        1984     2016      +32     
==========================================
+ Hits         1944     1975      +31     
- Misses         40       41       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Keep get_duplicate_forecasts() as a plain function instead of S3. Rename get_duplicate_columns() to get_forecast_type_ids() and make the default return character(0) so only types with actual ID columns need a method. Remove redundant methods for binary, point, and multivariate_point. Update tests to use forecast objects. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

Make get_forecast_type_ids() public so custom forecast types can define methods. Fix get_duplicate_forecasts() description that still referenced the old column-detection default. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

Add a `type` parameter so users can check for duplicates on raw data.frames by specifying the forecast type (e.g. "quantile", "sample"). Internally constructs a temporary forecast object to dispatch via get_forecast_type_ids(). Calling on a plain data.frame without type emits a deprecation warning. Also update check_duplicates() to include the detected type in the suggested get_duplicate_forecasts() call in the error message. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

@param

Keep old column-detection behaviour when calling get_duplicate_forecasts() on a plain data.frame without type, emitting a deprecation warning. Fix keyword_quote_linter by using backtick syntax for cli message names. Improve @param type docs to explain the class suffix convention. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

@Keywords

Add @Keywords as_forecast so pkgdown can find the topic in the reference index. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

Add lifecycle as a dependency and use lifecycle::deprecate_warn() for the deprecated plain data.frame path in get_duplicate_forecasts() instead of a manual cli_warn(). Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

The error message now shows a copy-pasteable get_duplicate_forecasts() call with both forecast_unit and type arguments so users can diagnose duplicates on their raw data without guessing parameters. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

seabbs

This looks good to me. Its a shame to need the type but I don't think there is another pattern that really makes sense unless we make internal checks more verbose again

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

nikosbosse

Looks good to me - only thing I'd do is consider adding the class instead of converting into a forecast object, but 🤷

nikosbosse · 2026-04-01T16:16:25Z

R/get-duplicate-forecasts.R

+  if (inherits(data, "forecast")) {
+    type_cols <- get_forecast_type_ids(data)
+  } else if (!is.null(type)) {
+    tmp <- new_forecast(data, paste0("forecast_", type))


When exactly does the new forecast object get validated? Since the point of the function is to diagnose issues when forecast validation fails, it's of course important that this doesn't fail imediately.

Thinking about it again. I think it should all work. I just have a vague tingling sense around data.table things and printing and validation.
Maybe something to monitor. Maybe cleaner to just add the class instead of actually making this a forecast object?

doesn't the new_forecast basically just add the class vs doing anything else? I didn't want it to be too fake of a process.

The initial design did only work for validated objects and then I was like well this is pointless! Its a bit of a circular problem

Happy to resolve this and merge or any thoughts on things to do here? Could also make an issue?

seabbs requested review from nikosbosse, sbfnk and seabbs March 30, 2026 15:09

seabbs removed request for nikosbosse and sbfnk March 30, 2026 15:21

This comment was marked as outdated.

Sign in to view

seabbs and others added 4 commits March 30, 2026 17:23

Merge branch 'main' into feat-issue-5

fb81149

fix: add pkgdown keyword for get_forecast_type_ids

6a17981

Add @Keywords as_forecast so pkgdown can find the topic in the reference index. Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

seabbs-bot mentioned this pull request Mar 30, 2026

Rework get_duplicate_forecasts() to S3 to avoid hard-coding columns and types #888

Open

seabbs and others added 3 commits March 30, 2026 19:24

Merge branch 'main' into feat-issue-5

ed202ab

refactor: use lifecycle package for deprecation warning

090d9c4

Add lifecycle as a dependency and use lifecycle::deprecate_warn() for the deprecated plain data.frame path in get_duplicate_forecasts() instead of a manual cli_warn(). Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

seabbs approved these changes Mar 30, 2026

View reviewed changes

seabbs requested review from nikosbosse and sbfnk March 30, 2026 18:34

fix: use double quotes to pass quotes_linter

ca59905

Co-authored-by: Sam Abbott <contact@samabbott.co.uk>

seabbs approved these changes Mar 30, 2026

View reviewed changes

seabbs enabled auto-merge March 30, 2026 18:40

seabbs added 2 commits April 1, 2026 10:52

Merge branch 'main' into feat-issue-5

50376da

Merge branch 'main' into feat-issue-5

3459aac

nikosbosse approved these changes Apr 1, 2026

View reviewed changes

Merge branch 'main' into feat-issue-5

b1d82c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework get_duplicate_forecasts() to S3 generic#1148

Rework get_duplicate_forecasts() to S3 generic#1148
seabbs-bot wants to merge 14 commits intomainfrom
feat-issue-5

seabbs-bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

seabbs left a comment

Uh oh!

nikosbosse left a comment

Uh oh!

nikosbosse Apr 1, 2026

Uh oh!

nikosbosse Apr 1, 2026

Uh oh!

seabbs Apr 1, 2026

Uh oh!

seabbs Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

seabbs-bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

codecov bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

seabbs left a comment

Choose a reason for hiding this comment

Uh oh!

nikosbosse left a comment

Choose a reason for hiding this comment

Uh oh!

nikosbosse Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

nikosbosse Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

seabbs Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

seabbs Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

seabbs-bot commented Mar 30, 2026 •

edited

Loading

codecov bot commented Mar 30, 2026 •

edited

Loading