Re-enable EvaluateOverhead to subtract method-call overhead from benchmark results by lewing · Pull Request #5142 · dotnet/performance

lewing · 2026-03-05T22:10:11Z

Summary

Re-enable EvaluateOverhead in the default benchmark job configuration. BenchmarkDotNet PR #3007 (merged Feb 16, 2026) changed the EvaluateOverhead default from true to false, which means BDN no longer runs idle/overhead iterations to measure and subtract the cost of the benchmark method call itself from workload measurements.

Problem

This one-line default change, combined with a 2.5-month WASM perf data gap (Dec 5, 2025 – Feb 26, 2026), caused the auto-filer to report 2,300+ false regressions:

dotnet/perf-autofiling-issues#69444: 1,416 interpreted WASM regressions
dotnet/perf-autofiling-issues#69430: 864 AOT WASM regressions

On native JIT platforms the method-call overhead is <1ns (imperceptible), but on WASM interpreter it's ~8-10ns and on AOT WASM ~1-3ns. Without overhead subtraction, every WASM microbenchmark reports raw time including this call overhead, creating an additive bias that makes short-baseline benchmarks appear dramatically regressed:

Baseline	+8ns overhead	Apparent regression
10 ns	→ 18 ns	1.80x
50 ns	→ 58 ns	1.16x
150 ns	→ 158 ns	1.05x (at detection threshold)
500 ns	→ 508 ns	1.02x (below threshold)

Evidence

The regression distribution across all 2,300+ benchmarks perfectly matches an additive constant, not a multiplicative factor. Time-series data from the perf portal confirms a step function coinciding exactly with the BDN methodology change — no gradual drift, and no runtime code changes in the window. See the detailed analysis in the issue comments.

Fix

One-line change in RecommendedConfig.cs to explicitly set .WithEvaluateOverhead(true) in the default Job configuration, restoring the previous measurement methodology for all platforms.

This is arguably the right default for a performance lab anyway — overhead subtraction gives more accurate results for the actual workload being measured, especially for sub-microsecond benchmarks.

/cc @AaronRobinsonMSFT @AaronRobinsonMSFT

…hmark results BenchmarkDotNet PR dotnet/BenchmarkDotNet#3007 (merged Feb 16, 2026) changed the EvaluateOverhead default from true to false. This means BDN no longer runs idle/overhead iterations to measure and subtract the cost of the benchmark method call itself. On native JIT platforms the call overhead is <1ns and imperceptible, but on interpreted WASM it is ~8-10ns and on AOT WASM ~1-3ns. Combined with a 2.5-month WASM perf data gap (Dec 5, 2025 – Feb 26, 2026), this caused the auto-filer to report 2300+ false regressions: - dotnet/perf-autofiling-issues#69444: 1416 interpreted WASM regressions - dotnet/perf-autofiling-issues#69430: 864 AOT WASM regressions The regression ratio inversely correlates with baseline duration — a constant additive overhead, not a real runtime change. Re-enabling EvaluateOverhead restores the previous measurement methodology for all platforms. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

timcassell · 2026-03-05T22:16:54Z

This is arguably the right default for a performance lab anyway — overhead subtraction gives more accurate results for the actual workload being measured, especially for sub-microsecond benchmarks.

That was the exact discussion we had before the change was made. cc @tannergooding

DrewScoggins · 2026-03-05T22:29:57Z

Yeah, from the experience of triaging, we often run into spurious regressions where we end up at 0ns time spent, as well as what is mentioned in the linked issue about fast tests being super noisy. Now that being said, if we have larger, variable overheads in the WASM tests we should try and turn them on for just that scenario. I don't have a lot of experience looking at the WASM tests so I trust your judgement @lewing. I imagine that should be fairly easy, and am happy to add to the PR when I get back from a walk.

tannergooding · 2026-03-05T22:43:24Z

That was the exact dotnet/BenchmarkDotNet#1802 we had before the change was made. cc @tannergooding

Right. While the intuitive thing would be that measuring and subtracting overhead is "better", it's often actually quite the opposite in practice due to the precision of the hardware timers and other factors.

Here it sounds rather like WASM interpreter is simply slow enough, in contrast to other scenarios, that no longer subtracting the overhead is showing up as measurable.

What would be interesting to know is whether the WASM results are overall more stable now that the overhead isn't being subtracted. If they are more stable, then while there is a "spike" we would be at a better and probably more representative baseline for what user code actually sees. This would also help reduce noise in future triage. However, if they stay the overall same stability levels, then it doesn't really matter and we can toggle subtraction back on for WASM and revisit that in the future if the overhead is ever reduced.

lewing · 2026-03-06T00:56:43Z

closing in favor of #5143

lewing requested review from AndyAyersMS, DrewScoggins and LoopedBard3 March 5, 2026 22:11

lewing requested a review from tannergooding March 5, 2026 22:25

lewing mentioned this pull request Mar 5, 2026

[Perf] Linux/x64: 1434 Regressions on 2/26/2026 7:23:53 AM +00:00 dotnet/perf-autofiling-issues#69444

Open

lewing closed this Mar 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enable EvaluateOverhead to subtract method-call overhead from benchmark results#5142

Re-enable EvaluateOverhead to subtract method-call overhead from benchmark results#5142
lewing wants to merge 1 commit intodotnet:mainfrom
lewing:evaluate-overhead-fix

lewing commented Mar 5, 2026

Uh oh!

timcassell commented Mar 5, 2026

Uh oh!

DrewScoggins commented Mar 5, 2026

Uh oh!

tannergooding commented Mar 5, 2026

Uh oh!

lewing commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

lewing commented Mar 5, 2026

Summary

Problem

Evidence

Fix

Uh oh!

timcassell commented Mar 5, 2026

Uh oh!

DrewScoggins commented Mar 5, 2026

Uh oh!

tannergooding commented Mar 5, 2026

Uh oh!

lewing commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants