Skip to content

perf: batch individual file inputs for parallel formatting#1818

Open
ansemb wants to merge 1 commit intobelav:mainfrom
ansemb:main
Open

perf: batch individual file inputs for parallel formatting#1818
ansemb wants to merge 1 commit intobelav:mainfrom
ansemb:main

Conversation

@ansemb
Copy link

@ansemb ansemb commented Feb 16, 2026

Problem

When passing 200+ individual file paths to CSharpier (e.g. with pre-commit hooks), formatting is significantly slower than passing a directory containing the same files.

Note: this was run on Mac M1. Showing the ms is only meant as an indication of large difference between inputting a folder and multiple files.

Directory:

dotnet csharpier format path/to/project/dir
Formatted 567 files in 928ms.

Files:

dotnet csharpier format \
  path/to/project/dir/file1 \
  path/to/project/dir/sub/file2 \
  ...

Before (293 files, individual paths): Formatted 293 files in 40886ms.
After (293 files, individual paths): Formatted 293 files in 859ms.

Root Cause

In CommandLineFormatter.FormatPhysicalFiles, each individual file path was processed inside the input loop with:

  1. A new OptionsProvider per file (expensive — reads .csharpierrc, .editorconfig, .gitignore from disk and walks up directories)
  2. A new FormattingCache per file (reads/writes a separate cache JSON each time)
  3. Sequential formatting (no parallelism)

In contrast, directory inputs already used a single OptionsProvider, a single FormattingCache, and parallel formatting via Task.WhenAll.

Solution

Individual file paths are now collected into a pending list during the input loop. After all inputs are processed:

  1. A common ancestor directory is computed from all file paths using a new GetCommonAncestor helper
  2. A single OptionsProvider is created, pre-seeded at the common ancestor. Config resolution still walks up from each file's own directory via its internal ConcurrentDictionary caches, so per-directory configs (e.g. different .csharpierrc in different subdirectories) are respected correctly.
  3. A single FormattingCache is shared across all files
  4. All files are formatted in parallel via Task.WhenAll, matching the directory code path

Directory inputs continue to work exactly as before.

Tests Added

  • Multiple_Files_Should_Be_Formatted — verifies files from different subdirectories are formatted
  • Multiple_Files_Should_Respect_Per_Directory_Config — verifies per-directory .csharpierrc is respected when using a shared OptionsProvider
  • Mixed_Files_And_Directory_Should_Format_All — verifies mixed directory + individual file inputs
  • GetCommonAncestor_Returns_Common_Path — 5 parameterized cases covering same dir, nested, divergent, root-only, and empty-string fallback

When multiple individual file paths are passed to CSharpier (e.g. from
pre-commit hooks), each file was previously processed sequentially with
its own OptionsProvider and FormattingCache. This caused severe
performance degradation (~15x slower than directory input).

Refactored FormatPhysicalFiles to collect individual file paths, compute
their common ancestor directory, and create a single shared
OptionsProvider and FormattingCache. Files are then formatted in
parallel using Task.WhenAll, matching the directory input code path.

Added tests for multi-file formatting, per-directory config resolution,
mixed file/directory input, and the GetCommonAncestor helper method.
@ansemb ansemb marked this pull request as ready for review February 17, 2026 12:27
@ansemb ansemb changed the title perf: Batch individual file inputs for parallel formattingperf: batch individual file inputs for parallel formatting perf: batch individual file inputs for parallel formatting Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant