Skip to content

original implementation of spectronaut converter updates#122

Merged
tonywu1999 merged 14 commits intodevelfrom
feat-turnover
Apr 2, 2026
Merged

original implementation of spectronaut converter updates#122
tonywu1999 merged 14 commits intodevelfrom
feat-turnover

Conversation

@tonywu1999
Copy link
Copy Markdown
Contributor

@tonywu1999 tonywu1999 commented Apr 1, 2026

Motivation & Context

The Spectronaut converter lacked flexibility needed for protein-turnover workflows: peptide-sequence sources were fixed, isotope-label classification could not be driven by user-supplied heavy-label patterns, and intensity selection relied on a narrow enum-style mapping. This made it difficult to parse bracketed heavy-label annotations in modified peptide sequences (e.g., K[Lys6]) and to run conversions when Spectronaut column names or turnover-focused MS1 intensity columns were used.

Solution Summary

Introduce configurable peptide-sequence and isotope-label handling plus more robust intensity-column resolution:

  • Add a peptideSequenceColumn parameter (default "EG.ModifiedSequence") so callers can choose the peptide source column.
  • Add a heavyLabels parameter (default NULL) that, when provided, parses peptide sequences for bracketed labels and assigns IsotopeLabelType ("H"/"L"/NA) per peptide; when NULL, behavior defaults to treating all peptides as light.
  • Replace fragile enum-based intensity mapping with a resolver that accepts legacy enum names or arbitrary Spectronaut column names and fails if the resolved column is absent.
  • Improve input validation to require minimal necessary columns (FGCharge plus at least one protein identity column) and verify the resolved peptide column exists.
  • Dynamically select and rename protein/peptide source columns to standardized names used downstream.

Detailed Changes

  • R/clean_Spectronaut.R

    • Expanded signature:
      • .cleanRawSpectronaut(msstats_object, intensity, calculateAnomalyScores, anomalyModelFeatures, peptideSequenceColumn = "EG.ModifiedSequence", heavyLabels = NULL)
    • Validation:
      • .validateSpectronautInput(spec_input, peptideSequenceColumn) now requires FGCharge and at least one protein identity column (PGProteinGroups or PGProteinAccessions) and validates the standardized peptide sequence column.
    • Intensity resolution:
      • Replaced previous intensity mapping with .resolveSpectronautIntensityColumn(intensity, colnames(spec_input)), which:
        • Maps legacy enum names ("PeakArea","NormalizedPeakArea") to standardized Spectronaut columns
        • Accepts arbitrary Spectronaut column-name strings (after internal standardization)
        • Errors if resolved column not present
    • Column handling:
      • Added .addSpectronautColumnsIfMissing(spec_input) to synthesize missing but derivable columns prior to filtering
      • Dynamically choose protein column (PGProteinGroups preferred; fallback to PGProteinAccessions) and peptide column from peptideSequenceColumn; rename to ProteinName and PeptideSequence
    • Isotope labeling:
      • Added .assignSpectronautIsotopeLabelType(spec_input, heavyLabels) to populate IsotopeLabelType via regex parsing of PeptideSequence when heavyLabels provided; leaves data unchanged when heavyLabels is NULL
    • Control-flow changes:
      • Validate peptideSequenceColumn early, then synthesize missing Spectronaut columns, then filter (e.g., FFrgLossType == "noloss")
  • R/converters_SpectronauttoMSstatsFormat.R

    • Expanded public interface:
      • Added parameters peptideSequenceColumn = "EG.ModifiedSequence" and heavyLabels = NULL
      • Changed intensity default from c('PeakArea','NormalizedPeakArea','MS1Quantity') to 'PeakArea' (interface now accepts legacy enum names or raw column-name strings)
    • Forwarding:
      • New parameters forwarded into MSstatsClean/MSstatsConvert pipeline
    • Preprocessing behavior:
      • MSstatsPreprocess() conditional fill for IsotopeLabelType: only auto-fills to "L" when heavyLabels is NULL; when heavyLabels is provided, avoid forced overwrite
  • New/updated helper functions (R/clean_Spectronaut.R)

    • .addSpectronautColumnsIfMissing(spec_input)
    • .resolveSpectronautIntensityColumn(intensity, available_cols)
    • .assignSpectronautIsotopeLabelType(spec_input, heavyLabels)
  • Documentation updates

    • man/SpectronauttoMSstatsFormat.Rd: updated intensity docs and added peptideSequenceColumn and heavyLabels argument docs; default/intensity guidance updated (recommends FG.MS1Quantity for turnover)
    • man/MSstatsClean.Rd and man/dot-cleanRawSpectronaut.Rd: updated S4/method and internal docs to include new parameters and describe peptide-column standardization and heavyLabels behavior
    • man/DIANNtoMSstatsFormat.Rd: corrected runOrder description to reference DIA-NN
    • man/dot-cleanRawDIANN.Rd: added calculateAnomalyScores and anomalyModelFeatures documentation entries
  • Tests

    • inst/tinytest/test_clean_Spectronaut.R:
      • Adjusted successful-case intensity argument from "MS1Quantity" to "FG.MS1Quantity" to match resolved input column name
      • Adjusted failing-case expectation to check presence-based error message ("not found in input data") rather than legacy argument-validation regex
    • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R:
      • Enforced baseline invariant that spectronaut_raw$IsotopeLabelType == "L" in baseline input
      • Relaxed missing-column tests to expect silent execution for optional columns (F.ExcludedFromQuantification, F.FrgLossType) when NULL
      • Updated protein-column fallback tests: silent when PG.ProteinGroups is NULL (falls back to PG.ProteinAccessions); error only when both are absent
      • Added "Heavy Label Testing" block:
        • Runs converter on boxcar_protein_turnover_input.csv with heavyLabel = c("K[Lys6]") and asserts:
          • Output schema present and non-empty
          • ProteinName non-missing
          • Expected Condition values (0d, 8d, 32d)
          • Presence of both IsotopeLabelType values ("H" and "L")
          • Heavy-specific peptide sequences contain Lys6 only in H rows; L rows do not contain Lys6; NA rows (unlabeled) do not contain K
        • Repeats with heavyLabel = c("L[Leu6]") asserting that "H" does not appear when expecting only L-labeled peptides

Unit Tests Added/Modified

  • Modified tests to align with new intensity column naming and to check behavior when heavyLabels is provided vs NULL.
  • New heavy-label integration tests verify:
    • IsotopeLabelType assignment (H/L/NA) based on peptide-sequence bracketed labels
    • That heavy-label patterns are recognized per-peptide and that resulting IsotopeLabelType values appear in converted output
  • Tests also verify optional-column tolerance and protein-column fallback behavior.

Coding Guidelines / Violations

  • No explicit coding-guideline violations were detected in the provided diffs:
    • Parameter naming follows existing conventions
    • Internal helpers use dot-prefix naming consistent with package conventions
    • Column renaming uses data.table::setnames() appropriately
    • New functionality is covered by unit tests
  • Notes for reviewers:
    • The public default for the intensity parameter changed shape (from an enum vector to a single string). Reviewers should confirm callers expecting the previous default-enum behavior are updated accordingly.
    • Ensure exported function documentation and NAMESPACE align with the new parameters and signatures (S4 method docs were updated in Rd files; verify corresponding Rd/S4 registration and NAMESPACE entries are consistent).

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Extended Spectronaut-to-MSstats conversion: added peptideSequenceColumn and heavyLabels parameters, introduced dynamic intensity-column resolution and isotope-label assignment, adjusted input validation and renamed/standardized peptide/protein columns before downstream processing.

Changes

Cohort / File(s) Summary
Core Spectronaut Cleaning Logic
R/clean_Spectronaut.R
Added peptideSequenceColumn and heavyLabels params to .cleanRawSpectronaut(). Added helpers .addSpectronautColumnsIfMissing(), .resolveSpectronautIntensityColumn(), .assignSpectronautIsotopeLabelType(). Updated .validateSpectronautInput() and switched intensity mapping to dynamic resolution and standardized peptide/protein renaming.
Public Converter Interface
R/converters_SpectronauttoMSstatsFormat.R
Extended SpectronauttoMSstatsFormat() signature with peptideSequenceColumn and heavyLabels; changed intensity default to single "PeakArea" and adjusted preprocessing so IsotopeLabelType is not forced when heavyLabels is provided.
Tests
inst/tinytest/test_clean_Spectronaut.R, inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R
Updated intensity-column reference in tests and error-message expectations. Added comprehensive heavy-label tests asserting isotope classification, presence/absence of labels in sequences, and expected condition values. Relaxed some missing-column error expectations to silent behavior where appropriate.
Documentation
man/MSstatsClean.Rd, man/SpectronauttoMSstatsFormat.Rd, man/dot-cleanRawSpectronaut.Rd, man/dot-cleanRawDIANN.Rd, man/DIANNtoMSstatsFormat.Rd
Documented new peptideSequenceColumn and heavyLabels params; reframed intensity docs to allow legacy enum or arbitrary Spectronaut column names (with standardization); minor DIANN runOrder wording tweak.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Converter as SpectronauttoMSstatsFormat()
    participant Cleaner as .cleanRawSpectronaut()
    participant Validator as .validateSpectronautInput()
    participant ColumnStd as .addSpectronautColumnsIfMissing()
    participant IntResolver as .resolveSpectronautIntensityColumn()
    participant IsoAssigner as .assignSpectronautIsotopeLabelType()
    participant Input as Spectronaut Input

    User->>Converter: call(intensity, peptideSequenceColumn, heavyLabels)
    Converter->>Cleaner: forward params + data
    Cleaner->>Validator: check required cols & peptideSequenceColumn
    Validator-->>Cleaner: validation result
    Cleaner->>ColumnStd: synthesize missing Spectronaut cols
    ColumnStd-->>Cleaner: augmented data
    Cleaner->>IntResolver: map/resolve intensity column
    IntResolver-->>Cleaner: resolved intensity column name
    Cleaner->>IsoAssigner: assign IsotopeLabelType (if heavyLabels)
    IsoAssigner-->>Cleaner: labeled data
    Cleaner-->>Converter: cleaned data
    Converter-->>User: MSstats-formatted output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • mstaniak

Poem

🐰
I hopped through columns, dots and names,
Found heavy tags in bracketed frames.
I nudged intensity into line,
Matched peptides, labels — neat design.
A tiny hop for cleaner data, fine!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is a blank template with no substantive content; all required sections (Motivation, Changes, Testing) lack implementation details. Complete all template sections: provide motivation/context and solution summary, detail the specific changes made, describe unit tests added/modified, and check off the completed checklist items.
Title check ❓ Inconclusive The title 'original implementation of spectronaut converter updates' is vague and generic, failing to convey the specific nature of the changes. Revise the title to be more specific and descriptive of the primary changes, such as 'Add peptide sequence and heavy label support to Spectronaut converter' or similar.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat-turnover

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
inst/tinytest/test_spectronaut_protein_turnover.R (1)

110-116: This check doesn't prove raw-column resolution picked the same intensity.

Comparing only nrow() will still pass if "FG.MS1Quantity" resolves to the wrong numeric column but preserves filtering. Compare the aligned Intensity values against output_ms1 instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@inst/tinytest/test_spectronaut_protein_turnover.R` around lines 110 - 116,
The test currently only checks row counts, which doesn't verify that the
"FG.MS1Quantity" column produced the same numeric intensities as the alias;
update the assertion to compare the Intensity values between output_raw_col and
output_ms1 instead of nrow(). Specifically, align/order both data.frames by the
same key columns (e.g. Run/Precursor/Charge/Peptide or the keys used in
SpectronauttoMSstatsFormat) or perform a join on those unique identifiers, then
compare the resulting Intensity vectors (from output_raw_col$Intensity and
output_ms1$Intensity) with expect_equal (or expect_equal with a small tolerance
if numeric rounding is possible). Ensure you reference the
SpectronauttoMSstatsFormat call that created output_raw_col and the expected
output_ms1 in the updated assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/clean_Spectronaut.R`:
- Around line 46-54: The subset of spec_input currently builds cols then
intersects with colnames(spec_input), which unintentionally drops the
standardized labelColumn and PEPStrippedSequence when heavyLabel is enabled;
update the cols construction used before spec_input = spec_input[, cols, with =
FALSE] (and the similar projections at the other affected sites) to always
include labelColumn and "PEPStrippedSequence" (in addition to
anomalyModelFeatures when calculateAnomalyScores is TRUE) so the heavy-label
classifier functions correctly, and ensure downstream classification logic uses
labelColumn rather than PeptideSequence when heavyLabel is TRUE.
- Around line 238-256: The current validation in clean_Spectronaut.R relaxes
required columns too far; update the checks so that required_columns includes
core fields RFileName and RCondition (in addition to FGCharge) and add a
separate check that at least one of EGModifiedSequence or FGLabeledSequence is
present; do not remove or bypass validation for fields other than the four that
.addMissingSpectronautColumns synthesizes (FFrgLossType,
FExcludedFromQuantification, FFrgIon, FCharge). Locate the
required_columns/missing_columns logic and the protein-name check (symbols
required_columns, missing_columns, setdiff, getOption("MSstatsLog")) and modify
it to enforce RFileName and RCondition and the
EGModifiedSequence|FGLabeledSequence existence before proceeding.

In `@R/converters_SpectronauttoMSstatsFormat.R`:
- Around line 75-78: The current branch silently accepts a vector for the
intensity argument; instead change the logic in the SpectronauttoMSstatsFormat
converter to treat an omitted argument separately (use missing(intensity) to
apply the default) and reject explicit multi-value inputs: if missing(intensity)
keep the current default-selection behavior, otherwise require length(intensity)
== 1 and call stop() with a clear error if length > 1 so callers cannot
accidentally pass multiple channels.

---

Nitpick comments:
In `@inst/tinytest/test_spectronaut_protein_turnover.R`:
- Around line 110-116: The test currently only checks row counts, which doesn't
verify that the "FG.MS1Quantity" column produced the same numeric intensities as
the alias; update the assertion to compare the Intensity values between
output_raw_col and output_ms1 instead of nrow(). Specifically, align/order both
data.frames by the same key columns (e.g. Run/Precursor/Charge/Peptide or the
keys used in SpectronauttoMSstatsFormat) or perform a join on those unique
identifiers, then compare the resulting Intensity vectors (from
output_raw_col$Intensity and output_ms1$Intensity) with expect_equal (or
expect_equal with a small tolerance if numeric rounding is possible). Ensure you
reference the SpectronauttoMSstatsFormat call that created output_raw_col and
the expected output_ms1 in the updated assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1ec6a1bb-a0f5-4f05-9f1e-acc710829075

📥 Commits

Reviewing files that changed from the base of the PR and between 5360aaf and cabe7fa.

⛔ Files ignored due to path filters (2)
  • inst/tinytest/raw_data/Spectronaut/boxcar_protein_turnover_input.csv is excluded by !**/*.csv
  • inst/tinytest/raw_data/Spectronaut/run_order.csv is excluded by !**/*.csv
📒 Files selected for processing (5)
  • R/clean_Spectronaut.R
  • R/converters_SpectronauttoMSstatsFormat.R
  • inst/tinytest/test_clean_Spectronaut.R
  • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R
  • inst/tinytest/test_spectronaut_protein_turnover.R

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
R/clean_Spectronaut.R (1)

62-64: ⚠️ Potential issue | 🟠 Major

Keep RFileName in the required-column set.

RFileName is still renamed to Run on Line 43, so dropping it from required_columns just lets malformed inputs pass boundary validation and fail later during annotation/preprocess with a less actionable error.

Suggested fix
-    required_columns = c("FGCharge")
+    required_columns = c("FGCharge", "RFileName")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 62 - 64, The required-columns check
currently only lists "FGCharge" so malformed inputs missing the original
filename column slip through; add "RFileName" to the required_columns vector
(the same name that is later renamed to "Run" when processing spec_input) so
missing RFileName is caught early by the missing_columns check and reported
before downstream annotation/preprocess (update the required_columns definition
used when computing missing_columns on spec_input).
R/converters_SpectronauttoMSstatsFormat.R (1)

56-57: ⚠️ Potential issue | 🔴 Critical

Default intensity still reaches cleaning as a length-2 vector.

Omitting intensity leaves c("PeakArea", "NormalizedPeakArea") intact here, and the commented-out guard means that vector is forwarded unchanged into .resolveSpectronautIntensityColumn(), which expects a scalar. The default/example call therefore errors before cleaning starts.

Suggested fix
 SpectronauttoMSstatsFormat = function(
         input, annotation = NULL,
-        intensity = c('PeakArea', 'NormalizedPeakArea'),
+        intensity = "PeakArea",
         peptideSequenceColumn = "EG.ModifiedSequence",
         heavyLabels = NULL,
@@
 ) {
-    # # Standardize the intensity value when it is a raw column name so that the
-    # # legacy match.arg() inside .cleanRawSpectronaut sees it correctly.
-    # known_aliases = c('PeakArea', 'NormalizedPeakArea')
-    # if (length(intensity) > 1) {
-    #     # No value supplied: use first (default)
-    #     intensity = intensity[1]
-    # }
-    # if (!(intensity %in% known_aliases)) {
-    #     # Treat as a raw column name and standardize it
-    #     intensity = .standardizeColnames(intensity)
-    # }
+    if (length(intensity) != 1L) {
+        stop("'intensity' must be a single value.")
+    }

Also applies to: 71-81, 119-123

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/clean_Spectronaut.R`:
- Around line 49-50: The call to .assignSpectronautIsotopeLabelType is passing
msstats_object (fourth arg) but the function signature accepts only three
params, causing an error and preventing the early return when heavyLabels is
NULL; remove the extra msstats_object argument from the calls that pass
(spec_input, heavyLabels, peptideSequenceColumn) and add or ensure an early
return inside .assignSpectronautIsotopeLabelType when heavyLabels is NULL so it
exits before pattern logic runs; also fix the fcase usage inside
.assignSpectronautIsotopeLabelType by either defining heavy_pattern_escaped and
bare_pattern from heavyLabels (escaping regex) before they are referenced or by
rewriting the fcase branches to not reference those undefined symbols; apply the
same fixes to the duplicate block around the other occurrence (lines ~189-207)
where .assignSpectronautIsotopeLabelType is invoked or similar pattern logic
exists.
- Around line 79-83: The presence check for the peptide column is inverted: in
the block using .standardizeColnames(peptideSequenceColumn) and
colnames(spec_input) you currently stop when the column exists; change the
predicate to trigger only when the standardized peptideSequenceColumn is NOT in
colnames(spec_input). Update the error path (getOption("MSstatsLog")("ERROR",
msg) and stop(msg)) to run under that negated condition and keep the existing
message (optionally fix grammar to "column is missing" or list missing columns)
so valid Spectronaut inputs are not rejected by clean_Spectronaut.R.

---

Duplicate comments:
In `@R/clean_Spectronaut.R`:
- Around line 62-64: The required-columns check currently only lists "FGCharge"
so malformed inputs missing the original filename column slip through; add
"RFileName" to the required_columns vector (the same name that is later renamed
to "Run" when processing spec_input) so missing RFileName is caught early by the
missing_columns check and reported before downstream annotation/preprocess
(update the required_columns definition used when computing missing_columns on
spec_input).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 18507982-2d6a-43a6-b051-230568d6ff63

📥 Commits

Reviewing files that changed from the base of the PR and between cabe7fa and 2708b0a.

📒 Files selected for processing (7)
  • R/clean_Spectronaut.R
  • R/converters_SpectronauttoMSstatsFormat.R
  • man/DIANNtoMSstatsFormat.Rd
  • man/MSstatsClean.Rd
  • man/SpectronauttoMSstatsFormat.Rd
  • man/dot-cleanRawDIANN.Rd
  • man/dot-cleanRawSpectronaut.Rd
✅ Files skipped from review due to trivial changes (1)
  • man/DIANNtoMSstatsFormat.Rd

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
R/clean_Spectronaut.R (1)

49-51: ⚠️ Potential issue | 🔴 Critical

Function call passes 4 arguments but .assignSpectronautIsotopeLabelType only accepts 3.

The call includes msstats_object as the 4th argument, but the function signature at lines 189-190 only defines 3 parameters (spec_input, heavyLabels, peptideSequenceColumn). This will cause an "unused argument" error at runtime when heavyLabels is non-NULL.

Remove the extra msstats_object argument:

  spec_input = .assignSpectronautIsotopeLabelType(
-    spec_input, heavyLabels, peptideSequenceColumn, msstats_object)
+    spec_input, heavyLabels, peptideSequenceColumn)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 49 - 51, The call to
.assignSpectronautIsotopeLabelType passes a fourth argument msstats_object while
the function .assignSpectronautIsotopeLabelType only accepts three parameters
(spec_input, heavyLabels, peptideSequenceColumn); remove the extra
msstats_object argument from the call where spec_input is assigned (the
invocation at spec_input = .assignSpectronautIsotopeLabelType(...)) so the call
matches the function signature and avoids the "unused argument" runtime error.
🧹 Nitpick comments (1)
R/clean_Spectronaut.R (1)

152-156: Unnecessary collapse argument in paste0.

The collapse parameter only affects vectors being collapsed into a single string. Here it's applied to a single concatenated string, making it dead code.

  if (!(resolved %in% available_cols)) {
    stop(paste0(
-      "Intensity column '", intensity, "' not found in input data. ", collapse = ", ")
+      "Intensity column '", intensity, "' not found in input data.")
    )
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 152 - 156, The stop() call inside the
check for resolved in available_cols is using paste0(..., collapse = ",") which
is unnecessary; remove the collapse argument from the paste0 invocation (or
switch to paste if you intended to collapse a vector) so the stop message is
created simply via paste0("Intensity column '", intensity, "' not found in input
data. "); update the call around the variables resolved, available_cols, and
intensity accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/clean_Spectronaut.R`:
- Around line 79-84: The error message built when peptideSequenceColumn is
missing uses incorrect grammar; update the string in the check that uses
.standardizeColnames(peptideSequenceColumn) %in% colnames(spec_input) so that
msg reads "The following column is missing from the input data: " (singular "is
missing") instead of "are missing"; ensure the same corrected msg is passed to
getOption("MSstatsLog")("ERROR", msg) and stop(msg).
- Around line 179-190: The function .assignSpectronautIsotopeLabelType currently
documents a 4th parameter msstats_object and accepts an unused parameter
peptideSequenceColumn while the body uses the renamed PeptideSequence column
directly; remove the unused argument peptideSequenceColumn from the signature
and from any calls, update the roxygen block to remove msstats_object so the
params match the function, and ensure the function still references
PeptideSequence and sets/updates IsotopeLabelType as before; also update any
call sites that passed peptideSequenceColumn (e.g., the callers that invoked
.assignSpectronautIsotopeLabelType) to call the new 2-parameter signature
(spec_input, heavyLabels).
- Around line 196-208: The light/heavy detection fails because bare_amino_acids
is taken from heavyLabels without removing digits or brackets; change the
extraction to strip brackets and trailing digits so bare_amino_acids is just the
alphabetic residue name (e.g. replace sub("\\[.*\\]", "", heavyLabels) with
something like sub("\\d+$", "", gsub("^\\[|\\]$", "", heavyLabels))). Keep
heavy_brackets_escaped_pattern built from the original heavyLabels (escaped for
[]), build bare_amino_acids_pattern from the cleaned alphabetic names, and leave
the spec_input[, IsotopeLabelType := data.table::fcase(...)] logic intact so
grepl(heavy_brackets_escaped_pattern, PeptideSequence, perl=TRUE) finds
bracketed labels and grepl(bare_amino_acids_pattern, PeptideSequence, perl=TRUE)
finds unlabeled residues.

---

Duplicate comments:
In `@R/clean_Spectronaut.R`:
- Around line 49-51: The call to .assignSpectronautIsotopeLabelType passes a
fourth argument msstats_object while the function
.assignSpectronautIsotopeLabelType only accepts three parameters (spec_input,
heavyLabels, peptideSequenceColumn); remove the extra msstats_object argument
from the call where spec_input is assigned (the invocation at spec_input =
.assignSpectronautIsotopeLabelType(...)) so the call matches the function
signature and avoids the "unused argument" runtime error.

---

Nitpick comments:
In `@R/clean_Spectronaut.R`:
- Around line 152-156: The stop() call inside the check for resolved in
available_cols is using paste0(..., collapse = ",") which is unnecessary; remove
the collapse argument from the paste0 invocation (or switch to paste if you
intended to collapse a vector) so the stop message is created simply via
paste0("Intensity column '", intensity, "' not found in input data. "); update
the call around the variables resolved, available_cols, and intensity
accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 10170c54-3dbe-489b-9b3c-b224e81d885c

📥 Commits

Reviewing files that changed from the base of the PR and between 2708b0a and c8c52dc.

📒 Files selected for processing (3)
  • R/clean_Spectronaut.R
  • inst/tinytest/test_clean_Spectronaut.R
  • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R
✅ Files skipped from review due to trivial changes (1)
  • inst/tinytest/test_clean_Spectronaut.R
🚧 Files skipped from review as they are similar to previous changes (1)
  • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
R/clean_Spectronaut.R (2)

161-177: ⚠️ Potential issue | 🟠 Major

Documentation and implementation mismatch on heavyLabels format.

The roxygen documentation (lines 168-174, 180-181) describes heavyLabels as the label name alone (e.g., "Lys6"), but the implementation at line 194 uses sub("\\[.*\\]", "", heavyLabels) which expects a format like "K[Lys6]" (amino acid letter + bracketed label). The test file confirms the bracketed format is expected (heavyLabel = c("K[Lys6]")).

With input "Lys6" (as documented), bare_amino_acids becomes "Lys6" (no change), and the light classification logic breaks because it looks for literal "Lys6" in sequences instead of "K".

Update the documentation to match the actual expected format:

📝 Proposed documentation fix
-#' When \code{heavyLabel} is provided, each row is classified as heavy
+#' When \code{heavyLabels} is provided, each row is classified as heavy
 #' (\code{"H"}), light (\code{"L"}), or unlabeled (\code{NA}) by inspecting
 #' the labeled sequence column for the presence of the label tag.
 #'
 #' In Spectronaut protein turnover reports, heavy peptides appear in
 #' \code{FG.LabeledSequence} with a bracketed modification, e.g.
-#' \code{_PEPTIDEK[Lys6]_}.  Any sequence that contains
-#' \code{[<heavyLabel>]} is classified as heavy; all others are light.
-#' Sequences that do not have amino acids that can carry the label
-#' are classified as \code{NA}.  For example, if \code{heavyLabels} is 
-#' \code{"Lys6"}, then \code{PEPTIDEZ} is classified as NA since it
-#' has no lysine residues that could be labeled.
+#' \code{_PEPTIDEK[Lys6]_}.  The \code{heavyLabels} parameter should be
+#' specified as the amino acid letter followed by the bracketed label,
+#' e.g. \code{"K[Lys6]"}.  Sequences containing the bracketed label are
+#' classified as heavy; sequences containing the bare amino acid (e.g. "K")
+#' without the label are light; sequences lacking the amino acid entirely
+#' are classified as \code{NA}.

At lines 180-181:

-#' `@param` heavyLabels Character scalar heavy label name (e.g. \code{"Lys6"}),
+#' `@param` heavyLabels Character vector of heavy labels in format amino acid + bracketed
+#'   modification (e.g. \code{"K[Lys6]"}, \code{"R[Arg10]"}),
 #'   or \code{NULL}.

Also applies to: 194-206

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 161 - 177, The docs and implementation
disagree about heavyLabels format: the code in clean_Spectronaut.R (logic around
heavyLabels, sub("\\[.*\\]", "", heavyLabels), and FG.LabeledSequence
classification) expects labels like "K[Lys6]" (amino-acid plus bracketed label)
but the roxygen text claims just "Lys6"; update the documentation to specify
that heavyLabels must include the residue letter with the bracketed modification
(e.g., "K[Lys6]") and note that the function strips the bracketed part to
determine the residue (used by the IsotopeLabelType assignment and light/heavy
classification), or alternatively change the implementation to accept bare
labels by mapping residue letters from a provided label-to-residue mapping if
you want to keep "Lys6" syntax—adjust the roxygen examples and parameter
description for heavyLabels and any examples/tests to match the chosen format.

79-84: ⚠️ Potential issue | 🟡 Minor

Minor grammar issue in error message.

"The following column are missing" should be "The following column is missing" for proper subject-verb agreement.

✏️ Proposed fix
     if (!(.standardizeColnames(peptideSequenceColumn) %in% colnames(spec_input))) {
-        msg = paste("The following column are missing from the input data:",
+        msg = paste("The following column is missing from the input data:",
                     peptideSequenceColumn)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 79 - 84, Update the error message string
in the check that verifies peptideSequenceColumn exists in spec_input: the code
block using .standardizeColnames(peptideSequenceColumn) and
getOption("MSstatsLog") currently builds msg with "The following column are
missing from the input data:" — change this to "The following column is missing
from the input data:" (or pluralize based on context if you intentionally expect
multiple columns) so the subject-verb agreement is correct when stop(msg) is
called.
R/converters_SpectronauttoMSstatsFormat.R (1)

54-58: ⚠️ Potential issue | 🟡 Minor

Multi-value intensity input still silently uses first element.

While the default is now a single value, if a caller accidentally passes intensity = c("PeakArea", "NormalizedPeakArea"), R's vectorized behavior will silently proceed with only the first element. The past review suggested using missing(intensity) for the omitted-argument path and explicitly requiring a scalar.

🛡️ Proposed validation in function body after line 70
 ) {
+    if (length(intensity) != 1L) {
+        stop("'intensity' must be a single value.")
+    }
 
     validation_config = list(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/converters_SpectronauttoMSstatsFormat.R` around lines 54 - 58, The function
SpectronauttoMSstatsFormat currently accepts a vector for the intensity argument
and will silently use only the first element; add an explicit validation near
the start of the function (after the initial argument parsing, e.g., after line
~70) that uses missing(intensity) to detect omitted argument and otherwise
checks that length(intensity) == 1 and is.character/introspect the type, and if
not, stop() with a clear error message instructing callers to pass a single
intensity name (e.g., "PeakArea"); reference the parameter intensity and
function SpectronauttoMSstatsFormat when implementing this guard.
🧹 Nitpick comments (1)
R/clean_Spectronaut.R (1)

152-156: Unnecessary collapse argument in paste0().

The collapse parameter in paste0() at line 154 is ineffective here since there's only a single string being constructed. It doesn't cause an error but is misleading.

✏️ Proposed fix
   if (!(resolved %in% available_cols)) {
     stop(paste0(
-      "Intensity column '", intensity, "' not found in input data. ", collapse = ", ")
-    )
+      "Intensity column '", intensity, "' not found in input data."))
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 152 - 156, The stop() call that
constructs the error message using paste0 when checking if resolved is in
available_cols includes an unnecessary collapse argument; remove the collapse =
", " from the paste0(...) in the conditional that references resolved,
available_cols and intensity so the message is built with a single paste0 call
(or simply concatenate the strings) rather than using the ineffective collapse
parameter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R`:
- Around line 392-398: The test calls to SpectronauttoMSstatsFormat use the
wrong argument name `heavyLabel`; update those calls to use the correct
parameter name `heavyLabels` so the function receives the intended vector (e.g.,
change `heavyLabel = c("K[Lys6]")` to `heavyLabels = c("K[Lys6]")` in both test
invocations), then re-run the tests to ensure the heavy-label classification
logic in SpectronauttoMSstatsFormat is actually exercised.
- Line 19: The test is asserting against the wrong object; change the assertion
to check the converted result's column instead of the raw input. Replace the use
of spectronaut_raw$IsotopeLabelType with output$IsotopeLabelType in the
expect_true(all(...)) call (and optionally ensure output$IsotopeLabelType
exists/non-NULL before asserting) so the test validates the conversion that adds
IsotopeLabelType.

In `@man/SpectronauttoMSstatsFormat.Rd`:
- Around line 53-63: The docs for the heavyLabels parameter are misleading:
update the roxygen param documentation in
R/converters_SpectronauttoMSstatsFormat.R (the source for
man/SpectronauttoMSstatsFormat.Rd) to show the actual expected format (e.g. use
examples like c("K[Lys6]") or c("K[Lys6]", "R[Arg10]") and state that labels
must match the bracketed substring in the peptide sequence), preserving the
explanation of how peptides are classified (IsotopeLabelType "H"/"L"/NA) and
that NULL defaults to all "L"; run roxygenize to regenerate the .Rd.

In `@R/converters_SpectronauttoMSstatsFormat.R`:
- Around line 15-25: The documentation for heavyLabels is misleading: the
implementation in .assignSpectronautIsotopeLabelType expects labels with the
amino-acid prefix plus the bracketed isotope (e.g. "K[Lys6]" or
c("K[Lys6]","R[Arg10]")) rather than bare names like "Lys6"; update the
parameter description for heavyLabels to state the required format (amino-acid
letter immediately followed by the bracketed label), provide corrected examples
matching the test (e.g. heavyLabels = c("K[Lys6]","R[Arg10]")), and keep the
existing behavior notes about classification to "H"/"L"/NA and the NULL default.

---

Duplicate comments:
In `@R/clean_Spectronaut.R`:
- Around line 161-177: The docs and implementation disagree about heavyLabels
format: the code in clean_Spectronaut.R (logic around heavyLabels,
sub("\\[.*\\]", "", heavyLabels), and FG.LabeledSequence classification) expects
labels like "K[Lys6]" (amino-acid plus bracketed label) but the roxygen text
claims just "Lys6"; update the documentation to specify that heavyLabels must
include the residue letter with the bracketed modification (e.g., "K[Lys6]") and
note that the function strips the bracketed part to determine the residue (used
by the IsotopeLabelType assignment and light/heavy classification), or
alternatively change the implementation to accept bare labels by mapping residue
letters from a provided label-to-residue mapping if you want to keep "Lys6"
syntax—adjust the roxygen examples and parameter description for heavyLabels and
any examples/tests to match the chosen format.
- Around line 79-84: Update the error message string in the check that verifies
peptideSequenceColumn exists in spec_input: the code block using
.standardizeColnames(peptideSequenceColumn) and getOption("MSstatsLog")
currently builds msg with "The following column are missing from the input
data:" — change this to "The following column is missing from the input data:"
(or pluralize based on context if you intentionally expect multiple columns) so
the subject-verb agreement is correct when stop(msg) is called.

In `@R/converters_SpectronauttoMSstatsFormat.R`:
- Around line 54-58: The function SpectronauttoMSstatsFormat currently accepts a
vector for the intensity argument and will silently use only the first element;
add an explicit validation near the start of the function (after the initial
argument parsing, e.g., after line ~70) that uses missing(intensity) to detect
omitted argument and otherwise checks that length(intensity) == 1 and
is.character/introspect the type, and if not, stop() with a clear error message
instructing callers to pass a single intensity name (e.g., "PeakArea");
reference the parameter intensity and function SpectronauttoMSstatsFormat when
implementing this guard.

---

Nitpick comments:
In `@R/clean_Spectronaut.R`:
- Around line 152-156: The stop() call that constructs the error message using
paste0 when checking if resolved is in available_cols includes an unnecessary
collapse argument; remove the collapse = ", " from the paste0(...) in the
conditional that references resolved, available_cols and intensity so the
message is built with a single paste0 call (or simply concatenate the strings)
rather than using the ineffective collapse parameter.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bccb970a-19d9-491f-9205-f00e12fa0456

📥 Commits

Reviewing files that changed from the base of the PR and between c8c52dc and a36b04c.

⛔ Files ignored due to path filters (1)
  • inst/tinytest/raw_data/Spectronaut/boxcar_protein_turnover_input.csv is excluded by !**/*.csv
📒 Files selected for processing (4)
  • R/clean_Spectronaut.R
  • R/converters_SpectronauttoMSstatsFormat.R
  • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R
  • man/SpectronauttoMSstatsFormat.Rd

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
R/clean_Spectronaut.R (3)

79-84: ⚠️ Potential issue | 🟡 Minor

Grammar issue: "are" should be "is".

Since this error message refers to a single column, use singular verb agreement.

     if (!(.standardizeColnames(peptideSequenceColumn) %in% colnames(spec_input))) {
-        msg = paste("The following column are missing from the input data:",
+        msg = paste("The following column is missing from the input data:",
                    peptideSequenceColumn)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 79 - 84, The error message uses plural
verb "are" but refers to a single column; update the message built in the block
that checks .standardizeColnames(peptideSequenceColumn) against
colnames(spec_input) to use singular agreement (e.g., "The following column is
missing from the input data:") and ensure this updated string is passed to
getOption("MSstatsLog")("ERROR", ...) and stop(...), referencing the
peptideSequenceColumn variable so the logged/stopped text remains informative.

191-203: ⚠️ Potential issue | 🟠 Major

Heavy-label pattern logic is still broken.

When heavyLabels = "Lys6" (without brackets), sub("\\[.*\\]", "", heavyLabels) does nothing, so bare_amino_acids = "Lys6". Similarly, heavy_brackets_escaped_pattern also resolves to "Lys6" since there are no brackets to escape. Both patterns match the same substring, and neither correctly identifies bracketed labels like [Lys6] vs. bare amino acids like K.

Per the docstring, the function should:

  1. Detect [Lys6] (bracketed) → Heavy
  2. Detect bare K (lysine residue without label) → Light
  3. No lysine → NA

Fix:

-.assignSpectronautIsotopeLabelType = function(spec_input, heavyLabels) {
-    IsotopeLabelType = PeptideSequence = NULL
-    if (is.null(heavyLabels)) {
-        return(spec_input)
-    }
-    
-    bare_amino_acids = sub("\\[.*\\]", "", heavyLabels)
-    bare_amino_acids_pattern = paste0(bare_amino_acids, collapse = "|")
-    heavy_pattern = paste0(heavyLabels, collapse = "|")
-    heavy_brackets_escaped_pattern = paste(
-        gsub("([\\[\\]])", "\\\\\\1", heavy_pattern, perl = TRUE),
-        collapse = "|"
-    )
+.assignSpectronautIsotopeLabelType = function(spec_input, heavyLabels) {
+    IsotopeLabelType = PeptideSequence = NULL
+    if (is.null(heavyLabels)) {
+        return(spec_input)
+    }
+    
+    # Map 3-letter amino acid codes to single-letter codes
+    aa_map = c(Lys = "K", Arg = "R", Leu = "L", Ile = "I", 
+               Val = "V", Phe = "F", Tyr = "Y", Trp = "W")
+    # Extract amino acid name (e.g., "Lys" from "Lys6") and map to single letter
+    aa_names = sub("[0-9]+$", "", heavyLabels)
+    bare_amino_acids = aa_map[aa_names]
+    bare_amino_acids_pattern = paste0(bare_amino_acids, collapse = "|")
+    # Build pattern to match bracketed labels: [Lys6]
+    heavy_brackets_escaped_pattern = paste0(
+        "\\[", heavyLabels, "\\]", collapse = "|"
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 191 - 203, The current logic treats
"Lys6" and "[Lys6]" the same; fix by (1) deriving the one-letter residue(s) from
heavyLabels (e.g., convert "Lys6" -> "K") into bare_amino_acids (remove numeric
suffix and map full AA name to one-letter), (2) build
heavy_brackets_escaped_pattern to match bracketed labels explicitly by wrapping
each heavy label in escaped brackets (e.g., paste0("\\[",
gsub("([\\[\\]])","\\\\\\\\\\1", heavyLabel), "\\]")), and (3) make
bare_amino_acids_pattern match the residue letter only when not inside brackets
(use a negative lookbehind like (?<!\\[) or an equivalent regex) so the fcase on
spec_input[, IsotopeLabelType := ...] correctly sets "H" for bracketed labels,
"L" for bare residue letters, and NA otherwise; update references: heavyLabels,
bare_amino_acids, bare_amino_acids_pattern, heavy_brackets_escaped_pattern, and
the spec_input[, IsotopeLabelType := data.table::fcase(...)].

61-85: ⚠️ Potential issue | 🟠 Major

Missing validation for RFileName and RCondition.

These columns are used in line 31 and renamed in line 44. If absent, setnames() fails with a cryptic error. Add them to required_columns so callers receive a clear diagnostic.

 .validateSpectronautInput = function(spec_input, peptideSequenceColumn) {
-    # Only FGCharge is truly required
-    required_columns = c("FGCharge")
+    required_columns = c("FGCharge", "RFileName", "RCondition")
     missing_columns = setdiff(required_columns, colnames(spec_input))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 61 - 85, The validation in
.validateSpectronautInput currently only requires "FGCharge" and misses
RFileName and RCondition which are later used/renamed (causing setnames() to
fail); update the required_columns vector in .validateSpectronautInput to
include "RFileName" and "RCondition", adjust the missing_columns/error messages
accordingly so callers get a clear diagnostic before setnames() is called.
🧹 Nitpick comments (1)
R/clean_Spectronaut.R (1)

152-156: Unused collapse argument and unhelpful error message.

paste0() with a single string doesn't use collapse. Consider listing available intensity-like columns to help users diagnose the issue.

   if (!(resolved %in% available_cols)) {
-    stop(paste0(
-      "Intensity column '", intensity, "' not found in input data. ", collapse = ", ")
-    )
+    stop(paste0(
+      "Intensity column '", intensity, "' (resolved to '", resolved, 
+      "') not found in input data."))
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/clean_Spectronaut.R` around lines 152 - 156, The stop message in the block
that checks if (resolved %in% available_cols) is unhelpful and incorrectly uses
collapse in paste0; update the error to show the requested intensity name
(intensity or resolved) and list available intensity-like columns from
available_cols so users can diagnose the mismatch. Replace the current
paste0(...) call with a construction that concatenates a clear message (e.g.,
"Intensity column '... ' not found") and then appends the vector of
available_cols collapsed with ", " (using paste(..., collapse = ", ")), ensuring
you remove the unused collapse argument and reference resolved/intensity and
available_cols in the message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@R/clean_Spectronaut.R`:
- Around line 79-84: The error message uses plural verb "are" but refers to a
single column; update the message built in the block that checks
.standardizeColnames(peptideSequenceColumn) against colnames(spec_input) to use
singular agreement (e.g., "The following column is missing from the input
data:") and ensure this updated string is passed to
getOption("MSstatsLog")("ERROR", ...) and stop(...), referencing the
peptideSequenceColumn variable so the logged/stopped text remains informative.
- Around line 191-203: The current logic treats "Lys6" and "[Lys6]" the same;
fix by (1) deriving the one-letter residue(s) from heavyLabels (e.g., convert
"Lys6" -> "K") into bare_amino_acids (remove numeric suffix and map full AA name
to one-letter), (2) build heavy_brackets_escaped_pattern to match bracketed
labels explicitly by wrapping each heavy label in escaped brackets (e.g.,
paste0("\\[", gsub("([\\[\\]])","\\\\\\\\\\1", heavyLabel), "\\]")), and (3)
make bare_amino_acids_pattern match the residue letter only when not inside
brackets (use a negative lookbehind like (?<!\\[) or an equivalent regex) so the
fcase on spec_input[, IsotopeLabelType := ...] correctly sets "H" for bracketed
labels, "L" for bare residue letters, and NA otherwise; update references:
heavyLabels, bare_amino_acids, bare_amino_acids_pattern,
heavy_brackets_escaped_pattern, and the spec_input[, IsotopeLabelType :=
data.table::fcase(...)].
- Around line 61-85: The validation in .validateSpectronautInput currently only
requires "FGCharge" and misses RFileName and RCondition which are later
used/renamed (causing setnames() to fail); update the required_columns vector in
.validateSpectronautInput to include "RFileName" and "RCondition", adjust the
missing_columns/error messages accordingly so callers get a clear diagnostic
before setnames() is called.

---

Nitpick comments:
In `@R/clean_Spectronaut.R`:
- Around line 152-156: The stop message in the block that checks if (resolved
%in% available_cols) is unhelpful and incorrectly uses collapse in paste0;
update the error to show the requested intensity name (intensity or resolved)
and list available intensity-like columns from available_cols so users can
diagnose the mismatch. Replace the current paste0(...) call with a construction
that concatenates a clear message (e.g., "Intensity column '... ' not found")
and then appends the vector of available_cols collapsed with ", " (using
paste(..., collapse = ", ")), ensuring you remove the unused collapse argument
and reference resolved/intensity and available_cols in the message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d0307056-91f8-4375-9f52-abb56bf0040f

📥 Commits

Reviewing files that changed from the base of the PR and between a36b04c and 0a6ada2.

📒 Files selected for processing (2)
  • R/clean_Spectronaut.R
  • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R
🚧 Files skipped from review as they are similar to previous changes (1)
  • inst/tinytest/test_converters_SpectronauttoMSstatsFormat.R

@tonywu1999 tonywu1999 merged commit e4b78fe into devel Apr 2, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant