Feat turnover by tonywu1999 · Pull Request #186 · Vitek-Lab/MSstats

tonywu1999 · 2026-04-02T21:43:09Z

Motivation and Context

Please include relevant motivation and context of the problem along with a short summary of the solution.

Changes

Please provide a detailed bullet point list of your changes.

Testing

Please describe any unit tests you added or modified to verify your changes.

Checklist Before Requesting a Review

I have read the MSstats contributing guidelines
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules
I have run the devtools::document() command after my changes and committed the added files

… compatibility

coderabbitai · 2026-04-02T21:43:17Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c83fe273-13c4-48d8-b378-433ef847c358

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat-turnover

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-02T21:44:15Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Possible Issue `.getNonMissingFilter()` can return an undefined `nonmissing_filter` for some paths (e.g., `impute=TRUE` with `censored_symbol=NULL`, or `censored_symbol` values other than `0`/`NA`). Consider initializing a default filter and handling unexpected `censored_symbol` values explicitly to avoid runtime errors and inconsistent missingness logic. .getNonMissingFilter = function(input, impute, censored_symbol) { if (impute) { if (!is.null(censored_symbol)) { if (censored_symbol == "0") { nonmissing_filter = !is.na(input$newABUNDANCE) & input$newABUNDANCE != 0 } else if (censored_symbol == "NA") { nonmissing_filter = !is.na(input$newABUNDANCE) } } } else { nonmissing_filter = !is.na(input$newABUNDANCE) & input$newABUNDANCE != 0 } nonmissing_filter Behavior Change Missing/censoring logic was expanded from Light-only to all labels (e.g., quantile cutoff calculation and `input$censored` assignment). This can cause Heavy (or other) label rows to be censored/imputed and also affects cutoff estimation by mixing label distributions. Validate that downstream summarization/imputation and expected turnover semantics still hold, and consider reintroducing label scoping or making it configurable if needed. quantiles = input[!is.na(INTENSITY) & INTENSITY > 1, quantile(ABUNDANCE, prob = c(0.01, 0.25, 0.5, 0.75, censored_cutoff), na.rm = TRUE)] iqr = quantiles[4] - quantiles[2] multiplier = (quantiles[5] - quantiles[4]) / iqr cutoff_lower = (quantiles[2] - multiplier * iqr) input$censored = !is.na(input$INTENSITY) & input$ABUNDANCE < cutoff_lower if (cutoff_lower <= 0 & !is.null(missing_symbol) & missing_symbol == "0") { zero_one_filter = !is.na(input$ABUNDANCE) & input$ABUNDANCE <= 0 input$censored = ifelse(zero_one_filter, TRUE, input$censored) } if (!is.null(missing_symbol) & missing_symbol == "NA") { input$censored = ifelse(is.na(input$INTENSITY), TRUE, input$censored) } msg = paste(' Log2 intensities under cutoff =', format(cutoff_lower, digits = 5), ' were considered as censored missing values.') msg_2 = paste(" Log2 intensities =", missing_symbol, "were considered as censored missing values.") getOption("MSstatsMsg")("INFO", msg) getOption("MSstatsMsg")("INFO", msg_2) getOption("MSstatsLog")("INFO", msg) getOption("MSstatsLog")("INFO", msg_2) } else { if (missing_symbol == '0') { input$censored = !is.na(input$INTENSITY) & (input$INTENSITY == 1 \| input$ABUNDANCE <= 0) } else if (missing_symbol == 'NA') { input$censored = is.na(input$ABUNDANCE) } Runtime Risk The `nameStandards='unlabeled'` branch uses `input[is.na(input$LABEL), "PeptideSequence", drop = TRUE]`, which is not idiomatic for `data.table` and may not return the expected vector (or may fail if `PeptideSequence` is not present in `input` at that point). Consider using `input[is.na(LABEL), unique(PeptideSequence)]` (or similar) and/or verifying `PeptideSequence` exists before indexing. if (length(standards) == 1 && standards == "unlabeled") { standards = unique(input[is.na(input$LABEL), "PeptideSequence", drop = TRUE]) if (length(standards) == 0) { msg = "nameStandards = 'unlabeled' but no unlabeled peptides found in data." getOption("MSstatsLog")("ERROR", msg) stop(msg) } } input = .normalizeGlobalStandards(input, peptides_dict, standards)

github-actions · 2026-04-02T21:45:31Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix unlabeled standards extraction `data.table` subsetting does not support `drop = TRUE`, and selecting `"PeptideSequence"` from `input` is unreliable since peptide sequence typically comes from `peptides_dict`. Derive unlabeled standards via `PEPTIDE` (and map to `PeptideSequence` when available) using data.table-safe expressions to avoid runtime errors. R/utils_normalize.R [37-47] if (normalization_method == "GLOBALSTANDARDS") { - if (length(standards) == 1 && standards == "unlabeled") { - standards = unique(input[is.na(input$LABEL), "PeptideSequence", drop = TRUE]) + if (length(standards) == 1 && standards == "unlabeled") { + unlabeled_peptides = unique(input[is.na(LABEL), PEPTIDE]) + if (!is.null(peptides_dict) && length(unlabeled_peptides) > 0) { + standards = unique(peptides_dict[PEPTIDE %in% unlabeled_peptides, PeptideSequence]) + } else { + standards = unlabeled_peptides + } + standards = standards[!is.na(standards)] + if (length(standards) == 0) { msg = "nameStandards = 'unlabeled' but no unlabeled peptides found in data." getOption("MSstatsLog")("ERROR", msg) stop(msg) } } input = .normalizeGlobalStandards(input, peptides_dict, standards) } Suggestion importance[1-10]: 8 __ Why: `unique(input[is.na(input$LABEL), "PeptideSequence", drop = TRUE])` is not valid `data.table` syntax (`drop` is an unused argument) and will likely error at runtime; additionally `input` may not contain `PeptideSequence` prior to merging with `peptides_dict`. The suggested approach avoids the `data.table` subsetting bug and uses columns that are more reliably present (e.g., `PEPTIDE`), improving functional correctness.	Medium
	Prevent NA in group detection The `%in% c("0", NA)` check can propagate `NA` (so `heavy_has_real_group_prof` becomes `NA`) and it is also brittle if `GROUP` is numeric `0` instead of `"0"`. Make the logic explicitly handle `NA` and normalize `GROUP` to character (or compare against both `0` and `"0"`), so the subsequent `if (...)` never receives an `NA`. R/dataProcessPlots.R [264-269] has_heavy_prof = any(processed$LABEL == "H") -heavy_has_real_group_prof = has_heavy_prof && !all(processed[LABEL == "H", GROUP] %in% c("0", NA)) +heavy_groups_prof = processed[LABEL == "H", GROUP] +heavy_has_real_group_prof = + has_heavy_prof && + any(!is.na(heavy_groups_prof) & !(as.character(heavy_groups_prof) %in% c("0"))) + tempGroupName = unique(processed[, c("GROUP", "RUN")]) if (length(unique(processed$LABEL)) == 2 && !heavy_has_real_group_prof) { - tempGroupName = tempGroupName[GROUP != '0'] + tempGroupName = tempGroupName[as.character(GROUP) != "0"] } Suggestion importance[1-10]: 7 __ Why: The current `%in% c("0", NA)` can yield `NA`, which can propagate through `all()`/`!` and make the `if (...)` condition `NA`, causing `missing value where TRUE/FALSE needed`. The proposed rewrite makes the heavy-group detection robust to `NA` and to `GROUP` being numeric vs character, improving correctness in edge cases.	Medium
	Stabilize factor label mapping `factor(LABEL, labels=...)` depends on the current (often alphabetical) ordering of factor levels, which can silently swap "Heavy/Light" or "Reference/Endogenous". Specify `levels=` explicitly (e.g., `c("H","L")`) to guarantee correct label mapping regardless of existing factor levels. R/dataProcessPlots.R [294-303] if (length(unique(processed$LABEL)) == 2) { if (heavy_has_real_group_prof) { - processed[, LABEL := factor(LABEL, labels = c("Heavy", "Light"))] + processed[, LABEL := factor(LABEL, levels = c("H", "L"), labels = c("Heavy", "Light"))] } else { - processed[, LABEL := factor(LABEL, labels = c("Reference", "Endogenous"))] + processed[, LABEL := factor(LABEL, levels = c("H", "L"), labels = c("Reference", "Endogenous"))] } } else { if (unique(processed$LABEL) == "L") { - processed[, LABEL := factor(LABEL, labels = c("Endogenous"))] + processed[, LABEL := factor(LABEL, levels = c("L"), labels = c("Endogenous"))] } else { Suggestion importance[1-10]: 6 __ Why: Using `factor(LABEL, labels = ...)` without specifying `levels=` can mis-map labels if `LABEL` is already a factor with unexpected level ordering. Explicit `levels = c("H","L")` (and `c("L")` for single-label) makes the mapping deterministic and prevents subtle plot-label swaps.	Low

tonywu1999 added 3 commits April 2, 2026 16:48

Make initial changes for protein turnover while keeping SRM backwards…

1316b6a

… compatibility

adjust normalization

9afabd4

update normalization code

da6aae0

github-actions bot added the Review effort 4/5 label Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat turnover#186

Feat turnover#186
tonywu1999 wants to merge 3 commits intodevelfrom
feat-turnover

tonywu1999 commented Apr 2, 2026

Uh oh!

coderabbitai bot commented Apr 2, 2026

Review skipped

Uh oh!

github-actions bot commented Apr 2, 2026

Uh oh!

github-actions bot commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tonywu1999 commented Apr 2, 2026

Motivation and Context

Changes

Testing

Checklist Before Requesting a Review

Uh oh!

coderabbitai bot commented Apr 2, 2026

Review skipped

Uh oh!

github-actions bot commented Apr 2, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented Apr 2, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant