Closed
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new fixMADC() feature to the BIGr R package to convert raw DArT MADC files to the fixed allele-ID format (standardizing CloneID to Chr_Pos and appending numbered allele suffixes), along with package metadata/docs updates.
Changes:
- Introduces
R/fixMADC.R(new exported function) and corresponding Rd documentation. - Adds a comprehensive
testthatsuite forfixMADC()and supporting fixture builders. - Updates package metadata (version bump to 0.7.0, NEWS entry, NAMESPACE imports/exports) and adds an example marker file to
inst/.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
R/fixMADC.R |
Implements the new fixMADC() function for remapping/standardizing MADC IDs. |
tests/testthat/test-fixMADC.R |
Adds extensive tests covering mapping, validation, warnings/errors, and file output behavior. |
man/fixMADC.Rd |
New generated documentation for fixMADC(). |
NAMESPACE |
Exports fixMADC and imports tidyr::replace_na. |
DESCRIPTION |
Bumps version to 0.7.0 and updates RoxygenNote. |
NEWS.md |
Adds a 0.7.0 entry describing fixMADC. |
R/utils.R |
Adds .suffix_type to globalVariables to avoid R CMD check notes. |
inst/iris_MADC_marker_file.csv |
Adds a packaged example marker file for fixMADC usage. |
cran-comments.md |
Removes outdated update notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+82
to
+83
| #Remove the first n.summary.columns columns | ||
| if (n.summary.columns > 0) { |
Comment on lines
+108
to
+136
| if (any(grepl("[*#_!.\\-]", marker_file[,3]))) { | ||
| stop("Special characters (*#_-!.) detected in the position column (column 3). Please review the marker file.") | ||
| } | ||
|
|
||
| if (!all(grepl("^[0-9]+$", marker_file[,3]))) { | ||
| stop("The position column (column 3) must be numeric. Please review the marker file.") | ||
| } | ||
|
|
||
| #Make marker IDs column and pad 0's for the position | ||
| marker_file$new_ID <- paste0( | ||
| marker_file[,2], | ||
| "_", | ||
| str_pad(marker_file[,3], width = 9, side = "left", pad = "0") | ||
| ) | ||
|
|
||
| #Verify there are no duplicate IDs in the marker file | ||
| if (length(unique(marker_file[,1])) != length(marker_file[,1])) { | ||
| stop("There are duplicate marker IDs in the first column. Please review marker file.") | ||
| } | ||
|
|
||
| #Verify there are no duplicate position information | ||
| if (length(unique(marker_file$new_ID)) != length(marker_file$new_ID)) { | ||
| stop("There are duplicate Chr and Pos information where more than one marker has the same Chr_Pos. Please review the marker file.") | ||
| } | ||
|
|
||
| #Verify chromosome column (col 2) contains no special characters | ||
| if (any(grepl("[*#_!.\\-]", marker_file[,2]))) { | ||
| stop("Special characters (*#_-!.) detected in the chromosome column (column 2). Please review the marker file.") | ||
| } |
| #Save the output to disk if file name provided | ||
| if (!is.null(output.file)) { | ||
| message("Saving fixed MADC data to file") | ||
| write.csv(filtered_df, paste0(output.file,"_fixedID.csv"), row.names = FALSE) |
Comment on lines
+69
to
+75
| #Read the madc file | ||
| filtered_df <- read.csv(madc.file, sep = ',', check.names = FALSE) | ||
|
|
||
| #Remove extra text after Ref and Alt (_001 or _002) | ||
| #filtered_df$AlleleID <- sub("\\|Ref_.*", "|Ref", filtered_df$AlleleID) | ||
| #filtered_df$AlleleID <- sub("\\|Alt_.*", "|Alt", filtered_df$AlleleID) | ||
|
|
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## development #51 +/- ##
===============================================
+ Coverage 83.05% 83.94% +0.88%
===============================================
Files 19 20 +1
Lines 1369 1451 +82
===============================================
+ Hits 1137 1218 +81
- Misses 232 233 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces BIGr version 0.7.0, with the primary change being the addition of a new function,
fixMADC, to process and standardize raw MADC files using user-supplied chromosome and position information. The update also includes relevant documentation, namespace, and metadata changes to support this new feature.New Functionality:
fixMADCto format raw MADC files by updating allele and clone IDs to a standardizedChr_Posformat using a user-provided marker file. This function supports downstream compatibility withmadc2vcfand BIGapp functions. (R/fixMADC.R,NAMESPACE,man/fixMADC.Rd,NEWS.md) [1] [2] [3] [4]Documentation and Metadata Updates:
fixMADCfunction. (DESCRIPTION,NEWS.md) [1] [2]man/fixMADC.Rd)RoxygenNoteversion to 7.3.3 inDESCRIPTION..suffix_typeto the list of global variables to avoid R CMD check warnings. (R/utils.R)Imports and Namespace:
fixMADCto the list of exported functions and importedreplace_nafromtidyrfor use in the new function. (NAMESPACE) [1] [2]Other:
cran-comments.mdby removing outdated update notes.