Remove quick-fingerprinting analysis (moved to bioinfo research)#259
Remove quick-fingerprinting analysis (moved to bioinfo research)#259
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Regression:
call_variantsnow misuses previously-ignoredregionparametercall_variantsnow treats BED-likeregionvalues as-Rtargets, preserving compatibility with callers that pass BED paths in theregionargument.
- ✅ Fixed: Ground truth filtered by BED only, not intersection
- When both
regionandregions_bedare set, ground-truth filtering now uses a precomputed BED intersection so denominator and called variants are restricted to the same locus set.
- When both
Or push these changes by commenting:
@cursor push 5882cc3db9
Preview (5882cc3db9)
diff --git a/ugvc/comparison/quick_fingerprinter.py b/ugvc/comparison/quick_fingerprinter.py
--- a/ugvc/comparison/quick_fingerprinter.py
+++ b/ugvc/comparison/quick_fingerprinter.py
@@ -46,8 +46,16 @@
def prepare_ground_truth(self):
ground_truths_to_check = {}
- if self.regions_bed is None:
- self.sp.print_and_run(f"echo {self.region} | sed 's/:/\t/' | sed 's/-/\t/' > {self.out_dir}/region.bed")
+ region_bed = f"{self.out_dir}/region.bed"
+ regions_bed_in_region = None
+ if self.region != "":
+ self.sp.print_and_run(f"echo {self.region} | sed 's/:/\t/' | sed 's/-/\t/' > {region_bed}")
+ if self.regions_bed is not None and self.region != "":
+ regions_bed_in_region = f"{self.out_dir}/regions_bed_in_region.bed"
+ self.sp.print_and_run(
+ f"bedtools intersect -a {self.regions_bed} -b {region_bed} | "
+ f"sort -k 1,1 -k 2,2n > {regions_bed_in_region}"
+ )
for sample_id in self.ground_truth_vcfs:
ground_truth_vcf = optional_cloud_sync(self.ground_truth_vcfs[sample_id], self.out_dir)
@@ -62,8 +70,9 @@
)
self.vpu.index_vcf(ground_truth_in_hcr)
if self.regions_bed is not None:
+ regions_to_use = regions_bed_in_region if regions_bed_in_region is not None else self.regions_bed
self.sp.print_and_run(
- f"bcftools view {ground_truth_in_hcr} -R {self.regions_bed} -Oz -o {ground_truth_to_check_vcf}"
+ f"bcftools view {ground_truth_in_hcr} -R {regions_to_use} -Oz -o {ground_truth_to_check_vcf}"
)
else:
self.sp.print_and_run(
@@ -78,7 +87,7 @@
)
elif self.region != "":
self.sp.print_and_run(
- f"bedtools intersect -a {hcr} -b {self.out_dir}/region.bed | "
+ f"bedtools intersect -a {hcr} -b {region_bed} | "
f"sort -k 1,1 -k 2,2n > {hcr_in_region}"
)
else:
diff --git a/ugvc/comparison/variant_hit_fraction_caller.py b/ugvc/comparison/variant_hit_fraction_caller.py
--- a/ugvc/comparison/variant_hit_fraction_caller.py
+++ b/ugvc/comparison/variant_hit_fraction_caller.py
@@ -21,7 +21,13 @@
self.region = region
def call_variants(self, cram: str, output_vcf: str, region: str, min_af: float, regions_bed: str | None = None) -> None:
- regions_arg = f"-R {regions_bed}" if regions_bed is not None else (f"-r {region}" if region else "")
+ if regions_bed is not None:
+ regions_arg = f"-R {regions_bed}"
+ elif region and (region.endswith(".bed") or region.endswith(".bed.gz")):
+ # Backward compatibility: some callers pass a BED path in `region`.
+ regions_arg = f"-R {region}"
+ else:
+ regions_arg = f"-r {region}" if region else ""
self.sp.print_and_run(
f"bcftools mpileup {cram} -f {self.ref} -a ad,format/dp --skip-indels -d 500 {regions_arg} "
+ f"| bcftools view -i 'AD[0:1] / format/DP >= {min_af}' -Oz -o {output_vcf}"This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
Applied via @cursor push command
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| if self.regions_bed is not None: | ||
| self.sp.print_and_run( | ||
| f"bedtools intersect -a {hcr} -b {self.out_dir}/region.bed | " | ||
| f"bedtools intersect -a {hcr} -b {self.regions_bed} | " |
There was a problem hiding this comment.
HCR intersected with full bed, not region-restricted bed
Low Severity
When both regions_bed and region_str are provided, the ground truth VCF is correctly filtered using regions_bed_in_region (the intersection of regions_bed with the region), but the HCR bed is intersected with the full self.regions_bed instead of self.regions_bed_in_region. This inconsistency means hcr_in_region may contain regions outside the specified region_str, producing an overly broad output file.



Remove quick-figerprinting analyssis from repo
Note
Medium Risk
Moderate risk because it deletes an end-to-end fingerprinting CLI/test surface and changes how
bcftools mpileupregion restriction is constructed (including new-RBED handling), which could alter variant-calling scope in downstream workflows.Overview
Removes the quick-fingerprinting feature and its system test fixtures: deletes the
quick_fingerprintingpipeline entrypoint, theQuickFingerprinterimplementation, the example JSON config, and the associated system test/resources.Updates hit-fraction variant calling and report dependencies: extends
VariantHitFractionCaller.call_variantsto accept an optionalregions_bed(and treats.bed/.bed.gzpassed viaregionas-Rfor backward compatibility), refactors major-alt extraction to use columnapply, and switchesrun_no_gt_report.pyto importugbio_comparison.vcf_comparison_utilsinstead ofcomparison_utils.Written by Cursor Bugbot for commit 4a25f59. This will update automatically on new commits. Configure here.