Xenium avoid calling `get_element_instances()`; fix deprecation warning #369

LucaMarconato · 2026-02-06T13:36:40Z

Big performance boost: avoid calling get_element_instances() when the info is present in the metadata.
Fix a falsely failing assert due to different index dtype (range index vs string) for some Xenium versions. Also some minor code cleanup.
Remove an old deprecation warning (and finally change the default argument as anticipated!)

@marcovarrone I noticed that even after #337 the memory usage was not great. This was due to remaining calls of compute() (called twice) via get_element_instances(). For some xenium version (some prototype versions right after 2.0.0 was released) it may be more difficult to avoid such call, but for newer Xenium version there is a massive performance boost. I will do systematic benchmarks soon, but I anticipate a 95% decrease in peak memory and some modest time improvement as well.

codecov-commenter · 2026-02-06T13:39:28Z

Codecov Report

❌ Patch coverage is 59.09091% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.85%. Comparing base (28eacc9) to head (7143d8e).

Files with missing lines	Patch %	Lines
src/spatialdata_io/readers/xenium.py	59.09%	9 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #369      +/-   ##
==========================================
- Coverage   63.08%   62.85%   -0.23%     
==========================================
  Files          27       27              
  Lines        3153     3158       +5     
==========================================
- Hits         1989     1985       -4     
- Misses       1164     1173       +9

Files with missing lines	Coverage Δ
src/spatialdata_io/readers/xenium.py	`71.10% <59.09%> (-2.18%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

… metadata; remove deprecaction warning

marcovarrone · 2026-02-07T13:12:29Z

Good catch @LucaMarconato !
Since get_element_instances() runs compute()only in the context of finding the unique elements, I assume it does not actually load the full image in RAM. The peak memory should then be the memory occupied by the chunks being processed in parallel at any given time, right?

What do you think about adding a num_workers parameter that can be used to wrap get_element_instances() as with dask.config.set(num_workers=num_workers):
That would allow users to reduce peak memory at the cost of running time.

LucaMarconato · 2026-02-09T09:39:18Z

Yes I would expect the behavior like what you describe. I did some experiments and the memory indeed decreases, but there is still quite some peak usage, probably because of some rechunking operations involved beforehand.

I could optimize this further and expose the parameters, but since get_element_instances() is now triggered only for some rare Xenium versions (before <2.0.0 the unique labels are found in a zarr file, from 2.0.0 in a parquet file; the info is missing in some prototype 2.0.0 xenium files, like in the xenium_2.0.0_io dataset from spatialdata-sandbox), I prefer not optimize this further in this PR.

If for some reasons this data turns out to be less rare than anticipated, or if the fallback code is needed for other reasons, I'd be open to have the behavior you described implemented.

LucaMarconato · 2026-02-09T09:45:57Z

Unreleated failing: scverse/spatialdata#1065. Will be fixed later today with a pre-release.

LucaMarconato · 2026-02-09T13:44:36Z

Turns out the offending compute was somewhere else! Here is the culprit: https://github.com/spatial-image/multiscale-spatial-image/blob/6217a331d3e0897dcb1d857086f68a6ab2a50f72/multiscale_spatial_image/to_multiscale/_dask_image.py#L196

The latest multiscale-spatial-image seems to have changed that, but it now depends on ngff-zarr. I will think what's best to do https://github.com/fideus-labs/ngff-zarr/blob/main/py/ngff_zarr/methods/_dask_image.py#L108, to use a different downsampling strategy, such as https://github.com/spatial-image/multiscale-spatial-image/blob/main/multiscale_spatial_image/to_multiscale/_xarray.py#L4 (and then change the way we read the data back, by loading each scale directly from disk), or start using ngff-zarr as a backend.

LucaMarconato · 2026-02-09T14:13:40Z

Considerations about the ngff-zarr dependency: fideus-labs/ngff-zarr#340

LucaMarconato · 2026-02-10T18:28:24Z

Turns out the offending compute was somewhere else! Here is the culprit: spatial-image/multiscale-spatial-image@6217a33/multiscale_spatial_image/to_multiscale/_dask_image.py#L196

This PR scverse/spatialdata#1068 fixes the above and leads to a massive performance improvement.

LucaMarconato added 2 commits February 6, 2026 14:33

fix assert xenium shapes id

e4d6733

cleanup

eae5858

LucaMarconato added the xenium label Feb 6, 2026

LucaMarconato added 2 commits February 7, 2026 11:22

attempt fix docs

f15a04e

xenium: avoid calling get_element_instances() when the info is in the…

9f1f713

… metadata; remove deprecaction warning

LucaMarconato marked this pull request as draft February 7, 2026 11:14

LucaMarconato changed the title ~~Xenium fix assert shapes id~~ Xenium avoid calling get_element_instances(); fix deprecation warning Feb 7, 2026

cleanup xenium _get_labels_and_indices_mapping()

7143d8e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xenium avoid calling `get_element_instances()`; fix deprecation warning #369

Xenium avoid calling `get_element_instances()`; fix deprecation warning #369

LucaMarconato commented Feb 6, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Feb 6, 2026 •

edited

Loading

Uh oh!

marcovarrone commented Feb 7, 2026

Uh oh!

LucaMarconato commented Feb 9, 2026

Uh oh!

LucaMarconato commented Feb 9, 2026 •

edited

Loading

Uh oh!

LucaMarconato commented Feb 9, 2026

Uh oh!

LucaMarconato commented Feb 9, 2026

Uh oh!

LucaMarconato commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Xenium avoid calling get_element_instances(); fix deprecation warning #369

Are you sure you want to change the base?

Xenium avoid calling get_element_instances(); fix deprecation warning #369

Conversation

LucaMarconato commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marcovarrone commented Feb 7, 2026

Uh oh!

LucaMarconato commented Feb 9, 2026

Uh oh!

LucaMarconato commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LucaMarconato commented Feb 9, 2026

Uh oh!

LucaMarconato commented Feb 9, 2026

Uh oh!

LucaMarconato commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Xenium avoid calling `get_element_instances()`; fix deprecation warning #369

Xenium avoid calling `get_element_instances()`; fix deprecation warning #369

LucaMarconato commented Feb 6, 2026 •

edited

Loading

codecov-commenter commented Feb 6, 2026 •

edited

Loading

LucaMarconato commented Feb 9, 2026 •

edited

Loading