Fix ONNX Runtime CUDA fallback by preloading shared library dependencies by Misty-Star · Pull Request #276 · nomadkaraoke/python-audio-separator

Misty-Star · 2026-03-25T09:24:17Z

Summary

This PR improves GPU initialization for ONNX Runtime in pip-based CUDA environments by preloading
shared library dependencies before provider setup.

It also adds a clearer warning when an ONNX Runtime session cannot activate the requested
execution provider, and defers CLI imports so audio-separator help/usage paths do not eagerly
trigger heavy runtime initialization.

Problem

In some Linux environments where CUDA and cuDNN are installed via pip wheels, onnxruntime-gpu
may report CUDAExecutionProvider as available, but actual ONNX sessions can still fall back to
CPU because the required shared libraries are not visible to the dynamic loader at session
creation time.

In this case, users may see errors similar to:

Failed to load library ... libonnxruntime_providers_cuda.so
libcudnn.so.9: cannot open shared object file
Failed to create CUDAExecutionProvider

This can be confusing because PyTorch may still detect and use CUDA successfully, while ONNX
Runtime silently falls back to CPU for ONNX models.

Root Cause

The CUDA/cuDNN runtime libraries provided by pip-installed NVIDIA wheels are not always discovered
automatically by ONNX Runtime before the first CUDA execution provider session is created.

Changes

Call onnxruntime.preload_dlls() during accelerated device setup when the installed ONNX
Runtime version supports it.
Add a warning in the MDX ONNX loading path when the requested execution provider is not actually
activated by the created session.
Defer importing Separator in the CLI until it is actually needed, so no-argument/help flows do
not eagerly trigger ONNX Runtime initialization.
Add unit coverage for the ONNX Runtime dependency preload path and for the CLI no-argument
behavior.

Why this helps

This makes pip-installed CUDA/cuDNN runtimes visible to ONNX Runtime earlier in the startup flow,
which avoids a common failure mode where ONNX Runtime advertises CUDA support but then creates
CPU-only sessions in practice.

It also makes provider fallback much more obvious in logs, which should make future GPU
troubleshooting easier.

Validation

Validated locally with a pip-based GPU environment on Linux using:

Python 3.12
PyTorch 2.11.0+cu130
onnxruntime-gpu 1.24.4

Before this change:

PyTorch detected CUDA successfully.
ONNX Runtime exposed CUDAExecutionProvider in provider discovery.
A minimal ONNX InferenceSession(..., providers=["CUDAExecutionProvider"]) failed to load CUDA
dependencies and fell back to CPUExecutionProvider.

After this change:

The same environment successfully created an ONNX Runtime session using CUDAExecutionProvider
without requiring a manual LD_LIBRARY_PATH workaround.

Notes

This PR is focused on CUDA dependency loading and provider activation.

It does not attempt to address unrelated ONNX Runtime device discovery warnings such as Linux
DRM probing messages (for example, /sys/class/drm/card0/device/vendor on systems where card0
is a framebuffer device rather than the NVIDIA device).

Summary by CodeRabbit

Improvements
- Better ONNX Runtime provider diagnostics with warnings when requested acceleration providers are unavailable.
- Added ONNX Runtime dependency preloading to improve GPU acceleration reliability.
- CLI imports and error handling improved to surface missing-dependency/help messaging reliably.
Tests
- Added unit tests covering GPU runtime setup, dependency preloading failure handling, and CLI behavior.

coderabbitai · 2026-03-25T09:24:38Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3515b783-5355-4e67-8a11-2b255880a77c

📥 Commits

Reviewing files that changed from the base of the PR and between 86ce059 and 51f5d6a.

📒 Files selected for processing (1)

tests/unit/test_gpu_runtime_setup.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unit/test_gpu_runtime_setup.py

Walkthrough

Adds ONNX Runtime visibility and optional shared-library preloading, defers Separator imports in the CLI, and adds unit tests for the GPU/ORT setup and CLI no-args behavior.

Changes

Cohort / File(s)	Summary
ONNX Provider Logging `audio_separator/separator/architectures/mdx_separator.py`	When loading an ONNX inference session (segment_size == dim_t path), retrieves session providers via `get_providers()` and compares against the requested provider; logs debug info or emits a warning if the requested provider wasn't activated.
Dependency Preloading & Device Setup `audio_separator/separator/separator.py`	Added `Separator.preload_onnxruntime_dependencies()` and callsite in `setup_accelerated_inferencing_device()` to conditionally call `ort.preload_dlls`, logging success or warning on exception before configuring devices.
CLI Lazy Imports `audio_separator/utils/cli.py`	Removed top-level `Separator` import; now imports `Separator` lazily inside each CLI branch and before the main separation workflow to defer module loading.
CLI Test Update `tests/unit/test_cli.py`	Replaced skipped test with a simulated no-args CLI run that patches `sys.modules['audio_separator.separator'] = None`, expects `SystemExit(1)`, and asserts help text is printed.
GPU/ORT Setup Tests `tests/unit/test_gpu_runtime_setup.py`	Added two tests for `Separator.setup_accelerated_inferencing_device()`: one asserting normal preload+device setup with mocked `ort.preload_dlls`, another asserting device setup still runs and a warning is logged when `ort.preload_dlls` raises `RuntimeError`.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I dug a tunnel through the code so neat,

Preloaded DLLs and providers we now greet,
Lazy imports tiptoe in at call-time's chime,
Tests hop along to make the run sublime. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: fixing ONNX Runtime CUDA fallback through preloading shared library dependencies, which directly aligns with the primary objective and code changes across multiple files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/test_gpu_runtime_setup.py`:
- Line 12: The test currently patches
"audio_separator.separator.separator.ort.preload_dlls" assuming the attribute
exists; update the patch call that creates mock_preload to include create=True
so the test tolerates environments where ort.preload_dlls is absent (i.e.,
change the patch for ort.preload_dlls to
patch("audio_separator.separator.separator.ort.preload_dlls", create=True) while
keeping the rest of the test and the mock_preload variable unchanged).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8146c522-64a5-4451-a9fc-5b22aadd4e81

📥 Commits

Reviewing files that changed from the base of the PR and between 153b2e4 and 86ce059.

📒 Files selected for processing (5)

audio_separator/separator/architectures/mdx_separator.py
audio_separator/separator/separator.py
audio_separator/utils/cli.py
tests/unit/test_cli.py
tests/unit/test_gpu_runtime_setup.py

tests/unit/test_gpu_runtime_setup.py

fix: preload onnxruntime cuda dependencies

86ce059

coderabbitai bot reviewed Mar 25, 2026

View reviewed changes

tests/unit/test_gpu_runtime_setup.py Outdated Show resolved Hide resolved

test: allow preload_dlls patch on older ort

51f5d6a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix ONNX Runtime CUDA fallback by preloading shared library dependencies#276

Fix ONNX Runtime CUDA fallback by preloading shared library dependencies#276
Misty-Star wants to merge 2 commits intonomadkaraoke:mainfrom
Misty-Star:feat/preload-onnxruntime-cuda-deps

Misty-Star commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 25, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Misty-Star commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Root Cause

Changes

Why this helps

Validation

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Misty-Star commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 25, 2026 •

edited

Loading