Skip to content

Block vLLM/SGLang serve on non-Linux with clear error#966

Closed
alfredoclarifai wants to merge 2 commits intocli-improvementfrom
cli-improvement-vllm-platform-check
Closed

Block vLLM/SGLang serve on non-Linux with clear error#966
alfredoclarifai wants to merge 2 commits intocli-improvementfrom
cli-improvement-vllm-platform-check

Conversation

@alfredoclarifai
Copy link
Contributor

Summary

  • Add early platform check when serving models that use vLLM or SGLang toolkits
  • On macOS/Windows, these engines crash deep in C extensions with opaque AttributeError tracebacks
  • Now fails fast with a clear message: what's wrong and what to do instead (cloud deploy or Ollama)
  • Applied to both serve paths (API-connected and --grpc)

Test plan

  • On macOS, run clarifai model serve . in a vLLM model directory — should get clear error instead of C extension traceback
  • On macOS, run clarifai model serve --grpc in an SGLang model directory — same clear error
  • On Linux, verify serve still works normally (platform check passes)

🤖 Generated with Claude Code

alfredoclarifai and others added 2 commits February 26, 2026 10:06
vLLM and SGLang only support Linux with GPU access. On macOS/Windows,
they crash deep in C extensions with opaque errors. This adds an early
platform check in both serve paths (API-connected and --grpc) to fail
fast with an actionable message suggesting cloud deploy or Ollama.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an early OS/platform guard in the CLI model serving paths to fail fast (with a clearer UserError) when attempting to serve vLLM/SGLang-based models on non-Linux platforms.

Changes:

  • Add non-Linux detection for vLLM/SGLang during clarifai model serve (API-connected) validation.
  • Add similar detection during clarifai model serve --grpc validation.
  • Reuse toolkit.provider (when present) alongside requirements.txt inspection to detect the engine.
Comments suppressed due to low confidence (1)

clarifai/cli/model.py:1404

  • serve_cmd introduces toolkit_provider = config.get('toolkit', {}).get('provider'), but later in the same validation block the LM Studio branch still re-reads config.get('toolkit', {}).get('provider') instead of using toolkit_provider. Using the cached variable consistently would avoid duplicate lookups and keep the checks uniform (and reduces the chance of future drift between branches).
    toolkit_provider = config.get('toolkit', {}).get('provider')
    if _platform.system() != "Linux":
        for engine in ('vllm', 'sglang'):
            if engine in dependencies or toolkit_provider == engine:
                raise UserError(
                    f"{engine} is not supported on {_platform.system()}. It requires a Linux environment with GPU access.\n"
                    "  Use 'clarifai model deploy .' to run on cloud GPU, or switch to the Ollama or LM Studio toolkit for local serving."
                )

    if "ollama" in dependencies or toolkit_provider == 'ollama':

Comment on lines +1396 to +1402
if _platform.system() != "Linux":
for engine in ('vllm', 'sglang'):
if engine in dependencies or toolkit_provider == engine:
raise UserError(
f"{engine} is not supported on {_platform.system()}. It requires a Linux environment with GPU access.\n"
" Use 'clarifai model deploy .' to run on cloud GPU, or switch to the Ollama or LM Studio toolkit for local serving."
)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds new platform-specific behavior (raising UserError on non-Linux when vLLM/SGLang is detected), but the existing CLI tests for clarifai model serve don’t appear to cover these branches. Adding unit tests that mock platform.system() (and set up requirements.txt/toolkit.provider for vllm/sglang) would prevent regressions and should cover both API-connected serve and the --grpc path.

Copilot uses AI. Check for mistakes.
Comment on lines +1116 to +1125
import platform as _platform

toolkit_provider = config.get('toolkit', {}).get('provider')
if _platform.system() != "Linux":
for engine in ('vllm', 'sglang'):
if engine in dependencies or toolkit_provider == engine:
raise UserError(
f"{engine} is not supported on {_platform.system()}. It requires a Linux environment with GPU access.\n"
" Use 'clarifai model deploy .' to run on cloud GPU, or switch to the Ollama or LM Studio toolkit for local serving."
)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _run_local_grpc, the non-Linux guard for vLLM/SGLang only runs inside if mode not in ("container", "env"). That means clarifai model serve --grpc --mode env|container on macOS/Windows will skip this early check and can still hit the same deep C-extension failures this PR is trying to avoid. Consider moving the platform/toolkit check outside the mode gate (using toolkit_provider and/or parsing requirements regardless of mode) so the behavior is consistent across all --grpc modes.

Copilot uses AI. Check for mistakes.
@luv-bansal
Copy link
Contributor

@alfredoclarifai I have included these improvements in my branch, so closing this PR

@luv-bansal luv-bansal closed this Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants