Deep-dive into the deployment of an on-premise low-privilege...#2043
Open
carlospolop wants to merge 1 commit intomasterfrom
Open
Deep-dive into the deployment of an on-premise low-privilege...#2043carlospolop wants to merge 1 commit intomasterfrom
carlospolop wants to merge 1 commit intomasterfrom
Conversation
Collaborator
Author
🔗 Additional ContextOriginal Blog Post: https://www.synacktiv.com/en/publications/deep-dive-into-the-deployment-of-an-on-premise-low-privileged-llm-server.html Content Categories: Based on the analysis, this content was categorized under "🤖 AI Security (LLM serving hardening / LLM infrastructure attack surface) and/or 🐧 Linux Privilege Escalation -> Docker Security (rootless Podman, --network=none + UNIX sockets, AppArmor profiles); plus a note under 🕸️ Pentesting Web (reverse-proxy endpoint allowlisting to prevent unintended debug/info-leak routes)". Repository Maintenance:
Review Notes:
Bot Version: HackTricks News Bot v1.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🤖 Automated Content Update
This PR was automatically generated by the HackTricks News Bot based on a technical blog post.
📝 Source Information
🎯 Content Summary
Title/Context
The post (Synacktiv, published 20/03/2026) documents a security-first, low-privilege deployment of an on‑premise LLM inference server intended for confidential business data. The core goals are: air-gapped instances (no data exfil via network), strong isolation between teams/projects (one GPU per project), and minimal host attack surface. The stack used throughout the article is:
• OS: Debian 13 (hardened)
• Inf...
🔧 Technical Details
Reduce inference-server data exposure by allowlisting API endpoints: Treat LLM inference servers as sensitive multi-user services. Debug/monitoring routes can expose internal state (e.g., a historical llama.cpp
/slotsendpoint that leaked full prompt contents when slot debugging was enabled). Place a reverse proxy in front of the server and enforce a strict deny-by-default allowlist (e.g., nginxmapreturning403for non-audited routes). Additionally disable server-side slot monitoring (--no-slots) so prompt/slot inspection endpoints cannot be used to exfiltrate user prompts and confidential inputs.Run LLM servers in rootless containers without networking using UNIX sockets: If the inference server can listen on a UNIX socket, run containers with
--network=noneto eliminate TCP/IP exposure and remove the need for user-mode networking daemons (e.g.,pasta). Control access with POSIX per...🤖 Agent Actions
Updated AI-Risk-Frameworks.md with a new AI-specific section on self-hosted LLM inference hardening.
The added content focuses on the non-duplicative parts from the Synacktiv post: prompt leakage through debug endpoints like
/slots, deny-by-default reverse-proxy allowlisting, rootlessPodmanwith--network noneover UNIX sockets, GPU device-node minimization includingnvidia-uvmrisk, and AppArmor/LSM confinement. I kept it in a single file and updated the references at the end of AI-Risk-Frameworks.md.Verification: checked the final diff and only this file changed.
This PR was automatically created by the HackTricks Feed Bot. Please review the changes carefully before merging.