Skip to content

Security: salmad3/nousdev

Security

SECURITY.md

Security Architecture

Trust Model

Nous processes user-authored documentation through a pipeline of parsers, transformers, and renderers. Each stage crosses a trust boundary where untrusted input must be validated before it can influence output.

Data Flow

User-authored source files (untrusted)
    │
    ├── Parsers (parser-md, parser-mdx, parser-adoc, parser-rst, parser-kd, parser-kdx)
    │   └── Validate: frontmatter schemas, annotation keys, container names
    │   └── Output: NDM nodes (structured, typed)
    │
    ├── Transformers (plugin-shiki, core pipeline)
    │   └── highlightedHtml: produced by Shiki, sanitized before emission
    │   └── Output: enriched NDM nodes
    │
    ├── Renderer (renderer-html)
    │   └── Entity escaping for all inline/attribute content
    │   └── DOMPurify sanitization for htmlBlock and highlightedHtml
    │   └── SafeHtml branded type prevents raw string emission
    │   └── Output: sanitized HTML string
    │
    ├── Metadata Emitters (agent-metadata)
    │   └── MarkdownBuilder for all user-content interpolation
    │   └── Context-specific escaping (link labels, URLs, code fence langs)
    │   └── Output: Markdown, JSON-LD, agents.json, OpenAPI specs
    │
    └── Search Client (search)
        └── href validation rejects protocol-relative URLs
        └── escapeHtml for all rendered search results
        └── Output: client-side search UI

Trust Boundaries

Boundary Location Mechanism
Source → NDM Parser packages Schema validation, annotation key allowlist, container name validation
NDM → HTML renderer-html/render-nodes.ts escapeHtml() for text/attributes, DOMPurify.sanitize() for htmlBlock and highlightedHtml
NDM → Markdown agent-metadata/emitters/ MarkdownBuilder with context-specific escaping
Search index → DOM search/client.ts escapeHtml(), href validation, escapeRegExp()
Annotation keys → Object properties parser-kd/preprocessor.ts Object.create(null) + VALID_ANNOTATION_KEY allowlist
Container names → Node types parser-kd/preprocessor.ts VALID_CONTAINER_NAME regex (/^[a-zA-Z][a-zA-Z0-9-]{0,63}$/)

Defense Layers

Layer 1: Input Validation (Parser Boundary)

Annotation keys are validated against an allowlist pattern (/^@?[a-zA-Z][a-zA-Z0-9_.-]{0,63}$/). Keys that don't match — including __proto__, constructor, and all other prototype properties — are silently dropped. Annotation objects use Object.create(null) to eliminate prototype chain access.

Container names are validated against /^[a-zA-Z][a-zA-Z0-9-]{0,63}$/. Invalid names are treated as plain text.

Layer 2: Sanitization (Render Boundary)

HTML output uses DOMPurify (isomorphic-dompurify), which employs the same DOM parser as browsers (via jsdom). This eliminates the class of mutation XSS (mXSS) attacks that exploit parser differentials between sanitizer and browser.

Two DOMPurify configurations are maintained:

  • DOMPURIFY_CONFIG: permissive allowlist for user-authored htmlBlock content (structural/semantic HTML elements)
  • HIGHLIGHTED_HTML_CONFIG: restrictive allowlist for syntax-highlighted code (only pre, code, span, div with class/style)

Pipeline-internal HTML (highlightedHtml from Shiki) passes through the restrictive sanitizer. Trust-by-convention has been eliminated.

Markdown output uses MarkdownBuilder, which separates structure from content at the type level. User-controlled strings pass through context-specific escaping functions (escapeLinkLabel, escapeLinkUrl, sanitizeLang, singleLine) that also strip Unicode bidirectional override characters.

Layer 3: Type System (Compile-Time)

The SafeHtml branded type (nominal typing via unique symbol) makes raw string injection a compile-time error. SafeHtml values can only be created through:

  • createSafeHtml(): wraps DOMPurify output (caller must guarantee sanitization)
  • escapeToSafeHtml(): entity-escapes plain text
  • concatSafeHtml(): joins existing SafeHtml values

The unwrapSafeHtml() function extracts the raw string for final emission. Call sites using this function are trust extraction points and should be reviewed with the same scrutiny as SQL query construction.

Layer 4: Content Security Policy (Browser)

The renderer emits a CSP meta tag via getCSPMetaTag():

default-src 'self';
script-src 'self';
style-src 'self' 'unsafe-inline';
img-src 'self' data: https:;
object-src 'none';
base-uri 'self';
form-action 'self';
frame-ancestors 'none'

Key properties:

  • script-src 'self' blocks inline scripts even if sanitization is bypassed
  • object-src 'none' blocks plugin-based attacks entirely
  • frame-ancestors 'none' prevents clickjacking
  • base-uri 'self' prevents base tag hijacking

The CSP can be customized via getCSPDirectives(overrides) for sites that require additional script sources (analytics, etc.).

Adversarial Test Coverage

packages/renderer-html/src/security.test.ts contains adversarial tests covering:

  • OWASP top XSS payloads (script, iframe, object, embed, form, meta)
  • Event handler injection (onclick, onmouseover, onerror, onload, onfocus)
  • javascript:/vbscript:/data: URI schemes with case obfuscation and whitespace padding
  • Encoding attacks (mixed case, null bytes, HTML entity encoding, double encoding)
  • Mutation XSS vectors (nested tag confusion, SVG foreignObject, math/mtext namespace, noscript, template, style exfiltration)
  • highlightedHtml injection (script, event handlers, iframe, disallowed tags)
  • Inline content entity escaping (text nodes, link URLs, image alt, inline code)
  • CSP meta tag emission and override mechanics
  • Component props serialization and attribute breakout prevention

Reporting

Security issues should be reported via GitHub Security Advisories, not through public issues.

There aren’t any published security advisories