fix(cm-client): preserve broken symlinks when zipping repositories#1468
fix(cm-client): preserve broken symlinks when zipping repositories#1468
Conversation
Replace zip-lib archiveFolder with direct yazl usage in zipRepository to handle broken (dangling) symlinks. zip-lib calls fs.stat() on symlink targets which throws ENOENT for broken symlinks, failing the entire code import. The new approach uses lstat to preserve symlinks as-is in the zip. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
solaris007
left a comment
There was a problem hiding this comment.
Hey @rpapani,
Good fix - broken symlinks crashing the entire code import pipeline is a real production pain point, and dropping down from zip-lib to yazl with lstat is the right approach.
Strengths
- Correct root cause fix (
src/index.js:393-394):zip-lib'sarchiveFoldercallsfs.stat()on symlink targets, which throws ENOENT for dangling symlinks. Usingyazldirectly withlstatSyncsolves this at the right layer. - Security boundary preserved (
src/index.js:326-335):#validateSymlinksstill runs before any zipping and still throws on symlinks escaping the repo root. The security gate is unchanged. - Clean separation of concerns (
src/index.js:349-371):#addDirToZipdoes exactly one thing and delegates validation to#validateSymlinks. The two-pass approach keeps the security boundary clear. - Stream error handling (
src/index.js:392-397): Both the yazl output stream and the write stream have error handlers wired to the rejection path. No silent failures. - Thorough test coverage: The broken-symlink test sets up a realistic nested directory structure and asserts on buffer content, metadata path, mode bits, and mtime. Four focused tests replacing two generic ones is a net positive.
- Low supply chain risk:
yazl3.3.1 is a mature, widely-used library (~2M weekly npm downloads) with a single transitive dependency (buffer-crc32). No known vulnerabilities.
Issues
Important (Should Fix)
-
Symlinks are stored as regular file content, not as ZIP symlink entries (
src/index.js:360)addBuffer(Buffer.from(linkTarget), metadataPath, { mode: stat.mode })writes the symlink target path as the file's content and sets mode to0o120777. The ZIP format has no native symlink type - this relies on the extractor interpreting Unix external attributes to reconstruct symlinks. Standard ZIP extractors (includingzip-lib'sextractused inunzipRepository) will extract this as a regular file containing the target path string, not as an actual symlink.This is not a security issue (the content is just a short path string, and the boundary check prevents escaping), but the PR claims "preserving them exactly as-is" - that claim depends entirely on the downstream consumer. If consumers expect working symlinks after extraction (e.g., Apache Dispatcher
enabled_farms -> available_farmspatterns), they will get regular files instead.Suggestion: Confirm with the downstream team (Cloud Manager pipeline) that this encoding is handled correctly on extraction. At minimum, document this behavior in the
zipRepositoryJSDoc so future engineers know the contract. -
Error test may be passing for the wrong reason (
test/cloud-manager-client.test.js-throws when yazl zip fails)The
throws when yazl zip fails and cleans up temp dirtest configures the mock pipe to emit'write failed', but it never configuresreaddirSyncStubfor the test'sclonePath. AfterbeforeEachresets the stub,readdirSync()returnsundefined. When#validateSymlinksiterates overundefinedwithfor...of, it throws aTypeErrorbefore yazl is ever used - so the test may be passing because of a TypeError rather than the intended stream error.Fix: Add
readdirSyncStub.withArgs(clonePath, { withFileTypes: true }).returns([]);at the top of this test case so the validation and walk phases complete cleanly and the test actually exercises the stream error path.
Minor (Nice to Have)
-
Empty directories are silently dropped (
src/index.js:352-370)#addDirToZiprecurses into directories but never adds directory entries themselves. If a directory is empty, it will be absent from the zip. In practice this is unlikely to matter (git doesn't track empty directories), but since the stated goal is "exact copy of the cloned ref," consider callingzip.addEmptyDirectory()when a directory has zero children, or document that empty directories are intentionally omitted. -
Missing test for writable stream error path (
src/index.js:393)The promise has two reject wires -
output.on('error')andzip.outputStream.on('error'). Only the outputStream error path is tested. A disk-full or permission-denied scenario would trigger the output error handler. Consider adding a test for that path. -
zip-libis now used only forextract(package.json:41)Not blocking, but the package now carries two ZIP libraries. Consider migrating
unzipRepositorytoyauzl(yazl's companion) in a follow-up to dropzip-libentirely.
Recommendations
- Document the symlink encoding contract in the JSDoc: "Symlinks are stored as regular entries whose content is the link target and whose mode preserves
S_IFLNK. Consumers must check mode bits to restore symlinks." - Consider combining
#validateSymlinksand#addDirToZipinto a single pass in a follow-up - this would eliminate duplicatereadlinkSync/readdirSynccalls and close the (theoretical) TOCTOU gap between validation and archiving. - Consider consolidating to yazl/yauzl and dropping
zip-libas a follow-up.
Assessment
Ready to merge? Yes, with minor fixes.
The core fix is correct and well-tested. The security boundary is maintained - symlink escape validation runs before zipping, lstat prevents symlink following, and stored content for symlinks is just the target path string (not file contents). The yazl dependency is low-risk. The main items to address are: (1) document or confirm the symlink-as-regular-file behavior with downstream consumers, and (2) fix the error test configuration so it exercises the intended code path. Neither is a merge blocker if you confirm the downstream behavior is acceptable.
ramboz
left a comment
There was a problem hiding this comment.
Looks good to me. I'd address the detailed PR feedback first though
Summary
archiveFolderwith directyazlusage inzipRepositoryto handle broken symlinks gracefully. zip-lib internally callsfs.stat()on symlink targets to determine file type, which throws ENOENT for broken (dangling) symlinks — causing the entire code import to fail.lstat(neverstat) to read symlink metadata and stores symlinks viayazl.addBuffer()with their original mode bits, preserving them exactly as-is in the zip regardless of whether the target exists.Context
Some customer repos may have broken symlinks on certain branches, which blocks code zipping and fails the entire code import. This fix allows zipping to proceed with broken symlinks as long as there are no security concerns (symlinks stay within the repo root).
Test plan
🤖 Generated with Claude Code