Fix DuckLake transaction conflict and SSL connection loss handling#389
Merged
fuziontech merged 3 commits intomainfrom Apr 3, 2026
Merged
Fix DuckLake transaction conflict and SSL connection loss handling#389fuziontech merged 3 commits intomainfrom
fuziontech merged 3 commits intomainfrom
Conversation
Two distinct failure modes were causing elevated commit failures after the DuckDB 1.5.1/DuckLake 0.4 upgrade: 1. SSL connection loss (35 occurrences): isTransientDuckLakeError checked for "server closed the connection unexpectedly" but the actual error says "SSL connection has been closed unexpectedly" — the SSL variant was never detected, so retryOnTransient never fired. 2. Transaction conflicts: concurrent DuckLake model builds conflict on commit due to global snapshot IDs. Add retryOnConflict with exponential backoff + jitter (50ms initial, 5 retries, 2s cap) for autocommit queries in both control-plane and standalone modes. Also maps transaction conflicts to SQLSTATE 40001 (serialization_failure) so PG-aware clients know to retry, and adds Prometheus counters for conflict tracking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix indentation in executeQueryDirect retryOnConflict closure - Preserve last DuckLake error in retryOnConflict exhaustion message (wraps with %w so callers see the actual conflict details) - Add conflict retry to executeSelectQuery for symmetry with Flight SQL handler's GetFlightInfoStatement (SELECT can conflict during schema probing) - Add comments explaining transient→conflict retry chaining behavior - Fix misleading test names and comments for non-conflict error case - Update exhaustion tests for the extra final fn() call Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
retryOnConflict now captures the last error from within the retry loop rather than making a hidden 6th attempt after exhaustion. The conflictMaxRetries constant now accurately reflects the actual number of retry attempts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
isTransientDuckLakeErrormissing SSL variant: the function checked for"server closed the connection unexpectedly"but production logs show"SSL connection has been closed unexpectedly"(35 occurrences). Also adds"Current transaction is aborted"for cascading PG failures.retryOnConflictfor DuckLake transaction conflicts: exponential backoff with jitter (50ms initial, 5 retries, 2s cap) for autocommit queries in both control-plane (Flight SQL) and standalone modes. User-managed transactions propagate the error.40001(serialization_failure) so PG-aware clients like SQLMesh know to retry.duckgres_{worker_,}ducklake_conflict_{total,retries_total,retry_successes_total,retries_exhausted_total}for monitoring conflict rates and retry effectiveness.Test plan
go build ./server/... ./duckdbservice/... ./controlplane/...go test ./duckdbservice/... -count=1— all passgo test ./server/... -count=1— all passgo vet ./server/... ./duckdbservice/...— cleanjust test-ducklake-concurrency— verify reduced client-visible conflicts🤖 Generated with Claude Code