feat(api-client): Update WebSocket connection logic to fallback from /websocket to /await on 404 [WPB-23436]#20439
feat(api-client): Update WebSocket connection logic to fallback from /websocket to /await on 404 [WPB-23436]#20439thisisamir98 wants to merge 6 commits intodevfrom
Conversation
…/websocket to /await on 404 [WPB-23436]
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #20439 +/- ##
==========================================
+ Coverage 45.34% 45.35% +0.01%
==========================================
Files 1635 1635
Lines 40312 40327 +15
Branches 8335 8335
==========================================
+ Hits 18279 18291 +12
- Misses 20101 20105 +4
+ Partials 1932 1931 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR updates the WebSocket connection logic in the @wireapp/api-client library to implement automatic endpoint discovery and fallback from /websocket to /await endpoints. The change replaces a temporary domain-based routing hack with a more robust test-and-fallback mechanism that probes the /websocket endpoint before attempting connection.
Changes:
- Implements
findSocketAddressToUse()method that tests/websocketendpoint availability and falls back to/awaiton failure - Makes
buildWebSocketUrl()async to accommodate the endpoint discovery logic - Updates
@wireapp/api-clientpackage version from 27.95.5 to 27.95.6 in core library dependencies - Removes the temporary
_temporaryIsProdBackend()domain detection method
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| libraries/api-client/src/tcp/WebSocketClient.ts | Implements async endpoint discovery with test connection and timeout-based fallback logic, removes domain-based hack |
| libraries/core/package.json | Updates api-client dependency to version 27.95.6 |
| yarn.lock | Updates lock file to reflect new api-client version |
|
🔗 Download Full Report Artifact 🧪 Playwright Test Summary
|
There was a problem hiding this comment.
[Blocker] - Security: Access token exposed in test WebSocket URL
The findSocketAddressToUse method creates a test WebSocket connection with the full queryString that includes the access token. This test connection is made before the actual authenticated connection, potentially exposing the token in browser DevTools, network logs, or other monitoring systems.
If the test connection fails and an error is logged anywhere in the WebSocket stack, the access token could be leaked. Additionally, this creates an extra authentication attempt that could be logged server-side, potentially causing security alerts or rate limiting issues.
Consider:
- Making the test connection without authentication parameters (if the endpoint allows checking availability without auth)
- Using a HEAD request or OPTIONS request to test endpoint availability instead of opening a full WebSocket connection
- At minimum, ensure no logging occurs that could expose the token in the test connection
There was a problem hiding this comment.
[Important] - Cached socket address is never invalidated
The cachedSocketAddress is set once and never cleared, even when:
- The WebSocket disconnects or encounters errors
- The backend infrastructure changes (e.g., /websocket endpoint becomes unavailable)
- The client reconnects after a network change
This means if the initial probe succeeds with /websocket but that endpoint later becomes unavailable (e.g., during a rolling deployment), the client will continue trying /websocket indefinitely without re-probing.
Consider adding a mechanism to:
- Clear the cache on connection failures or errors
- Re-probe after a certain number of failed connection attempts
- Add a public method to reset the cache when needed (e.g.,
resetSocketAddressCache())
There was a problem hiding this comment.
[Important] - Missing test coverage for new WebSocket endpoint fallback logic
The new findSocketAddressToUse method implements critical connection logic but has no test coverage. Given that:
- Other methods in WebSocketClient.test.ts have test coverage
- This is a critical path for establishing connections
- The logic involves timing, error handling, and resource cleanup
Tests should cover:
- Successful connection to
/websocketendpoint - Fallback to
/awaiton error - Fallback to
/awaiton timeout - Proper cleanup of test WebSocket connections
- Cache behavior (not re-probing on subsequent calls)
There was a problem hiding this comment.
[Suggestion] - Fallback logic doesn't specifically check for 404 as mentioned in PR title
The PR title mentions "fallback from /websocket to /await on 404 [WPB-23436]", but the implementation falls back on ANY error (line 313: testSocket.onerror = handleFailure;) or timeout, not specifically 404 responses.
While this may be intentional (falling back on any error is more robust), consider:
- If you specifically want to check for 404, you would need to inspect the error object or use a different approach (HTTP HEAD request first)
- If the current behavior is correct, update the PR title/description to reflect that it falls back on any connection failure, not just 404
- Document why falling back on all errors is the chosen approach (e.g., some backends might return 403, 401, or other errors instead of 404)
| } | ||
|
|
||
| public buildWebSocketUrl(): string { | ||
| public async buildWebSocketUrl(): Promise<string> { |
There was a problem hiding this comment.
[Important] - Breaking API change: buildWebSocketUrl signature changed from sync to async
The method buildWebSocketUrl was changed from returning string to returning Promise<string>. This is a breaking change for any external code that calls this public method synchronously.
While the internal usage in onReconnect already handles it with await, any external consumers of this API will break. For example:
- Direct calls like
const url = client.buildWebSocketUrl()will now get a Promise instead of a string - This could affect testing, debugging tools, or other libraries that depend on @wireapp/api-client
Consider:
- Deprecating the old method and creating a new async method (e.g.,
buildWebSocketUrlAsync) - Documenting this as a breaking change in the changelog
- Bumping the major version number to signal the breaking change (though this appears to be a minor version bump from 27.95.5 to 27.95.6)
| public async buildWebSocketUrl(): Promise<string> { | |
| /** | |
| * @deprecated This method is now asynchronous and returns a Promise. | |
| * Use {@link buildWebSocketUrlAsync} instead to make the async nature explicit. | |
| */ | |
| public buildWebSocketUrl(): Promise<string> { | |
| return this.buildWebSocketUrlAsync(); | |
| } | |
| public async buildWebSocketUrlAsync(): Promise<string> { |
There was a problem hiding this comment.
[Blocker] - Resource leak: Test WebSocket connection is not properly cleaned up in all scenarios
The findSocketAddressToUse method creates a test WebSocket connection but has several critical issues:
-
No error listener removal: After
handleSuccessorhandleFailureis called, the event listeners remain attached to the closed socket, potentially causing memory leaks if the Promise resolves but events fire later. -
Race condition with close(): Calling
testSocket.close()is asynchronous, but the function resolves immediately. Ifonerrorfires afterhandleSuccesshas been called, it will callhandleFailurewhich will try to close an already-closing socket and resolve the Promise again (though this is harmless for Promises, it's still incorrect flow). -
Missing onclose handler: If the connection closes unexpectedly (not via error or open), the Promise never resolves, hanging indefinitely.
Suggested improvements:
- Clear event handlers after resolution to prevent late-firing events
- Add an
onclosehandler to catch unexpected closures - Consider using AbortController or a flag to prevent multiple resolutions
| // In case the connection hangs, we fallback to the await endpoint after 5 seconds | |
| const timeoutId = setTimeout(() => { | |
| if (testSocket.readyState !== WebSocket.OPEN) { | |
| handleFailure(); | |
| } | |
| }, 5000); | |
| const handleSuccess = () => { | |
| clearTimeout(timeoutId); | |
| testSocket.close(); | |
| resolve('websocket'); | |
| }; | |
| const handleFailure = () => { | |
| clearTimeout(timeoutId); | |
| testSocket.close(); | |
| resolve('await'); | |
| }; | |
| testSocket.onopen = handleSuccess; | |
| testSocket.onerror = handleFailure; | |
| let settled = false; | |
| const cleanup = () => { | |
| testSocket.onopen = null; | |
| testSocket.onerror = null; | |
| testSocket.onclose = null; | |
| }; | |
| // In case the connection hangs, we fallback to the await endpoint after 5 seconds | |
| const timeoutId = setTimeout(() => { | |
| if (!settled && testSocket.readyState !== WebSocket.OPEN) { | |
| handleFailure(); | |
| } | |
| }, 5000); | |
| const handleSuccess = () => { | |
| if (settled) { | |
| return; | |
| } | |
| settled = true; | |
| clearTimeout(timeoutId); | |
| cleanup(); | |
| testSocket.close(); | |
| resolve('websocket'); | |
| }; | |
| const handleFailure = () => { | |
| if (settled) { | |
| return; | |
| } | |
| settled = true; | |
| clearTimeout(timeoutId); | |
| cleanup(); | |
| testSocket.close(); | |
| resolve('await'); | |
| }; | |
| testSocket.onopen = handleSuccess; | |
| testSocket.onerror = handleFailure; | |
| testSocket.onclose = () => { | |
| if (!settled) { | |
| handleFailure(); | |
| } | |
| }; |
There was a problem hiding this comment.
This variable is just an alias to another one. Do we really need that?
There was a problem hiding this comment.
| if (!this.cachedSocketAddress) { | |
| if (this.cachedSocketAddress !== undefined) { |
I know this is spread across the whole code base but it would be great to make the intention more clear and to prevent something that is called future drift:
if (!this.cachedSocketAddress) could mean (but you don't know it):
- if
cachedSocketAddresschanges in the future to be string instead of a string literal and uses an empty string this statement will be truthy - if it will become
nullthennullwill be also treated as missing
There was a problem hiding this comment.
Instead of an hard timer wouldn't it be better to listen for error cases? So for example
testSocket.addEventListener('error', handleFailure);If the WebSocket connection could really "hang", wouldn't it be easier to introduce timing out promises so that you would not have to take care about cleaning up things afterwards manually. It also provides fallbacks (which would be exactly what I want to solve here: use /websocket and fallback to /await)
|
| ? `${this.baseUrl}/${cachedSocketPrefixValue}?${queryString}` | ||
| : `${this.baseUrl}${this.versionPrefix}/events?${queryString}`; |
There was a problem hiding this comment.
[Blocker] - Logging the full WebSocket URL leaks the access_token query parameter into client logs, which is sensitive and can be exfiltrated via log collection/bug reports.
Suggested fix: redact/remove sensitive query parameters before logging (or log only the path/endpoint type without the token).
| if (this.cachedSocketAddress.isNothing) { | ||
| const webSocketAddressPrefix = await findWebSocketAddressPrefix({ | ||
| baseUrl: this.baseUrl, | ||
| queryString, | ||
| webSocket: WebSocket, | ||
| connectionTimeoutInMilliseconds: 5000, | ||
| }); | ||
|
|
||
| this.cachedSocketAddress = toolbelt.fromResult(webSocketAddressPrefix); | ||
| } | ||
|
|
||
| const webSocketAddress = this.cachedSocketAddress.match({ | ||
| Just: cachedSocketPrefixValue => { | ||
| return this.useLegacySocket | ||
| ? `${this.baseUrl}/${cachedSocketPrefixValue}?${queryString}` | ||
| : `${this.baseUrl}${this.versionPrefix}/events?${queryString}`; | ||
| }, | ||
| Nothing: () => { | ||
| return `${this.baseUrl}/await?${queryString}`; | ||
| }, | ||
| }); | ||
|
|
There was a problem hiding this comment.
[Important] - buildWebSocketUrl performs the /websocket probe even when useLegacySocket is false (the result is ignored and ${versionPrefix}/events is used). This adds an extra connection attempt and can fail in setups where /websocket is disabled.
Suggested fix: gate the probing/caching logic behind this.useLegacySocket so it only runs for legacy socket URLs.
| if (this.cachedSocketAddress.isNothing) { | |
| const webSocketAddressPrefix = await findWebSocketAddressPrefix({ | |
| baseUrl: this.baseUrl, | |
| queryString, | |
| webSocket: WebSocket, | |
| connectionTimeoutInMilliseconds: 5000, | |
| }); | |
| this.cachedSocketAddress = toolbelt.fromResult(webSocketAddressPrefix); | |
| } | |
| const webSocketAddress = this.cachedSocketAddress.match({ | |
| Just: cachedSocketPrefixValue => { | |
| return this.useLegacySocket | |
| ? `${this.baseUrl}/${cachedSocketPrefixValue}?${queryString}` | |
| : `${this.baseUrl}${this.versionPrefix}/events?${queryString}`; | |
| }, | |
| Nothing: () => { | |
| return `${this.baseUrl}/await?${queryString}`; | |
| }, | |
| }); | |
| let webSocketAddress: string; | |
| if (this.useLegacySocket) { | |
| if (this.cachedSocketAddress.isNothing) { | |
| const webSocketAddressPrefix = await findWebSocketAddressPrefix({ | |
| baseUrl: this.baseUrl, | |
| queryString, | |
| webSocket: WebSocket, | |
| connectionTimeoutInMilliseconds: 5000, | |
| }); | |
| this.cachedSocketAddress = toolbelt.fromResult(webSocketAddressPrefix); | |
| } | |
| webSocketAddress = this.cachedSocketAddress.match({ | |
| Just: cachedSocketPrefixValue => { | |
| return `${this.baseUrl}/${cachedSocketPrefixValue}?${queryString}`; | |
| }, | |
| Nothing: () => { | |
| return `${this.baseUrl}/await?${queryString}`; | |
| }, | |
| }); | |
| } else { | |
| webSocketAddress = `${this.baseUrl}${this.versionPrefix}/events?${queryString}`; | |
| } |
| export function findWebSocketAddressPrefix( | ||
| dependencies: FindWebSocketAddressPrefixOptions, | ||
| ): Task<WebSocketAddressPrefix, unknown> { | ||
| const {baseUrl, queryString, webSocket, connectionTimeoutInMilliseconds} = dependencies; | ||
|
|
There was a problem hiding this comment.
[Important] - FindWebSocketAddressPrefix (and new dependencies) is added, but WebSocketClient implements its own separate probing logic (findSocketAddressToUse). This duplicates behavior and risks divergence.
Suggested fix: either wire WebSocketClient to use findWebSocketAddressPrefix (and remove the duplicate), or remove the unused helper/tests/dependencies if it’s not meant to be used.
| testSocket.close(); | ||
| resolve('websocket'); | ||
| }; | ||
| testSocket.onerror = () => { |
There was a problem hiding this comment.
[Important] - On onerror, the test socket is not closed in tryToEstablishWebSocketConnection, which can leak a socket instance and associated handlers.
Suggested fix: close the socket (and optionally clear handlers) before rejecting/resolving on error.
| testSocket.onerror = () => { | |
| testSocket.onerror = () => { | |
| try { | |
| // Clear handlers to avoid retaining references unnecessarily | |
| testSocket.onopen = null; | |
| testSocket.onerror = null; | |
| testSocket.onclose = null; | |
| testSocket.close(); | |
| } catch { | |
| // Swallow errors from close to ensure rejection still occurs | |
| } |
|
|
||
| return websocketAddress; | ||
| return webSocketAddress; | ||
| } | ||
|
|
||
| public useAsyncNotificationsSocket() { |
There was a problem hiding this comment.
[Blocker] - findSocketAddressToUse uses the global WebSocket constructor (new WebSocket(...)), which will throw in Node/test environments. This library already uses a Node shim (WebSocketNode) via ReconnectingWebsocket, so reconnect/URL building can break outside the browser.
Suggested fix: reuse the same WebSocket implementation as ReconnectingWebsocket (or inject a WebSocket constructor) so the probe works in both browser and Node.
| const webSocketAddressPrefix = await findWebSocketAddressPrefix({ | ||
| baseUrl: this.baseUrl, | ||
| queryString, | ||
| webSocket: WebSocket, | ||
| connectionTimeoutInMilliseconds: 5000, | ||
| }); | ||
|
|
||
| this.cachedSocketAddress = toolbelt.fromResult(webSocketAddressPrefix); |
There was a problem hiding this comment.
[Blocker] - The fallback decision is based on any connection error/timeout, not specifically a 404 on /websocket (e.g., transient network issues or auth failures will permanently cache 'await'). This can silently degrade to the long-polling endpoint even when /websocket exists.
Suggested fix: determine endpoint availability via an HTTP request that can inspect status codes (or otherwise only cache fallback on a confirmed 404), and avoid caching negative results caused by temporary failures.
| const webSocketAddressPrefix = await findWebSocketAddressPrefix({ | |
| baseUrl: this.baseUrl, | |
| queryString, | |
| webSocket: WebSocket, | |
| connectionTimeoutInMilliseconds: 5000, | |
| }); | |
| this.cachedSocketAddress = toolbelt.fromResult(webSocketAddressPrefix); | |
| try { | |
| const webSocketAddressPrefix = await findWebSocketAddressPrefix({ | |
| baseUrl: this.baseUrl, | |
| queryString, | |
| webSocket: WebSocket, | |
| connectionTimeoutInMilliseconds: 5000, | |
| }); | |
| const maybePrefix = toolbelt.fromResult(webSocketAddressPrefix); | |
| // Only cache positive results to avoid permanently degrading to the fallback | |
| if (maybePrefix.isJust) { | |
| this.cachedSocketAddress = maybePrefix; | |
| } | |
| } catch (error) { | |
| // [Important] Log and continue to fallback logic without caching a negative result | |
| this.logger.error('Failed to determine WebSocket address prefix', error); | |
| } |



Pull Request
Summary
Security Checklist (required)
Accessibility (required)
Standards Acknowledgement (required)
Screenshots or demo (if the user interface changed)
Notes for reviewers