fix(lapis): health check: consider LAPIS "UP" even when SILO is down by fengelniederhammer · Pull Request #1594 · GenSpectrum/LAPIS

fengelniederhammer · 2026-03-12T10:48:53Z

So that Kubernetes will not restart LAPIS just because SILO is down (in which case restarting LAPIS is pointless)

PR Checklist

~~- [ ] All necessary documentation has been adapted.~~
~~- [ ] All necessary changes are explained in the llms.txt.~~
~~- [ ] The implemented feature is covered by an appropriate test.~~

vercel · 2026-03-12T10:48:58Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
lapis	Ready	Preview, Comment	Mar 17, 2026 10:21am

Copilot

Pull request overview

Adjusts LAPIS’ Spring Boot Actuator health reporting so that the service remains reported as UP even if the dependent SILO service is unavailable, while still surfacing SILO-related diagnostics in health details.

Changes:

Always returns Health.up() from SiloHealthIndicator.
Adds SILO diagnostic details (versions on success; siloStatus/error info on failure) instead of marking overall health as DOWN.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

lapis/src/main/kotlin/org/genspectrum/lapis/health/SiloHealthIndicator.kt

Copilot · 2026-03-12T10:53:10Z

lapis/src/main/kotlin/org/genspectrum/lapis/health/SiloHealthIndicator.kt

+        Health
+            .up() // LAPIS should always be "up", independent of SILO.
+            .let {
+                try {
+                    val info = cachedSiloClient.callInfo()
+                    it
+                        .withDetail("dataVersion", info.dataVersion)
+                        .withDetail("siloVersion", info.siloVersion ?: "unknown")
+                } catch (e: SiloNotReachableException) {
+                    it
+                        .withDetail("siloStatus", "DOWN")
+                        .withDetail("error", "SILO not reachable")
+                        .withDetail("message", e.message)
+                } catch (e: SiloUnavailableException) {
+                    it
+                        .withDetail("siloStatus", "DOWN")
+                        .withDetail("error", "SILO unavailable (HTTP 503)")
+                        .withDetail("message", e.message)
+                        .withDetail("retryAfter", e.retryAfter)
+                } catch (e: Exception) {
+                    it
+                        .withDetail("siloStatus", "DOWN")
+                        .withDetail("error", "Unexpected error checking SILO")
+                        .withDetail("message", e.message)
+                }
+            }
+            .build()


SiloHealthIndicator now always reports Spring Actuator status UP (even when SILO is down) and only encodes SILO failure via details. This is easy to misinterpret because /actuator/health components will still show this indicator as UP; if the intent is to keep overall LAPIS health UP, consider moving SILO checks into a separate health group/contributor (e.g., readiness) or otherwise returning a real DOWN status for SILO while excluding it from the liveness group.

Suggested change

Health

.up() // LAPIS should always be "up", independent of SILO.

.let {

try {

val info = cachedSiloClient.callInfo()

it

.withDetail("dataVersion", info.dataVersion)

.withDetail("siloVersion", info.siloVersion ?: "unknown")

} catch (e: SiloNotReachableException) {

it

.withDetail("siloStatus", "DOWN")

.withDetail("error", "SILO not reachable")

.withDetail("message", e.message)

} catch (e: SiloUnavailableException) {

it

.withDetail("siloStatus", "DOWN")

.withDetail("error", "SILO unavailable (HTTP 503)")

.withDetail("message", e.message)

.withDetail("retryAfter", e.retryAfter)

} catch (e: Exception) {

it

.withDetail("siloStatus", "DOWN")

.withDetail("error", "Unexpected error checking SILO")

.withDetail("message", e.message)

}

}

.build()

try {

val info = cachedSiloClient.callInfo()

Health.up()

.withDetail("dataVersion", info.dataVersion)

.withDetail("siloVersion", info.siloVersion ?: "unknown")

.build()

} catch (e: SiloNotReachableException) {

Health.down()

.withDetail("siloStatus", "DOWN")

.withDetail("error", "SILO not reachable")

.withDetail("message", e.message)

.build()

} catch (e: SiloUnavailableException) {

Health.down()

.withDetail("siloStatus", "DOWN")

.withDetail("error", "SILO unavailable (HTTP 503)")

.withDetail("message", e.message)

.withDetail("retryAfter", e.retryAfter)

.build()

} catch (e: Exception) {

Health.down()

.withDetail("siloStatus", "DOWN")

.withDetail("error", "Unexpected error checking SILO")

.withDetail("message", e.message)

.build()

}

This sounds valid; we're in the 'SiloHealthIndicator', shouldn't there be a LapisHealthIndicator somewhere maybe?

Edit: This was in response to what ChatGPT said above.

The Lapis health is indicated automatically by Spring.

If we don't want to have SILO in here at all, then we should maybe completely remove the SiloHealthIndicator? I tried to make it "info only" without affecting the LAPIS health.

lapis/src/main/kotlin/org/genspectrum/lapis/health/SiloHealthIndicator.kt

fhennig

What do you think about having UP, DOWN and DEGRADED?

fhennig · 2026-03-12T14:39:10Z

lapis/src/main/kotlin/org/genspectrum/lapis/health/SiloHealthIndicator.kt

+        Health
+            .up() // LAPIS should always be "up", independent of SILO.
+            .let {
+                try {
+                    val info = cachedSiloClient.callInfo()
+                    it
+                        .withDetail("dataVersion", info.dataVersion)
+                        .withDetail("siloVersion", info.siloVersion ?: "unknown")
+                } catch (e: SiloNotReachableException) {
+                    it
+                        .withDetail("siloStatus", "DOWN")
+                        .withDetail("error", "SILO not reachable")
+                        .withDetail("message", e.message)
+                } catch (e: SiloUnavailableException) {
+                    it
+                        .withDetail("siloStatus", "DOWN")
+                        .withDetail("error", "SILO unavailable (HTTP 503)")
+                        .withDetail("message", e.message)
+                        .withDetail("retryAfter", e.retryAfter)
+                } catch (e: Exception) {
+                    it
+                        .withDetail("siloStatus", "DOWN")
+                        .withDetail("error", "Unexpected error checking SILO")
+                        .withDetail("message", e.message)
+                }
+            }
+            .build()


This sounds valid; we're in the 'SiloHealthIndicator', shouldn't there be a LapisHealthIndicator somewhere maybe?

Edit: This was in response to what ChatGPT said above.

🤖 I have created a release *beep* *boop* --- ## [0.8.0](v0.7.2...v0.8.0) (2026-03-19) ### ⚠ BREAKING CHANGES * **lapis:** Backslash is now an escape character inside quoted strings. In regex queries that use `\` as a regex escape, you'll need to pass `\\` now, e.g. before: `division.regex=Basel\{1,2\}` -> after: `division.regex=Basel\\{1,2\\}` ### Features * **lapis:** support escaping single quotes in advanced query string values ([#1599](#1599)) ([88d59e9](88d59e9)) ### Bug Fixes * **lapis:** health check: consider LAPIS "UP" even when SILO is down ([#1594](#1594)) ([b9abdf9](b9abdf9)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

fix(lapis): health check: consider LAPIS "UP" even when SILO is down

02e9678

fengelniederhammer requested review from Copilot and fhennig March 12, 2026 10:48

fengelniederhammer self-assigned this Mar 12, 2026

Copilot started reviewing on behalf of fengelniederhammer March 12, 2026 10:49 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

fhennig reviewed Mar 12, 2026

View reviewed changes

fengelniederhammer added 2 commits March 17, 2026 11:19

refactor

164f64f

refactor

b06523d

vercel bot deployed to Preview March 17, 2026 10:21 View deployment

fengelniederhammer requested a review from fhennig March 17, 2026 10:23

fhennig approved these changes Mar 18, 2026

View reviewed changes

fengelniederhammer merged commit b9abdf9 into main Mar 18, 2026
9 checks passed

fengelniederhammer deleted the upStatus branch March 18, 2026 09:58

github-actions bot mentioned this pull request Mar 18, 2026

chore(main): release 0.8.0 #1604

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(lapis): health check: consider LAPIS "UP" even when SILO is down#1594

fix(lapis): health check: consider LAPIS "UP" even when SILO is down#1594
fengelniederhammer merged 3 commits intomainfrom
upStatus

fengelniederhammer commented Mar 12, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

fhennig Mar 12, 2026 •

edited

Loading

Uh oh!

fengelniederhammer Mar 17, 2026

Uh oh!

Uh oh!

fhennig left a comment

Uh oh!

fhennig Mar 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fengelniederhammer commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Checklist

Uh oh!

vercel bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

fhennig Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fengelniederhammer Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fhennig left a comment

Choose a reason for hiding this comment

Uh oh!

fhennig Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fengelniederhammer commented Mar 12, 2026 •

edited

Loading

vercel bot commented Mar 12, 2026 •

edited

Loading

fhennig Mar 12, 2026 •

edited

Loading

fhennig Mar 12, 2026 •

edited

Loading