Skip to content

hmon: add heartbeat monitor#67

Merged
arkjedrz merged 5 commits intoeclipse-score:mainfrom
qorix-group:arkjedrz_heartbeat-monitor
Mar 6, 2026
Merged

hmon: add heartbeat monitor#67
arkjedrz merged 5 commits intoeclipse-score:mainfrom
qorix-group:arkjedrz_heartbeat-monitor

Conversation

@arkjedrz
Copy link
Contributor

@arkjedrz arkjedrz commented Feb 9, 2026

Add heartbeat monitor HMON.

Resolves #68

@arkjedrz arkjedrz requested a review from pawelrutkaq February 9, 2026 15:43
@arkjedrz arkjedrz requested a review from Copilot February 9, 2026 15:43
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

License Check Results

🚀 The license check job ran with the Bazel command:

bazel run --lockfile_mode=error //:license-check

Status: ⚠️ Needs Review

Click to expand output
[License Check Output]
Extracting Bazel installation...
Starting local Bazel server (8.4.2) and connecting to it...
INFO: Invocation ID: d74593c4-ca89-43be-8999-df68d28af178
Computing main repo mapping: 
Computing main repo mapping: 
Computing main repo mapping: 
WARNING: For repository 'score_rust_policies', the root module requires module version score_rust_policies@0.0.3, but got score_rust_policies@0.0.5 in the resolved dependency graph. Please update the version in your MODULE.bazel or set --check_direct_dependencies=off
Loading: 
Loading: 0 packages loaded
Loading: 0 packages loaded
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Loading: 0 packages loaded
    currently loading: 
Analyzing: target //:license-check (1 packages loaded)
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)
Analyzing: target //:license-check (1 packages loaded, 0 targets configured)

Analyzing: target //:license-check (64 packages loaded, 9 targets configured)

Analyzing: target //:license-check (89 packages loaded, 9 targets configured)

Analyzing: target //:license-check (145 packages loaded, 2462 targets configured)

Analyzing: target //:license-check (154 packages loaded, 3485 targets configured)

Analyzing: target //:license-check (155 packages loaded, 7715 targets configured)

Analyzing: target //:license-check (165 packages loaded, 7899 targets configured)

Analyzing: target //:license-check (165 packages loaded, 7901 targets configured)

Analyzing: target //:license-check (165 packages loaded, 7901 targets configured)

Analyzing: target //:license-check (167 packages loaded, 8025 targets configured)

Analyzing: target //:license-check (169 packages loaded, 9913 targets configured)

INFO: Analyzed target //:license-check (170 packages loaded, 10039 targets configured).
[12 / 16] JavaToolchainCompileClasses external/rules_java+/toolchains/platformclasspath_classes; 0s disk-cache, processwrapper-sandbox ... (2 actions running)
[14 / 16] JavaToolchainCompileBootClasspath external/rules_java+/toolchains/platformclasspath.jar; 0s disk-cache, processwrapper-sandbox
[15 / 16] Building license.check.license_check.jar (); 0s disk-cache, multiplex-worker
INFO: Found 1 target...
Target //:license.check.license_check up-to-date:
  bazel-bin/license.check.license_check
  bazel-bin/license.check.license_check.jar
INFO: Elapsed time: 26.873s, Critical Path: 2.67s
INFO: 16 processes: 12 internal, 3 processwrapper-sandbox, 1 worker.
INFO: Build completed successfully, 16 total actions
INFO: Running command line: bazel-bin/license.check.license_check ./formatted.txt <args omitted>
usage: org.eclipse.dash.licenses.cli.Main [-batch <int>] [-cd <url>]
       [-confidence <int>] [-ef <url>] [-excludeSources <sources>] [-help] [-lic
       <url>] [-project <shortname>] [-repo <url>] [-review] [-summary <file>]
       [-timeout <seconds>] [-token <token>]

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

The created documentation from the pull request is available at: docu-html

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new heartbeat monitor (HMON) to the Rust health monitoring library and integrates it into the existing monitoring worker/supervisor notification flow.

Changes:

  • Introduces heartbeat module (monitor + atomic state) and integrates heartbeat monitors into HealthMonitorBuilder/HealthMonitor.
  • Updates the monitor evaluation interface to accept a shared hmon_starting_point and wires it through the monitoring worker thread.
  • Refactors SupervisorAPIClient into a dedicated module with selectable implementations via Cargo features.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/health_monitoring_lib/rust/worker.rs Passes a shared HMON start instant into monitor evaluations; moves supervisor client trait out.
src/health_monitoring_lib/rust/supervisor_api_client/mod.rs Adds feature-selected SupervisorAPIClient + implementation alias.
src/health_monitoring_lib/rust/supervisor_api_client/stub_supervisor_api_client.rs New stub client implementation.
src/health_monitoring_lib/rust/supervisor_api_client/score_supervisor_api_client.rs New SCORE client implementation.
src/health_monitoring_lib/rust/lib.rs Adds heartbeat monitors to builder + start flow; uses new supervisor client impl selector.
src/health_monitoring_lib/rust/heartbeat/mod.rs Exposes heartbeat monitor API.
src/health_monitoring_lib/rust/heartbeat/heartbeat_state.rs Adds atomic packed heartbeat state and tests.
src/health_monitoring_lib/rust/heartbeat/heartbeat_monitor.rs Implements heartbeat monitor logic + tests (incl. loom).
src/health_monitoring_lib/rust/deadline/deadline_monitor.rs Adapts deadline evaluation to new evaluator signature and shared start instant.
src/health_monitoring_lib/rust/common.rs Extends evaluation error types; adds duration_to_u32; updates evaluator trait signature.
src/health_monitoring_lib/Cargo.toml Adds optional monitor_rs, loom target dep, and feature defaults.
src/health_monitoring_lib/BUILD Enables score_supervisor_api_client feature in Bazel builds.
Cargo.toml Updates workspace defaults and adds cfg(loom) lint configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from f07c0db to e65f6a6 Compare February 10, 2026 11:43
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 10, 2026 11:43 — with GitHub Actions Inactive
@arkjedrz arkjedrz requested a review from Copilot February 10, 2026 11:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from e65f6a6 to bd438c3 Compare February 10, 2026 12:57
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 10, 2026 12:57 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from bd438c3 to c5f4b84 Compare February 10, 2026 13:14
@arkjedrz arkjedrz requested a review from Copilot February 10, 2026 13:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from c5f4b84 to b012f37 Compare February 10, 2026 13:19
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 10, 2026 13:19 — with GitHub Actions Inactive
@arkjedrz arkjedrz requested a review from Copilot February 10, 2026 13:19
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from b012f37 to 862da21 Compare February 11, 2026 12:14
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 11, 2026 12:14 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 862da21 to 59c92ee Compare February 13, 2026 14:48
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 13, 2026 14:48 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 13, 2026 14:48 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 59c92ee to cf14efb Compare February 17, 2026 12:12
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 17, 2026 12:12 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval February 17, 2026 12:12 — with GitHub Actions Inactive
@arkjedrz arkjedrz self-assigned this Feb 17, 2026
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from cf14efb to 67fe6cc Compare February 25, 2026 09:30
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 10:16 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 10:16 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from a8b11fc to ef78463 Compare March 3, 2026 12:39
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 12:39 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 3, 2026 12:39 — with GitHub Actions Inactive
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from ef78463 to 3816630 Compare March 4, 2026 09:35
@arkjedrz arkjedrz requested a deployment to workflow-approval March 4, 2026 09:36 — with GitHub Actions Waiting
@arkjedrz arkjedrz requested a deployment to workflow-approval March 4, 2026 09:36 — with GitHub Actions Waiting
@pawelrutkaq
Copy link
Contributor

FYI: We are still dropping a refactor on this code today, as we discussed after review with @arkjedrz

@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 3816630 to 27766ce Compare March 4, 2026 11:32
@arkjedrz arkjedrz requested a deployment to workflow-approval March 4, 2026 11:33 — with GitHub Actions Waiting
@arkjedrz arkjedrz requested a deployment to workflow-approval March 4, 2026 11:33 — with GitHub Actions Waiting
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 27766ce to 77da130 Compare March 4, 2026 11:34
@arkjedrz arkjedrz requested a deployment to workflow-approval March 4, 2026 11:34 — with GitHub Actions Waiting
@arkjedrz arkjedrz requested a deployment to workflow-approval March 4, 2026 11:34 — with GitHub Actions Waiting
@arkjedrz arkjedrz force-pushed the arkjedrz_heartbeat-monitor branch from 77da130 to 125905c Compare March 4, 2026 13:54
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 4, 2026 13:54 — with GitHub Actions Inactive
@arkjedrz arkjedrz temporarily deployed to workflow-approval March 4, 2026 13:54 — with GitHub Actions Inactive
arkjedrz added 3 commits March 6, 2026 09:32
Add heartbeat monitor HMON.
Rework heartbeat monitor state into two atomics.
- Add new constructor to `TimeRange`.
- Rework `time_offset`.
- Update state using `compare_exchange`.
- Other code, docs, comments fixes.
Use swap to read and reset state in single operation.
// Check current counter state.
let counter = snapshot.counter();
// Disallow multiple heartbeats in same heartbeat cycle.
if counter > 1 {
Copy link
Contributor

@NicolasFussberger NicolasFussberger Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that there are some hard constraints on the internal_processing_cycle time for this to work. It must be chosen so that evaluate() is always called at the right time such at it never spans across two alive intervals.

Image

In case multiple heartbeat monitors are configured, I wonder if you can find a suitable internal_processing_cycle such that this works?

I guess it works when you choose a very frequent internal_processing_cycle. Is this understanding correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if I understand Your diagram correctly:

  • If it represents a single monitor, then a correct heartbeat is a beginning of a new processing cycle - everything after the first heartbeat is irrelevant in the first line.
  • If it represents multiple monitors then heartbeats land at different monitors and their timelines are separate.

Internal processing cycle frequency must be twice as large as largest heartbeat frequency required by any monitor. Not sure if that's a Nyquist frequency, but the rough idea remains the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its has to be < min(mon1.min, mon2.min,...)*2. This is what I thin we discussed at the beginning, that the internal cycle is separate from the supervisor api notification cycle to be able to tune it because it makse no sense to have ie hearbeat every 10ms and detecting failure only every 100ms. Should we change that idea, the more sophisticated alorithmi is needed, since You need to store up to N samples (then how N is defined etc)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok understood, thanks for the clarification. This means use cases where there is no clear lower bound for the heartbeat are not supported by the heartbeat monitor and in this case you might use a deadline instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Np. We can always decide later that we need extensions here, but this will not affect the user API

// Check current counter state.
let counter = snapshot.counter();
// Disallow multiple heartbeats in same heartbeat cycle.
if counter > 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its has to be < min(mon1.min, mon2.min,...)*2. This is what I thin we discussed at the beginning, that the internal cycle is separate from the supervisor api notification cycle to be able to tune it because it makse no sense to have ie hearbeat every 10ms and detecting failure only every 100ms. Should we change that idea, the more sophisticated alorithmi is needed, since You need to store up to N samples (then how N is defined etc)

- Change `from_interval` params.
- Add `new_internal` for later FFI usage.
- Change message in `duration_to_int`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[HmLib] Rust Heartbeat Monitor API

5 participants