Conversation
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
…OptIn metric property Signed-off-by: gniadeck <77535280+gniadeck@users.noreply.github.com>
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
…o list Replace the boolean property with a comma-separated list of OTel metric names, giving users fine-grained control over which metrics use OpenTelemetry Semantic Conventions. Use "*" to enable all metrics. Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
| public class JvmGarbageCollectorMetrics { | ||
|
|
||
| private static final String JVM_GC_COLLECTION_SECONDS = "jvm_gc_collection_seconds"; | ||
| private static final String JVM_GC_DURATION = "jvm.gc.duration"; |
There was a problem hiding this comment.
should we be consistent with the way the other metric is defined?
| private static final String JVM_GC_DURATION = "jvm.gc.duration"; | |
| private static final String JVM_GC_DURATION = "jvm_gc_duration"; |
| if (!GarbageCollectionNotificationInfo.GARBAGE_COLLECTION_NOTIFICATION.equals( | ||
| notification.getType())) { | ||
| return; | ||
| } | ||
|
|
||
| GarbageCollectionNotificationInfo info = | ||
| GarbageCollectionNotificationInfo.from( | ||
| (CompositeData) notification.getUserData()); | ||
|
|
||
| observe(gcDurationHistogram, info); |
There was a problem hiding this comment.
try/catch here to avoid crashing the JVM is the notification listener throws something
| if (!GarbageCollectionNotificationInfo.GARBAGE_COLLECTION_NOTIFICATION.equals( | |
| notification.getType())) { | |
| return; | |
| } | |
| GarbageCollectionNotificationInfo info = | |
| GarbageCollectionNotificationInfo.from( | |
| (CompositeData) notification.getUserData()); | |
| observe(gcDurationHistogram, info); | |
| try { | |
| if (!GarbageCollectionNotificationInfo.GARBAGE_COLLECTION_NOTIFICATION.equals( | |
| notification.getType())) { | |
| return; | |
| } | |
| GarbageCollectionNotificationInfo info = | |
| GarbageCollectionNotificationInfo.from( | |
| (CompositeData) notification.getUserData()); | |
| observe(gcDurationHistogram, info); | |
| } catch (Exception e) { | |
| logger.warning( | |
| "Exception while processing garbage collection notification: " + e.getMessage()); | |
| } |
| .name(JVM_GC_DURATION) | ||
| .unit(Unit.SECONDS) | ||
| .help("Duration of JVM garbage collection actions.") | ||
| .labelNames("jvm.gc.action", "jvm.gc.name", "jvm.gc.cause") |
There was a problem hiding this comment.
jvm.gc.cause is still experimental, should we hold off on that one until it stabilizes?
| registerNotificationListener(gcDurationHistogram); | ||
| } | ||
|
|
||
| private void registerNotificationListener(Histogram gcDurationHistogram) { |
There was a problem hiding this comment.
taking a look at the OTel implementation, they also track the listeners and do a cleanup, not sure if it makes sense for us to implement something like that, by making JvmMetrics an AutoClosable. Looks like it ends up requiring a lot of changes, not sure if its worth it
Another thing they do is check to see if the class exists first , which I think would be good to do here too
| } | ||
|
|
||
| @Test | ||
| public void testNonOtelMetricsAbsentWhenUseOtelEnabled() { |
There was a problem hiding this comment.
| public void testNonOtelMetricsAbsentWhenUseOtelEnabled() { | |
| void testNonOtelMetricsAbsentWhenUseOtelEnabled() { |
|
|
||
| @Test | ||
| @SuppressWarnings("rawtypes") | ||
| public void testGCDurationHistogramLabels() throws Exception { |
There was a problem hiding this comment.
| public void testGCDurationHistogramLabels() throws Exception { | |
| void testGCDurationHistogramLabels() throws Exception { |
Supercedes #1738
introduces a Prometheus histogram metric
jvm_gc_duration_secondsto record JVM garbage collection pause durations. existing metrics provide limited visibility into long GC pauses, making it difficult to detect latency spikes. this change leverages the already registered GC notifications (as used inJvmMemoryPoolAllocationMetrics) to capture pause durations without additional instrumentation.the histogram uses
0.01, 0.1, 1, 10buckets and includes labels forgc name,action, andcause, enabling detailed monitoring of both short and long GC pauses. this addresses the lack of visibility highlighted in community discussions such as this one. buckets are also defined according to the opentelemetry semantic conventions specExample result: