Skip to content

Slow invocation of read callbacks#622

Open
RGleb wants to merge 3 commits intodatastax:masterfrom
RGleb:master
Open

Slow invocation of read callbacks#622
RGleb wants to merge 3 commits intodatastax:masterfrom
RGleb:master

Conversation

@RGleb
Copy link

@RGleb RGleb commented Feb 14, 2026

Issue

Here, the driver invokes read callbacks sequentially. When the driver invokes a callback, the thread implicitly invokes user code outside the driver. If you need to invoke multiple callbacks and the first one is slow, the others will wait. This may cause performance issues with the user application.

Possible solution

In the TcsMetricsRequestResultHandler we can create a TaskCompletionSource with the TaskCreationOptions.RunContinuationsAsynchronously option. This will cause the user code to run on a different thread. I fixed the issue in this PR, but the TaskCompletionSource is still used in other places without the TaskCreationOptions.RunContinuationsAsynchronously option.

Should this issue be fixed in the driver or is it a user application issue?

…Asynchronously in TcsMetricsRequestResultHandler for netstandard
@joao-r-reis
Copy link
Collaborator

joao-r-reis commented Feb 16, 2026

This will only be an issue if the application has a SynchronizationContext and it is not using await.ConfigureAwait(false).

If your app is a ASP NET Core app for example then by default there is no SynchronizationContext. Even if the app has a context as long as ConfigureAwait(false) is used then there won't be an issue either.

This PR would make it so that the driver is more "defensive" to handle cases where users don't correctly use ConfigureAwait(false) but it would bring a minor performance hit to those who do especially if it's a high throughput application.

At this time I don't see a strong case to accept this change.

@RGleb
Copy link
Author

RGleb commented Feb 17, 2026

This will only be an issue if the application has a SynchronizationContext and it is not using await.ConfigureAwait(false)

Look at the test (especially Thread.Sleep after ExecuteAsync), it uses await.ConfigureAwait(false) everywhere, so continuations should be executed in the thread pool, but the thread pool has no reason to execute continuations in another thread, because the code is already executing in the thread that called the _taskCompletionSource.SetResult. For example, in ASP.NET Core, we can force the thread pool to continue after ExecuteAsync on another thread if we call await Task.Run(() => Thread.Sleep(3000)).ConfigureAwait(false); or await Task.Yield(); before Thread.Sleep, but it would be better if the driver tried to avoid executing continuations outside itself at least in the case when you need to call several callbacks in this cycle.

You can run the test from this pull request with the net8 target framework and see that with the fix it passes in 9 seconds, but without the fix it often fails after about 30 seconds.

but it would bring a minor performance hit to those who do especially if it's a high throughput application

In case of possible throughput issues, you can add a setting that will allow you to change the driver's behavior.
But at the moment, even if you use await.ConfigureAwait(false) everywhere, you will still have an application throughput issue if slow synchronous application code is executed after awaiting for an asynchronous driver method call.

Copy link
Collaborator

@joao-r-reis joao-r-reis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm using synchronous blocking code on the test is a classic example of bad application behaviour but it does show that if the application just does some amount of CPU bound work after executing a request it will cause issues. I still have some concern that this might be detrimental for applications that don't do this but there is definitely a stronger case to merge this.

Unfortunately this fix will not be applied for applications that use any version of .NET Framework because the net452 target will be chosen in those scenarios but I don't want to change or add a new target framework in this PR.

I'm not sure when I will be able to get out a C# driver release but I'll merge this after CI passes.

@RGleb
Copy link
Author

RGleb commented Feb 18, 2026

Unfortunately this fix will not be applied for applications that use any version of .NET Framework

There is an option here to invoke callbacks this way

await Task.WhenAll(operationCallbacks.Select(cb => cb(stream, timestamp))).ConfigureAwait(false);

but this requires a refactoring related to reading data from MemoryStream, for example, you can parse the responses before invoking callbacks.

I'm not sure when I will be able to get out a C# driver release but I'll merge this after CI passes.

Ok, thank you!

@joao-r-reis
Copy link
Collaborator

There is an option here to invoke callbacks this way

When I have a bit more time to do some larger development on this driver I think we will just bump the targets to net8;netstandard2.0;net462 so this will no longer be an issue anyway

@joao-r-reis
Copy link
Collaborator

The test is failing in CI, the test needs to be a bit more tolerant to slower machines

Failed SessionExecuteAsyncCQLQueriesParallelAndSlowResponseProcessingInUserThread [6 s]
[2026-02-18T14:45:47.182Z]   Error Message:
[2026-02-18T14:45:47.182Z]      Expected: less than 5000
[2026-02-18T14:45:47.182Z]   But was:  6563
[2026-02-18T14:45:47.182Z] 
[2026-02-18T14:45:47.182Z]   Stack Trace:
[2026-02-18T14:45:47.182Z]      at Cassandra.IntegrationTests.Core.SessionExecuteAsyncTests.SessionExecuteAsyncCQLQueriesParallelAndSlowResponseProcessingInUserThread() in /home/jenkins/workspace/drivers_csharp_oss_PR-622/src/Cassandra.IntegrationTests/Core/SessionExecuteAsyncTests.cs:line 120
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.TaskAwaitAdapter.GenericAdapter`1.BlockUntilCompleted()
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.MessagePumpStrategy.NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaiter)
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.AsyncToSyncAdapter.Await(Func`1 invoke)
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.Commands.TestMethodCommand.RunTestMethod(TestExecutionContext context)
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.Commands.TestMethodCommand.Execute(TestExecutionContext context)
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.Commands.BeforeAndAfterTestCommand.<>c__DisplayClass1_0.<Execute>b__0()
[2026-02-18T14:45:47.182Z]    at NUnit.Framework.Internal.Commands.DelegatingTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
[2026-02-18T14:45:47.182Z]

@joao-r-reis
Copy link
Collaborator

Can you also add a 3.23.0 section in CHANGELOG.md with a TBD date and add this PR under there? Just so we know there's an unreleased improvement in master

@RGleb
Copy link
Author

RGleb commented Feb 19, 2026

we will just bump the targets to net8;netstandard2.0;net462 so this will no longer be an issue anyway

It would be nice.

we will just bump the targets to net8;netstandard2.0;net462

Support for .NET 8 will end on November 10, 2026. Therefore, if netstandard2.0 does not have some features, it is better to focus on .NET 10.

The test is failing in CI

Fixed - the test requires threadpool configuration.

Can you also add a 3.23.0 section in CHANGELOG.md with a TBD date and add this PR under there?

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants