This is an automated email from the ASF dual-hosted git repository.
curth pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-adbc.git
The following commit(s) were added to refs/heads/main by this push:
new d7fc274f5 fix(csharp/src/Client): clear cached record batch in Read()
to prevent stale data on exception (#4133)
d7fc274f5 is described below
commit d7fc274f54d1873975d70ed166d166faa37338bb
Author: eric-wang-1990 <[email protected]>
AuthorDate: Tue Mar 24 07:03:26 2026 -0700
fix(csharp/src/Client): clear cached record batch in Read() to prevent
stale data on exception (#4133)
## Summary
Fix an infinite loop in `AdbcDataReader.Read()` that occurs when the
underlying stream throws mid-read. Callers (e.g. Power BI) that retry
`Read()` after an exception get stuck re-reading stale rows forever
instead of seeing the error.
## Problem
`AdbcDataReader.Read()` has two paths:
```csharp
public override bool Read()
{
// Path A: serve rows from cached batch
if (this.recordBatch != null && currentRowInRecordBatch <
recordBatch.Length - 1)
{
currentRowInRecordBatch++;
return true;
}
// Path B: fetch next batch
this.recordBatch = ReadNextRecordBatchAsync().Result;
return this.recordBatch != null;
}
```
`ReadNextRecordBatchAsync()` resets `currentRowInRecordBatch = 0`
**before** calling the stream:
```csharp
private ValueTask<RecordBatch?> ReadNextRecordBatchAsync(...)
{
this.currentRowInRecordBatch = 0; // ← side-effect persists
RecordBatch? recordBatch =
this.adbcQueryResult.Stream?
.ReadNextRecordBatchAsync(...).Result; // ← THROWS (server
died)
return new ValueTask<RecordBatch?>(recordBatch); // never reached
}
```
When `.Result` throws, the assignment back in `Read()` never completes —
`this.recordBatch` retains the **last successful batch**. The stream
doesn't advance either: the error condition is permanent (e.g. server is
gone), so every subsequent call throws the same exception from the same
state.
This creates an infinite cycle when the caller retries after the
exception:
```
State after throw: recordBatch = batch[4096 rows], currentRowInRecordBatch
= 0
→ Caller retries Read()
→ Path A: recordBatch != null, 0 < 4095 → true → serve stale row 0
→ Path A: 1 < 4095 → true → serve stale row 1
→ ... 4094 more stale rows ...
→ Path A: 4095 < 4095 → false → skip to Path B
→ Path B: ReadNextRecordBatchAsync() → resets index to 0 → THROWS
→ recordBatch still = old batch, currentRowInRecordBatch = 0
→ Caller retries Read()
→ Back to stale row 0 ... forever
```
Observed with Power BI Desktop: PBI retries `Read()` after exceptions,
causing it to show a spinner forever instead of surfacing the error.
## Fix
Set `this.recordBatch = null` **before** calling `.Result`. If the fetch
throws, the batch is already cleared, so retry calls hit Path B
immediately (which re-throws the error) instead of re-serving stale rows
from Path A.
## Test plan
- [x] Verified with Power BI Desktop: error dialog now surfaces
correctly instead of infinite hang
- [x] All existing C# unit tests pass on all platforms (ubuntu, windows,
macos-intel, macos-arm)
- [x] Manual test: stop Databricks SQL warehouse mid-query → PBI shows
error instead of spinner
---
csharp/src/Client/AdbcDataReader.cs | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/csharp/src/Client/AdbcDataReader.cs
b/csharp/src/Client/AdbcDataReader.cs
index 03d4f564a..7b85875bb 100644
--- a/csharp/src/Client/AdbcDataReader.cs
+++ b/csharp/src/Client/AdbcDataReader.cs
@@ -335,6 +335,11 @@ namespace Apache.Arrow.Adbc.Client
return true;
}
+ // Clear the previous batch before fetching the next one.
+ // If ReadNextRecordBatchAsync throws (e.g. server error
mid-stream),
+ // callers that retry Read() must not re-read stale rows from the
+ // old batch — they must see the exception again immediately.
+ this.recordBatch = null;
this.recordBatch = ReadNextRecordBatchAsync().Result;
return this.recordBatch != null;