Client): clear cached record batch in Read() to prevent stale data on exception (#4133)

curth Tue, 24 Mar 2026 07:03:37 -0700

This is an automated email from the ASF dual-hosted git repository.

curth pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-adbc.git



The following commit(s) were added to refs/heads/main by this push:
     new d7fc274f5 fix(csharp/src/Client): clear cached record batch in Read() 
to prevent stale data on exception (#4133)
d7fc274f5 is described below

commit d7fc274f54d1873975d70ed166d166faa37338bb
Author: eric-wang-1990 <[email protected]>
AuthorDate: Tue Mar 24 07:03:26 2026 -0700

    fix(csharp/src/Client): clear cached record batch in Read() to prevent 
stale data on exception (#4133)
    
    ## Summary
    
    Fix an infinite loop in `AdbcDataReader.Read()` that occurs when the
    underlying stream throws mid-read. Callers (e.g. Power BI) that retry
    `Read()` after an exception get stuck re-reading stale rows forever
    instead of seeing the error.
    
    ## Problem
    
    `AdbcDataReader.Read()` has two paths:
    
    ```csharp
    public override bool Read()
    {
        // Path A: serve rows from cached batch
        if (this.recordBatch != null && currentRowInRecordBatch < 
recordBatch.Length - 1)
        {
            currentRowInRecordBatch++;
            return true;
        }
    
        // Path B: fetch next batch
        this.recordBatch = ReadNextRecordBatchAsync().Result;
        return this.recordBatch != null;
    }
    ```
    
    `ReadNextRecordBatchAsync()` resets `currentRowInRecordBatch = 0`
    **before** calling the stream:
    
    ```csharp
    private ValueTask<RecordBatch?> ReadNextRecordBatchAsync(...)
    {
        this.currentRowInRecordBatch = 0;              // ← side-effect persists
        RecordBatch? recordBatch =
            this.adbcQueryResult.Stream?
                .ReadNextRecordBatchAsync(...).Result;  // ← THROWS (server 
died)
        return new ValueTask<RecordBatch?>(recordBatch); // never reached
    }
    ```
    
    When `.Result` throws, the assignment back in `Read()` never completes —
    `this.recordBatch` retains the **last successful batch**. The stream
    doesn't advance either: the error condition is permanent (e.g. server is
    gone), so every subsequent call throws the same exception from the same
    state.
    
    This creates an infinite cycle when the caller retries after the
    exception:
    
    ```
    State after throw: recordBatch = batch[4096 rows], currentRowInRecordBatch 
= 0
    
    → Caller retries Read()
    → Path A: recordBatch != null, 0 < 4095 → true → serve stale row 0
    → Path A: 1 < 4095 → true → serve stale row 1
    → ... 4094 more stale rows ...
    → Path A: 4095 < 4095 → false → skip to Path B
    → Path B: ReadNextRecordBatchAsync() → resets index to 0 → THROWS
    → recordBatch still = old batch, currentRowInRecordBatch = 0
    → Caller retries Read()
    → Back to stale row 0 ... forever
    ```
    
    Observed with Power BI Desktop: PBI retries `Read()` after exceptions,
    causing it to show a spinner forever instead of surfacing the error.
    
    ## Fix
    
    Set `this.recordBatch = null` **before** calling `.Result`. If the fetch
    throws, the batch is already cleared, so retry calls hit Path B
    immediately (which re-throws the error) instead of re-serving stale rows
    from Path A.
    
    ## Test plan
    
    - [x] Verified with Power BI Desktop: error dialog now surfaces
    correctly instead of infinite hang
    - [x] All existing C# unit tests pass on all platforms (ubuntu, windows,
    macos-intel, macos-arm)
    - [x] Manual test: stop Databricks SQL warehouse mid-query → PBI shows
    error instead of spinner
---
 csharp/src/Client/AdbcDataReader.cs | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/csharp/src/Client/AdbcDataReader.cs 
b/csharp/src/Client/AdbcDataReader.cs
index 03d4f564a..7b85875bb 100644
--- a/csharp/src/Client/AdbcDataReader.cs
+++ b/csharp/src/Client/AdbcDataReader.cs
@@ -335,6 +335,11 @@ namespace Apache.Arrow.Adbc.Client
                 return true;
             }
 
+            // Clear the previous batch before fetching the next one.
+            // If ReadNextRecordBatchAsync throws (e.g. server error 
mid-stream),
+            // callers that retry Read() must not re-read stale rows from the
+            // old batch — they must see the exception again immediately.
+            this.recordBatch = null;
             this.recordBatch = ReadNextRecordBatchAsync().Result;
 
             return this.recordBatch != null;

(arrow-adbc) branch main updated: fix(csharp/src/Client): clear cached record batch in Read() to prevent stale data on exception (#4133)

Reply via email to