kosiew commented on code in PR #20922:
URL: https://github.com/apache/datafusion/pull/20922#discussion_r2964937756


##########
datafusion/physical-plan/src/sorts/builder.rs:
##########
@@ -200,3 +229,143 @@ pub(crate) fn try_grow_reservation_to_at_least(
     }
     Ok(())
 }
+
+/// Returns true if the error is an Arrow offset overflow.
+fn is_offset_overflow(e: &DataFusionError) -> bool {
+    matches!(
+        e,
+        DataFusionError::ArrowError(boxed, _)
+            if matches!(boxed.as_ref(), ArrowError::OffsetOverflowError(_))
+    )
+}
+
+fn offset_overflow_error() -> DataFusionError {
+    DataFusionError::ArrowError(Box::new(ArrowError::OffsetOverflowError(0)), 
None)
+}
+
+fn recover_offset_overflow_from_panic<T, F>(f: F) -> Result<T>

Review Comment:
   The retry behavior looks good, but right now it seems like it’s only covered 
through synthetic helper failures.
   
   Since the production path depends on matching Arrow’s panic payload pretty 
closely, I think it’d be great to add one higher-level regression test closer 
to `BatchBuilder::build_record_batch()` or `SortPreservingMergeStream` that 
exercises the retry/drain flow end-to-end through an injectable interleave hook.
   
   That would make it a lot easier to catch future Arrow-side panic-message 
changes - or refactors in this file before they slip through.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to