alamb commented on code in PR #11399:
URL: https://github.com/apache/datafusion/pull/11399#discussion_r1672835991
##########
datafusion/core/src/datasource/file_format/parquet.rs:
##########
@@ -893,12 +893,9 @@ async fn send_arrays_to_col_writers(
let mut next_channel = 0;
for (array, field) in rb.columns().iter().zip(schema.fields()) {
for c in compute_leaves(field, array)? {
- col_array_channels[next_channel]
- .send(c)
- .await
- .map_err(|_| {
- DataFusionError::Internal("Unable to send array to
writer!".into())
- })?;
+ // Do not surface error from closed channel.
+ let _ = col_array_channels[next_channel].send(c).await;
+
Review Comment:
I agree if we just ignore the error here I think this code will continue to
furiously encode data only to throw it away. If it hits an error I think it
should return early
In general, I don't there is any useful error message to be made if the
channel errors on write. It happens when the other end of the channel has "hung
up" (aka there is no receiver that will ever receive the message)
SO in other words, I think this function needs to stop and return control if
encounters an error sending.
Something like
```rust
// Do not surface error from closed channel (means something
// else hit an error, and the plan is shutting down).
if let Err(_) = col_array_channels[next_channel].send(c).await {
return Ok(());
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]