djanderson commented on PR #14286:
URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2661181898

   Alright so I was able to strip down a reproducer for the JoinError panic but 
it's a little unsatisfying. I absolutely tried to start very simple, but
   - I was unable to trigger the issue with the `MockStore`, the simplest I 
could trigger it with is running MinIO in a local docker container
   - I was unable to trigger the issue with one or a small number of small 
record batches. The issue only crops up with longer streams of data.
   
   I imagine it could be slightly simpler but at least for me this is the 
easiest thing I could come up with that consistently triggers the issue.
   
   https://github.com/djanderson/parquet-sink-dedicated-exec-repro
   
   Would or anyone else potentially be able to see if this reproduces the error 
for you? I haven't given up on running the issue down but because of the nature 
of what does and doesn't trigger the issue, help from more knowledgeable folks 
would be hugely appreciated. 
   
   This is what a small number of record batches looks like, which is a 
**_failure to reproduce the issue_**:
   Important things to note in this case are:
   - the file is successfully written to minio
   - the client gets a successful response
   
   <img width="1727" alt="image" 
src="https://github.com/user-attachments/assets/23e84298-fe74-4849-8c38-401015a93022";
 />
   
   This is what a larger number of record batches looks like, which 
demonstrates the tokio panic on the server. Important things to note in this 
case are:
   - the file fails to be written to minio
   - the client gets an error response
   
   <img width="1727" alt="image" 
src="https://github.com/user-attachments/assets/808e87c0-1e08-4cfb-bb17-40d3f60741f2";
 />


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to