timsaucer commented on PR #1311:
URL: 
https://github.com/apache/datafusion-python/pull/1311#issuecomment-3592543053

   I believe I have found the root cause of the change in behavior of the 
interrupt unit test: https://github.com/apache/datafusion/pull/17452
   
   You can test yourself if you go to `main` in `datafusion-python` and try 
these two patches. These will set datafusion to right before and right after 
https://github.com/apache/datafusion/pull/17452 
   
   
[failing_test.txt](https://github.com/user-attachments/files/23839600/failing_test.txt)
   
[passing_test.txt](https://github.com/user-attachments/files/23839601/passing_test.txt)
   
   I think this note is relevant from the PR:
   
   ```rust
       /// This method is async and uses a [`tokio::sync::Barrier`] to wait for 
all partitions
       /// to report their bounds. Once that occurs, the method will resolve 
for all callers and the
       /// dynamic filter will be updated exactly once.
       ///
       /// # Note
       ///
       /// As barriers are reusable, it is likely an error to call this method 
more times than the
       /// total number of partitions - as it can lead to pending futures that 
never resolve. We rely
       /// on correct usage from the caller rather than imposing additional 
checks here. If this is a concern,
       /// consider making the resulting future shared so the ready result can 
be reused.
   ```
   
   I haven't looked much at tokio barriers, but it seems like this is somehow 
interfering with how we are trying to interrupt the task.
   
   I see you have a fix that updates the unit test. That is good, and probably 
unblocks this PR but I do want to get to the very root of this problem and make 
sure we have a robust solution.
   
   Tagging @rkrishn7 and @adriangb just so they know we're talking about the 
work they put in upstream.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to