ozankabak closed issue #14036: Un-cancellable Query when hitting many large
files.
URL: https://github.com/apache/datafusion/issues/14036
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific
alamb commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2921953997
There is a new proposed fix in
- https://github.com/apache/datafusion/pull/16196
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
alamb commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2685750238
@carols10cents has made a benchmark here:
- https://github.com/apache/datafusion/pull/14818
--
This is an automated message from the Apache Git Service.
To respond to the mes
jeffreyssmith2nd commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2580216590
@berkaysynnada I'll take another stab at getting a reproducer for this that
doesn't require customer data
--
This is an automated message from the Apache Git Service.
berkaysynnada commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2579313671
We have also checkpoint tests which will drop the stream after some amount
of time, and after the failure, FileStream offsets do not increment more.
I think the same
alamb commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2578322453
I tried a bit today to re-create this but was not able to
What I tried was to create a highly compressed parquet file (48MB that has
1B rows with all repeated strings) and
jeffreyssmith2nd commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2577862225
> How do you cancel the query? You mean terminating the next()'s on the
stream, or dropping the stream.
The queries are running in the context of a gRPC request,
berkaysynnada commented on issue #14036:
URL: https://github.com/apache/datafusion/issues/14036#issuecomment-2577045685
How do you cancel the query? You mean terminating the next()'s on the
stream, or dropping the stream. If it is the former, the issue might be related
with the RepartitionE
jeffreyssmith2nd opened a new issue, #14036:
URL: https://github.com/apache/datafusion/issues/14036
### Describe the bug
**TLDR; Reading many large Parquet files can prevent a query from being
cancelled.**
We have a customer that is running a query similar to the following