lalaorya opened a new issue, #15665: URL: https://github.com/apache/datafusion/issues/15665
### Describe the bug When using the LIMIT clause, simple `LIMIT N` syntax (such as `LIMIT 10`) works normally, but when using the syntax with an offset (such as `LIMIT 10,20`), it fails and results in an internal error. 1. Create a table `tbl` with data.(using a custom union provider that includes MemTable and ListingTable.) 2. Execute the following query (works normally): ```sql SELECT * FROM tbl WHERE appId = 'xxx' LIMIT 0,10 ``` 3. Execute the following query (fails): ```sql SELECT * FROM tbl WHERE appId = 'xxx' LIMIT 10,20 ``` ### Error Information ``` Internal error: GlobalLimitExec requires a single input partition. This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker ``` ### Execution Plan Comparison #### Successful Query Execution Plan ``` [grpc-flight] physical plan after generate CoalesceBatchesExec: target_batch_size=8192, fetch=20 FilterExec: appId@1 = 6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 1744199279329000 AND _l2_timestamp@0 < 1744199879329000 UnionExec SortExec: expr=[_l2_timestamp@0 DESC NULLS LAST], preserve_partitioning=[false] MemoryExec: partitions=1, partition_sizes=[6] ParquetExec: file_groups={1 group: [[113bf01e-4ec6-42b8-9fbb-f8846bc51664/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744199703383977.1744199797977616.16.7315697162543697288.parquet, 113bf01e-4ec6-42b8-9fbb-f8846bc51664/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744065150679723.1744199685314003.205345.7315697162543697221.parquet]]}, projection=[_l2_timestamp, appId, message, namespace], output_ordering=[_l2_timestamp@0 DESC NULLS LAST], predicate=appId@1 = 6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 1744199279329000 AND _l2_timestamp@0 < 1744199879329000, pruning_predicate=appId_null_count@2 != appId_row_count@3 AND appId_min@0 <= 6eb17b3d80344184bb4c5592 AND 6eb17b3d80344184bb4c5592 <= appId_max@1 AND _l2_timestamp_null_count@5 != _l2_timestamp_row_count@6 AND _l2_timestamp_max@4 >= 1744199279329000 AND _l2_timestamp_null_count@5 != _l2_timestamp_row_count@6 AND _l2_timestamp_min@7 < 1744199879329000, required_guarantees=[appId in (6eb 17b3d80344184bb4c5592)] ``` #### Failed Query Execution Plan ``` [grpc-flight] physical plan after generate GlobalLimitExec: skip=10, fetch=20 CoalesceBatchesExec: target_batch_size=8192, fetch=30 FilterExec: appId@1 = 6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 1744199369830000 AND _l2_timestamp@0 < 1744199969830000 UnionExec SortExec: expr=[_l2_timestamp@0 DESC NULLS LAST], preserve_partitioning=[false] MemoryExec: partitions=1, partition_sizes=[3] ParquetExec: file_groups={1 group: [[b5107ed9-933b-40e1-b87b-81d3da109d5f/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744199879085286.1744199936023233.10.7315697162543697422.parquet, b5107ed9-933b-40e1-b87b-81d3da109d5f/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744199818030386.1744199863928697.8.7315697162543697371.parquet, b5107ed9-933b-40e1-b87b-81d3da109d5f/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744065150679723.1744199797977616.205361.7315697162543697335.parquet]]}, projection=[_l2_timestamp, appId, message, namespace], output_ordering=[_l2_timestamp@0 DESC NULLS LAST], predicate=appId@1 = 6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 1744199369830000 AND _l2_timestamp@0 < 1744199969830000, pruning_predicate=appId_null_count@2 != appId_row_count@3 AND appId_min@0 <= 6eb17b3d80344184bb4c5592 AND 6eb17b3d80344184bb4c5592 <= appId_max@1 AND _l2_timestamp_null_count@5 != _l2_timestamp_row_count@6 AND _l2_timestamp_ max@4 >= 1744199369830000 AND _l2_timestamp_null_count@5 != _l2_timestamp_row_count@6 AND _l2_timestamp_min@7 < 1744199969830000, required_guarantees=[appId in (6eb17b3d80344184bb4c5592)] ``` #### Questions and Clarifications From the execution plan, I can see that the offset LIMIT query uses `GlobalLimitExec: skip=10, fetch=20`, but fails with an error saying "GlobalLimitExec requires a single input partition." I'm not sure if this is a bug or a limitation in my usage pattern. Could someone explain: 1. What "GlobalLimitExec requires a single input partition" means in this context? 2. Is there a specific pattern I should follow when using LIMIT with an offset in DataFusion? 3. Are there any configuration settings or query modifications that could help resolve this issue? Any insights would be appreciated. ### To Reproduce _No response_ ### Expected behavior _No response_ ### Additional context - DataFusion version: 45.0.0 - Operating System: linux x86_64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org