lalaorya opened a new issue, #15665:
URL: https://github.com/apache/datafusion/issues/15665

   ### Describe the bug
   
   When using the LIMIT clause, simple `LIMIT N` syntax (such as `LIMIT 10`) 
works normally, but when using the syntax with an offset (such as `LIMIT 
10,20`), it fails and results in an internal error.
   
   1. Create a table `tbl` with data.(using a custom union provider that 
includes MemTable and ListingTable.)
   2. Execute the following query (works normally):
      ```sql
      SELECT * FROM tbl WHERE appId = 'xxx' LIMIT 0,10
      ```
   3. Execute the following query (fails):
      ```sql
      SELECT * FROM tbl WHERE appId = 'xxx' LIMIT 10,20
      ```
   ### Error Information
   
   ```
   Internal error: GlobalLimitExec requires a single input partition.
   This was likely caused by a bug in DataFusion's code and we would welcome 
that you file an bug report in our issue tracker
   ```
   ### Execution Plan Comparison
   #### Successful Query Execution Plan
   ```
   [grpc-flight] physical plan after generate
   CoalesceBatchesExec: target_batch_size=8192, fetch=20
     FilterExec: appId@1 = 6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 
1744199279329000 AND _l2_timestamp@0 < 1744199879329000
       UnionExec
         SortExec: expr=[_l2_timestamp@0 DESC NULLS LAST], 
preserve_partitioning=[false]
           MemoryExec: partitions=1, partition_sizes=[6]
         ParquetExec: file_groups={1 group: 
[[113bf01e-4ec6-42b8-9fbb-f8846bc51664/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744199703383977.1744199797977616.16.7315697162543697288.parquet,
 
113bf01e-4ec6-42b8-9fbb-f8846bc51664/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744065150679723.1744199685314003.205345.7315697162543697221.parquet]]},
 projection=[_l2_timestamp, appId, message, namespace], 
output_ordering=[_l2_timestamp@0 DESC NULLS LAST], predicate=appId@1 = 
6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 1744199279329000 AND 
_l2_timestamp@0 < 1744199879329000, pruning_predicate=appId_null_count@2 != 
appId_row_count@3 AND appId_min@0 <= 6eb17b3d80344184bb4c5592 AND 
6eb17b3d80344184bb4c5592 <= appId_max@1 AND _l2_timestamp_null_count@5 != 
_l2_timestamp_row_count@6 AND _l2_timestamp_max@4 >= 1744199279329000 AND 
_l2_timestamp_null_count@5 != _l2_timestamp_row_count@6 AND _l2_timestamp_min@7 
< 1744199879329000, required_guarantees=[appId in (6eb
 17b3d80344184bb4c5592)]
   ```
   
   #### Failed Query Execution Plan
   
   ```
   [grpc-flight] physical plan after generate
   GlobalLimitExec: skip=10, fetch=20
     CoalesceBatchesExec: target_batch_size=8192, fetch=30
       FilterExec: appId@1 = 6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 
1744199369830000 AND _l2_timestamp@0 < 1744199969830000
         UnionExec
           SortExec: expr=[_l2_timestamp@0 DESC NULLS LAST], 
preserve_partitioning=[false]
             MemoryExec: partitions=1, partition_sizes=[3]
           ParquetExec: file_groups={1 group: 
[[b5107ed9-933b-40e1-b87b-81d3da109d5f/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744199879085286.1744199936023233.10.7315697162543697422.parquet,
 
b5107ed9-933b-40e1-b87b-81d3da109d5f/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744199818030386.1744199863928697.8.7315697162543697371.parquet,
 
b5107ed9-933b-40e1-b87b-81d3da109d5f/schema=schema_key/$$/dev/app/appId=6eb17b3d80344184bb4c5592/1744065150679723.1744199797977616.205361.7315697162543697335.parquet]]},
 projection=[_l2_timestamp, appId, message, namespace], 
output_ordering=[_l2_timestamp@0 DESC NULLS LAST], predicate=appId@1 = 
6eb17b3d80344184bb4c5592 AND _l2_timestamp@0 >= 1744199369830000 AND 
_l2_timestamp@0 < 1744199969830000, pruning_predicate=appId_null_count@2 != 
appId_row_count@3 AND appId_min@0 <= 6eb17b3d80344184bb4c5592 AND 
6eb17b3d80344184bb4c5592 <= appId_max@1 AND _l2_timestamp_null_count@5 != 
_l2_timestamp_row_count@6 AND _l2_timestamp_
 max@4 >= 1744199369830000 AND _l2_timestamp_null_count@5 != 
_l2_timestamp_row_count@6 AND _l2_timestamp_min@7 < 1744199969830000, 
required_guarantees=[appId in (6eb17b3d80344184bb4c5592)]
   ```
   #### Questions and Clarifications
   
   From the execution plan, I can see that the offset LIMIT query uses 
`GlobalLimitExec: skip=10, fetch=20`, but fails with an error saying 
"GlobalLimitExec requires a single input partition."
   
   I'm not sure if this is a bug or a limitation in my usage pattern. Could 
someone explain:
   1. What "GlobalLimitExec requires a single input partition" means in this 
context?
   2. Is there a specific pattern I should follow when using LIMIT with an 
offset in DataFusion?
   3. Are there any configuration settings or query modifications that could 
help resolve this issue?
   
   Any insights would be appreciated.
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   - DataFusion version: 45.0.0
   - Operating System: linux x86_64


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to