Re: Limit Query Performance Suggestion

2017-01-17 Thread Liang-Chi Hsieh
Hi Sujith, I saw your updated post. Seems it makes sense to me now. If you use a very big limit number, the shuffling before `GlobalLimit` would be a bottleneck for performance, of course, even it can eventually shuffle enough data to the single partition. Unlike `CollectLimit`, actually I thin

Re: Limit Query Performance Suggestion

2017-01-17 Thread sujith71955
Dear Liang, Thanks for your valuable feedback. There was a mistake in the previous post i corrected it, as you mentioned the `GlobalLimit` we will only take the required number of rows from the input iterator which really pulls data from local blocks and remote blocks. but if the limit value is

Re: Limit Query Performance Suggestion

2017-01-15 Thread Liang-Chi Hsieh
Hi Sujith, Thanks for suggestion. The codes you quoted are from `CollectLimitExec` which will be in the plan if a logical `Limit` is the final operator in an logical plan. But in the physical plan you showed, there are `GlobalLimit` and `LocalLimit` for the logical `Limit` operation, so the `doE