Thanks, I'll take a look at latest changes in more detail. I'd only looked at the specific function in trunk and it seemed unchanged from 0.13.
On Thu, May 7, 2015 at 7:50 PM, Ashutosh Chauhan <hashut...@apache.org> wrote: > Harish has done some good work for popular use-case of windowing on > https://issues.apache.org/jira/browse/HIVE-7062 which are available from > 0.14 onwards. Will that be useful in your scenario? Or, are you targeting > non-windowing PTFs? > > Thanks, > Ashutosh > > On Thu, May 7, 2015 at 6:43 AM, Sivaramakrishnan Narayanan < > tarb...@gmail.com> wrote: > > > Hi, > > > > I was reading through the PTFOperator and related code and was wondering > if > > there is an opportunity to optimize this function in > > WindowingTableFunction.java > > > > public void execute(PTFPartitionIterator<Object> pItr, PTFPartition > > outP) throws HiveException { > > > > This guy iterates over the input partition once to compute > outputColumns. > > This causes a full read of input partition. > > > > It then iterates over input partition again to append newly computed > > values. This causes another read of input partition and a write to output > > partition. > > > > I was wondering if it may be more efficient to append to the output > > partition as soon as window expressions have been computed. This will > avoid > > one scan of the input partition. > > > > FYI - I've been looking at hive 0.13 code mostly but a glance at trunk > > suggests this logic is the same there. > > > > Thanks, > > > > Siva > > >