Hi Jing,

Thanks for the driving this, LGTM.

Best,
Godfrey

Jingsong Li <jingsongl...@gmail.com> 于2022年7月15日周五 11:38写道:
>
> Thanks for starting this discussion.
>
> Have we considered introducing a listPartitionWithStats() in Catalog?
>
> Best,
> Jingsong
>
> On Fri, Jul 15, 2022 at 10:08 AM Jark Wu <imj...@gmail.com> wrote:
> >
> > Hi Jing,
> >
> > Thanks for starting this discussion. The bulk fetch is a great improvement
> > for the optimizer.
> > The FLIP looks good to me.
> >
> > Best,
> > Jark
> >
> > On Fri, 8 Jul 2022 at 17:36, Jing Ge <j...@ververica.com> wrote:
> >
> > > Hi devs,
> > >
> > > After having multiple discussions with Jark and Goldfrey, I'd like to 
> > > start
> > > a discussion on the mailing list w.r.t. FLIP-247[1], which will
> > > significantly improve the performance by providing the bulk fetch
> > > capability for table and column statistics.
> > >
> > > Currently the statistics information about tables can only be fetched from
> > > the catalog by each given partition iteratively. Since getting statistics
> > > information from catalogs is a very heavy operation, in order to improve
> > > the query performance, we’d better provide functionality to fetch the
> > > statistics information of a table for all given partitions in one shot.
> > >
> > > Based on the manual performance test, for 2000 partitions, the cost will 
> > > be
> > > improved from 10s to 2s. The improvement result is 500%.
> > >
> > > [1]
> > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-247%3A+Bulk+fetch+of+table+and+column+statistics+for+given+partitions
> > >
> > > Best regards,
> > > Jing
> > >

Reply via email to