Hi Wenlong, I'm fine with the config options. Best, Godfrey
wenlong.lwl <wenlong88....@gmail.com> 于2021年11月17日周三 下午3:13写道: > > Hi Chesney and Konstantin, > thanks for your feedback, I have added a section about How we support set > description at DataStream API in the doc. > > > Bests, > Wenlong > > On Tue, 16 Nov 2021 at 21:05, Konstantin Knauf <kna...@apache.org> wrote: > > > Hi everyone, > > > > Thanks for starting this discussion. I am in favor of solving this for > > DataStream and Table API at the same time, using the same configuration > > keys. IMO we shouldn't introduce any additional fragmentation if we can > > avoid it. > > > > Cheers, > > > > Konstantin > > > > On Tue, Nov 16, 2021 at 1:50 PM wenlong.lwl <wenlong88....@gmail.com> > > wrote: > > > > > hi, Chesney, we focus on sql first because the operator and topology of > > sql > > > jobs are generated by the engine, raising most of the problems in naming, > > > not only because the name is long but also because the topology can be > > more > > > complex than DataStream. > > > > > > The case in Datastream is much better, most of the names in DataStream > > API > > > are quite concise except for the windowing you mentioned, and the > > topology > > > is usually simpler, what's more we can easily expose to DataStream API > > as > > > a second step once the foundation implementation is done. If it is > > > necessary, we can also cover the changes on DataStream API now, maybe > > take > > > Windowing first as an example? > > > > > > Best, > > > Wenlong > > > > > > On Tue, 16 Nov 2021 at 19:14, Chesnay Schepler <ches...@apache.org> > > wrote: > > > > > > > Why should this be specific to the table API? The datastream API has > > > > similar issues with long operator names (like windowing). > > > > > > > > On 16/11/2021 11:22, wenlong.lwl wrote: > > > > > Thanks Godfrey for the suggestion. > > > > > Regarding 1, how about > > table.optimizer.simplify-operator-name-enabled, > > > > > which means that we would simplify the name of operator and keep the > > > > > details in description only. > > > > > "table.optimizer.operator-name.description-enabled" can not describe > > > what > > > > > it means I think. > > > > > Regarding 2, I agree that it is better to use enum instead of > > boolean. > > > > For > > > > > key I think you are meaning "pipeline.vertex-description-pattern" > > > instead > > > > > of "pipeline.vertex-name-pattern", and I would like to choose > > > > DEFAULT/TREE > > > > > for values. > > > > > > > > > > Best, > > > > > Wenlong > > > > > > > > > > On Tue, 16 Nov 2021 at 17:28, godfrey he <godfre...@gmail.com> > > wrote: > > > > > > > > > >> Thanks for creating this FLIP Wenlong. > > > > >> > > > > >> The FLIP already looks pretty solid, I think the config options can > > be > > > > >> improved a little: > > > > >> 1) about table.optimizer.separate-name-and-description, I think > > > > >> "operator-name" should be considered in the option, > > > > >> how about table.optimizer.operator-name.description-enabled ? > > > > >> 2) about pipeline.tree-mode-vertex-description, I think we can make > > > > >> the mode accept string value, > > > > >> which is more flexible. How about pipeline.vertex-name-pattern, the > > > > >> default value is "TREE", > > > > >> another option is "CASCADE" (or "DEFAULT", which is more simple) > > > > >> > > > > >> What do you think? > > > > >> > > > > >> Best, > > > > >> Godfrey > > > > >> > > > > >> wenlong.lwl <wenlong88....@gmail.com> 于2021年11月15日周一 下午6:36写道: > > > > >> > > > > >>> Hi, all, FYI the FLIP doc has been created : > > > > >>> > > > > >> > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-195%3A+Improve+the+name+and+structure+of+vertex+and+operator+name+for+sql+job > > > > >>> Best, > > > > >>> Wenlong > > > > >>> > > > > >>> On Mon, 15 Nov 2021 at 11:41, wenlong.lwl <wenlong88....@gmail.com > > > > > > > >> wrote: > > > > >>>> Hi all, > > > > >>>> Thanks for the feedback, It seems that the proposal is accepted by > > > all > > > > >> of > > > > >>>> you guys. I will prepare a formal FLIP document and then go ahead > > to > > > > >> the > > > > >>>> vote stage. > > > > >>>> If any one has any other comments or suggestions, please let me > > > know, > > > > >>>> thanks. > > > > >>>> > > > > >>>> Best, > > > > >>>> Wenlong > > > > >>>> > > > > >>>> On Fri, 12 Nov 2021 at 05:54, Neng Lu <nl...@apache.org> wrote: > > > > >>>> > > > > >>>>> +1 (non-binding) > > > > >>>>> This change will really help to ease developer life. > > > > >>>>> > > > > >>>>> On Thu, Nov 11, 2021 at 6:33 AM Guowei Ma <guowei....@gmail.com> > > > > >> wrote: > > > > >>>>>> +1 > > > > >>>>>> This would be very helpful for our debugging online job. > > > > >>>>>> > > > > >>>>>> Best, > > > > >>>>>> Guowei > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> On Thu, Nov 11, 2021 at 8:03 PM Yuepeng Pan <flin...@126.com> > > > > wrote: > > > > >>>>>> > > > > >>>>>>> +1. It's useful to understand the job topology. > > > > >>>>>>> Looking forward to this feature. > > > > >>>>>>> Best, > > > > >>>>>>> Yuepeng Pan. > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> At 2021-11-11 19:44:44, "Yangze Guo" <karma...@gmail.com> > > wrote: > > > > >>>>>>>> +1. That's gonna help a lot for debugging. > > > > >>>>>>>> > > > > >>>>>>>> Best, > > > > >>>>>>>> Yangze Guo > > > > >>>>>>>> > > > > >>>>>>>> On Thu, Nov 11, 2021 at 7:37 PM Till Rohrmann < > > > > >> trohrm...@apache.org> > > > > >>>>>>> wrote: > > > > >>>>>>>>> This improvement looks like it makes the life of our users a > > > lot > > > > >>>>>> easier > > > > >>>>>>>>> when it comes to understanding logs and reading the UI. Hence > > > > >> +1. > > > > >>>>>>>>> Cheers, > > > > >>>>>>>>> Till > > > > >>>>>>>>> > > > > >>>>>>>>> On Thu, Nov 11, 2021 at 11:59 AM JING ZHANG < > > > > >> beyond1...@gmail.com> > > > > >>>>>>> wrote: > > > > >>>>>>>>>> Big +1. > > > > >>>>>>>>>> > > > > >>>>>>>>>> This is a problem frequently encountered in our production > > > > >>>>>> platform, > > > > >>>>>>> look > > > > >>>>>>>>>> forward to this improvement. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Best, > > > > >>>>>>>>>> Jing Zhang > > > > >>>>>>>>>> > > > > >>>>>>>>>> Martijn Visser <mart...@ververica.com> 于2021年11月11日周四 > > > > >> 下午6:26写道: > > > > >>>>>>>>>>> +1. Looks much better now > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> On Thu, 11 Nov 2021 at 11:07, godfrey he < > > > > >> godfre...@gmail.com> > > > > >>>>>>> wrote: > > > > >>>>>>>>>>>> Thanks for driving this, this improvement solves a > > > > >>>>>> long-complained > > > > >>>>>>>>>>>> problem, +1 > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Best, > > > > >>>>>>>>>>>> Godfrey > > > > >>>>>>>>>>>> > > > > >>>>>>>>>>>> Jark Wu <imj...@gmail.com> 于2021年11月11日周四 下午5:40写道: > > > > >>>>>>>>>>>>> +1 for this. It looks much more clear and structured. > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> Best, > > > > >>>>>>>>>>>>> Jark > > > > >>>>>>>>>>>>> > > > > >>>>>>>>>>>>> On Thu, 11 Nov 2021 at 17:23, Chesnay Schepler < > > > > >>>>>>> ches...@apache.org> > > > > >>>>>>>>>>>> wrote: > > > > >>>>>>>>>>>>>> I'm generally in favor of it, and there are already > > > > >>>>>> tickets > > > > >>>>>>> that > > > > >>>>>>>>>>>>>> proposed a dedicated operator/vertex description: > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-20388 > > > > >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-21858 > > > > >>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> On 11/11/2021 10:02, wenlong.lwl wrote: > > > > >>>>>>>>>>>>>>> Hi, all, I would like to start a discussion about an > > > > >>>>>>> improvement > > > > >>>>>>>>>> on > > > > >>>>>>>>>>>> name > > > > >>>>>>>>>>>>>>> and structure of job vertex name, mainly to improve > > > > >>>>>>> experience of > > > > >>>>>>>>>>>>>> debugging > > > > >>>>>>>>>>>>>>> and analyzing sql job at runtime. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> the main proposed changes including: > > > > >>>>>>>>>>>>>>> 1. separate description and name for operator, so > > > > >> that > > > > >>>>>> we > > > > >>>>>>> can > > > > >>>>>>>>>> have > > > > >>>>>>>>>>>>>> detailed > > > > >>>>>>>>>>>>>>> info at description and shorter name, which could be > > > > >>>>>> more > > > > >>>>>>>>>> friendly > > > > >>>>>>>>>>>> for > > > > >>>>>>>>>>>>>>> external systems like logging/metrics without losing > > > > >>>>>> useful > > > > >>>>>>>>>>>> information. > > > > >>>>>>>>>>>>>>> 2. introduce a tree-mode vertex description which > > > > >> can > > > > >>>>>> make > > > > >>>>>>> the > > > > >>>>>>>>>>>>>> description > > > > >>>>>>>>>>>>>>> more readable and easier to understand > > > > >>>>>>>>>>>>>>> 3. clean up and improve description for sql operator > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> here is an example with the changes for a sql job: > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> vertex name: > > > > >>>>>>>>>>>>>>> GlobalGroupAggregate[52] -> (Calc[53] -> > > > > >>>>>>> NotNullEnforcer[54] -> > > > > >>>>>>>>>>> Sink: > > > > >>>>>>>>>>>>>>> tb_ads_dwi_pub_hbd_spm_dtr_002_003[54], Calc[55] -> > > > > >>>>>>>>>>>> NotNullEnforcer[56] > > > > >>>>>>>>>>>>>> -> > > > > >>>>>>>>>>>>>>> Sink: tb_ads_dwi_pub_hbd_spm_dtr_002_004[56]) > > > > >>>>>>>>>>>>>>> vertex description: > > > > >>>>>>>>>>>>>>> [52]:GlobalGroupAggregate(groupBy=[stat_date, > > > > >>>>>> spm_url_ab, > > > > >>>>>>>>>> client], > > > > >>>>>>>>>>>>>>> select=[stat_date, spm_url_ab, client, > > > > >> COUNT(count1$0) > > > > >>>>>> AS > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001, COUNT(distinct$0 count$1) AS > > > > >>>>>>>>>>> clk_uv_app_mtr_001, > > > > >>>>>>>>>>>>>>> COUNT(count1$2) AS clk_cnt_app_mtr_002, > > > > >> COUNT(distinct$0 > > > > >>>>>>> count$3) > > > > >>>>>>>>>>> AS > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002, COUNT(count1$4) AS > > > > >>>>>> clk_cnt_app_mtr_003, > > > > >>>>>>>>>>>>>>> COUNT(distinct$0 count$5) AS clk_uv_app_mtr_003]) :- > > > > >>>>>>>>>>>>>>> [53]:Calc(select=[CASE((client <> ''), > > > > >>>>>> CONCAT_WS('\u0004', > > > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '12345')), > > > > >> 1, > > > > >>>>>> 4), > > > > >>>>>>>>>> ':md5'), > > > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '12345:app', > > > > >>>>>> CONCAT(client, > > > > >>>>>>>>>>> ':client'), > > > > >>>>>>>>>>>>>>> CONCAT('ddd:', stat_date)), > > > > >> null:VARCHAR(2147483647)) AS > > > > >>>>>>> rowkey, > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_001 AS clk_cnt_app_dtr_001, > > > > >>>>>>> clk_uv_app_mtr_001 AS > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_001, clk_cnt_app_mtr_002 AS > > > > >>>>>>> clk_cnt_app_dtr_002, > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_002 AS clk_uv_app_dtr_002, > > > > >>>>>>> clk_cnt_app_mtr_003 AS > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_003, clk_uv_app_mtr_003 AS > > > > >>>>>>> clk_uv_app_dtr_003]) : > > > > >>>>>>>>>>> +- > > > > >>>>>>>>>>>>>>> [54]:NotNullEnforcer(fields=[rowkey]) : +- > > > > >>>>>>>>>>>>>>> > > > > >> > > > > > > > > > [54]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_003], > > > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001, > > > > >> clk_uv_app_dtr_001, > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002, > > > > >>>>>>> clk_cnt_app_dtr_003, > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003]) +- > > > > >> [55]:Calc(select=[CASE((client > > > > >>>>>> <> > > > > >>>>>>> ''), > > > > >>>>>>>>>>>>>>> CONCAT_WS('\u0004', > > > > >>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, > > > > >>>>>>>>>>>> '12345')), 1, > > > > >>>>>>>>>>>>>>> 4), ':md5'), CONCAT(spm_url_ab, ':spmab'), > > > > >> '12345:app', > > > > >>>>>>>>>>>> CONCAT('ddd:', > > > > >>>>>>>>>>>>>>> stat_date), CONCAT(client, ':client')), (client = > > > > >> ''), > > > > >>>>>>>>>>>>>> CONCAT_WS('\u0004', > > > > >>>>>>>>>>>>>>> CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '92459')), > > > > >> 1, > > > > >>>>>> 4), > > > > >>>>>>>>>> ':md5'), > > > > >>>>>>>>>>>>>>> CONCAT(spm_url_ab, ':spmab'), '92459:app', > > > > >>>>>> CONCAT('ddd:', > > > > >>>>>>>>>>>> stat_date)), > > > > >>>>>>>>>>>>>>> null:VARCHAR(2147483647)) AS rowkey, > > > > >>>>>> clk_cnt_app_mtr_001 AS > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_001, clk_uv_app_mtr_001 AS > > > > >>>>>>> clk_uv_app_dtr_001, > > > > >>>>>>>>>>>>>>> clk_cnt_app_mtr_002 AS clk_cnt_app_dtr_002, > > > > >>>>>>> clk_uv_app_mtr_002 AS > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_002, clk_cnt_app_mtr_003 AS > > > > >>>>>>> clk_cnt_app_dtr_003, > > > > >>>>>>>>>>>>>>> clk_uv_app_mtr_003 AS clk_uv_app_dtr_003]) +- > > > > >>>>>>>>>>>>>>> [56]:NotNullEnforcer(fields=[rowkey]) +- > > > > >>>>>>>>>>>>>>> > > > > >> > > > > > > > > > [56]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_004], > > > > >>>>>>>>>>>>>>> fields=[rowkey, clk_cnt_app_dtr_001, > > > > >> clk_uv_app_dtr_001, > > > > >>>>>>>>>>>>>>> clk_cnt_app_dtr_002, clk_uv_app_dtr_002, > > > > >>>>>>> clk_cnt_app_dtr_003, > > > > >>>>>>>>>>>>>>> clk_uv_app_dtr_003]) > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> For more detail on the proposal: > > > > >>>>>>>>>>>>>>> > > > > >> > > > > > > > > > https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk > > > > >>>>>>>>>>>>>>> < > > > > >> > > > > > > > > > https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk/edit# > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Looking forward to your feedback, thanks. > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Bests > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>>> Wenlong Lyu > > > > >>>>>>>>>>>>>>> > > > > >>>>>>>>>>>>>> > > > > > > > > > > > > > > > > > -- > > > > Konstantin Knauf > > > > https://twitter.com/snntrable > > > > https://github.com/knaufk > >