+1 for this. It looks much more clear and structured.

Best,
Jark

On Thu, 11 Nov 2021 at 17:23, Chesnay Schepler <ches...@apache.org> wrote:

> I'm generally in favor of it, and there are already tickets that
> proposed a dedicated operator/vertex description:
>
> https://issues.apache.org/jira/browse/FLINK-20388
> https://issues.apache.org/jira/browse/FLINK-21858
>
> On 11/11/2021 10:02, wenlong.lwl wrote:
> > Hi, all, I would like to start a discussion about an improvement on name
> > and structure of job vertex name, mainly to improve experience of
> debugging
> > and analyzing sql job at runtime.
> >
> > the main proposed changes including:
> > 1. separate description and name for operator, so that we can have
> detailed
> > info at description and shorter name, which could be more friendly for
> > external systems like logging/metrics without losing useful information.
> > 2. introduce a tree-mode vertex description which can make the
> description
> > more readable and easier to understand
> > 3. clean up and improve description for sql operator
> >
> > here is an example with the changes for a sql job:
> >
> > vertex name:
> > GlobalGroupAggregate[52] -> (Calc[53] -> NotNullEnforcer[54] -> Sink:
> > tb_ads_dwi_pub_hbd_spm_dtr_002_003[54], Calc[55] -> NotNullEnforcer[56]
> ->
> > Sink: tb_ads_dwi_pub_hbd_spm_dtr_002_004[56])
> > vertex description:
> > [52]:GlobalGroupAggregate(groupBy=[stat_date, spm_url_ab, client],
> > select=[stat_date, spm_url_ab, client, COUNT(count1$0) AS
> > clk_cnt_app_mtr_001, COUNT(distinct$0 count$1) AS clk_uv_app_mtr_001,
> > COUNT(count1$2) AS clk_cnt_app_mtr_002, COUNT(distinct$0 count$3) AS
> > clk_uv_app_mtr_002, COUNT(count1$4) AS clk_cnt_app_mtr_003,
> > COUNT(distinct$0 count$5) AS clk_uv_app_mtr_003]) :-
> > [53]:Calc(select=[CASE((client <> ''), CONCAT_WS('\u0004',
> > CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '12345')), 1, 4), ':md5'),
> > CONCAT(spm_url_ab, ':spmab'), '12345:app', CONCAT(client, ':client'),
> > CONCAT('ddd:', stat_date)), null:VARCHAR(2147483647)) AS rowkey,
> > clk_cnt_app_mtr_001 AS clk_cnt_app_dtr_001, clk_uv_app_mtr_001 AS
> > clk_uv_app_dtr_001, clk_cnt_app_mtr_002 AS clk_cnt_app_dtr_002,
> > clk_uv_app_mtr_002 AS clk_uv_app_dtr_002, clk_cnt_app_mtr_003 AS
> > clk_cnt_app_dtr_003, clk_uv_app_mtr_003 AS clk_uv_app_dtr_003]) : +-
> > [54]:NotNullEnforcer(fields=[rowkey]) : +-
> >
> [54]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_003],
> > fields=[rowkey, clk_cnt_app_dtr_001, clk_uv_app_dtr_001,
> > clk_cnt_app_dtr_002, clk_uv_app_dtr_002, clk_cnt_app_dtr_003,
> > clk_uv_app_dtr_003]) +- [55]:Calc(select=[CASE((client <> ''),
> > CONCAT_WS('\u0004', CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '12345')), 1,
> > 4), ':md5'), CONCAT(spm_url_ab, ':spmab'), '12345:app', CONCAT('ddd:',
> > stat_date), CONCAT(client, ':client')), (client = ''),
> CONCAT_WS('\u0004',
> > CONCAT(SUBSTRING(MD5(CONCAT(spm_url_ab, '92459')), 1, 4), ':md5'),
> > CONCAT(spm_url_ab, ':spmab'), '92459:app', CONCAT('ddd:', stat_date)),
> > null:VARCHAR(2147483647)) AS rowkey, clk_cnt_app_mtr_001 AS
> > clk_cnt_app_dtr_001, clk_uv_app_mtr_001 AS clk_uv_app_dtr_001,
> > clk_cnt_app_mtr_002 AS clk_cnt_app_dtr_002, clk_uv_app_mtr_002 AS
> > clk_uv_app_dtr_002, clk_cnt_app_mtr_003 AS clk_cnt_app_dtr_003,
> > clk_uv_app_mtr_003 AS clk_uv_app_dtr_003]) +-
> > [56]:NotNullEnforcer(fields=[rowkey]) +-
> >
> [56]:Sink(table=[default_catalog.default_database.tb_ads_dwi_pub_hbd_spm_dtr_002_004],
> > fields=[rowkey, clk_cnt_app_dtr_001, clk_uv_app_dtr_001,
> > clk_cnt_app_dtr_002, clk_uv_app_dtr_002, clk_cnt_app_dtr_003,
> > clk_uv_app_dtr_003])
> >
> > For more detail on the proposal:
> >
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk
> > <
> https://docs.google.com/document/d/1VUVJeHY_We09GY53-K2lETP3HUNZG9wMKyecFWk_Wxk/edit#
> >
> >
> > Looking forward to your feedback, thanks.
> >
> > Bests
> >
> > Wenlong Lyu
> >
>
>

Reply via email to