Re: Allow setting job name when using StatementSet

2021-06-07 Thread JING ZHANG
I agree with Nico, I just add the link of pipeline.name here. Nicolaus Weidner 于2021年6月7日周一 下午11:46写道: > Hi Yuval, > > I am not familiar with the Table API, but in the fragment you posted, the > generated job

Re: State migration for sql job

2021-06-07 Thread JING ZHANG
Hi aitozi, This is a popular demand that many users mentioned, which appears in user mail list for several times. Unfortunately, it is not supported by Flink SQL yet, maybe would be solved in the future. BTW, a few company try to solve the problem in some specified user cases on their internal Flin

Re: Re: Add control mode for flink

2021-06-07 Thread Steven Wu
I can see the benefits of control flow. E.g., it might help the old (and inactive) FLIP-17 side input. I would suggest that we add more details of some of the potential use cases. Here is one mismatch with using control flow for dynamic config. Dynamic config is typically targeted/loaded by one sp

Re: Re: Add control mode for flink

2021-06-07 Thread Xintong Song
+1 on separating the effort into two steps: 1. Introduce a common control flow framework, with flexible interfaces for generating / reacting to control messages for various purposes. 2. Features that leverating the control flow can be worked on concurrently Meantime, keeping collectin

State migration for sql job

2021-06-07 Thread aitozi
When use flink sql, we encounter a big problem to deal with sql state compatibility.Think we have a group agg sql like ```sqlselect sum(`a`) from source_t group by `uid But if i want to add a new agg column to ```sqlselect sum(`a`), max(`a`) from source_t group by `uidThen sql state will no

Re: Re: Add control mode for flink

2021-06-07 Thread Yun Gao
Very thanks Jiangang for bringing this up and very thanks for the discussion! I also agree with the summarization by Xintong and Jing that control flow seems to be a common buidling block for many functionalities and dynamic configuration framework is a representative application that frequentl

Re: How to configure column width in Flink SQL client?

2021-06-07 Thread Ingo Bürk
Hi Svend, unfortunately the column width in the SQL client cannot currently be configured. Regards Ingo On Mon, Jun 7, 2021 at 4:19 PM Svend wrote: > > Hi everyone, > > When using the Flink SQL client and displaying results interactively, it > seems the values of any column wider than 24 char

Re: DataStream API in Batch Execution mode

2021-06-07 Thread Guowei Ma
Hi, Macro I think you could try the `FileSource` and you could find an example from [1]. The `FileSource` would scan the file under the given directory recursively. Would you mind opening an issue for lacking the document? [1] https://github.com/apache/flink/blob/master/flink-connectors/flink-con

Re: Add control mode for flink

2021-06-07 Thread kai wang
I'm big +1 for this feature. 1. Limit the input qps. 2. Change log level for debug. in my team, the two examples above are needed JING ZHANG 于2021年6月8日周二 上午11:18写道: > Thanks Jiangang for bringing this up. > As mentioned in Jiangang's email, `dynamic configuration framework` > provides ma

Re: Add control mode for flink

2021-06-07 Thread JING ZHANG
Thanks Jiangang for bringing this up. As mentioned in Jiangang's email, `dynamic configuration framework` provides many useful functions in Kuaishou, because it could update job behavior without relaunching the job. The functions are very popular in Kuaishou, we also see similar demands in maillist

Re: sql multisink explain result is more expensive than expected

2021-06-07 Thread Luan Cooper
should I post in Dev user list ? On Mon, 7 Jun 2021 at 18:56 Luan Cooper wrote: > Hi > > We're using multi sink in sql with view, the TestCase is > > """java > @Test > def testJoinTemporalTableWithViewWithFilterPushDown(): Unit = { > createLookupTable("LookupTableAsync1", new AsyncTableF

Re: Add control mode for flink

2021-06-07 Thread 刘建刚
Thanks Xintong Song for the detailed supplement. Since flink is long-running, it is similar to many services. So interacting with it or controlling it is a common desire. This was our initial thought when implementing the feature. In our inner flink, many configs used in yaml can be adjusted by dyn

Re: Re: Re: Failed to cancel a job using the STOP rest API

2021-06-07 Thread Thomas Wang
This is actually a very simple job that reads from Kafka and writes to S3 using the StreamingFileSink w/ Parquet format. I'm all using Flink's API and nothing custom. Thomas On Sun, Jun 6, 2021 at 6:43 PM Yun Gao wrote: > Hi Thoms, > > Very thanks for reporting the exceptions, and it seems to b

DataStream API in Batch Execution mode

2021-06-07 Thread Marco Villalobos
How do I use a hierarchical directory structure as a file source in S3 when using the DataStream API in Batch Execution mode? I have been trying to find out if the API supports that, because currently our data is organized by years, halves, quarters, months, and but before I launch the job, I flat

Re: Flink app performance test framework

2021-06-07 Thread luck li
Thanks Yangze. Nextmark is useful to me. Best regards > On Jun 6, 2021, at 8:08 PM, Yangze Guo wrote: > > Hi, Luck, > > I may not fully understand your requirements. If you just want to test > the performance of typical streaming jobs with the Flink, you can > refer to the nexmark[1]. If you j

Re: Allow setting job name when using StatementSet

2021-06-07 Thread Nicolaus Weidner
Hi Yuval, I am not familiar with the Table API, but in the fragment you posted, the generated job name is only used as default if configuration option pipeline.name is not set. Can't you just set that to the name you want to have? Best wishes, Nico On Mon, Jun 7, 2021 at 10:09 AM Yuval Itzchakov

Re: Stream processing into single sink to multiple DB Schemas

2021-06-07 Thread Maciej Obuchowski
Hey, We had similar problem, but with 1000s of tables. I've created issue [1] and PR with internally used solution [2], but unfortunately, there seems to be no interest in upstreaming this feature. Thanks, Maciej [1] https://issues.apache.org/jira/browse/FLINK-21643 [2] https://github.com/apache/

Re: Stream processing into single sink to multiple DB Schemas

2021-06-07 Thread Nicolaus Weidner
Hi Tamir, I assume you want to use the Jdbc connector? You can use three filters on your input stream to separate it into three separate streams, then add a sink to each of those (see e.g. [1]). Then you can have a different SQL statement for each of the three sinks. If you specify the driver name

How to configure column width in Flink SQL client?

2021-06-07 Thread Svend
Hi everyone, When using the Flink SQL client and displaying results interactively, it seems the values of any column wider than 24 characters is truncated, which is indicated by a '~' character, e.g. the "member_user_id" below: ``` SELECT metadata.true_as_of_timestamp_millis, member_user_i

Re: Is it possible to customize avro schema name when using SQL

2021-06-07 Thread Nicolaus Weidner
Hi Tao, This is currently not possible using Table API, though this will likely change in a future version. Currently, you would have to do that using the Datastream API [1] and then switch to the Table API. Best wishes, Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/

Re: Multiple Exceptions during Load Test in State Access APIs with RocksDB

2021-06-07 Thread Chirag Dewan
Hi, I think I got my issue. Would help if someone can confirm it :) I am using a NFS filesystem for storing my checkpoints and my Flink cluster is running on a K8 with 2 TMs and 2 JMs.  All my pods share the NFS PVC with state.checkpoint.dir and we also missed setting the RocksDB local dir. Does

Re: recover from svaepoint

2021-06-07 Thread Piotr Nowojski
Hi, Thanks Tianxin and 周瑞' for reporting and tracking down the problem. Indeed that could be the reason behind it. Have either of you already created a JIRA ticket for this bug? > Concerning the required changing of the UID of an operator Piotr, is this a known issue and documented somewhere? I f

Re: Add control mode for flink

2021-06-07 Thread Jark Wu
Thanks Xintong for the summary, I'm big +1 for this feature. Xintong's summary for Table/SQL's needs is correct. The "custom (broadcast) event" feature is important to us and even blocks further awesome features and optimizations in Table/SQL. I also discussed offline with @Yun Gao several times

sql multisink explain result is more expensive than expected

2021-06-07 Thread Luan Cooper
Hi We're using multi sink in sql with view, the TestCase is """java @Test def testJoinTemporalTableWithViewWithFilterPushDown(): Unit = { createLookupTable("LookupTableAsync1", new AsyncTableFunction1) util.addTable( """ |CREATE TEMPORARY VIEW v_vvv AS |SELECT *

Re: After configuration checkpoint strategy, Flink Job cannot restart when job failed

2021-06-07 Thread Chesnay Schepler
The default number of restart attempts is 1. You need to explicitly configure it to allow more failures. On 6/7/2021 11:53 AM, 1095193...@qq.com wrote: Hi community, I h

Re: Corrupted unaligned checkpoints in Flink 1.11.1

2021-06-07 Thread Piotr Nowojski
Hi Alex, A quick question. Are you using incremental checkpoints? Best, Piotrek sob., 5 cze 2021 o 21:23 napisał(a): > Small correction, in T4 and T5 I mean Job2, not Job 1 (as job 1 was save > pointed). > > Thank you, > Alex > > On Jun 4, 2021, at 3:07 PM, Alexander Filipchik > wrote: > > 

Allow setting job name when using StatementSet

2021-06-07 Thread Yuval Itzchakov
Hi, Currently when using StatementSet, the name of the job is autogenerated by the runtime: [image: image.png] Is there any reason why there shouldn't be an overload that allows the user to explicitly specify the name of the running job? -- Best Regards, Yuval Itzchakov.