Re: Question about processing a 3-level List data type in parquet

2020-11-06 Thread Peter Huang
Hi Naehee, Thanks for reporting the issue. Yes, it is a bug in the ParquetInputFormat. Would you please create a jira ticket and assign to me. I will try to fix it by the end of this weekend. My Jira account name Zhenqiu Huang. Thanks Best Regards Peter Huang On Wed, Nov 4, 2020 at 11:57 PM Na

Re: union stream vs multiple operators

2020-11-06 Thread Alexey Trenikhun
Ok, thank you. From: Chesnay Schepler Sent: Thursday, November 5, 2020 3:15:28 PM To: Alexey Trenikhun ; Flink User Mail List Subject: Re: union stream vs multiple operators I don't think the first option has any benefit. On 11/5/2020 1:19 AM, Alexey Trenikhun

Join Bottleneck

2020-11-06 Thread Rex Fenley
Hello, I have a Job that's a series of Joins, GroupBys, and Aggs and it's bottlenecked in one of the joins. The join's cardinality is ~300 million rows on the left and ~200 million rows on the right all with unique keys. I'm seeing this in the plan for that bottlenecked Join. Join(joinType=[Inner

Re: Rules of Thumb for Setting Parallelism

2020-11-06 Thread Rex Fenley
Great, thanks! So just to confirm, configure # of task slots to # of core nodes x # of vCPUs? I'm not sure what you mean by "distribute them across both jobs (so that the total adds up to 32)". Is it configurable how many task slots a job can receive, so in this case I'd provide ~30/36 * 32 task

Re: Rules of Thumb for Setting Parallelism

2020-11-06 Thread Till Rohrmann
Hi Rex, as a rule of thumb I recommend configuring your TMs with as many slots as they have cores. So in your case your cluster would have 32 slots. Then depending on the workload of your jobs you should distribute them across both jobs (so that the total adds up to 32). A high number of operators

Re: How to use properly the function: withTimestampAssigner((event, timestamp) ->..

2020-11-06 Thread Till Rohrmann
Hi Simone, The problem is that the Java 1.8 compiler cannot do type inference when chaining methods [1]. The solution would be WatermarkStrategy wmStrategy = WatermarkStrategy .forMonotonousTimestamps() .withTimestampAssigner((event

Re: cannot pull statefun docker image

2020-11-06 Thread Tzu-Li (Gordon) Tai
Hi, The Dockerfiles in the examples in the flink-statefun repo currently work against images built from snapshot development branches. Ververica has been hosting StateFun base images for released versions: https://hub.docker.com/r/ververica/flink-statefun You can change `FROM flink-statefun:*` to

cannot pull statefun docker image

2020-11-06 Thread Lian Jiang
Hi, I tried to build statefun-greeter-example docker image with "docker build ." but cannot pull the base statefun docker image due to access denied. Any idea? Thanks. $ docker login Authenticating with existing credentials... Login Succeeded lianj:~/repo/flink-statefun/statefun-examples/statefun

How to use properly the function: withTimestampAssigner((event, timestamp) ->..

2020-11-06 Thread Simone Cavallarin
Hi, I'm taking the timestamp from the event payload that I'm receiving from Kafka. I'm struggling to get the time and I'm confused on how I should use the function ".withTimestampAssigner()". I'm receiving an error on event.getTime() that is telling me: "cannot resolve method "Get Time" in "Obj

Re: Native kubernetes setup

2020-11-06 Thread Yang Wang
Actually, in our document, we have provided a command[1] to create the service account. It is similar to your yaml file. $ kubectl create clusterrolebinding flink-role-binding-default --clusterrole=edit --serviceaccount=default:default Unfortunately, we could not support mounting a PVC. We plan

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2020-11-06 Thread Flavio Pompermaier
I think it's ok.. I suggest also to add JobStatus to onJobExecuted() so you can immediately know if the job finished successfully or if it is was failed or canceled. Thanks for the help, Flavio On Fri, Nov 6, 2020 at 10:41 AM Kostas Kloudas wrote: > Hi Flavio, > > Coould this https://issues.apa

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2020-11-06 Thread Kostas Kloudas
Hi Flavio, Coould this https://issues.apache.org/jira/browse/FLINK-20020 help? Cheers, Kostas On Thu, Nov 5, 2020 at 9:39 PM Flavio Pompermaier wrote: > > Hi everybody, > I was trying to use the JobListener in my job but onJobExecuted() on Flink > 1.11.0 but I can't understand if the job succe

Re: Failure to execute streaming SQL query

2020-11-06 Thread Danny Chan
Hi, Satyam ~ What version of Flink release did you use? I tested your first SQL statements in local and they both works great. Your second SQL statement fails because currently we does not support stream-stream join on time attributes because the join would breaks the semantic of time attribute (

Re: Flink TLS in K8s

2020-11-06 Thread Patrick Eifler
Hi Chesney, Thanks for the hint. I have mounted my certs in both job and taskmanager volume mounts. When the containers bootup I get the log that the ssl store is successfully loaded. Note: I use the same keystore setup to connect to secured Kafka Cluster and this works. How would you suggest t