Re: How does setMaxParallelism work

2018-03-27 Thread Jörn Franke
What was the input format, the size and the program that you tried to execute > On 28. Mar 2018, at 08:18, Data Engineer wrote: > > I went through the explanation on MaxParallelism in the official docs here: > https://ci.apache.org/projects/flink/flink-docs-master/ops/production_ready.html#set-m

Re: SSL config on Kubernetes - Dynamic IP

2018-03-27 Thread Sampath Bhat
Hi Edward, You can use this parameter in flink-conf.yaml to supress the hostname checking in certificates. If it suits your purpose. security.ssl.verify-hostname: false Secondly even I'm running flink 1.4 on K8s, I used to get the same error stack trace as you mentioned, while the blob client was

How does setMaxParallelism work

2018-03-27 Thread Data Engineer
I went through the explanation on MaxParallelism in the official docs here: https://ci.apache.org/projects/flink/flink-docs-master/ops/production_ready.html#set-maximum-parallelism-for-operators-explicitly However, I am not able to figure out how Flink decides the parallelism value. For instance,

Unable to load AWS credentials: Flink 1.2.1 + S3 + Kubernetes

2018-03-27 Thread Bajaj, Abhinav
Hi, I am trying to use Flink 1.2.1 with RockDB as statebackend and S3 for checkpoints. I am using Flink 1.2.1 docker images and running them in Kubernetes cluster. I have followed the steps documented in the Flink documentation - https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/

[ANNOUNCE] Weekly community update #13

2018-03-27 Thread Till Rohrmann
Dear squirrels, here is the next weekly community update thread. Please post any news and updates you want to share with the community here. # Release 1.5 The community is still focused on hardening the Flink 1.5 release and adding more automated end-to-end tests. If you want to help with the ef

Re: SSL config on Kubernetes - Dynamic IP

2018-03-27 Thread Till Rohrmann
Hi Edward, could you please file a JIRA issue for this problem. It might be as simple as that the TaskManager's network stack uses the IP instead of the hostname as you suggested. But we have to look into this to be sure. Also the logs of the JobManager as well as the TaskManagers could be helpful

Re: SSL config on Kubernetes - Dynamic IP

2018-03-27 Thread Christophe Jolif
I suspect this relates to: https://issues.apache.org/jira/browse/FLINK-5030 For which there was a PR at some point but nothing has been done so far. It seems the current code explicitly uses the IP vs Hostname for Netty SSL configuration. Without that I'm really wondering how people are reasonabl

timeWindow emits records before window ends?

2018-03-27 Thread NEKRASSOV, ALEXEI
Hello, With time characteristic set to IngestionTime I expected "timeWindow(Time.minutes(3))" to NOT produce any records in the first 3 minutes of running the job, and yet it does emit the record before 3 minutes elapse. Am I doing something wrong? Or my understanding of timeWindow is incorrect?

Re: Table/SQL Kafka Sink Question

2018-03-27 Thread Timo Walther
Hi Alexandru, the KafkaTableSink does not expose all features of the underlying DataStream API. Either you convert your table program to the DataStream API for the sink operation or you just extend a class like Kafka010JsonTableSink and customize it. Regards, Timo Am 27.03.18 um 11:59 schr

Re: regarding the use of colocation groups

2018-03-27 Thread Chesnay Schepler
Hello, your first use-case should be achievable by using a custom partitioner , probably with a KeySelector that returns the word. As for the second use-case, typically this would be achieved b

SSL config on Kubernetes - Dynamic IP

2018-03-27 Thread Edward Alexander Rojas Clavijo
Hi all, Currently I have a Flink 1.4 cluster running on kubernetes and with SSL configuration based on https://ci.apache.org/projects/flink/flink-docs- master/ops/security-ssl.html. However, as the IP of the nodes are dynamic (from the nature of kubernetes), we are using only the DNS which we can

regarding the use of colocation groups

2018-03-27 Thread Konstantinos Barmpis
Hello, I was wondering how to properly use colocation groups (if applicable) to achieve the required functionality in the following two simple contrived use-cases (focusing on the essence of the problem), both of which aim to be executed on a multi-node cluster (2 or more slaves and a master), wit

Re: Queryable State

2018-03-27 Thread Vishal Santoshi
Thank you for the clarification. On Wed, Mar 21, 2018, 4:28 AM Kostas Kloudas wrote: > Hi Vishal, > > As Fabian said, queryable state is just a feature that exposes the state > kept within Flink, and it is not made to > replace functionality that would otherwise be made by a sink. In the > futur

Re: java.lang.IllegalArgumentException: The implementation of the provided ElasticsearchSinkFunction is not serializable. The object probably contains or references non-serializable fields.

2018-03-27 Thread chandresh pancholi
Hi, Thank you for the response. I have made the suggested changes But now I am getting "Caused by: java.lang.NoClassDefFoundError: scala/Product$class" I am running my application on SpringBoot 2.0 version. Is there better platform to run Flink Code? Caused by: java.lang.NoClassDefFoundError: sca

Re: java.lang.IllegalArgumentException: The implementation of the provided ElasticsearchSinkFunction is not serializable. The object probably contains or references non-serializable fields.

2018-03-27 Thread Chesnay Schepler
Your anonymous ElasticsearchSinkFunction accesses the client variable that is defined outside of the function. For the function to be serializable, said Client must be as well. I suggest to turn your function into a named class with a constructor that accepts the indexName. On 27.03.2018 12:1

java.lang.IllegalArgumentException: The implementation of the provided ElasticsearchSinkFunction is not serializable. The object probably contains or references non-serializable fields.

2018-03-27 Thread chandresh pancholi
Flow Producer -> Kafka(Avro) -> Flink Connector with Avro deseriser -> FLink -> ES Kafka - Latest version Flink : 1.4.2 ES: 5.5.2 @Service public class FlinkStream { @Autowired private ClientService clientService; @Autowired private AppConfig appConfig; @PostConstruct pu

Re: Programmatic creation of YARN sessions and deployment (running) Flink jobs on it.

2018-03-27 Thread Chesnay Schepler
Hello, I think the flink-conf.yaml should only be required on the node on which you call yarn-session.sh. For starting the session cluster programmatically you would have to look into the YarnClusterDescriptor (for starting the session cluster) and the YarnClusterClient for submitting jobs (

Re: Error running on Hadoop 2.7

2018-03-27 Thread Stephan Ewen
Thanks, in that case it sounds like it is more related to Hadoop classpath mixups, rather than class loading. On Mon, Mar 26, 2018 at 3:03 PM, ashish pok wrote: > Stephan, we are in 1.4.2. > > Thanks, > > -- Ashish > > On Mon, Mar 26, 2018 at 7:38 AM, Stephan Ewen > wrote: > If you are on Flink

Re: Table/SQL Kafka Sink Question

2018-03-27 Thread Alexandru Gutan
That's what I concluded as well after checking the docs and source code. I'm thinking to add another job using the Stream API (where it is possible), that will ingest the data resulted from by Table/SQL API job, and that will add the message key into Kafka. On 27 March 2018 at 12:55, Chesnay Sche

Re: Table/SQL Kafka Sink Question

2018-03-27 Thread Chesnay Schepler
Hello, as far as i can this is not possible. I'm including Timo, maybe he can explain why this isn't supported. On 26.03.2018 21:56, Pavel Ciorba wrote: Hi everyone! Can I specify a *message key* using the Kafka sink in the Table/SQL API ? The goal is to sink each row as JSON along side with

Re: Keyby connect for a one to many relationship - DataStream API - Ride Enrichment (CoProcessFunction)

2018-03-27 Thread Chesnay Schepler
You can still connect the streams but it will be more complex than the reference solution. You will have to store the events from B in a ListState instead. If an A arrives, store it in the value state, emit a tuple (A, B_x) for every stored B, and clear B. From that point on, emit a new tuple (