Lower Parallelism derives better latency

2018-01-02 Thread Netzer, Liron
Hi group, We have a standalone Flink cluster that is running on a UNIX host with 40 CPUs and 256GB RAM. There is one task manager and 24 slots were defined. When we decrease the parallelism of the Stream graph operators(each operator has the same parallelism), we see a consistent change in the l

JobManager not receiving resource offers from Mesos

2018-01-02 Thread 김동원
Hi, I try to launch a Flink cluster on top of dc/os but TaskManagers are not launched at all. What I do to launch a Flink cluster is as follows: - Click "flink" from "Catalog" on the left panel of dc/os GUI. - Click "Run service" without any modification on configuration for the purpose of test

scala 2.12 support/cross-compile

2018-01-02 Thread Hao Sun
Hi team, I am wondering if there is a schedule to support scala 2.12? If I need flink 1.3+ with scala 2.12, do I just have to cross compile myself? Is there anything blocking us from using scala 2.12? Thanks

Re: BackPressure handling

2018-01-02 Thread Vishal Santoshi
Also note that if I were to start 2 pipelines 1. Working off the head of the topic and thus not prone to the pathological case described above 2. Doing a replay and thus prone to the pathological case described above Than the 2nd pipe will stall the 1st pipeline. This seems to to point to -

Re: BackPressure handling

2018-01-02 Thread Vishal Santoshi
Thank you. On Tue, Jan 2, 2018 at 1:31 PM, Nico Kruber wrote: > Hi Vishal, > let me already point you towards the JIRA issue for the credit-based > flow control: https://issues.apache.org/jira/browse/FLINK-7282 > > I'll have a look at the rest of this email thread tomorrow... > > > Regards, > Ni

Re: BackPressure handling

2018-01-02 Thread Nico Kruber
Hi Vishal, let me already point you towards the JIRA issue for the credit-based flow control: https://issues.apache.org/jira/browse/FLINK-7282 I'll have a look at the rest of this email thread tomorrow... Regards, Nico On 02/01/18 17:52, Vishal Santoshi wrote: > Could you please point me to any

Re: BackPressure handling

2018-01-02 Thread Vishal Santoshi
Could you please point me to any documentation on the "credit-based flow control" approach On Tue, Jan 2, 2018 at 10:35 AM, Timo Walther wrote: > Hi Vishal, > > your assumptions sound reasonable to me. The community is currently > working on a more fine-grained back pressuring with credit-b

Re: keyby() issue

2018-01-02 Thread Timo Walther
Hi Jinhua, did you check the key group assignments? What is the distribution of "MathUtils.murmurHash(keyHash) % maxParallelism" on a sample of your data? This also depends on the hashCode on the output of your KeySelector. keyBy should handle high traffic well, but it is designed for key spa

Re: About Kafka08Fetcher and Kafka010Fetcher

2018-01-02 Thread Timo Walther
Maybe Gordon (in CC) can answer your question. Am 1/1/18 um 3:36 PM schrieb Jaxon Hu: In Kafka08Fetcher, it use  Map to manage multi-threads. But I notice in Kafka09Fetcher or Kafka010Fetcher, it's gone. So how Kafka09Fetcher implements multi-threads read partitions from kafka?

Re: How to stop FlinkKafkaConsumer and make job finished?

2018-01-02 Thread Timo Walther
Hi Arnaud, thanks for letting us know your workaround. I agree that this is a frequently asked topic and important in certain use cases. I'm sure that it will be solved in the near future depending on the priorities. My 2 cents: Flink is an open source project maybe somebody is willing to wo

Re: about the checkpoint and state backend

2018-01-02 Thread Timo Walther
Hi Jinhua, I will try to answer your questions: Flink checkpoints the state of each operator. For a Kafka consumer operator this is only the offset. For other operators (such as Windows or a ProcessFunction) the values/list/maps stored in the state are checkpointed. If you are interested in t

Re: about the checkpoint and state backend

2018-01-02 Thread Stefan Richter
Hi, > I have two questions: > > a) does the records/elements themselves would be checkpointed? or just > record offset checkpointed? That is, what data included in the > checkpoint except for states? No, just offsets (or something similar, depending on the source), which are part of the state o

RE: How to stop FlinkKafkaConsumer and make job finished?

2018-01-02 Thread LINZ, Arnaud
Hi, My 2 cents: not being able to programmatically nicely stop a Flink stream is what lacks most to the framework IMHO. It's a very common use case: each time you want to update the application or change its configuration you need to nicely stop & restart it, without triggering alerts, data lo

Re: Flink Kafka Consumer stops fetching records

2018-01-02 Thread Timo Walther
Hi Teena, could you tell us a bit more about your job. Are you using event-time semantics? Regards, Timo Am 1/2/18 um 6:14 AM schrieb Teena K: Hi, I am using Flink 1.4 along with Kafka 0.11. My stream job has 4 Kafka consumers each subscribing to 4 different topics. The stream from each c

Re: BackPressure handling

2018-01-02 Thread Timo Walther
Hi Vishal, your assumptions sound reasonable to me. The community is currently working on a more fine-grained back pressuring with credit-based flow control. It is on the roamap for 1.5 [1]/[2]. I will loop in Nico that might tell you more about the details. Until then I guess you have to imp

BackPressure handling

2018-01-02 Thread Vishal Santoshi
I did a simulation on session windows ( in 2 modes ) and let it rip for about 12 hours 1. Replay where a kafka topic with retention of 7 days was the source ( earliest ) 2. Start the pipe with kafka source ( latest ) I saw results that differed dramatically. On replay the pipeline stalled after

Re: S3 Access in eu-central-1

2018-01-02 Thread Nico Kruber
Sorry for the late response, but I finally got around adding this workaround to our "common issues" section with PR https://github.com/apache/flink/pull/5231 Nico On 29/11/17 09:31, Ufuk Celebi wrote: > Hey Dominik, > > yes, we should definitely add this to the docs. > > @Nico: You recently upd