Re: kafka partitions, data locality

2019-04-30 Thread Fabian Hueske
efan Richter [mailto:s.rich...@ververica.com] > *Sent:* Friday, April 26, 2019 11:15 AM > *To:* Smirnov Sergey Vladimirovich (39833) > *Cc:* Dawid Wysakowicz ; Ken Krugler < > kkrugler_li...@transpac.com>; user@flink.apache.org; d...@flink.apache.org > *Subject:* Re: kafk

RE: kafka partitions, data locality

2019-04-29 Thread Smirnov Sergey Vladimirovich (39833)
rugler mailto:kkrugler_li...@transpac.com>> Cc: user@flink.apache.org<mailto:user@flink.apache.org>; d...@flink.apache.org<mailto:d...@flink.apache.org> Subject: Re: kafka partitions, data locality Hi Smirnov, Actually there is a way to tell Flink that data is already partitioned. You can

Re: kafka partitions, data locality

2019-04-26 Thread Stefan Richter
> Cc: user@flink.apache.org; d...@flink.apache.org > Subject: Re: kafka partitions, data locality > > Hi Smirnov, > > Actually there is a way to tell Flink that data is already partitioned. You > can try the reinterpretAsKeyedStream[1] method. I must warn you though this &g

RE: kafka partitions, data locality

2019-04-26 Thread Smirnov Sergey Vladimirovich (39833)
: kafka partitions, data locality Hi Smirnov, Actually there is a way to tell Flink that data is already partitioned. You can try the reinterpretAsKeyedStream[1] method. I must warn you though this is an experimental feature. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release

Re: kafka partitions, data locality

2019-04-25 Thread Dawid Wysakowicz
; With best regards, > > Sergey > > *From:*Ken Krugler [mailto:kkrugler_li...@transpac.com] > *Sent:* Wednesday, April 17, 2019 9:23 PM > *To:* Smirnov Sergey Vladimirovich (39833) > *Subject:* Re: kafka partitions, data locality > >   > > Hi Sergey, > >   >

RE: kafka partitions, data locality

2019-04-19 Thread Smirnov Sergey Vladimirovich (39833)
: kafka partitions, data locality Hi Sergey, As you surmised, once you do a keyBy/max on the Kafka topic, to group by clientId and find the max, then the topology will have a partition/shuffle to it. This is because Flink doesn’t know that client ids don’t span Kafka partitions. I don’t know of

kafka partitions, data locality

2019-04-17 Thread Smirnov Sergey Vladimirovich (39833)
Hello, We planning to use apache flink as a core component of our new streaming system for internal processes (finance, banking business) based on apache kafka. So we starting some research with apache flink and one of the question, arises during that work, is how flink handle with data locality