Thank you very much. got it. On Tue, Nov 27, 2018 at 12:53 PM Fabian Hueske <fhue...@gmail.com> wrote:
> Hi Avi, > > I'd definitely go for approach #1. > Flink will hash partition the records across all nodes. This is basically > the same as a distributed key-value store sharding keys. > I would not try to fine tune the partitioning. You should try to use as > many keys as possible to ensure an even distribution of key. This will also > allow to scale your application later to more tasks. > > Best, Fabian > > Am Di., 27. Nov. 2018 um 05:21 Uhr schrieb yinhua.dai < > yinhua.2...@outlook.com>: > >> General approach#1 is ok, but you may have to use some hash based key >> selector if you have a heavy data skew. >> >> >> >> -- >> Sent from: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >> >