Thanks, things are clear so far.
Hi,
first, regarding your use-case questions:
1. if you do a keyBy(..) on the "word" then the same words will end up on
the same machine.
2. This depends on the StateBackend that you use. For example, there is the
FileStateBackend that keeps state in memory and does checkpoints to a file
system an
Mentioning 100TB "in my context" is more like "saving current state" at
some point of time to "backup" or "direct access" storage and continue with
next 100TB/hours/days of streamed data.
So - no, it is not about a finite data set.
On Mon, May 23, 2016 at 11:13 AM, Matthias J. Sax wrote:
> Are y
Are you talking about a streaming or a batch job?
You are mentioning a "text stream" but also say you want to stream 100TB
-- indicating you have a finite data set using DataSet API.
-Matthias
On 05/22/2016 09:50 PM, Xtra Coder wrote:
> Hello,
>
> Question from newbie about how Flink's WordCou
Hello,
Question from newbie about how Flink's WordCount will actually work at
scale.
I've read/seen rather many high-level presentations and do not see
more-or-less clear answers for following …
Use-case:
--
there is huuuge text stream with very variable set of words – let's say
1BLN