Re: Flink streaming with 1+ TB of managed state

Robert Metzger Sat, 19 Nov 2016 04:16:43 -0800

Hi Steven,

According to this presentation, King.com is using Flink with terabytes of
state:
http://flink-forward.org/wp-content/uploads/2016/07/Gyulo-Fo%CC%81ra-RBEA-Scalable-Real-Time-Analytics-at-King.compressed.pdf
(see Page 4 specifically)


For the 90GB experiment, what is the expected time for transferring 90 GB
of data in your environment?

Regards,
Robert


On Sat, Nov 19, 2016 at 1:41 AM, Steven Ruppert <ste...@fullcontact.com>
wrote:

> Hi,
>
> Is anybody currently running flink streaming with north of a terabyte
> (TB) of managed state? If you are, can you share your experiences wrt
> hardware, tuning, recovery situations, etc?
>
> I'm evaluating flink for a use case I estimate will take around 5TB of
> state in total, but looking at the actual implementation of the
> rocksDB state and current lack of incremental checkpointing or
> recovery, it doesn't seem feasible.
>
> I have successfully tested flink up to roughly 90GB of managed state
> in rocksDB, but that's taking 5 minutes to checkpoint or recover (on a
> pretty beefy YARN cluster).
>
> For most cases, my state updates are idempotent and can be moved to
> something external. However, it'd be nice to know of any current of
> future plans for running flink at the terabyte scale.
>
> --Steven
>
> --
> *CONFIDENTIALITY NOTICE: This email message, and any documents, files or
> previous e-mail messages attached to it is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. Any
> unauthorized review, use, disclosure or distribution is prohibited. If you
> are not the intended recipient, please contact the sender by reply email
> and destroy all copies of the original message.*
>

Re: Flink streaming with 1+ TB of managed state

Reply via email to