Hi Steven, According to this presentation, King.com is using Flink with terabytes of state: http://flink-forward.org/wp-content/uploads/2016/07/Gyulo-Fo%CC%81ra-RBEA-Scalable-Real-Time-Analytics-at-King.compressed.pdf (see Page 4 specifically)
For the 90GB experiment, what is the expected time for transferring 90 GB of data in your environment? Regards, Robert On Sat, Nov 19, 2016 at 1:41 AM, Steven Ruppert <ste...@fullcontact.com> wrote: > Hi, > > Is anybody currently running flink streaming with north of a terabyte > (TB) of managed state? If you are, can you share your experiences wrt > hardware, tuning, recovery situations, etc? > > I'm evaluating flink for a use case I estimate will take around 5TB of > state in total, but looking at the actual implementation of the > rocksDB state and current lack of incremental checkpointing or > recovery, it doesn't seem feasible. > > I have successfully tested flink up to roughly 90GB of managed state > in rocksDB, but that's taking 5 minutes to checkpoint or recover (on a > pretty beefy YARN cluster). > > For most cases, my state updates are idempotent and can be moved to > something external. However, it'd be nice to know of any current of > future plans for running flink at the terabyte scale. > > --Steven > > -- > *CONFIDENTIALITY NOTICE: This email message, and any documents, files or > previous e-mail messages attached to it is for the sole use of the intended > recipient(s) and may contain confidential and privileged information. Any > unauthorized review, use, disclosure or distribution is prohibited. If you > are not the intended recipient, please contact the sender by reply email > and destroy all copies of the original message.* >