Re: About stateful transformations

2016-10-27 Thread Juan Rodríguez Hortalá
Hi Aljoscha, Thanks for your answer. At least by keeping only the latest one we don't have retention problems with the state backend, and for now I guess we could use manually triggered savepoints if we needed to store the history of the state. Thanks, Juan On Tue, Oct 25, 2016 at 6:58 AM, Aljo

Re: About stateful transformations

2016-10-25 Thread Aljoscha Krettek
Hi, there is already a mechanism for that. Currently, Flink will only keep the most recent, successful checkpoint. We are currently working on making that configurable so that, for example, the last n successful checkpoints can be kept. Cheers, Aljoscha On Tue, 25 Oct 2016 at 06:47 Juan Rodríguez

Re: About stateful transformations

2016-10-24 Thread Juan Rodríguez Hortalá
Hi Gyula, Thanks a lot for your response, it was very clear. I understand that there is no problem of small files due to checkpointing not being incremental. I also understand that each worker will interpret a file:// URL as local to its own file system, which works ok if all workers have a remove

Re: About stateful transformations

2016-10-24 Thread Gyula Fóra
Hi Juan, Let me try to answer some of your questions :) We have been running Flink Streaming at King for quite some time now with multiple jobs having several hundred gigabytes of KV state stored in RocksDB. I would say RocksDB state backend is definitely the best choice at the moment for large d

About stateful transformations

2016-10-23 Thread Juan Rodríguez Hortalá
Hi all, I don't have much experience with Flink, so please forget me if I ask some obvious questions. I was taking a look to the documentation on stateful transformations in Flink at https://ci.apache.org/projects/flink/flink-docs- release-1.2/dev/state.html. I'm mostly interested in Flink for str