Hi, in the Flink doc there is this: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html#the-rocksdbstatebackend and this: RocksDBStateBackend <https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/contrib/streaming/state/RocksDBStateBackend.html>
Cheers, Aljoscha On Sun, 24 Apr 2016 at 21:58 Chen Bekor <chen.be...@gmail.com> wrote: > cool - can you point me to some docs about how to configure Rocks DB? I > searched the online docs and found nothing substantial. Also - If I'm using > HDFS (S3backed ) cluster, how would that effect RocksDB? can I configure it > to run on optimized SSD etc? > > any help is appreciated. > > > On Sun, Apr 24, 2016 at 7:57 AM, John Sherwood <j...@vt.edu> wrote: > >> This sounds like you have some per-key state to keep track of, so the >> 'correct' way to do it would be to keyBy the guid. I believe that if you >> run your environment using the Rocks DB state backend you will not OOM >> regardless of the number of GUIDs that are eventually tracked. Whether >> flink/stream processing is the most effective way to achieve your goal, I >> can't say, but I am fairly confident that this particular aspect is not a >> problem. >> >> On Sat, Apr 23, 2016 at 1:13 AM, Chen Bekor <chen.be...@gmail.com> wrote: >> >>> hi all, >>> >>> I have a stream of incoming object versions (objects change over time) >>> and a requirement to fetch from a datastore the last known object version >>> in order to link it with the id of the new version, so that I end up with >>> a linked list of object versions. >>> >>> all object versions contain the same guid, so I was thinking about using >>> flink streaming in order to assure ordering and avoid concurrency / race >>> conditions in the linkage process (object version might arrive unordered or >>> may arrive at spikes) >>> >>> if I use the object guid as a key for a keyed stream I am concerned I >>> will end up with millions of windowed streams hence causing OOM. >>> >>> what do you think should be the right approach? do you think flink is >>> the right technology for this task? >>> >> >> >