Re: sharded state, 2-step operation

2016-08-24 Thread Stephan Ewen
Hi! The "feedback loop" sounds like a solution, yes. Actually, that works well with the CoMap / CoFlatMap - one input to the CoMap would be the original value, the other input the feedback value. https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/index.html#datastream-transform

Re: sharded state, 2-step operation

2016-08-23 Thread Michael Warnock
Another approach I'm considering, which feels pretty kludgy, but I think should be acceptable for my current use: Only one stateful op, keyed on the same field, but with a flag field indicating the actual operation to be performed. The results of this op are output to a kafka (or whatever) queue,

Re: sharded state, 2-step operation

2016-08-23 Thread Michael Warnock
Thanks for the quick response! I've been wondering about Connected streams and CoFlatMap, but either I don't see all the ways they can be used, or they don't solve my problem. Do you know of any examples outside of the documentation? My searches for "flink comap example" and similar haven't turne

Re: sharded state, 2-step operation

2016-08-23 Thread Stephan Ewen
Hi! This is a tricky one. State access and changes are not shared across operators in Flink. We chose that design because it makes it possible to work on "local" state in each operator - state automatically shards with the computation - no locking / concurrency implications - asynchronous pe

sharded state, 2-step operation

2016-08-23 Thread Michael Warnock
I'm trying to do something that seems like it should be possible, but my implementation doesn't behave as expected, and I'm not sure how else to express it. Let's say the stream is composed of tuples like this: (Alice, Bob, 1) and I want to keyBy(1), flatMap with state associated with Alice, then