Hey guys! I've been thinking about this one today:
Say you have a stream of data in the form of (id, value) - This will evidently be a DataStream of Tuple2. I need to cache this data in some sort of static stream (perhaps even a DataSet). Then, if in the input stream, I see an id that was previously stored, I should update its value with the most recent entry. On an example: 1, 3 2, 5 6, 7 1, 5 The value cached for the id 1 should be 5. How would you recommend caching the data? And what would be used for the update? A join function? As far as I see things, you cannot really combine DataSets with DataStreams although a DataSet is, in essence, just a finite stream. If this can indeed be done, some pseudocode would be nice :) Thanks! Andra