Hi, Cassandra could be used as a distributed cache. Lohith.
Sent from my Sony Xperia™ smartphone ---- Aljoscha Krettek wrote ---- Hi Srikanth, that's an interesting use case. It's not possible to do something like this out-of-box but I'm actually working on API for such cases. In the mean time, I programmed a short example that shows how something like this can be programmed using the API that is currently available. It requires writing a custom operator but it is still somewhat succinct: https://gist.github.com/aljoscha/c657b98b4017282693a67f1238c88906 Please let me know if you have any questions. Cheers, Aljoscha On Thu, 21 Apr 2016 at 03:06 Srikanth <srikanth...@gmail.com<mailto:srikanth...@gmail.com>> wrote: Hello, I have a fairly typical streaming use case but not able to figure how to implement it best in Flink. I want to join records read from a kafka stream with one(or more) dimension tables which are saved as flat files. As per this jira<https://issues.apache.org/jira/browse/FLINK-2320> its not possible to join DataStream with DataSet. These tables are too big to do a collect() and join. It will be good to read these files during startup, do a partitionByHash and keep it cached. On the DataStream may be do a keyBy and join. Is something like this possible? Srikanth Information transmitted by this e-mail is proprietary to Mphasis, its associated companies and/ or its customers and is intended for use only by the individual or entity to which it is addressed, and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If you are not the intended recipient or it appears that this mail has been forwarded to you without proper authority, you are notified that any use or dissemination of this information in any manner is strictly prohibited. In such cases, please notify us immediately at mailmas...@mphasis.com and delete this mail from your records.