Re: Join DataStream with dimension tables?

Lohith Samaga M Fri, 22 Apr 2016 09:21:18 -0700

Hi,
Cassandra could be used as a distributed cache.

Lohith.


Sent from my Sony Xperia™ smartphone


---- Aljoscha Krettek wrote ----

Hi Srikanth,
that's an interesting use case. It's not possible to do something like this 
out-of-box but I'm actually working on API for such cases.

In the mean time, I programmed a short example that shows how something like 
this can be programmed using the API that is currently available. It requires 
writing a custom operator but it is still somewhat succinct:
https://gist.github.com/aljoscha/c657b98b4017282693a67f1238c88906

Please let me know if you have any questions.

Cheers,
Aljoscha

On Thu, 21 Apr 2016 at 03:06 Srikanth 
<srikanth...@gmail.com<mailto:srikanth...@gmail.com>> wrote:
Hello,

I have a fairly typical streaming use case but not able to figure how to 
implement it best in Flink.
I want to join records read from a kafka stream with one(or more) dimension 
tables which are saved as flat files.

As per this jira<https://issues.apache.org/jira/browse/FLINK-2320> its not 
possible to join DataStream with DataSet.
These tables are too big to do a collect() and join.

It will be good to read these files during startup, do a partitionByHash and 
keep it cached.
On the DataStream may be do a keyBy and join.
Is something like this possible?

Srikanth
Information transmitted by this e-mail is proprietary to Mphasis, its 
associated companies and/ or its customers and is intended 
for use only by the individual or entity to which it is addressed, and may 
contain information that is privileged, confidential or 
exempt from disclosure under applicable law. If you are not the intended 
recipient or it appears that this mail has been forwarded 
to you without proper authority, you are notified that any use or dissemination 
of this information in any manner is strictly 
prohibited. In such cases, please notify us immediately at 
mailmas...@mphasis.com and delete this mail from your records.

Re: Join DataStream with dimension tables?

Reply via email to