Hi Ali,

I'm excited to hear that EMC is looking into Apache Flink. I think the
solution to this problem depends on one question: What is the size of the
data in the CSV file compared to the memory you have available in the
cluster?
Would the mapping table from the file fit into the memory of all nodes
running Flink?

Regards,
Robert

PS: Did you subscribe to the mailing list? I've CCed you in case you're not
subscribed yet

On Wed, Nov 4, 2015 at 4:54 PM, Kashmar, Ali <ali.kash...@emc.com> wrote:

> Hi there,
>
> I’m trying to design and implement a use case in Flink where I’m receiving
> protocol packets over a socket. Each packet has the subscriber IMSI in it
> and a bunch of more data. At the same time, I have a csv file with a
> mapping from IMSI -> subscriber group. I need to inject the group into
> packet and then send it to the sink.
>
> I’ve tried loading the CSV into a memory map and then accessing the map
> from within the Flink operators but that only works when the CSV is very
> small (a few hundred subscribers). I’ve tried creating another stream for
> the CSV and connecting the streams but that doesn’t yield anything as I
> can’t have access to objects from both streams at the same time.
>
> How would you guys approach this?
>
> Thanks,
> Ali
>

Reply via email to