Re: Maximum Size of Reference Look Up Table in Spark

2016-07-15 Thread Jacek Laskowski
Hi, Never worked in a project that would require it. Jacek On 15 Jul 2016 5:31 p.m., "Saravanan Subramanian" wrote: > Hello Jacek, > > Have you seen any practical limitation or performance degradation issues > while using more than 10GB of broadcast cache ? > > Thanks, > Saravanan S. > > > On

Re: Maximum Size of Reference Look Up Table in Spark

2016-07-15 Thread Saravanan Subramanian
Hello Jacek, Have you seen any practical limitation or performance degradation issues while using more than 10GB of broadcast cache ? Thanks,Saravanan S. On Thursday, 14 July 2016 8:06 PM, Jacek Laskowski wrote: Hi, My understanding is that the maximum size of a broadcast is the Long.M

Re: Maximum Size of Reference Look Up Table in Spark

2016-07-14 Thread Jacek Laskowski
Hi, My understanding is that the maximum size of a broadcast is the Long.MAX_VALUE (and plus some more since the data is going to be encoded to save space, esp. for catalyst-driver datasets). Ad 2. Before the tasks access the broadcast variable it has to be sent across network that may be too slo

Maximum Size of Reference Look Up Table in Spark

2016-07-14 Thread Saravanan Subramanian
Hello All, I am in the middle of designing real time data enhancement services using spark streaming.  As part of this, I have to look up some reference data while processing the incoming stream. I have below questions: 1) what is the maximum size of look up table / variable can be stored as Bro