I think this pattern may be common, so some tools that share such a table
across multiple tasks may make sense.
Would be nice to add a handler that you give an "initializer" which reads
the data and build the shared lookup map. The first to acquire the handler
actually initializes the data set (re
Hi Arnaud,
I'm happy that you were able to resolve the issue. If you are still
interested in the first approach, you could try some things, for example
using only one slot per task manager (the slots share the heap of the TM).
Regards,
Robert
On Fri, Nov 13, 2015 at 9:18 AM, LINZ, Arnaud
wrote:
Hello,
I’ve worked around my problem by not using the HiveServer2 JDBC driver to read
the ref table. Apparently, despite all the good options passed to the Statement
object, it poorly handles RAM, since converting the table into textformat and
directly reading the hdfs works without any problem