Re: Question on accessing LLAP as data cache from external containers

2018-02-02 Thread Gopal Vijayaraghavan
> For example, a Hive job may start Tez containers, which then retrieve data > from LLAP running concurrently. In the current implementation, this is > unrealistic That is how LLAP was built - to push work from Tez to LLAP vertex by vertex, instead of an all-or-nothing implementation. Here ar

Re: Question on accessing LLAP as data cache from external containers

2018-01-31 Thread Sungwoo Park
Thanks for the link. My question was how to access LLAP daemon from Containers to retrieve data for Hive jobs. For example, a Hive job may start Tez containers, which then retrieve data from LLAP running concurrently. In the current implementation, this is unrealistic (because every task can be ju

Re: Question on accessing LLAP as data cache from external containers

2018-01-29 Thread Jörn Franke
Are you looking for sth like this: https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html To answer your original question: why not implement the whole job in Hive? Or orchestrate using oozie some parts in mr and some in Huve. > On 30. Jan 2018, at

Question on accessing LLAP as data cache from external containers

2018-01-29 Thread Sungwoo Park
Hello all, I wonder if an external YARN container can send requests to LLAP daemon to read data from its in-memory cache. For example, YARN containers owned by a typical MapReduce job (e.g., TeraSort) could fetch data directly from LLAP instead of contacting HDFS. In this scenario, LLAP daemon ju