What is the benefit of Hive on Spark if you cannot pre-load data into memory that you know will be queried.
On Mon, Aug 31, 2015 at 4:25 PM, Xuefu Zhang <xzh...@cloudera.com> wrote: > What you described isn't part of the functionality of Hive on Spark. > Rather, Spark is used here as a general purpose engine similar to MR but > without intemediate stages. It's batch origientated. > > Keeping 100T data in memory is hardly beneficial unless you know that that > dataset is going to be used in subsequent queries. > > For loading data in memory and providing near real-time response, you > might want to look at some memory-based DBs. > > Thanks, > Xuefu > > On Thu, Aug 27, 2015 at 9:11 AM, Patrick McAnneny < > patrick.mcann...@leadkarma.com> wrote: > >> Once I get "hive.execution.engine=spark" working, how would I go about >> loading portions of my data into memory? Lets say I have a 100TB database >> and want to load all of last weeks data in spark memory, is this possible >> or even beneficial? Or am I thinking about hive on spark in the wrong way. >> >> I also assume hive on spark could get me to near-real-time capabilities >> for large queries. Is this true? >> > >