What is the benefit of Hive on Spark if you cannot pre-load data into
memory that you know will be queried.

On Mon, Aug 31, 2015 at 4:25 PM, Xuefu Zhang <xzh...@cloudera.com> wrote:

> What you described isn't part of the functionality of Hive on Spark.
> Rather, Spark is used here as a general purpose engine similar to MR but
> without intemediate stages. It's batch origientated.
>
> Keeping 100T data in memory is hardly beneficial unless you know that that
> dataset is going to be used in subsequent queries.
>
> For loading data in memory and providing near real-time response, you
> might want to look at some memory-based DBs.
>
> Thanks,
> Xuefu
>
> On Thu, Aug 27, 2015 at 9:11 AM, Patrick McAnneny <
> patrick.mcann...@leadkarma.com> wrote:
>
>> Once I get "hive.execution.engine=spark" working, how would I go about
>> loading portions of my data into memory? Lets say I have a 100TB database
>> and want to load all of last weeks data in spark memory, is this possible
>> or even beneficial? Or am I thinking about hive on spark in the wrong way.
>>
>> I also assume hive on spark could get me to near-real-time capabilities
>> for large queries. Is this true?
>>
>
>

Reply via email to