Hive has a little bit more emphasis on the case that your data that is queried 
is much bigger than available memory or when you need to query many different 
small data subsets or recently interactively queries (llap  etc.). 

Spark is more for machine learning working iteravely over the whole same 
dataset in memory. Additionally it has streaming and graph processing 
capabilities that can be used together. 

Besides this depending on your needs other ecosystem components are relevant. 
For instance, both are less good with lookups of single entries in a dataset. 
They are not so good for text analytics.

Said that both develop rapidly and this may change. Additionally both have 
replacements , such as Flink for Spark etc



Sent from my iPhone
> On 25 May 2016, at 18:11, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> Can you be a bit more specific how are you going to use Spark. For example as 
> a powerful query tool, Analytics, Data migration.
> 
> Spark SQL and Spark-shell provide a subset of Hive SQL (depending on which 
> version of Hive and Spark you have in mind).
> 
> As a query tool Spark is very powerful as it uses DAG and In-memory 
> computation, provides Scala (and others) as the language. You can create your 
> own uber JAR fie for distribution etc
> You can of course use Spark as an execution engine for Hive as opposed to 
> map-reduce to take advantage of Spark processing
> 
> etc etc
> 
> HTH
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
>  
> 
>> On 25 May 2016 at 16:34, Aakash Basu <raj2coo...@gmail.com> wrote:
>> Hi,
>> 
>>  
>> 
>> I’m new to the Spark Ecosystem, need to understand the Pros and Cons of 
>> fetching data using SparkSQL vs Hive in Spark vs Spark API.
>> 
>>  
>> 
>> PLEASE HELP!
>> 
>>  
>> 
>> Thanks,
>> 
>> Aakash Basu.
>> 
> 

Reply via email to