Hey,
I was trying out spark sql using the HiveContext and doing a select on a 
partitioned table with lots of partitions (16,000+). It took over 6 minutes 
before it even started the job. It looks like it was querying the Hive 
metastore and got a good chunk of data back.  Which I'm guessing is info on the 
partitions.  Running the same query using hive takes 45 seconds for the entire 
job. 
I know spark sql doesn't support all the hive optimization.  Is this a known 
limitation currently?  
Thanks,Tom

Reply via email to