Hey, I was trying out spark sql using the HiveContext and doing a select on a partitioned table with lots of partitions (16,000+). It took over 6 minutes before it even started the job. It looks like it was querying the Hive metastore and got a good chunk of data back. Which I'm guessing is info on the partitions. Running the same query using hive takes 45 seconds for the entire job. I know spark sql doesn't support all the hive optimization. Is this a known limitation currently? Thanks,Tom
- Spark Sql reading hive partitioned tables? Tom Graves
- Re: Spark Sql reading hive partitioned tables? Michael Armbrust
- Re: Spark Sql reading hive partitioned tables? Cheolsoo Park
- Re: Spark Sql reading hive partitioned table... Michael Armbrust