Re: Doubts about SparkSQL

2015-05-24 Thread Renato Marroquín Mogrovejo
Hi all, Many thanks for all responses, but I think I just pressed enter too quickly without explaining correctly what I meant. What I mean is if the optimizer is able to optimize the processing if an inner query contains a blocking operator like an aggregation and if it knows the partitioning sch

Re: Doubts about SparkSQL

2015-05-23 Thread Ram Sriharsha
Yes it does ... you can try out the following example (the People dataset that comes with Spark). There is an inner query that filters on age and an outer query that filters on name. The physical plan applies a single composite filter on name and age as you can see below sqlContext.sql("select * f

Doubts about SparkSQL

2015-05-23 Thread Renato Marroquín Mogrovejo
Hi all, I have some doubts about the latest SparkSQL. 1. In the paper about SparkSQL it has been stated that "The physical planner also performs rule-based physical optimizations, such as pipelining projections or filters into one Spark map operation. ..." If dealing with a query of the form: s