Hi all,
Many thanks for all responses, but I think I just pressed enter too quickly
without explaining correctly what I meant.
What I mean is if the optimizer is able to optimize the processing if an
inner query contains a blocking operator like an aggregation and if it
knows the partitioning sch
Yes it does ... you can try out the following example (the People dataset
that comes with Spark). There is an inner query that filters on age and an
outer query that filters on name.
The physical plan applies a single composite filter on name and age as you
can see below
sqlContext.sql("select * f
Hi all,
I have some doubts about the latest SparkSQL.
1. In the paper about SparkSQL it has been stated that "The physical
planner also performs rule-based physical optimizations, such as pipelining
projections or filters into one Spark map operation. ..."
If dealing with a query of the form:
s