Thank you Takeshi.
After executing df3.explain(true) I realised that the Optimiser batches are
being performed and also the predicate push down.
I think that only the analiser batches are executed when creating the data
frame by the context.sql(query). It seems that the optimiser batches are
exec
Hi,
What's the result of `df3.explain(true)`?
// maropu
On Thu, May 12, 2016 at 10:04 AM, Telmo Rodrigues <
telmo.galante.rodrig...@gmail.com> wrote:
> I'm building spark from branch-1.6 source with mvn -DskipTests package and
> I'm running the following code with spark shell.
>
> *val* sqlCont
I'm building spark from branch-1.6 source with mvn -DskipTests package and
I'm running the following code with spark shell.
*val* sqlContext *=* *new* org.apache.spark.sql.*SQLContext*(sc)
*import* *sqlContext.implicits._*
*val df = sqlContext.read.json("persons.json")*
*val df2 = sqlContext.r
>
>
> logical plan after optimizer execution:
>
> Project [id#0L,id#1L]
> !+- Filter (id#0L = cast(1 as bigint))
> ! +- Join Inner, Some((id#0L = id#1L))
> ! :- Subquery t
> ! : +- Relation[id#0L] JSONRelation
> ! +- Subquery u
> ! +- Relation[id#1L] JSONRelation
>
I
Will try with JSON relation, but with Spark's temp tables (Spark version
1.6 ) I get an optimized plan as you have mentioned. Should not be much
different though.
Query : "select t1.col2, t1.col3 from t1, t2 where t1.col1=t2.col1 and
t1.col3=7"
Plan :
Project [COL2#1,COL3#2]
+- Join Inner, Some(
In this case, isn't better to perform the filter earlier as possible even there
could be unhandled predicates?
Telmo Rodrigues
No dia 11/05/2016, às 09:49, Rishi Mishra escreveu:
> It does push the predicate. But as a relations are generic and might or might
> not handle some of the predicate
It does push the predicate. But as a relations are generic and might or
might not handle some of the predicates , it needs to apply filter of
un-handled predicates.
Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)
https://in.linkedin.com/in/rishiteshmishra
On Wed, May 11, 2016
Hello,
I have a question about the Catalyst optimizer in Spark 1.6.
initial logical plan:
!'Project [unresolvedalias(*)]
!+- 'Filter ('t.id = 1)
! +- 'Join Inner, Some(('t.id = 'u.id))
! :- 'UnresolvedRelation `t`, None
! +- 'UnresolvedRelation `u`, None
logical plan after optimize