Re: Performance problems on SQL JOIN

2014-06-21 Thread Michael Armbrust
> cacheTable("rooms3"); > > sql("SELECT * FROM rooms2 LEFT JOIN rooms3 ON rooms2.hotelId = > rooms3.hotelId AND rooms2.toDate = rooms3.toDate").count(); > > > Are we doing something wrong here? > Thanks! > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Performance-problems-on-SQL-JOIN-tp8001.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

Re: Performance problems on SQL JOIN

2014-06-20 Thread mathias
educe at joins.scala:219' take up the majority of the time. Is this due to bad partitioning or caching? Or is there a problem with the JOIN operator? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-problems-on-SQL-JOIN-tp8001p8016.html Sent f

Re: Performance problems on SQL JOIN

2014-06-20 Thread Evan R. Sparks
).map(x => BookingInfo(x(0), > x(1), > > ... , x(9))); // 30k rows > > > > rooms2.registerAsTable("rooms2"); > > cacheTable("rooms2"); > > rooms3.registerAsTable("rooms3"); > > cacheTable("rooms3"); > > > > s

Re: Performance problems on SQL JOIN

2014-06-20 Thread Xiangrui Meng
ows > > rooms2.registerAsTable("rooms2"); > cacheTable("rooms2"); > rooms3.registerAsTable("rooms3"); > cacheTable("rooms3"); > > sql("SELECT * FROM rooms2 LEFT JOIN rooms3 ON rooms2.hotelId = > rooms3.hotelId AND rooms2.toDate = ro

Performance problems on SQL JOIN

2014-06-20 Thread mathias
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Performance-problems-on-SQL-JOIN-tp8001.html Sent from the Apache Spark User List mailing list archive at Nabble.com.