subject:"spark\-sql with join terribly slow."

Re: spark-sql with join terribly slow.

2014-12-21 Thread Cheng Lian

nt RDD2 (50 rows) state,amount1 join by state New RDD3:(30,000 rows) state,city,amount,amount1 Do a select(amount-amount1) from New RDD3. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble

Re: spark-sql with join terribly slow.

2014-12-17 Thread nitin

IN key "id" and could prevent the shuffle by passing the partition information to in-memory caching. See - https://issues.apache.org/jira/browse/SPARK-4849 Thanks -Nitin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-sql-with-join-terribl

Re: spark-sql with join terribly slow.

2014-12-17 Thread Cheng Lian

RDD2 (50 rows) state,amount1 join by state New RDD3:(30,000 rows) state,city,amount,amount1 Do a select(amount-amount1) from New RDD3. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-sql-with-join-terribly-slow-tp20751.html Sent from the Apache Spa

spark-sql with join terribly slow.

2014-12-17 Thread harirajaram

text: http://apache-spark-user-list.1001560.n3.nabble.com/spark-sql-with-join-terribly-slow-tp20751.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apach