Re: Spark SQL table Join, one task is taking long

2014-12-04 Thread Veeranagouda Mukkanagoudar
ins for huge table(s) are costly. Fact and Dimension concepts > from > star schema don't translate well to Big Data (Hadoop, Spark). It may be > better to de-normalize and store huge tables to avoid Joins. Joins seem to > be evil. (Have tried de-normalizing when using Cassandra, but

Re: Spark SQL table Join, one task is taking long

2014-12-04 Thread Venkat Subramanian
, but that has its own problem of resulting in full table scan when running ad-hoc queries when the keys are not known) Regards, Venkat -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-task-is-taking-long-tp20124p20389.html Sent from t

Re: Spark SQL table Join, one task is taking long

2014-12-03 Thread Cheng Lian
so I can (about 10 Mb). Also, not sure if this normal in such a table join that one task would take most amount of time. Let me know if you have any suggestions. Regards, Venkat -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-ta

Re: Spark SQL table Join, one task is taking long

2014-12-02 Thread Venkat Subramanian
Bump up. Michael Armbrust, anybody from Spark SQL team? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-task-is-taking-long-tp20124p20218.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Spark SQL table Join, one task is taking long

2014-12-01 Thread Venkat Subramanian
t me know if you have any suggestions. Regards, Venkat -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-task-is-taking-long-tp20124.html Sent from the Apache Spark User List mailing list archi