ins for huge table(s) are costly. Fact and Dimension concepts
> from
> star schema don't translate well to Big Data (Hadoop, Spark). It may be
> better to de-normalize and store huge tables to avoid Joins. Joins seem to
> be evil. (Have tried de-normalizing when using Cassandra, but
, but that has its
own problem of resulting in full table scan when running ad-hoc queries when
the keys are not known)
Regards,
Venkat
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-task-is-taking-long-tp20124p20389.html
Sent from t
so I can (about
10 Mb). Also, not sure if this normal in such a table join that one task
would take most amount of time. Let me know if you have any suggestions.
Regards,
Venkat
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-ta
Bump up.
Michael Armbrust, anybody from Spark SQL team?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-task-is-taking-long-tp20124p20218.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
t me know if you have any suggestions.
Regards,
Venkat
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-table-Join-one-task-is-taking-long-tp20124.html
Sent from the Apache Spark User List mailing list archi