What kinds are the tables underlying the SchemaRDDs? Could you please
provide the DDL of the tables and the query you executed?
On 12/18/14 6:15 AM, harirajaram wrote:
Guys,
I'm trying to join 2-3 schemaRDD's for approx 30,000 rows and it is terribly
slow.No doubt I get the results but it takes 8s to do the join and get the
results.
I'm running on a standalone spark in my m/c having 8 cores and 12gb RAM with
4 workers.
Not sure why it is consuming time,any inputs appreciated..
This is just an e.g on what I'm trying to say.
RDD1(30,000 rows)
state,city,amount
RDD2 (50 rows)
state,amount1
join by state
New RDD3:(30,000 rows)
state,city,amount,amount1
Do a select(amount-amount1) from New RDD3.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/spark-sql-with-join-terribly-slow-tp20751.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]