Thanks Chen
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-timestamp-tp10367p10449.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
...@spark.incubator.apache.org
Subject: RE: Joining by timestamp.
Hi Chen,
Thank you very much for your reply. I think I do not understand how can I do
the join using spark api. If you have time , could you please write some code .
Thanks again,
D.
--
View this message in context:
http://apache-spark-user-list
Hi Chen,
Thank you very much for your reply. I think I do not understand how can I do
the join using spark api. If you have time , could you please write some
code .
Thanks again,
D.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-timestamp-tp1
Actually it's just a pseudo algorithm I described, you can do it with spark
API. Hope the algorithm helpful.
-Original Message-
From: durga [mailto:durgak...@gmail.com]
Sent: Tuesday, July 22, 2014 11:56 AM
To: u...@spark.incubator.apache.org
Subject: RE: Joining by timestamp.
Hi
Hi Chen,
I am new to the Spark as well as SparkSQL , could you please explain how
would I create a table and run query on top of it.That would be super
helpful.
Thanks,
D.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Joining-by-timestamp-tp10367p10381.ht
This is a very interesting problem. SparkSQL supports the Non Equi Join, but it
is in very low efficiency with large tables.
One possible solution is make both table partition based and the partition keys
are (cast(ds as bigint) / 240), and with each partition in dataset1, you
probably can writ