Re: SparkSQL extensions

2014-07-27 Thread Michael Armbrust
Ah, I understand now. That sounds pretty useful and is something we would currently plan very inefficiently. On Sun, Jul 27, 2014 at 1:07 AM, Christos Kozanitis wrote: > Thanks Michael for the recommendations. Actually the region-join (or I > could name it range-join or interval-join) that I w

Re: SparkSQL extensions

2014-07-27 Thread Christos Kozanitis
Thanks Michael for the recommendations. Actually the region-join (or I could name it range-join or interval-join) that I was thinking should join the entries of two tables with inequality predicates. For example if table A(col1 int, col2 int) contains entries (1,4) and (10,12) and table b(c1 int, c

Re: SparkSQL extensions

2014-07-26 Thread Michael Armbrust
A very simple example of adding a new operator to Spark SQL: https://github.com/apache/spark/pull/1366 An example of adding a new type of join to Spark SQL: https://github.com/apache/spark/pull/837 Basically, you will need to add a new physical operator that inherits from SparkPlan and a Strategy

SparkSQL extensions

2014-07-26 Thread Christos Kozanitis
Hello I was wondering is it easy for you guys to point me to what modules I need to update if I had to add extra functionality to sparkSQL? I was thinking to implement a region-join operator and I guess I should add the implementation details under joins.scala but what else do I need to modify?