TD - this might not be the best forum, but (1) - batch left outer stream -
is always feasible under reasonable constraints, for example a window
constraint on the stream.

I think it would be super useful to have a central place in the 2.0 docs
that spells out what exactly is included, what is targeted to 2.1 and what
will likely be post 2.1...
I think that so far it is not well-communicated (and we are a couple of
weeks after the preview release) - as a user and potential early adopter I
have to constantly dig into the source code and pull requests trying to
decipher if I could use 2.0 APIs for my use case.

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

On Tue, Jun 7, 2016 at 12:36 PM, Tathagata Das <tathagata.das1...@gmail.com>
wrote:

> 1.  Not all types of joins are supported. Here is the list.
> - Right outer joins - stream-batch not allowed, batch-stream allowed
> - Left outer joins - batch-stream not allowed, stream-batch allowed
>  (reverse of Right outer join)
> - Stream-stream joins are not allowed
>
> In the cases of outer joins, the not-allowed-cases are fundamentally hard
> because to do them correctly, every time there is new data in the stream,
> all the past data in the stream needs to be processed. Since we cannot
> stored ever-increasing amount of data in memory, this is not feasible.
>
> 2. For the update mode, the timeline is Spark 2.1.
>
>
> TD
>
> On Mon, Jun 6, 2016 at 6:54 AM, raaggarw <raagg...@adobe.com> wrote:
>
>> Thanks
>> So,
>>
>> 1) For joins (stream-batch) - are all types of joins supported - i mean
>> inner, leftouter etc or specific ones?
>> Also what is the timeline for complete support - I mean stream-stream
>> joins?
>>
>> 2) So now outputMode is exposed via DataFrameWriter but will work in
>> specific cases as you mentioned? We were looking for delta & append output
>> modes for aggregation/groupBy. What is the timeline for that?
>>
>> Thanks
>> Ravi
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Timeline-for-supporting-basic-operations-like-groupBy-joins-etc-on-Streaming-DataFrames-tp27091p27093.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Reply via email to