Re: Lateness for Spark

2018-09-12 Thread Amit Sela
event time triggers (via watermarks) should be supported as well. On Sun, Sep 9, 2018 at 11:57 PM Vishwas Bm wrote: > Hi, > > Thanks for the reply. As per the beam capability matrix only > Processing-time triggers is supported by spark runner. > As this page is not updated, what other triggers

Re: Lateness for Spark

2018-09-09 Thread Vishwas Bm
Hi, Thanks for the reply. As per the beam capability matrix only Processing-time triggers is supported by spark runner. As this page is not updated, what other triggers are supported in beam spark runner. *Thanks & Regards,* *Vishwas *

Re: Lateness for Spark

2018-09-09 Thread Amit Sela
I don't think the capability matrix is updated, the Spark runner uses LateDataUtils to handle late elements - https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L300 On Fri, Sep 7, 2018 at 6:43 PM Rag

Re: Lateness for Spark

2018-09-07 Thread Raghu Angadi
I see. Hopefully someone with more familiarity with Spark runner will chime in. On Fri, Sep 7, 2018 at 1:41 PM Vishwas Bm wrote: > Hi, > > In our use case the watermark is the processing time. > > As per beam capability matrix ( > https://beam.apache.org/documentation/runners/capability-matrix/)

Re: Lateness for Spark

2018-09-07 Thread Vishwas Bm
Hi, In our use case the watermark is the processing time. As per beam capability matrix ( https://beam.apache.org/documentation/runners/capability-matrix/) lateness is not supported by spark runner. But as per the output in our use case we are able to see late data getting emitted. So we wante

Re: Lateness for Spark

2018-09-07 Thread Raghu Angadi
Lateness depends on watermark. How did you configure your KafkaIO reader? Did you set custom timestamp function? By default watermark in KafkaIO is set to same as processing time, in which case, your watermark could be close to 13-38-37 (processing time). Note that this is in general true across a

Lateness for Spark

2018-09-07 Thread rahul patwari
Hi, We are running a Beam program on Spark. We are using 2.5.0 Beam and SparkRunner versions. We are seeing Late data in the output emitted by Spark. As per the capability Matrix, Lateness is not supported in Spark. Is it supported now? or Are we missing something? Steps: Read from Kafka, Apply