What Robert said is correct. However, that behaviour depends on the
Trigger. You can write your own Trigger that behaves differently when
late data arrives, that is, you could write a trigger that never fires
for late data. In that case, you can also simply set the allowed
lateness to zero, however. You could also write a trigger that waits for
a certain number of late elements to arrive and then triggers a firing.


Best,

Aljoscha





On Fri, Mar 10, 2017, at 20:14, Robert Metzger wrote:

> Hi Ethan,

> 

> how late elements (elements with event time after the watermark) are
> handled depends on the operator. Flink's window operators will trigger
> a single event window when they fall into the "allowed lateness"
> timeframe. Otherwise, they are dropped.
> 

> On Thu, Mar 9, 2017 at 5:30 PM, ext.eformichella
> <ext.eformiche...@riotgames.com> wrote:
>> Thanks for the suggestion, we can definitely try that out.

>> 

>> My one concern there is that events technically can lag for days or
>> even months in some cases, but we only care about including the
>> events that lag for 30 minutes or so, and would like the further
>> lagging events to be ignored - I just want to make sure that doesn't
>> require special handling.
>> 

>> I also just want to make sure I'm understanding the maximum lateness
>> watermark correctly. Suppose a watermark gets generated, and then an
>> element with an older timestamp is found. My understanding was that
>> that element should be ignored, but from our results it looks like
>> the late element actually overwrites the aggregate of the on-time
>> elements. Is this expected behavior?
>> 

>> Thank you for your help!

>> -Ethan

>> 

>> On Tue, Mar 7, 2017 at 6:01 PM, Dawid Wysakowicz [via Apache Flink
>> User Mailing List archive.] <[hidden email][1]> wrote:
>>> 

>>> Hi Ethan,

>>> 

>>> I believe then it is because the Watermark and Timestamps in your
>>> implementation are uncorrelated. What Watermark really is a marker
>>> that says there will be no elements with timestamp smaller than the
>>> value of this watermark. For more info on the concept see [1][2].
>>> 

>>> In your case as you say that events can "lag" for 30 minutes, you
>>> should try the BoundedOutOfOrdernessTimestampExtractor. It is
>>> designed exactly for a case like yours.
>>> 

>>> Regards,

>>> Dawid

>>> 

>>> 

>>> 2017-03-07 22:33 GMT+01:00 ext.eformichella <[hidden email][3]>:

>>>> Hi Dawid, I'm working with Max on the project Our code for the
>>>> TimestampAndWatermarkAssigner is: ``` class
>>>> TimestampAndWatermarkAssigner(val maxLateness: Long) extends
>>>> AssignerWithPeriodicWatermarks[Row] {
>>>>
>>>>    override def extractTimestamp(element: Row,
>>>>    previousElementTimestamp: Long): Long = {  element.minTime }
>>>>
>>>>    override def getCurrentWatermark(): Watermark = {  new
>>>>    Watermark(System.currentTimeMillis() - maxLateness) } } ```
>>>>
>>>>  Where Row is a class representing the incoming JSON object coming
>>>>  from Kafka, which includes the timestamp
>>>>
>>>>  Thanks, -Ethan
>>>>
>>>>
>>>>
>>>>  --
>>>>  View this message in context:
>>>>  
>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Issues-with-Event-Time-and-Kafka-tp12061p12090.html
>>>> Sent from the Apache Flink User Mailing List archive. mailing list
>>>> archive at Nabble.com.
>>>> 

>>> 

>>> 
>>> 

>>> 

>>> If you reply to this email, your message will be added to the
>>> discussion below:
>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Issues-with-Event-Time-and-Kafka-tp12061p12092.html
>>> 

>>> To unsubscribe from Issues with Event Time and Kafka, click here.
>>> NAML[4]
>>> 

>>
>> View this message in context:Re: Issues with Event Time and Kafka[5]
>> Sent from the Apache Flink User Mailing List archive. mailing list
>> archive[6] at Nabble.com.



Links:

  1. http:///user/SendEmail.jtp?type=node&node=12139&i=0
  2. 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/event_time.html#event-time-and-watermarks
  3. http:///user/SendEmail.jtp?type=node&node=12092&i=0
  4. 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
  5. 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Issues-with-Event-Time-and-Kafka-tp12061p12139.html
  6. http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to