Hi Max, How do you assign timestamps to your events (in event-time case)? Could you post whole code for your TimestampAndWatermarkAssigner?
Regards, Dawid 2017-03-07 20:59 GMT+01:00 ext.mwalker <ext.mwal...@riotgames.com>: > Hi Stephan, > > The right number of events seem to leave the source and enter the windows, > but it shows that 0 exit the windows. > > Also I have tried 30 minutes and not setting the watermark interval, I am > not sure what I am supposed to put there the docs seem vague about that. > > Best, > > Max > > On Tue, Mar 7, 2017 at 1:54 PM, Stephan Ewen [via Apache Flink User > Mailing List archive.] <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=12087&i=0>> wrote: > >> Hi! >> >> At a first glance, your code looks correct to assign the Watermarks. What >> is your watermark interval in the config? >> >> Can you check with the Flink metrics (if you are using 1.2) to see how >> many rows leave the source, how many enter/leave the window operators, etc? >> >> That should help figuring out why there are so few result rows... >> >> Stephan >> >> >> On Mon, Mar 6, 2017 at 8:57 PM, ext.mwalker <[hidden email] >> <http:///user/SendEmail.jtp?type=node&node=12084&i=0>> wrote: >> >>> Hi Folks, >>> >>> We are working on a Flink job to proccess a large amount of data coming >>> in >>> from a Kafka stream. >>> >>> We selected Flink because the data is sometimes out of order or late, >>> and we >>> need to roll up the data into 30-minutes event time windows, after which >>> we >>> are writing it back out to an s3 bucket. >>> >>> We have hit a couple issues: >>> >>> 1) The job works fine using processing time, but when we switch to event >>> time (almost) nothing seems to be written out. >>> Our watermark code looks like this: >>> ``` >>> override def getCurrentWatermark(): Watermark = { >>> new Watermark(System.currentTimeMillis() - maxLateness); >>> } >>> ``` >>> And we are doing this: >>> ``` >>> val env = StreamExecutionEnvironment.getExecutionEnvironment >>> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) >>> ``` >>> and this: >>> ``` >>> .assignTimestampsAndWatermarks(new >>> TimestampAndWatermarkAssigner(Time.minutes(30).toMilliseconds)) >>> ``` >>> >>> However even though we get millions of records per hour (the vast >>> majority >>> of which are no more that 30 minutes late) we get like 2 - 10 records per >>> hour written out to the s3 bucket. >>> We are using a custom BucketingFileSink Bucketer if folks believe that is >>> the issue I would be happy to provide that code here as well. >>> >>> 2) On top of all this, we would really prefer to write the records >>> directly >>> to Aurora in RDS rather than to an intermediate s3 bucket, but it seems >>> that >>> the JDBC sink connector is unsupported / doesn't exist. >>> If this is not the case we would love to know. >>> >>> Thanks in advance for all the help / insight on this, >>> >>> Max Walker >>> >>> >>> >>> -- >>> View this message in context: http://apache-flink-user-maili >>> ng-list-archive.2336050.n4.nabble.com/Issues-with-Event-Time >>> -and-Kafka-tp12061.html >>> Sent from the Apache Flink User Mailing List archive. mailing list >>> archive at Nabble.com. >>> >> >> >> >> ------------------------------ >> If you reply to this email, your message will be added to the discussion >> below: >> http://apache-flink-user-mailing-list-archive.2336050.n4. >> nabble.com/Issues-with-Event-Time-and-Kafka-tp12061p12084.html >> To unsubscribe from Issues with Event Time and Kafka, click here. >> NAML >> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> > > > ------------------------------ > View this message in context: Re: Issues with Event Time and Kafka > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Issues-with-Event-Time-and-Kafka-tp12061p12087.html> > > Sent from the Apache Flink User Mailing List archive. mailing list archive > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at > Nabble.com. >