Hello everyone,
I've been struggling for a while to perform a session-window join using pyflink
/ SQL API. From my understanding it seems that the table API requires an
equality on the window_start and window_end parameters. Unfortunately, in the
case of session join (with static gap)
--
Best!
Xuyang
At 2024-07-17 17:22:01, "liu ze" wrote:
Hi,
Currently, Flink's windows are based on time (or a fixed number of elements). I
want to trigger window computation based on specific events (marked within the
data). In the DataStream API, this can be
Hi,
Currently, Flink's windows are based on time (or a fixed number of
elements). I want to trigger window computation based on specific events
(marked within the data). In the DataStream API, this can be achieved using
GlobalWindow and custom triggers, but how can it be done in Flin
Hi Sachin,
`performing incremental aggregation using stateful processing` is same as
`windows with agg`, but former is more flexible.If flink window can not
satisfy your performance needs
,and your business logic has some features that can be customized for
optimization. You can choose the
=
reducedPlayerStatsData
.keyBy(new KeySelector())
.window(
TumblingEventTimeWindows.of(Time.seconds(secs)))
.aggregate(new DataAggregator())
.name("aggregate");
In this case data which is aggregated is of a different type than the input
so I had to use
- use
multi-level time window granularity for pre-aggregation can
significantly improve performance and reduce computation latency
Best,
Zhongqiang Gong
Sachin Mittal 于2024年5月17日周五 03:48写道:
> Hi,
> My pipeline step is something like this:
>
> SingleOutputStreamOperator reducedData
Hi,
My pipeline step is something like this:
SingleOutputStreamOperator reducedData =
data
.keyBy(new KeySelector())
.window(
TumblingEventTimeWindows.of(Time.seconds(secs)))
.reduce(new DataReducer())
.name("reduce");
This works fin
, Apr 8, 2024 at 11:55 AM wrote:
> Hi Sachin
>
>
>
> What exactly does the MyReducer do? Can you provide us with some code?
>
>
>
> Just a wild guess from my side, did you check the watermarking? If the
> Watermarks aren't progressing there's no way for F
Hi Sachin
What exactly does the MyReducer do? Can you provide us with some code?
Just a wild guess from my side, did you check the watermarking? If the
Watermarks aren't progressing there's no way for Flink to know when to emit a
window and therefore you won't see any outgoin
Hi,
I have a following windowing step in my pipeline:
inputData
.keyBy(new MyKeySelector())
.window(
TumblingEventTimeWindows.of(Time.seconds(60)))
.reduce(new MyReducer())
.name("MyReducer");
Same step when I see in Flink UI shows as:
Window(TumblingEventT
Hi team,
I'm a newcomer to Flink's window functions, specifically utilizing
TumblingProcessingTimeWindows with a configured window duration of 20
minutes. However, I've noticed an anomaly where the window output occurs
within 16 to 18 minutes. This has left me uncertain about wheth
>
> On Mon, Dec 4, 2023 at 8:03 PM arjun s wrote:
>
>> Hello team,
>> I'm relatively new to Flink's window functions, and I've configured a
>> tumbling window with a 10-minute duration. I'm wondering about the scenario
>> where the Flink job is rest
Hello team,
I'm relatively new to Flink's window functions, and I've configured a
tumbling window with a 10-minute duration. I'm wondering about the scenario
where the Flink job is restarted or the Flink application goes down. Is
there a mechanism to persist the aggregated
I am reading stats from Kinesis, deserializing them into a stat POJO and
then doing something like this using an aggregated window with no defined
processWindow function:
timestampedStats
.keyBy(v -> v.groupKey())
.window(TumblingEventTimeWindows.of(Time.seco
Are you using Flink SQL? If using Flink SQL, the window is triggered when and
only when the special data (with the expected timestamp after watermark)
enters. It is not possible to trigger the window without changing the
window-start and window-end column.
--
Best!
Xuyang
At
Is it possible to trigger a window without changing window-start and
window-end dates?
I have a lot of jobs run in window tumble (3H) and when they are all
triggered at the same time, it causes performance problems. If somehow I
can delay some of them 10-15 minutes , without changing the
Hi,
Is it possible to delay a window trigger without changing window-end and
window-start times?
Thanks
t time passes
> by the end of the time window
> The event time/watermark time of you join operator is the minimum watermark
> time of all inputs
> Because your second table does not emit watermark, it’s watermark time
> remains at Long.MinValue, hence also the operator time stay
semantics on the second table as well
* The logic is this:
* Your join operator only generates output windows once the event time
passes by the end of the time window
* The event time/watermark time of you join operator is the minimum
watermark time of all inputs
* Because your
Hi Matthias,
No the second table doesn’t have an event time and a watermark specified. In
order for the window to work do I need a watermark also on the second table?
Thanks
Eugenio
> Il giorno 21 set 2023, alle ore 13:45, Schwalbe Matthias
> ha scritto:
>
> Ciao Eugenio,
>
you think?
Cari saluti
Thias
From: Eugenio Marotti
Sent: Thursday, September 21, 2023 8:56 AM
To: user@flink.apache.org
Subject: Window aggregation on two joined table
Hi,
I’m trying to execute a window aggregation on two joined table from two Kafka
topics (upsert fashion), but I get no output
Hi,
I’m trying to execute a window aggregation on two joined table from two Kafka
topics (upsert fashion), but I get no output. Here’s the code I’m using:
This is the first table from Kafka with an event time watermark on ‘data_fine’
attribute:
final TableDescriptor
ing to run a window aggregation SQL query (on Flink 1.16) with
> Windowing TVF for a TUMBLE window with a size of 5 Milliseconds it seems
> Flink does not let a window size use a time unit smaller than seconds. Is
> that correct?
> (The documentation
> <https://nightlies.apache
I am trying to run a window aggregation SQL query (on Flink 1.16) with
Windowing TVF for a TUMBLE window with a size of 5 Milliseconds it seems
Flink does not let a window size use a time unit smaller than seconds. Is
that correct?
(The documentation
<https://nightlies.apache.org/flink/flink-d
Hi Eugenio,
I think it is due to window completion which will be complete once your
watermarked field on the event advances 15 days interval since the first
received event time.
Please also check this default trigger behavior here:
https://nightlies.apache.org/flink/flink-docs-release-1.16/docs
Hi everyone,
I’m trying to calculate an average with a sliding window. Here’s the code I’m
using. First of all I receive a series of events from a Kafka topic. I declared
a watermark on the ‘data_fine’ attribute.
final TableDescriptor filteredPhasesDurationsTableDescriptor
y using TumblingWindow windows
> assigner implementation is that the state sizes are growing unconditionally
> even when we have tuned rocksdb options and provided a good chunk of
> managed memory.
>
> We are able to read more than 15 records within a period of 4 mins
> which
Hi Jiadong,
>From the context you described, I think ProcessingTimeWindow may not be a
good solution.
If I understand correctly, you'd like to use the same SQL for streaming and
batch jobs in your platform. How about creating partitioned Sink tables for
streaming jobs instead of Window?
r scenario.
Thank you for your time and I look forward to hearing from you soon.
Best,
Jiadong Lu
On 2023/4/26 18:13, Shammon FY wrote:
Hi Jiadong
Using the process time window in Batch jobs may be a little strange for
me. I prefer to partition the data according to the day level, and then
Hi Jiadong
Using the process time window in Batch jobs may be a little strange for
me. I prefer to partition the data according to the day level, and then the
Batch job reads data from different partitions instead of using Window.
Best,
Shammon FY
On Wed, Apr 26, 2023 at 12:03 PM Jiadong Lu
Hi, Shammon,
Thank you for your reply.
Yes, the window configured with `Time.days(1)` has no special meaning,
it is just used to group all data into the same global window.
I tried using `GlobalWindow` for this scenario, but `GlobalWindow` also
need a `Trigger` like
Hi Jiadong,
I think it depends on the specific role of the window here for you. If this
window has no specific business meaning and is only used for performance
optimization, maybe you can consider to use join directly
Best,
Shammon FY
On Tue, Apr 25, 2023 at 5:42 PM Jiadong Lu wrote:
> He
Hello,everyone,
I am confused about the window of join/coGroup operator in Batch mode.
Here is my demo code, and it works fine for me at present. I wonder if
this approach that using process time window in batch mode is
appropriate? and does this approach have any problems? I want to use
this
rocksdb options and provided a good chunk of
managed memory.
We are able to read more than 15 records within a period of 4 mins
which is our time window set based on our requirements.
Now one optimization which I see is suggested through the flink docs in
order to reduce the state size is to use
Hi
I think you can give more detail such as example can help us to trace the
cause, thanks
Best,
Shammon
On Tue, Mar 7, 2023 at 5:31 PM Saurabh Singh via user
wrote:
> Hi Community,
>
> We have the below use case,
>
>- We have to use EventTime for Windowing (Tumbl
Hi Community,
We have the below use case,
- We have to use EventTime for Windowing (Tumbling Window) and
Watermarking.
- We use *TumbingEventTimeWindows* for this
- We have to continuously emit the results for Window every 1 minute.
- We are planning to use
One question as title: Whether Flink SQL window operations support
> "Allow Lateness and SideOutput"?
>
> Just as supported in Datastream api (allowedLateness
> and sideOutputLateData) like:
>
> SingleOutputStreamOperator<
Hi dear engineers,
One question as title: Whether Flink SQL window operations support "Allow
Lateness and SideOutput"?
Just as supported in Datastream api (allowedLateness and sideOutputLateData)
like:
SingleOutputStreamOperator<>sumStream = dataStream.ke
Hi Yuxia, Thanks for your reply! I expected the program to do reduce by key. It
should count the number of data having the same username field. The program
threw no exception, but it did not produce the expected output. Best regards,
Lucas
Hi, Lucas.
What do you mean by saying "unable to do event time window aggregation with
watermarkedStream"?
What exception it will throw?
Best regards,
Yuxia
发件人: "wei_yuze"
收件人: "User"
发送时间: 星期二, 2023年 2 月 07日 下午 1:43:59
主题: Unable to do event time
Hello!
I was unable to do event time window aggregation with Kafka source, but had no
problem with "fromElement" source. The code is attached as follow. The code has
two data sources, named "streamSource" and "kafkaSource" respectively. The
program works wel
I'm trying to sink two Window Streams to the same Kinesis Sink. When
I do this, no results are making it to the sink (code below). If I
remove one of the windows from the Job, results do get published.
Adding another stream to the sink seems to void both.
How can I have results from both W
That’s my understanding as well, thanks for your confirmation.
From: Yanfei Lei
Sent: 04 November 2022 16:03
To: Qing Lim
Cc: User
Subject: Re: Does reduce function on keyed window gives any guarantee on the
order of elements?
Hi Qing,
> am I right to think that there will be 1 red
Best,
Yanfei
Qing Lim 于2022年11月3日周四 16:17写道:
> Hi Yanfei
>
> Thanks for the explanation.
>
>
>
> If I use reduce in the context of keyed stream with window, am I right to
> think that there will be 1 reduce function per key, and they will never
> overlap? Each re
Hi Yanfei
Thanks for the explanation.
If I use reduce in the context of keyed stream with window, am I right to think
that there will be 1 reduce function per key, and they will never overlap? Each
reduce function instance will only receive elements from the same key in order.
From: Yanfei Lei
Hi Qing,
> Does it guarantee that it will be called in the same order of elements in
the stream, where value2 is always 1 element after value1?
Order is maintained within each parallel stream partition. If the reduce
operator only has one sending- sub-task, the answer is YES, but if reduce
operato
Hi, Flink User Group
I am trying to use Reduce function, I wonder does it guarantee order when its
called?
The signature is as follow:
T reduce(T value1, T value2) throws Exception;
Does it guarantee that it will be called in the same order of elements in the
stream, where value2 is always 1
Thanks for the confirmation :)
Regards,
Alexis.
On Sun, 9 Oct 2022, 10:37 Hangxiang Yu, wrote:
> Hi, Alexis.
> I think you are right. It also applies for a global window with a custom
> trigger.
> If you apply a ReduceFunction or AggregateFunction, the window state size
> usu
Hi, Alexis.
I think you are right. It also applies for a global window with a custom
trigger.
If you apply a ReduceFunction or AggregateFunction, the window state size
usually is smaller than applying ProcessWindowFunction due to the
aggregated value. It also works for global windows.
Of course
Hello,
I found an SO thread that clarifies some details of window state size [1].
I would just like to confirm that this also applies when using a global
window with a custom trigger.
The reason I ask is that the TriggerResult API is meant to cover all
supported scenarios, so FIRE vs
/datastream/EndOfStreamWindows.java
The usage would be like:
.keyBy(new MyKeySelector())
.window(EndOfStreamWindows.get())
.reduce(new MyReduceFunction())
Best,
Yunfeng Zhou
On Thu, Sep 29, 2022 at 9:36 PM Vararu, Vadim
wrote:
> Hi all,
>
>
>
> I need to configure a keyed global wi
Hi all,
I need to configure a keyed global window that would trigger a reduce function
for all the events in each key group before the processing finishes and the job
closes.
I have something similar for the realtime(streaming) version of the job,
configured with a processing time gap
Hi everyone,
I know the low level details of this are likely internal, but at a high
level we can say that operators usually have some state associated with
them. Particularly for error handling and job restarts, I imagine windows
must persist state, and operators in general probably persist netwo
Sorry for misleading.
After some investigation, seems UDTAGG can only used in flink table spi.
Best regards,
Yuxia
- 原始邮件 -
发件人: "yuxia"
收件人: "user-zh"
抄送: "User"
发送时间: 星期四, 2022年 8 月 18日 上午 10:21:12
主题: Re: get state from window
> does flink s
Hi, there are two methods on the Context object that a process() invocation
receives that allows access to the two types of state:
- globalState(), which allows access to keyed state that is not scoped
to a window
- windowState(), which allows access to keyed state that is also scoped
> does flink sql support UDTAGG?
Yes, Flink sql support UDTAGG.
Best regards,
Yuxia
- 原始邮件 -
发件人: "曲洋"
收件人: "user-zh" , "User"
发送时间: 星期四, 2022年 8 月 18日 上午 10:03:24
主题: get state from window
Hi dear engineers,
I have one question: does flink stre
Is it possible in Flink SQL to tumble a window by row size instead of time?
Let's say that I want a window for every 1 rows for example using the Flink
SQL API.
is that possible?
I can't find any documentation on how to do that, and I don't know if it is
supported.
, userid, count(pageid) AS pgcnt FROM
>> TABLE(TUMBLE(TABLE myTable, DESCRIPTOR(rowtime), INTERVAL '5' SECONDS))
>> WHERE (p_userid <> 'User_6') GROUP BY window_start, window_end, userid
>>
>>
>>
>> --
>> Best!
>> Xuya
AS pgcnt FROM TABLE
> (TUMBLE(TABLE myTable, DESCRIPTOR(rowtime), INTERVAL '5' SECONDS)) WHERE
> (p_userid <> 'User_6') GROUP BY window_start, window_end, userid
>
>
>
> --
> Best!
> Xuyang
>
>
> At 2022-05-19 08:53:03, "Pouria Pi
I am running a Flink application in Java that performs window aggregation.
The query runs successfully on Flink 1.14.4. However, after upgrading to
Flink 1.15.0 and switching the code to use Windowing TVF, it fails with a
runtime error as planner can not compile and instantiate window Aggs
Handler
Hi Juntao,
Thanks a lot for taking a look at this.
After a little inspection, I found that elements (window state) are stored
> in namespace TimeWindow{start=1,end=11}, in your case, and trigger count
> (trigger state) is stored in namespace TimeWindow{start=1,end=15}, but
> WindowReade
Sorry to make the previous mail private.
My response reposted here:
"
After a little inspection, I found that elements (window state) are stored
in namespace TimeWindow{start=1,end=11}, in your case, and trigger count
(trigger state) is stored in namespace TimeWindow{start=1,end=15}
I'm using Flink-1.14.4 and failed to load in WindowReaderFunction the
>> state of a stateful trigger attached to a session window.
>> I found that the following data become available in WindowReaderFunction:
>> - the state defined in the ProcessWindowFunction
>> - the
Hi Shilpa,
There is no need to have artificial messages in the input kafka topic (and I
don’t see where Andrew suggests this 😊 )
However your use case is not 100% clear as to for which keys you want to emit
0-count window results , either:
* A) For all keys your job has ever seen (that’s
Thanks Andrew.
We did consider this solution too. Unfortunately we do not have permissions
to generate artificial kafka events in our ecosystem.
Dario,
Thanks for your inputs. We will give your design a try. Due the number of
events being processed per window, we are using incremental aggregate
It depends on the user case, in Shilpa's use case it is about users so
the user ids are probably know beforehand.
https://dpaste.org/cRe3G <= This is an example with out an window but
essentially Shilpa you would be reregistering the timers every time they
fire.
You would also have t
sting into Hive, we filter out the canary events. The ingestion
code has work to do and can mark an hour as complete, but still end up
writing no events to it.
Perhaps you could do the same? Always emit artificial events, and filter
them out in your windowing code? The window should still fir
a
solution to do the same. The options we considered are below, please let us
know if there are other ideas we haven't looked into.
[1] Querable State : Save the keys in each of the Process Window Functions.
Query the state from an external application and alert when a key is
missing afte
Can anyone help me with this?
Thanks in advance,
On Tue, Apr 19, 2022 at 4:28 PM Dongwon Kim wrote:
> Hi,
>
> I'm using Flink-1.14.4 and failed to load in WindowReaderFunction the
> state of a stateful trigger attached to a session window.
> I found that the following data
Hi,
I'm using Flink-1.14.4 and failed to load in WindowReaderFunction the state
of a stateful trigger attached to a session window.
I found that the following data become available in WindowReaderFunction:
- the state defined in the ProcessWindowFunction
- the registered timers of the sta
I dug into this further and I no longer suspect the window describe
previously as it does not leverage MergeableWindowAssigner. However, I did
identify four in our code that do. They all
level ProcessingTimeSessionWindows.withGap. 3 of them use a 500ms gap,
while oe uses a 100ms gap. Based on
Here's our custom trigger. We thought about switching to
ProcessingTimeoutTrigger.of(CountTrigger.of(100, Time.ofMinutes(1)). But
I'm not sure that'll trigger properly when the window closes.
Thanks.
Jai
import org.apache.flink.streaming.api.windowing.triggers.Count
We are encountering the following error when running our Flink job. We
have several processing windows, but it appears to be related to a
TumblingProcessingTimeWindow. Checkpoints are failing to complete midway.
The code block for the window is:
.keyBy(order -> getKey(order))
.win
ompleteness
> and Idid a cut and paste lol
>
> Ok let you know.
>
> On Mon, 31 Jan 2022 at 17:18, Dario Heinisch
> wrote:
>
>> Then you should be using a process based time window, in your case:
>> TumblingProcessingTimeWindows
>>
>> See
>> https://ni
ime window, in your case:
> TumblingProcessingTimeWindows
>
> See
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/
> for more info
> On 31.01.22 23:13, John Smith wrote:
>
> Hi Dario, I don't care about event time I just want
Then you should be using a process based time window, in your case:
TumblingProcessingTimeWindows
See
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/
for more info
On 31.01.22 23:13, John Smith wrote:
Hi Dario, I don't care about event time I
Hi Dario, I don't care about event time I just want to do tumbling window
over the "processing time" I.e: count whatever I have in the last 5 minutes.
On Mon, 31 Jan 2022 at 17:09, Dario Heinisch
wrote:
> Hi John
>
> This is because you are using event time (TumblingEve
"Kafka Source")
.uid(kafkaTopic).name(kafkaTopic)
.setParallelism(1)
.flatMap(new MapToMyEvent("my-event", "message")) // <--- This works
.uid("map-json-logs").name("map-json-logs"); slStream.keyBy(n
ToMyEvent("my-event", "message")) // <--- This works
.uid("map-json-logs").name("map-json-logs");
slStream.keyBy(new MinutesKeySelector(windowSizeMins)) // <---
This prints twice
.window(TumblingEventTimeWindows.of(Time.min
ource
detects this, it sends out a record with a very large watermark to cut off
the session.
Lars Skjærven 于2022年1月21日周五 20:01写道:
> We're doing a stream.keyBy().window().aggregate() to aggregate customer
> feedback into sessions. Every now and then we have to update the job, e.g.
>
We're doing a stream.keyBy().window().aggregate() to aggregate customer
feedback into sessions. Every now and then we have to update the job, e.g.
change the key, so that we can't easlily continue from the previous state.
Cancelling the job (without restarting from last savepoint) will
Hi Jing,
Please try this way,
Only create one sink for final output, write the window aggregate and topN
in one query, write the result of topN into the final sink.
Best,
Jing Zhang
Jing 于2021年12月24日周五 03:13写道:
> Hi Jing Zhang,
>
> Thanks for the reply! My current implementation is
TRING, channel_id STRING, window_end
BIGINT, num_select BIGINT) "
)
This also doesn't work.
Thanks,
Jing
On Thu, Dec 23, 2021 at 1:20 AM Jing Zhang wrote:
> Hi Jing,
> In fact, I agree with you to use TopN [2] instead of Window TopN[1] by
> normalizing
> time into a unit
>
> *Sender:*Yun Gao
> *Send Date:*Thu Dec 23 17:05:33 2021
> *Recipients:*cy
> *CC:*'user@flink.apache.org'
> *Subject:*Re: Re:Re: Window Aggregation and Window Join ability not work
> properly
>
>> Hi Caiyi,
>>
>> I think if the
Hi Jing,
In fact, I agree with you to use TopN [2] instead of Window TopN[1] by
normalizing
time into a unit with 5 minute, and add it to be one of partition keys.
Please note two points when use TopN
1. the result is an update stream instead of append stream, which means the
result sent might be
inal Mail --
Sender:Yun Gao
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy
CC:'user@flink.apache.org'
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,
I think if the image shows all the records, after the change we should on
Sorry I mean 16:00:05, but it should be similar.
--Original Mail --
Sender:Yun Gao
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy
CC:'user@flink.apache.org'
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,
Hi Caiyi,
I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5
minutes?
Best,
Yun
--Original Mail --
Sender:cy
Send Date:Thu Dec 23 15:44:23 2021
Hi Jing,
I'm afraid there is no possible to Window TopN in SQL on 1.12 version
because window TopN is introduced since 1.13.
> I saw the one possibility is to create a table and insert the aggregated
data to the table, then do top N like [1]. However, I cannot make this
approach work b
Hi, Flink community,
Is there any existing code I can use to get the window top N with Flink
1.12? I saw the one possibility is to create a table and insert the
aggregated data to the table, then do top N like [1]. However, I cannot
make this approach work because I need to specify the connector
I change to
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
At 2021-12-23 15:16:20, "Yun Gao" wrote:
Hi Caiyi,
The window need to be finalized with watermark[1]. I noticed that th
Hi Caiyi,
The window need to be finalized with watermark[1]. I noticed that the watermark
defined
is `datatime` - INTERVAL '5' MINUTE, it means the watermark emitted would be
the
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to
trigger the window of
Hi Roman,
Thanks for the update and testing locally. This is very informative.
-Guoqin
On Mon, Nov 22, 2021 at 10:55 AM Roman Khachatryan wrote:
> Hi Guoqin,
>
> I was able to reproduce the problem locally. I can see that at the
> time of window firing the services are already cl
Hi Guoqin,
I was able to reproduce the problem locally. I can see that at the
time of window firing the services are already closed because of the
emitted MAX_WATERMARK.
Previously, there were some discussions around waiting for all timers
to complete [1], but AFAIK there was not much demand to
> Hi Guoqin,
>
> Thanks for the clarification.
>
> Processing time windows actually don't need watermarks: they fire when
> window end time comes.
> But the job will likely finish earlier because of the bounded input.
>
> Handling of this case was improved in 1.14
Hi Guoqin,
Thanks for the clarification.
Processing time windows actually don't need watermarks: they fire when
window end time comes.
But the job will likely finish earlier because of the bounded input.
Handling of this case was improved in 1.14 as part of FLIP-147, as
well as in pre
Hi Roman,
Thanks for your quick response!
Yes, it does seem to be the window closing problem.
So if I change the tumble window on eventTime, which is column 'requestTime',
it works fine. I guess the EOF of the test data file kicks in a watermark
of Long.MAX_VALUE.
But my application
do a local test of my pyflink sql application. The sql
> query does a very simple job: read from source, window aggregation and write
> result to a sink.
>
> In the production, the source and sink will be kinesis stream. But since I
> need to do a local test, I am mocking out t
d from source, window aggregation and
write result to a sink.
In the production, the source and sink will be kinesis stream. But since I
need to do a local test, I am mocking out the source and sink with the
local filesystem.
The problem is the local job always runs successfully, but the results were
new WatermarkStrategy[Int] {
> override def createWatermarkGenerator(context:
> WatermarkGeneratorSupplier.Context): WatermarkGenerator[Int] = new
> AscendingTimestampsWatermarks[Int]
>
> override def createTimestampAssigner(context:
> TimestampAssignerSupplier.Context):
1 - 100 of 1438 matches
Mail list logo