I have a co-process function which goes through a loop and performs the
same task against the input event using different parameters in the state.
So, on the one hand we have the basic input events and on the other the
parameters.
Would that be a good idea to parallelize this task by means of
i, Oct 13, 2023 at 9:31 PM Yashoda Krishna T
wrote:
> Is it possible to use table API inside a processAll window function .
> Lets say, the use case is process function should enrich for each element
> by querying some SQL queries over the entire elements in the window using
> table
Is it possible to use table API inside a processAll window function .
Lets say, the use case is process function should enrich for each element
by querying some SQL queries over the entire elements in the window using
table API. Is this case supported in flink? If not what is the suggested way
case.
>
> For that, we would like to call an interface from the use case that
> effectively sends the event ultimately via out.collect
>
> The problem is that for instantiating the use case we need to inject the
> collector as dependency and we dont have access to the collector
instantiating the use case we need to inject the
collector as dependency and we dont have access to the collector at the
process function class level, only at the processelement method level.
Is there any way to access the collector from the process function class,
in the open method ?
Regards,
Oscar
erun situation.
> 4、when using flink eventTimeWindow function, process function can not be
> trigger if no event input at that window.
>
> For example, a product named A which have two data input with time
> 9:25:12 and 9:28:23 ,how can I output zero between 9:25 and 9:28 with
> EventTime window?
>
> Best regards,
> Xlf
>
.
4、when using flink eventTimeWindow function, process function can not be
trigger if no event input at that window.
For example, a product named A which have two data input with time 9:25:12
and 9:28:23 ,how can I output zero between 9:25 and 9:28 with EventTime
window?
Best regards,
Xlf
ing. It seems that WindowOperator consists of some
>> InternalStates, of which signature is where window is namespace or key, if
>> I understand correctly. But internal states are not available for Flink
>> users.
>>
>> So my question is: is there an efficient way to simulate watermark
>> buffering using process function for Flink users?
>>
>> Thanks.
>>
>
; users.
>
> So my question is: is there an efficient way to simulate watermark
> buffering using process function for Flink users?
>
> Thanks.
>
It seems that WindowOperator consists of some InternalStates,
of which signature is where window is namespace or key, if I understand
correctly. But internal states are not available for Flink users.
So my question is: is there an efficient way to simulate watermark
buffering using process function fo
nit) type while the second one
>> produces a (String, String) type, so the whole if expression produces
>> (String, Any) type. However your parseJson should return Either[String,
>> String], thus causing this issue.
>>
>>
>> Siddhesh Kalgaonkar 于2022年1月5日周三 19:04写
ype. However your parseJson should return Either[String, String], thus
> causing this issue.
>
>
> Siddhesh Kalgaonkar 于2022年1月5日周三 19:04写道:
>
>> I have written a process function where I am parsing the JSON and if it
>> is not according to the expected format it passes as
he second one produces
a (String, String) type, so the whole if expression produces (String, Any)
type. However your parseJson should return Either[String, String], thus
causing this issue.
Siddhesh Kalgaonkar 于2022年1月5日周三 19:04写道:
> I have written a process function where I am parsing the JSON an
I have written a process function where I am parsing the JSON and if it is
not according to the expected format it passes as Failure to the process
function and I print the records which are working fine. Now, I was trying
to print the message and the record in case of Success and Failure. I
s to puzzle me. Here's a contrived
> example to clarify: A process function receives data but never emits
> anything (neither in the processElement or based on a timer).. i.e., the
> processFunction is just a black hole for event records passing through it.
> Do watermarks still make it
*Watermark progression:*
The code you pointed out in [1], seems to indicate that watermarks are a
logical side-effect that travel alongside events in the stream but can also
progress on their own? This continues to puzzle me. Here's a contrived
example to clarify: A process function receives
Hi Ahmad,
The ProcessFunction is simply forwarding the Watermark [1]. So I don't have
any explanation as to why it would not advance anymore as soon as you emit
data. My assumption was that by emitting in the process function causes
backpressure and thus halts the advancement of the wate
I replaced the Aync IO with a simple process
function and print statements in the body of the process function. The
process function simply emits what it received. I also removed the custom
sink (that has an external dependency) and replaced it with a simple lambda
that occasionally prints ju
n key's by a field and
> uses a KeyedProcessFunciton.
>
> The keyed process function outputs events from with the `processElement`
> method using `out.collect`. No timers are used to collect or output any
> elements (or do anything for that matter).
>
> I also have a simple pri
Flink 1.11
I have a simple Flink application that reads from Kafka, uses event
timestamps, assigns timestamps and watermarks and then key's by a field and
uses a KeyedProcessFunciton.
The keyed process function outputs events from with the `processElement`
method using `out.collect`. No t
t modification time) match with the previous timestamp count.
>
> Is there refere about checking the previous count? am I understanding
> correctly? help me to understand this part.
>
> (ii) can the process function be used to look back the previous
> key/count?
>
> [1]
? am I understanding
correctly? help me to understand this part.
(ii) can the process function be used to look back the previous key/count?
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/operators/process_function/
Thank you
e. Your timestamp assigner returns timestamp in
>>> second resolution while Flink requires millisecond resolution.
>>>
>>>
>>> Best,
>>> Kezhu Wang
>>>
>>> On February 24, 2021 at 11:49:59, sagar (sagarban...@gmail.com) wrote:
>>>
>
gt;>
>>> I saw one potential issue. Your timestamp assigner returns timestamp in
>>> second resolution while Flink requires millisecond resolution.
>>>
>>>
>>> Best,
>>> Kezhu Wang
>>>
>>> On February 24, 2021 at 11:49:59, sagar
gt; On February 24, 2021 at 11:49:59, sagar (sagarban...@gmail.com) wrote:
>>
>> I have simple flink stream program, where I am using socket as my
>> continuous source
>> I have window size of 2 seconds.
>>
>> Somehow my window process function is not trigge
e I am using socket as my
>> continuous source
>> I have window size of 2 seconds.
>>
>> Somehow my window process function is not triggering and even if I pass
>> events in any order, flink is not ignoring
>>
>> I can see the output only when I kill my sock
ream program, where I am using socket as my
> continuous source
> I have window size of 2 seconds.
>
> Somehow my window process function is not triggering and even if I pass
> events in any order, flink is not ignoring
>
> I can see the output only when I kill my sock
continuous source
I have window size of 2 seconds.
Somehow my window process function is not triggering and even if I pass
events in any order, flink is not ignoring
I can see the output only when I kill my socket , please find the code
snippet below
final StreamExecutionEnvironment env
I have simple flink stream program, where I am using socket as my
continuous source
I have window size of 2 seconds.
Somehow my window process function is not triggering and even if I pass
events in any order, flink is not ignoring
I can see the output only when I kill my socket , please find
Hi Marco,
sorry for the late reply. Have you looked into user-defined aggregate
functions for SQL? I think your requirements can be easily implemented
there. You can declare multiple aggregate functions per window. There is
also the built-in function LISTAGG that might help for your use case.
Alright, maybe my example needs to be more concrete. How about this:
In this example, I don't want to create to windows just to re-combine
what was just aggregated in SQL. Is there a way to transform the aggregate
results into one datastream object so that I don't have to aggregate again?
// agg
DOW --> aggregated table (the is
results of table with 20 records in it ) --> PROCESS FUNCTION (aggregated
table)
t;> flink-state-backends/flink-statebackend-rocksdb/src/test/java/org/apache/flink/contrib/streaming/state/RocksIncrementalCheckpointRescalingTest.java
>> Best,
>> Guowei
>>
>>
>> On Fri, Nov 13, 2020 at 10:36 AM Marco Villalobos <
>> mvillalo...@kineteq
>
> On Fri, Nov 13, 2020 at 10:36 AM Marco Villalobos <
> mvillalo...@kineteque.com> wrote:
>
>> Hi,
>>
>> I would like to adding keyed state to test harness before calling process
>> function.
>>
>> I am using the OneInputStreamOperatorTestHar
; I would like to adding keyed state to test harness before calling process
> function.
>
> I am using the OneInputStreamOperatorTestHarness.
>
> I can't find any examples online on how to do that, and I am struggling to
> figure this out.
>
> Can somebody please provide g
Hi,
I would like to adding keyed state to test harness before calling process
function.
I am using the OneInputStreamOperatorTestHarness.
I can't find any examples online on how to do that, and I am struggling to
figure this out.
Can somebody please provide guidance? My test case has
-spread-out-slots
configuration [1].
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#cluster-evenly-spread-out-slots
On Mon, Sep 14, 2020 at 5:33 PM Arti Pande wrote:
> Hi,
>
> Here is a question related to parallelism of keyed-process-function
Hi,
Here is a question related to parallelism of keyed-process-function that is
applied to the KeyedStream. For some code that looks like this
myStream.keyBy(...)
.process(new MyKeyedProcessFunction())
.process().setParallelism(10)
On a Flink cluster with 5 TM nodes each with 10 task
;
> Hi User,
>
> We are using keyed process function with Event time for flink streaming
> application.
> We register event time on "processElement" function, and mentioned that
> "onTimer" function had different "timestamp" as registered on
&
Hi User,
We are using keyed process function with Event time for flink streaming
application.
We register event time on "processElement" function, and mentioned that
"onTimer" function had different "timestamp" as registered on
"processElement" functio
Hi,
the configuration parameter is just legacy API. You can simply pass any
serializable object to the constructor of your process function.
Regards,
Timo
Am 29.03.18 um 20:38 schrieb Main Frame:
Hi guys! Iam newbie in flink and I have probably silly question about streaming
api.
So for
Hi guys! Iam newbie in flink and I have probably silly question about streaming
api.
So for the instance:
I trying to apply SomeProcessFunction to stream1
…
DataStream stream2 = stream1.process(new
MyProcessFunction()).name("Ingest data»);
…
I have created package-private class with MyProcess
Thanks Chesnay,
So I think to support multi input and multiple output model like data flow
paper indicates, Flink needs to get credit based scheduling as well as side
input ready and doing a new set of data stream apis that doesn’t constrained
with backwards compatibility issues. Only then can
I've opened https://issues.apache.org/jira/browse/FLINK-8437
Unfortunately i doubt we can fix this properly. The proposed solution
will not work if we ever allow arbitrary functions to use side-outputs.
On 16.01.2018 08:59, Juho Autio wrote:
Could someone with knowledge of the right terms crea
Could someone with knowledge of the right terms create this in JIRA,
please? I guess I could also create it if needed..
On Mon, Jan 15, 2018 at 3:15 PM, Chesnay Schepler
wrote:
> yes, i meant that process() returns the special operator. This would
> definitely deserve a JIRA issue.
>
>
> On 15.0
yes, i meant that process() returns the special operator. This would
definitely deserve a JIRA issue.
On 15.01.2018 14:09, Juho Autio wrote:
Thanks for the explanation. Did you meant that process() would return
a SingleOutputWithSideOutputOperator?
Any way, that should be enough to avoid the
Thanks for the explanation. Did you meant that process() would return a
SingleOutputWithSideOutputOperator?
Any way, that should be enough to avoid the problem that I hit (and it also
seems like the best & only way).
Maybe the name should be something more generic though, like
ProcessedSingleOutp
It would mean that getSideOutput() would return a
SingleOutputWithSideOutputOperator which extends SingleOutputOperator
offering getSideOutput(). Other transformations would still return a
SingleOutputOperator.
With this the following code wouldn't compile.
stream
.process(...)
.filte
> sideoutput might deserve a seperate class which inherit form
singleoutput. It might prevent lot of confusions
Thanks, but how could that be done? Do you mean that if one calls
.process(), then the stream would change to another class which would only
allow calls like .getMainOutput() or .getSide
Hi Juho,
I think sideoutput might deserve a seperate class which inherit form
singleoutput. It might prevent lot of confusions. A more generic question
is whether datastream api can be mulitple ins and mulitple outs natively.
It's more like scheduling problem when you come from single process syst
Maybe I could express it in a slightly different way: if adding the
.filter() after .process() causes the side output to be somehow totally
"lost", then I believe the .getSideOutput() could be aware that there is
not such side output to be listened to from upstream, and throw an
exception. I mean,
Hi Juho,
Now that I think of it this seems like a bug to me: why does the call to
getSideOutput succeed if it doesn't provide _any_ input?
With the way side outputs work, I don’t think this is possible (or would make
sense). An operator does not know whether or not it would ever emit some
elem
When I run the code below (Flink 1.4.0 or 1.3.1), only "a" is printed. If I
switch the position of .process() & .filter() (ie. filter first, then
process), both "a" & "b" are printed, as expected.
I guess it's a bit hard to say what the side output should include in this
case: the stream before fi
ep
>> a lastModified variable as a ValueState,
>> then compare it to the timestamp provided by the timer to see if the
>> current key should be evicted.
>> Checkout the example on the ProcessFunction page.
>>
>> https://ci.apache.org/projects/flink/flin
Hi All,
I have a streaming pipeline which is keyed by userid and then
to a flatmap function. I need to clear the state after sometime
and I was looking at process function for it.
Inside the process element function if I register a timer
wouldn't it create a timer f
org/projects/flink/flink-docs-release-1.2/dev/stream/process_function.html
>>
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/process_function.html>
>>
>> Best regards,
>> Kien
>>
>> On 9/5/2017 11:49 AM, Navneeth K
> current key should be evicted.
> Checkout the example on the ProcessFunction page.
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/
> process_function.html
>
> Best regards,
> Kien
>
> On 9/5/2017 11:49 AM, Navneeth Krishnan wrote:
>
&g
ease-1.2/dev/stream/process_function.html
>
> <https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/process_function.html>
>
> Best regards,
> Kien
>
> On 9/5/2017 11:49 AM, Navneeth Krishnan wrote:
>> Hi All,
>>
>> I have a stre
/dev/stream/process_function.html
Best regards,
Kien
On 9/5/2017 11:49 AM, Navneeth Krishnan wrote:
Hi All,
I have a streaming pipeline which is keyed by userid and then to a
flatmap function. I need to clear the state after sometime and I was
looking at process function for it.
Inside the
Hi Navneeth,
Currently, I don't think there is any built-in functionality to trigger
onTimer periodically.
As for the second part of your question, do you mean that you want to query
on which key the fired timer was registered from? I think this also isn't
possible right now.
I'm looping in Aljos
How are you determining your data is stale? Also if you want to know the key,
why don't you store the key in your state as well?
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi All,
I have a streaming pipeline which is keyed by userid and then to a flatmap
function. I need to clear the state after sometime and I was looking at
process function for it.
Inside the process element function if I register a timer wouldn't it
create a timer for each incoming me
62 matches
Mail list logo