-> sink1 -> if OK
> sink2 -> if OK sink3
>
> \-> if NOK sink4 \-> if NOK sink4
>
> In order to achieve it, we would like to have some kind of side output
> coming out from our sinks but side outputs are not available in
> SinkFunctions. The only way
t; filters/maps/process -> sink1 -> if OK sink2 -> if OK
sink3
\-> if NOK sink4 \-> if NOK sink4
In order to achieve it, we would like to have some kind of side output coming
out from our sinks but side outputs are not available in SinkFunctions. The
only way we have come up
; >> have provided the type information.
> >>
> >> The second constructor is introduced after the document and the first
> >> constructor, and I think the document might have been outdated and not
> >> match with OutputTag's current behavior. A ticket
s introduced after the document and the first
>> constructor, and I think the document might have been outdated and not
>> match with OutputTag's current behavior. A ticket and PR could be
>> added to fix the document. What do you think?
>>
>> Best,
>> Yunfeng
>&
's current behavior. A ticket and PR could be
> added to fix the document. What do you think?
>
> Best,
> Yunfeng
>
> On Fri, Sep 22, 2023 at 4:55 PM Alexis Sarda-Espinosa
> wrote:
> >
> > Hello,
> >
> > very quick question, the documentation for side
ep 22, 2023 at 4:55 PM Alexis Sarda-Espinosa
wrote:
>
> Hello,
>
> very quick question, the documentation for side outputs states that an
> OutputTag "needs to be an anonymous inner class, so that we can analyze the
> type" (this is written in a comment in the example).
Hello,
very quick question, the documentation for side outputs states that an
OutputTag "needs to be an anonymous inner class, so that we can analyze the
type" (this is written in a comment in the example). Is this really true?
I've seen many examples where it's a static ele
ource (right input):
>>>
>>> [image: Screen Shot 2021-05-20 at 3.12.22 PM.png]
>>> these join functions have a time window/duration or interval associated
>>> with them to define the duration of join state and inference window. this
>>> is set per operat
gt;> with them to define the duration of join state and inference window. this
>> is set per operator to allow for in order and out of order join thresholds
>> for id based joins, and this window acts as the scope for inference when a
>> right event that is an inference candida
I found the problem. I tried to sign timestamps to the operator (I don't
know why), and when I did that, because I used the Flink API fluently, I
was no longer referencing the operator that contained the side-outputs.
Disregard my question.
On Sat, May 22, 2021 at 9:28 PM Marco Villa
I have been struggling for two days with an issue using the DataStream API
in Batch Execution mode.
It seems as though my side-output has no elements available to downstream
operators.
However, I am certain that the downstream operator received events.
I logged the side-output element just before
e window. this
> is set per operator to allow for in order and out of order join thresholds
> for id based joins, and this window acts as the scope for inference when a
> right event that is an inference candidate (missing foreign key id) is
> about to be evicted from state.
>
> p
foreign key id) is
about to be evicted from state.
problem:
i have things coded up with side outputs for duplicate, late and dropped
events. the dropped events case is the one i am focusing on since events
that go unmatched are dropped when they are evicted from state. only rhs
events are the o
Python
> API? This already works pretty neatly in the DataStream API but couldn't find
> any communication on adding this to PyFlink.
>
> In the meantime, what do you suggest for a workaround on side outputs?
> Intuitively, I would copy a stream and add a filter for each
t do you suggest for a workaround on side outputs?
Intuitively, I would copy a stream and add a filter for each side output
but this seems a bit inefficient. In that setup, each side output will need
to go over the complete stream. Any ideas?
Thanks in advance!
Regards,
Wouter
Hi Alexey,
side outputs should be counted in numRecordsOutPerSecond. However, there is
a bug that this is not happening for side-outputs in the middle of the
chain [1].
[1] https://issues.apache.org/jira/browse/FLINK-18808
On Tue, Dec 22, 2020 at 1:14 AM Alexey Trenikhun wrote:
> He
Hello,
Does numRecordsOutPerSecond metric takes into account number of records send to
side output or it provides rate only for main output?
Thanks,
Alexey
ords, and filtering out
>> the wrong records in each suffix stream, but it's not super efficient...
>> Unfortunately, from what I can see, using side outputs isn't an option
>> because each output tag has a single type parameter, and the output record
>> is dispat
ffix streams, which are allocated on boot
> based on configuration.
>
> At the moment I'm just duplicating the input records, and filtering out
> the wrong records in each suffix stream, but it's not super efficient...
> Unfortunately, from what I can see, using side outputs isn&
, but it's not super efficient...
Unfortunately, from what I can see, using side outputs isn't an option
because each output tag has a single type parameter, and the output record
is dispatched based on its runtime type.
Is there a better way to do this?
Thanks!
-0xe1a
Hi Patrick,
at the moment it is not possible to disconnect side outputs from other
streaming operators. I guess what you would like to have is an operator
which consumes on a best effort basis but which can also lose some data
while it is being restarted. This is currently not supported by Flink
Hi Till,
Thanks for your reply.
Is there any option to disconnect the side outputs from the pipelined data
exchanges of the main stream.
The benefit of side outputs is very high regarding performance and useability
plus it fits the use case here very nicely. Though this pipelined connection
side outputs, I think they are connected via
pipelined data exchanges with the main stream and, hence, are part of the
same failover region as the main stream.
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy
Cheers
Hi all,
We are trying to setup regions to enable Flink to only stop failing tasks based
on region instead of failing the entire stream.
We are using one main stream that is reading from a kafka topic and a bunch of
side outputs for processing each event from that topic differently.
For the
No, I think David answered the specific question that I asked i.e. is it
okay (or not) for operators other than sinks and side outputs to do I/O.
Purging DLQ entries is something we'll need to be able to do anyway (for
some scenarios - aside from successful checkpoint retries) and I
specifi
le/dev/stream/operators/asyncio.html
> > <https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/stream/operators/asyncio.html>
> >
> > Best,
> > David
> >
> > On Sun, Jul 26, 2020 at 11:08 AM Tom Fennelly
> > wrote:
> >
> >
dev/stream/operators/asyncio.html>
>
> Best,
> David
>
> On Sun, Jul 26, 2020 at 11:08 AM Tom Fennelly
> wrote:
>
>> Hi.
>>
>> What are the negative side effects of (for example) a filter function
>> occasionally making a call out to a DB ? Is this a big no-no and should all
>> outputs be done through sinks and side outputs, no exceptions ?
>>
>> Regards,
>>
>> Tom.
>>
>
at are the negative side effects of (for example) a filter function
> occasionally making a call out to a DB ? Is this a big no-no and should all
> outputs be done through sinks and side outputs, no exceptions ?
>
> Regards,
>
> Tom.
>
Hi.
What are the negative side effects of (for example) a filter function
occasionally making a call out to a DB ? Is this a big no-no and should all
outputs be done through sinks and side outputs, no exceptions ?
Regards,
Tom.
Hi Ivneet,
Q1) you can read about the deprecation of split in FLINK-11084 [1]. In
general side-outputs subsume the functionality and allow some advanced
cases (like emitting the same record into two outputs).
Q2) It's simply a matter of API design. The basic idea is to keep most
interfac
Hi folks,
I want to split my stream for some invalid message handling, and need help
understanding a few things.
Question 1: Why is *split *operator deprecated?
Question 2: Why side-outputs are only supported for ProcessFunction,
KeyedProcessFunction etc.
The doc on side-outputs says: "*Yo
I don't think this is possible.
At the very least you should be able to workaround this by having your
AsyncFunction return an Either, and having a subsequent
ProcessFunction do the side-output business.
On 19/02/2020 22:25, KristoffSC wrote:
Hi,
any thoughts about this one?
Regards,
Krzysz
Hi,
any thoughts about this one?
Regards,
Krzysztof
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi all,
Is there a way to emit a side output from RichAsyncFunction operator like it
is possible with ProcessFunctions via ctx.output(outputTag, value); At first
glance I don't see a way to do it
In my use case RichAsyncFunction is used to call REST services and I would
like to handle REST error c
Hi flink,
I would like to know if you think that providing access to the side output
from Async I/O cloud be a good idea.
Thanks in advance,
Romain
Hi Julio,
thanks for this great example. I could reproduce it on my machine and I
could find the problem.
You need to store the newly created branch of your pipeline in some
variable like `val test = pipeline.process()` in order to access the
side outputs via `test.getSideOutput
/SideoutputSample.scala
The thing to notice is that we do the split to side outputs _after_ the
window functions -- because we want to split the results just before the
sinks (we had a split there instead, but the job would, sometimes, crash
because "splits can't be used with side outputs", or somethi
Timo
Am 02.04.18 um 21:53 schrieb Julio Biason:
Hey guys,
I have a pipeline that generates two different types of data (but both
use the same trait) and I need to save each on a different sink.
So far, things were working with splits, but it seems using splits
with side outputs (for the
Hey guys,
I have a pipeline that generates two different types of data (but both use
the same trait) and I need to save each on a different sink.
So far, things were working with splits, but it seems using splits with
side outputs (for the late data, which we'll plug a late arrival han
.
>
> The pipeline process lines from our logs and generate different metrics
> based on it (I mean, quite the standard procedure). It uses side outputs
> for dead letter queues, in case it finds something wrong with the logs and
> a metric can't be generated. At the end, b
Hey guys,
I got a weird problem with my pipeline.
The pipeline process lines from our logs and generate different metrics
based on it (I mean, quite the standard procedure). It uses side outputs
for dead letter queues, in case it finds something wrong with the logs and
a metric can'
on adding side outputs
> for the DataSet API. The workaround is to output one common type from a
> function have several parallel filters after that for filtering out the
> elements of the correct type for the respective stream.
>
> Best,
> Aljoscha
>
> On 6. Jun 2017, at 11:15,
Hi Flavio,
As far as I am aware no one is currently working on adding side outputs for the
DataSet API. The workaround is to output one common type from a function have
several parallel filters after that for filtering out the elements of the
correct type for the respective stream.
Best
Hi to all,
will side outputs [FLINK-4460
<https://issues.apache.org/jira/browse/FLINK-4460>] be eventually available
also for batch API?
Best,
Flavio
FLINK-4460
Thanks,
Chen
On Wed, Feb 8, 2017 at 9:04 AM, Newport, Billy wrote:
> I’ve implemented side outputs right now using an enum approach as
> recommended be others. Basically I have a mapper which wants to generate 4
> outputs (DATA, INSERT, UPDATES, DELETE).
>
>
>
&
I've implemented side outputs right now using an enum approach as recommended
be others. Basically I have a mapper which wants to generate 4 outputs (DATA,
INSERT, UPDATES, DELETE).
It emits a Tuple2 right now and I use a 4 following filters to
write each 'stream' to a differe
46 matches
Mail list logo