Hello, Thank you very much for the reply

I was thinking on branching because I have some heavy processes that I
would like to distribute to other workers, and scale independently of the
other less heavier processes

Does that make sense?

On Wed, Aug 9, 2023 at 12:16 PM John Casey via user <user@beam.apache.org>
wrote:

> Depending on the specifics of your processing, it may be simpler to just
> do both transforms within a single pardo.
>
> i.e.
>
> pipeline.apply(kafka.read())
> .apply(ParDo.of(new UserTransform());
>
> public static class UserTransform extends DoFn<KafkaRecord, Object>{
>
> @ProcessElement
> public void processElement(@Element KafkaRecord record,
> OutputReciever<Object> receiver) {
>       Type1 part1 = something(record);
>       Type2 part2 = somethingElse(record;
>       MergedType merged = merge(part1, part2);
>      receiver.output(merged)
> }
>
> }
>
>
>
> On Wed, Jul 26, 2023 at 11:43 PM Ruben Vargas <ruben.var...@metova.com>
> wrote:
>
>>
>> Hello?
>>
>> Any advice on how to do what I described? I can only found examples of
>> bounded data. Not for streaming.
>>
>>
>>
>> Aldo can I get invited to slack?
>>
>> Thank you very much!
>>
>> El El vie, 21 de julio de 2023 a la(s) 9:34, Ruben Vargas <
>> ruben.var...@metova.com> escribió:
>>
>>> Hello,
>>>
>>> I'm starting using Beam and I would like to know if there is any
>>> recommended pattern for doing the following:
>>>
>>> I have a message coming from Kafka and then I would like to apply two
>>> different transformations and merge them in a single result at the end. I
>>> attached an image that describes the pipeline.
>>>
>>> Each message has its own unique key,
>>>
>>> What I'm doing is using a Session Window with a trigger
>>> elementCountAtLeast with the number equal to the number of process I
>>> expected to generate results (in the case of the diagram will be 2)
>>>
>>> This is the code fragment I used for construct the window:
>>>
>>>         Window<KV<String, OUTPUT>> joinWindow = Window.<KV<String,
>>> OUTPUT>>into(Sessions.withGapDuration(Duration.standardSeconds(60))).triggering(
>>>
>>> Repeatedly.forever(AfterPane.elementCountAtLeast(nProcessWait))
>>>         ).discardingFiredPanes().withAllowedLateness(Duration.ZERO);
>>>
>>>
>>> and then a CoGroupKey to join all of the results. Is this a
>>> recommended approach? Or  is there a recommended way? What happens if at
>>> some points I have a lot of windows "open"?
>>>
>>>
>>> Thank you very much!
>>>
>>>

Reply via email to