On Thu, Aug 20, 2020 at 12:54 PM Talat Uyarer
wrote:
> Hi Lucas,
>
>> Not really. It is more about pipeline complexity, logging, debugging,
>> monitoring which become more complex.
>
> Should I use a different consumer group or should I use the same consumer
> group ?
>
I don't know what you're a
Hi Lucas,
> Not really. It is more about pipeline complexity, logging, debugging,
> monitoring which become more complex.
Should I use a different consumer group or should I use the same consumer
group ? And also How Autoscaling will decide worker count ?
What do you mean by it's not working pr
Do you mean I can put my simple pipeline multiple times for all topics in
one dataflow job ?
Yes
Is there any side effect having multiple independent DAG on one DF job ?
Not really. It is more about pipeline complexity, logging, debugging,
monitoring which become more complex.
And also why the Tu
Filter step is an independent step. We can think it is an etl step or
something else. MessageExtractor step writes messages on TupleTags based on
the kafka header. Yes, MessageExtractor is literally a multi-output DoFn
already. MessageExtractor is processing 48kps but branches are processing
their
Hi Robert,
I calculated process speed based on worker count. When I have
separate jobs. topic1 job used 5 workers, topic2 job used 7 workers. Based
on KafkaIO message count. they had 4kps processing speed per worker. After
I combine them in one df job. That DF job started using ~18 workers, not 12
Is this 2kps coming out of Filter1 + 2kps coming out of Filter2 (which
would be 4kps total), or only 2kps coming out of KafkaIO and
MessageExtractor?
Though it /shouldn't/ matter, due to sibling fusion, there's a chance
things are getting fused poorly and you could write Filter1 and
Filter2 instea
Hi,
I have a very simple DAG on my dataflow job. (KafkaIO->Filter->WriteGCS).
When I submit this Dataflow job per topic it has 4kps per instance
processing speed. However I want to consume two different topics in one DF
job. I used TupleTag. I created TupleTags per message type. Each topic has
dif