Hello,
I have a use case where I have a bounded source and I am reading Avro files
from Google Cloud Storage. I am also using group by transform.The amount of
data is huge and I need to process the data in sequential order.
But as Bounded source reads everything it seemed to be a good idea fixed
pear in the source files?
> Do you have a lot of keys or very few?
>
> If you want to process all the data across all the files in sequential
> order with no parallelism then Apache Beam may not provide much value since
> its basis is all about parallel data processing.
>
>
mestamp ->
>>> Processing
>>>
>>>
>>> Regards,
>>> Neha
>>>
>>>
>>> On Thu, Apr 16, 2020, 6:57 PM Luke Cwik wrote:
>>>
>>>> What do you mean by in sequential order, order across files, keys, ...?
>>>
Hi Team,
We have a beam pipeline written in java sdk. Since the apache kafka-io does
not have support for topic-regex, we have forked the kafka-io and added the
regex support.
However when we are running the pipeline using custom-kafka-io, we do not
see metrics exposed by kafka-io such as element