How to setup staging, pre-prod, and production envs for dataflow jobs?

2022-07-08 Thread Shivam Singhal
Hi Community, What is the flow you follow for setting up staging, pre-prod and prod environments for your dataflow jobs? Stackoverflow question *here *

RedisIO Apache Beam JAVA Connector

2022-07-18 Thread Shivam Singhal
Hi everyone, I see that org.apache.beam.sdk.io.redis version 2.20.0 onwards, this connector is marked experimental. I tried to see the changelog

Re: RedisIO Apache Beam JAVA Connector

2022-07-19 Thread Shivam Singhal
will :) Thanks, Shivam Singhal On Mon, 18 Jul 2022 at 22:22, Alexey Romanenko wrote: > Hi Shivam, > > RedisIO is already for quite a long time in Beam, so we may consider it’s > rather stable. I guess it was marked @Experimental since its user API was > changing at that momen

[JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Shivam Singhal
element from CollectionB. Does anyone know a simple method to do this? Thanks, Shivam Singhal

[JAVA] Batch elements from a PCollection

2022-08-10 Thread Shivam Singhal
t batches based in the keys which is not my usecase. Will be great if someone could help in this. Thanks, Shivam Singhal

Re: [JAVA] Batch elements from a PCollection

2022-08-10 Thread Shivam Singhal
Is there no other way than https://stackoverflow.com/a/44956702 ? On Thu, 11 Aug 2022 at 1:00 AM, Shivam Singhal wrote: > I have a PCollection of type KV where each key in those > KVs is unique. > > I would like to split all those KV pairs into batches. This new > PCollection

Re: [JAVA] Handling repeated elements when merging two pcollections

2022-08-10 Thread Shivam Singhal
en one value while keys > without duplicates will have iterables containing exactly one value. > > On Wed, Aug 10, 2022 at 12:25 PM Shivam Singhal < > shivamsinghal5...@gmail.com> wrote: > >> I have two PCollections, CollectionA & CollectionB of type KV> Byte[]&

Automating the e2e testing of flows involving batch beam pipelines

2022-10-13 Thread Shivam Singhal
Hey folks, I have a backend side flow which involves running a batch beam pipeline. We have an automation test which: 1. Writes some mock data to BQ 2. Invokes Dataflow API to run a batch job which reads from BQ and writes the results to BigTable 3. Asserts on the results written by

Memory Leak in streaming pipelines

2022-10-14 Thread Shivam Singhal
that there have been some memory leak fixes in version 28.1 <https://github.com/google/guava/releases/tag/v28.1> and version 30.0 <https://github.com/google/guava/releases/tag/v30.0> Is it possible that Beam is affected by this and beam needs to be upgraded to use a newer version of

Re: Go + Apache Beam GCP Dataflow: Could not find the sink for pubsub, Check that the sink library specifies alwayslink = 1

2023-02-06 Thread Shivam Singhal
Hey Ashok KS, Is this a batch pipeline? On Mon, 6 Feb 2023 at 09:27, Ashok KS wrote: > Hi All, > > Just sending a reminder in case anyone could help. I haven't received any > response to my issue. > > Regards, > Ashok > > On Fri, Feb 3, 2023 at 12:23 AM Ashok KS wrote: > >> Hi All, >> >> I'm n

Re: Go + Apache Beam GCP Dataflow: Could not find the sink for pubsub, Check that the sink library specifies alwayslink = 1

2023-02-06 Thread Shivam Singhal
> > Regards, > Ashok > > On Mon, 6 Feb 2023 at 10:52 pm, Shivam Singhal < > shivamsinghal5...@gmail.com> wrote: > >> Hey Ashok KS, >> >> Is this a batch pipeline? >> >> On Mon, 6 Feb 2023 at 09:27, Ashok KS wrote: >> >>> Hi All,

Re: Go + Apache Beam GCP Dataflow: Could not find the sink for pubsub, Check that the sink library specifies alwayslink = 1

2023-02-06 Thread Shivam Singhal
in Go. > > Any pointers would be appreciated. > > Regards, > Ashok > > On Mon, 6 Feb 2023 at 10:59 pm, Shivam Singhal < > shivamsinghal5...@gmail.com> wrote: > >> The issue is not yet verified by the maintainers but I think the pubsubio >> connector&#x

Re: Go + Apache Beam GCP Dataflow: Could not find the sink for pubsub, Check that the sink library specifies alwayslink = 1

2023-02-06 Thread Shivam Singhal
le to force it as a streaming application by passing --streaming=True > But for my project, they want it in Go so I had to rewrite the complete > logic in Go. > I did the same, but stuck at the last point of publishing it to PubSub. > > On Mon, Feb 6, 2023 at 11:07 PM Shivam Singhal &l

Re: Go + Apache Beam GCP Dataflow: Could not find the sink for pubsub, Check that the sink library specifies alwayslink = 1

2023-02-06 Thread Shivam Singhal
I will be picking the issue up once the maintainers have triaged the issue. On Mon, 6 Feb 2023 at 17:43, Shivam Singhal wrote: > Not sure if there is any solution other than fixing the Go pubsubio > package. > > On Mon, 6 Feb 2023 at 17:41, Ashok KS wrote: > >> Yes, that

Re: Launch Dataflow Flex Templates from Go

2023-02-14 Thread Shivam Singhal
Hey Ashok, If you already have a flex template file and the docker image built, you can use the Dataflow API to run the template. https://cloud.google.com/dataflow/docs/reference/rest On Wed, 15 Feb 2023 at 04:49, Ashok KS wrote: > Hello Beam Community, > > I have written a Dataflow pipeline

Re: Launch Dataflow Flex Templates from Go

2023-02-14 Thread Shivam Singhal
There shouldn’t be much change in the API request irrespective of the SDK language On Wed, 15 Feb 2023 at 10:50, Shivam Singhal wrote: > Hey Ashok, > > If you already have a flex template file and the docker image built, you > can use the Dataflow API to run the templat

How to handle errors in GO SDK in a custom PTransform

2023-04-05 Thread Shivam Singhal
Hi folks, In Java SDK, we have robust error handling using tagged outputs and PCollectionTuples. Do we have something similar in Go SDK? I have been unable to locate it in the reference docs . *A general usecase for error handling wh

Re: How to handle errors in GO SDK in a custom PTransform

2023-04-06 Thread Shivam Singhal
) > > See https://beam.apache.org/documentation/programming-guide/#output-tags > for more details. > > Thanks, > Danny > > On Wed, Apr 5, 2023 at 7:31 AM Shivam Singhal > wrote: > >> Hi folks, >> >> In Java SDK, we have robust error handling using tagged out