Connectors are written using ParDos. A connector (source) may use a source
framework (Splittable DoFn is the recommended framework currently) or may
be written using regular ParDos. The main advantages of a source framework
are various features provided by such frameworks (progress reporting,
dynamic work rebalancing, checkpointing, backlog reporting etc.). If your
HTTP endpoint can be used to implement such features it makes sense to use
the source framework. Otherwise I would simply use a regular ParDos.

When it comes to sinks, the most important feature would be to write data
in an idempotent way in the presence of worker failures without writing
duplicate data to the HTTP endpoint. I'm not sure this can be done
efficiently without knowing more details about the nature of the endpoint.

Thanks,
Cham

On Fri, Jun 23, 2023 at 4:48 PM Juan Romero <jsrf...@gmail.com> wrote:

> Hi guys. I have a doubt related with it make sense to create an HTTP
> connector in Apache Beam or simply I can create a PArdo Function that make
> the http request. I want to know which advantages I would have creating an
> IO HTTP connector.
>

Reply via email to