Doubts about kafka connector

2023-04-13 Thread Juan Romero
Hi Apache Beam team. I have been working on a POC for the company i'm working for and Im using apache beam kafka connector to read from kafka topic and write into other kafka topic. The source and target topic have 3 partitions and is compulsory keep ordering by certain message keys. Regarding it I

Re: Doubts about kafka connector

2023-04-13 Thread Juan Romero
thin a partition > without any extra work by you. > > For 2, you can use .commitOffsetsInFinalize to only commit back to the > source topic once the pipeline has persisted the message, at which point it > may not be fully processed, but it is guaranteed that it will be processed. >

Avoid using docker when I use a external transformation

2023-04-18 Thread Juan Romero
Hi. I have an issue when I try to run a kafka io pipeline in python on my local machine, because in my local machine it is not possible to install docker. Seems that beam try to use docker to pull and start the beam java sdk i order to start the expansion service. I tried to start manually the exp

Can I batch data when i use JDBC write operation?

2023-04-19 Thread Juan Romero
Hi community. On this occasion I have a doubt regarding how to read a stream from kafka and write batches of data with the jdbc connector. The idea is to override a specific row if the current row we want to insert into has the same id and the load_date_time is greater. The conceptual pipeline loo

Re: Can I batch data when i use JDBC write operation?

2023-04-20 Thread Juan Romero
Hi. Can someone help me with this? El mié, 19 abr 2023 a las 15:08, Juan Romero () escribió: > Hi community. > > On this occasion I have a doubt regarding how to read a stream from kafka > and write batches of data with the jdbc connector. The idea is to override > a specific row

Create IO connector for HTTP or ParDO

2023-06-23 Thread Juan Romero
Hi guys. I have a doubt related with it make sense to create an HTTP connector in Apache Beam or simply I can create a PArdo Function that make the http request. I want to know which advantages I would have creating an IO HTTP connector.

Re: Create IO connector for HTTP or ParDO

2023-06-26 Thread Juan Romero
Splittable DoFn didn't exist at that time. > > So, if we want to move forward on HTTP/REST, we have to list the > features and expected behavior. > > Regards > JB > > On Sat, Jun 24, 2023 at 1:47 AM Juan Romero wrote: > > > > Hi guys. I have a doubt rela

Disaster recovery in dataflow with kafka connector

2023-07-14 Thread Juan Romero
Hello, I'm trying to identify what is the workaround given different error scenarios when I'm reading from kafka in apache beam on google dataflow. 1) from [1] it says "Dataflow also does have in-place pipeline update that restores the persisted checkpoints from one pipeline to another" --> that m

Streaming management exception in the sink target.

2023-12-04 Thread Juan Romero
Hi guys. I want to ask you about how to deal with the scenario when the target sink (eg: jdbc, kafka, bigquery, pubsub etc) fails for any reason and i don't want to lost the message and create a bottleneck with many errors due an hypothetical target sink problem, and i want to use with_excpetion_h

Re: Streaming management exception in the sink target.

2023-12-06 Thread Juan Romero
radshaw > wrote: > >> Currently error handling is implemented on sinks in an ad-hoc basis >> (if at all) but John (cc'd) is looking at improving things here. >> >> On Mon, Dec 4, 2023 at 10:25 AM Juan Romero wrote: >> > >> > Hi guys. I wa

Re: Streaming management exception in the sink target.

2023-12-06 Thread Juan Romero
o do something other than simply retrying > failures > > On Wed, Dec 6, 2023 at 10:44 AM Juan Romero wrote: > >> But , is it not possible to get the message that can't reach the target >> sink and put it in another target (eg: kafka error topic where we can >> ver

Re: Streaming management exception in the sink target.

2023-12-06 Thread Juan Romero
las 11:23, John Casey () escribió: > For the moment, yes. > > On Wed, Dec 6, 2023 at 11:21 AM Juan Romero wrote: > >> Thanks John. Is it the same case if i want to write in a postgres table >> with the sql connector? >> >> El mié, 6 dic 2023 a las 11:05, John Cas

Update statement with jdbc connector

2024-01-17 Thread Juan Romero
Hi guys. Im looking for how configure a custom query (statement parameter in the connector) only for update a register. Can you help me with that?

Re: Update statement with jdbc connector

2024-01-17 Thread Juan Romero
password='postgres', table_name= 'test', connection_properties="stringtype=unspecified", statement='INSERT INTO test \ VALUES(?,?,?) \ ON CONFLICT (id)\ DO U

How specify jsonb type on RowCoder for jdbc connector

2024-02-06 Thread Juan Romero
Hello, Currently Im trying to make inserts to a postgres table with this schema: id: int, message: jsonb, source: str the issue is that I cant find a way to specify the jsonb type on the RowCoder, so I tried specifying dict and json on the NamedTuple but it doesnt work. How should I create the R

Problem in jdbc connector with autoincrement value

2024-02-19 Thread Juan Romero
Hi guys. I have a table in apache beam that has an auto increment id with a sequence. *CREATE SEQUENCE sap_tm_customer_id_seq;CREATE TABLE IF NOT EXISTS test ( id bigint DEFAULT nextval('sap_tm_customer_id_seq'::regclass), name VARCHAR(10), load_date_time TIMESTAMP);* And i have the fol

Problem with pikcler

2024-02-23 Thread Juan Romero
Hi guys. Currently i have the class SaptmCustomerUpsertPipeline that inherited from the GenericPipeline class. Locally the pipeline works fine, but when i try to run it in dataflow i got this error: severity: "ERROR" textPayload: "Error message from worker: generic::unknown: Traceback (most recen

Re: Problem with pikcler

2024-02-24 Thread Juan Romero
main-session > > Alternatively, structure your pipeline as a package: > https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#nonpython > > > On Fri, Feb 23, 2024 at 6:14 PM Juan Romero wrote: > >> Hi guys. >> >> Currently i have the class Sap

Re: Problem in jdbc connector with autoincrement value

2024-02-24 Thread Juan Romero
VALUES(DEFAULT, ?,?::timestamp)" > > DEFAULT fills in the id using sap_tm_customer_id_seq. > > I hope this is what you are looking for. > > > On Mon, Feb 19, 2024 at 5:57 PM Juan Romero wrote: > >> Hi guys. I have a table in apache beam that has an auto increment id with >&g

How the snowflake connector works internally

2024-03-11 Thread Juan Romero
Hi guys, I have a question related to how the snowflake connector works under the hood and how it inserts the streaming data. Is it using snowflake streaming? . i'm looking forward guys.

Hot update in dataflow without lossing messages

2024-04-15 Thread Juan Romero
Hi guys. Good morning. I haven't done some test in apache beam over data flow in order to see if i can do an hot update or hot swap meanwhile the pipeline is processing a bunch of messages that fall in a time window of 10 minutes. What I saw is that when I do a hot update over the pipeline and cur

Re: Hot update in dataflow without lossing messages

2024-04-15 Thread Juan Romero
cessed by > either one or the other. > > [1] > https://cloud.google.com/dataflow/docs/guides/stopping-a-pipeline#drain > > On Mon, Apr 15, 2024 at 9:33 AM Juan Romero wrote: > >> Hi guys. Good morning. >> >> I haven't done some test in apache beam over data