Hi Apache Beam team. I have been working on a POC for the company i'm
working for and Im using apache beam kafka connector to read from kafka
topic and write into other kafka topic. The source and target topic have 3
partitions and is compulsory keep ordering by certain message keys.
Regarding it I
thin a partition
> without any extra work by you.
>
> For 2, you can use .commitOffsetsInFinalize to only commit back to the
> source topic once the pipeline has persisted the message, at which point it
> may not be fully processed, but it is guaranteed that it will be processed.
>
Hi.
I have an issue when I try to run a kafka io pipeline in python on my local
machine, because in my local machine it is not possible to install docker.
Seems that beam try to use docker to pull and start the beam java sdk i
order to start the expansion service. I tried to start manually the
exp
Hi community.
On this occasion I have a doubt regarding how to read a stream from kafka
and write batches of data with the jdbc connector. The idea is to override
a specific row if the current row we want to insert into has the same id
and the load_date_time is greater. The conceptual pipeline loo
Hi. Can someone help me with this?
El mié, 19 abr 2023 a las 15:08, Juan Romero () escribió:
> Hi community.
>
> On this occasion I have a doubt regarding how to read a stream from kafka
> and write batches of data with the jdbc connector. The idea is to override
> a specific row
Hi guys. I have a doubt related with it make sense to create an HTTP
connector in Apache Beam or simply I can create a PArdo Function that make
the http request. I want to know which advantages I would have creating an
IO HTTP connector.
Splittable DoFn didn't exist at that time.
>
> So, if we want to move forward on HTTP/REST, we have to list the
> features and expected behavior.
>
> Regards
> JB
>
> On Sat, Jun 24, 2023 at 1:47 AM Juan Romero wrote:
> >
> > Hi guys. I have a doubt rela
Hello, I'm trying to identify what is the workaround given different error
scenarios when I'm reading from kafka in apache beam on google dataflow.
1) from [1] it says "Dataflow also does have in-place pipeline update that
restores the persisted checkpoints from one pipeline to another" --> that
m
Hi guys. I want to ask you about how to deal with the scenario when the
target sink (eg: jdbc, kafka, bigquery, pubsub etc) fails for any reason
and i don't want to lost the message and create a bottleneck with many
errors due an hypothetical target sink problem, and i want to use
with_excpetion_h
radshaw
> wrote:
>
>> Currently error handling is implemented on sinks in an ad-hoc basis
>> (if at all) but John (cc'd) is looking at improving things here.
>>
>> On Mon, Dec 4, 2023 at 10:25 AM Juan Romero wrote:
>> >
>> > Hi guys. I wa
o do something other than simply retrying
> failures
>
> On Wed, Dec 6, 2023 at 10:44 AM Juan Romero wrote:
>
>> But , is it not possible to get the message that can't reach the target
>> sink and put it in another target (eg: kafka error topic where we can
>> ver
las 11:23, John Casey ()
escribió:
> For the moment, yes.
>
> On Wed, Dec 6, 2023 at 11:21 AM Juan Romero wrote:
>
>> Thanks John. Is it the same case if i want to write in a postgres table
>> with the sql connector?
>>
>> El mié, 6 dic 2023 a las 11:05, John Cas
Hi guys. Im looking for how configure a custom query (statement parameter
in the connector) only for update a register. Can you help me with that?
password='postgres',
table_name= 'test',
connection_properties="stringtype=unspecified",
statement='INSERT INTO test \
VALUES(?,?,?) \
ON CONFLICT (id)\
DO U
Hello,
Currently Im trying to make inserts to a postgres table with this schema:
id: int,
message: jsonb,
source: str
the issue is that I cant find a way to specify the jsonb type on the
RowCoder, so I tried specifying dict and json on the NamedTuple but it
doesnt work. How should I create the R
Hi guys. I have a table in apache beam that has an auto increment id with a
sequence.
*CREATE SEQUENCE sap_tm_customer_id_seq;CREATE TABLE IF NOT EXISTS test (
id bigint DEFAULT nextval('sap_tm_customer_id_seq'::regclass), name
VARCHAR(10), load_date_time TIMESTAMP);*
And i have the fol
Hi guys.
Currently i have the class SaptmCustomerUpsertPipeline that inherited from
the GenericPipeline class. Locally the pipeline works fine, but when i try
to run it in dataflow i got this error:
severity: "ERROR"
textPayload: "Error message from worker: generic::unknown: Traceback (most
recen
main-session
>
> Alternatively, structure your pipeline as a package:
> https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#nonpython
>
>
> On Fri, Feb 23, 2024 at 6:14 PM Juan Romero wrote:
>
>> Hi guys.
>>
>> Currently i have the class Sap
VALUES(DEFAULT, ?,?::timestamp)"
>
> DEFAULT fills in the id using sap_tm_customer_id_seq.
>
> I hope this is what you are looking for.
>
>
> On Mon, Feb 19, 2024 at 5:57 PM Juan Romero wrote:
>
>> Hi guys. I have a table in apache beam that has an auto increment id with
>&g
Hi guys, I have a question related to how the snowflake connector works
under the hood and how it inserts the streaming data. Is it using snowflake
streaming? . i'm looking forward guys.
Hi guys. Good morning.
I haven't done some test in apache beam over data flow in order to see if i
can do an hot update or hot swap meanwhile the pipeline is processing a
bunch of messages that fall in a time window of 10 minutes. What I saw is
that when I do a hot update over the pipeline and cur
cessed by
> either one or the other.
>
> [1]
> https://cloud.google.com/dataflow/docs/guides/stopping-a-pipeline#drain
>
> On Mon, Apr 15, 2024 at 9:33 AM Juan Romero wrote:
>
>> Hi guys. Good morning.
>>
>> I haven't done some test in apache beam over data
22 matches
Mail list logo