Request to join slack channel

2024-02-20 Thread Lydian Lee
Hi, Can I get the invitation to join slack channel ? The ASF slack seems required invitation to be able to join. Thanks

how to enable debugging mode for python worker harness

2024-03-17 Thread Lydian Lee
Hi, I have an issue when setting up a POC of Python SDK with Flink runner to run in docker-compose. The python worker harness was not returning any error but: ``` python-worker-harness-1 | 2024/03/17 07:10:17 Executing: python -m apache_beam.runners.worker.sdk_worker_main python-worker-harness-

Re: how to enable debugging mode for python worker harness

2024-03-17 Thread Lydian Lee
the flink task manager. Otherwise, it > won't be able to launch a Docker container for reading kafka messages. > > Cheers, > Jaehyeon > > > On Sun, 17 Mar 2024 at 18:21, Lydian Lee wrote: > >> Hi, >> >> I have an issue when setting up a POC of Pytho

Re: how to enable debugging mode for python worker harness

2024-03-17 Thread Lydian Lee
also runs well, > > Maybe this is related to your Mac setting? > > On Sun, Mar 17, 2024 at 8:34 AM Lydian Lee > wrote: > >> >> Hi, >> >> Just FYI, the similar things works on a different image with the one I >> built using my company’s image as base im

Creating custom metrics with labels (python SDK + Flink Runner)

2024-03-25 Thread Lydian Lee
Hi, I am using beam python SDK with flink runner, and I am trying to add custom labels to the metrics. It seems like the provided function (link ) doesn't allow me to add labels. ``` @staticmethod def co

Re: how to enable debugging mode for python worker harness

2024-03-31 Thread Lydian Lee
one. Thanks! On Mon, Mar 18, 2024 at 6:24 PM XQ Hu via user wrote: > I did not do anything special but ran `docker-compose -f > docker-compose.yaml up` from your repo. > > On Sun, Mar 17, 2024 at 11:38 PM Lydian Lee > wrote: > >> Hi XQ, >> >> The code is sim

Beam dropping events from Kafka after reshuffle ?

2024-09-17 Thread Lydian Lee
Hi, We are using Beam Python SDK with Flink Runner, the Beam version is 2.41.0 and the Flink version is 1.15.4. We have a pipeline that has 2 stages: 1. read from kafka and fixed window for every 1 minute 2. aggregate the data for the past 1 minute and reshuffle so that we have less partition cou

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-18 Thread Lydian Lee
/runners/core/LateDataDroppingDoFnRunner.java#L132 > On 9/18/24 08:53, Lydian Lee wrote: > > I would love to, but there are some limitations on our ends that the > version bump won’t be happened soon. Thus I need to figure out what might > be the root cause though. > > > On Tue,

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-18 Thread Lydian Lee
he events into S3 with parquet format ) Thanks! On Wed, Sep 18, 2024 at 3:16 PM Reuven Lax via user wrote: > How are you doing this aggregation? > > On Wed, Sep 18, 2024 at 3:11 PM Lydian Lee > wrote: > >> Hi Jan, >> >> Thanks for the recommendation. In our c

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-23 Thread Lydian Lee
should use the default, > which should be processing time, so this should not be the issue. The one > potentially left transform is the sink transform, as Reuven mentioned. Can > you share details of the implementation? > > Jan > On 9/19/24 23:10, Lydian Lee wrote: > > Hi, Jan &

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-23 Thread Lydian Lee
times. The duplicates might > also appear different if the Iterables are slightly different on retries, > especially in the case when Flink restarts a checkpoint. > > Reuven > > On Mon, Sep 23, 2024 at 1:40 PM Lydian Lee > wrote: > >> Hi Jan, >> >> Thanks so

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-19 Thread Lydian Lee
t;Adding 'trigger_processing_time' timestamp" >> beam.Map(lambda event: > window.TimestampedValue(event, time.time())) > > This does not change the assigned timestamp of an element, but creates a > new element which contains processing time. It will not be used for &g

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-17 Thread Lydian Lee
t version to see if this > issue is still present? There were lots of changes between 2.41.0 and > 2.59.0. > > Jan > On 9/17/24 17:49, Lydian Lee wrote: > > Hi, > > We are using Beam Python SDK with Flink Runner, the Beam version is 2.41.0 > and the Flink version is 1.

Re: Beam dropping events from Kafka after reshuffle ?

2024-09-25 Thread Lydian Lee
; flink. (parallelism 1), and happen really rarely in production. >> >> So it means we can't control the timestamp of an item even with >> `window.TimestampedValue(event, time.time()))`? >> >> Best, >> >> Marc >> >> >> On Tue, Se

Unable to run in python-worker-harness after bumpping from 2.41.0 to 2.60.0

2025-01-26 Thread Lydian Lee
Hi, We are trying to bump an old pipeline using flink runner and beam python SDK. The version changes are: - Flink: 1.15.4 -> 1.18.0 - Beam: 2.41.0 -> 2.60.0 - python: 3.9. (no change) However, after bumping the version, we noticed that the python-worker-harness is unable to process function prop

Re: Unable to run in python-worker-harness after bumpping from 2.41.0 to 2.60.0

2025-01-27 Thread Lydian Lee
Hi, It seems like there's also similar issue from others: https://github.com/apache/beam/issues/29683 For me, here's how I launch the job with the following argos: --runner=portableRunner --streaming --environment_type=EXTERNAL --environment_config=localhost:5 --experiments=disable_logging_su

Re: Unable to run in python-worker-harness after bumpping from 2.41.0 to 2.60.0

2025-01-29 Thread Lydian Lee
apache/beam/blob/master/sdks/python/container/boot.go#L253 > . > > It may be due to failure of the Python harness process. > > On Tue, 28 Jan 2025 at 06:22, Lydian Lee wrote: > >> Hi, >> >> It seems like there's also similar issue from others: >> htt