[QUESTION] Question abount MqttIO checkpointing

2024-08-17 Thread LDesire
Hi everyone, Is MqttIO's checkpointing logic intentional? When MqttIO receives a large number of messages, for example when using a topic with a regular expression like 'data/#'. It will add these messages to the `MqttCheckpointMark` instance as a list. Then, each time the `MqttCheckpointMark#fi

[QUESTION] about CsvIO

2024-10-09 Thread LDesire
In the CsvIO.parseRows method, does it matter if the number of CSV headers is not the same as the number of fields in the Schema? I'm looking at that method and I don't see any logic anywhere that validates this. I've looked for related tests, but they don't seem to be validated properly. ```

Re: Subject: [RESULT] [VOTE] Release 2.60.0, release candidate #2

2024-10-18 Thread LDesire
Thank you Yi hu!2024. 10. 18. 오후 11:40, Yi Hu via dev 작성:Beam 2.60.0 release has been finalized. Thanks again everyone!On Thu, Oct 17, 2024 at 2:52 PM Yi Hu wrote:Hi everyone,I'm happy to announce that we have unanimously approved this release.There are 7 approving votes, 4 of

Request for Reviewers for Spark Runner Improvements

2024-10-02 Thread LDesire
Hi Beam Dev Community, I've made some updates to Spark Runner in Apache Beam. The updates are in these PRs: - #32546 : optimized to skip filter operation when there is only one output - #32610 : Change to use partitioner in GroupNonMergingWindowsFunctions#groupByKeyInGlobalWindow Please review

[DISCUSS] Initial Implementation of MailIO - Bounded Read with SDF

2024-11-12 Thread LDesire
m / Google Cloud Dataflow github.com Looking forward to your feedback and suggestions. Best regards, LDesire

Re: Joining Apache Beam Slack

2024-11-09 Thread LDesire
Hi. Can I also join the Apache Beam Slack channel? my email is two_som...@icloud.com Thanks. LDesire > 2024. 6. 4. 오전 2:38, Ahmet Altay via dev 작성: > > Sent an invite. Welcome to the Beam community! > > On Fri, May 31, 2024 at 6:00 PM Ben Warren <mailto:b...@sno

[DISCUSS] Potential Use Cases for MailIO Connector

2024-11-12 Thread LDesire
Hello, I am currently working on developing a MailIO connector for Apache Beam. While I have made progress implementing bounded read functionality, I'm somewhat uncertain about the practical use cases where users would need the MailIO connector. The use cases I've considered are: - Bounded Re

Question abount Spark Runner's Filter in parDo

2024-09-22 Thread LDesire
Hello Beam community. I'm currently trying out Spark Runner and while going through the code, I noticed that when evaluating a ParDo operation, it applies too many filter operations (from line 467 in TransformTranslator.java). The original intent of this code seems to be to apply filters becau

[DISCUSS] Implementing Stateful Streaming in Spark Runner without Timer Support

2024-11-25 Thread LDesire
Hello, I have implemented stateful streaming functionality in the Spark Runner and am currently in the testing phase. Since implementing timer functionality seems quite challenging at this point, I am considering creating a PR with state management implementation only. I have added logic to re

[QUESTION] about empty iterator emission in Spark Runner's GroupByKey

2024-11-30 Thread LDesire
Hi, I’m testing a Spark Runner Streaming Pipeline that combines stateful operations with GroupByKey operation (using org.apache.beam.runners.spark.stateful.SparkGroupAlsoByWindowViaWindowSet). During testing, I’ve observed that GroupByKey occasionally emits empty iterators. I’m wondering if th

Re: Integrating LLMs as PTransform in beam pipeline

2024-12-01 Thread LDesire
Hi Ganesh, I'm very excited to see this new PTransform for Beam's Java SDK. Thank you for sharing this valuable contribution. Best regards > 2024. 12. 1. 오후 11:46, Ganesh 작성: > > Hi All, > > I've been working on a custom Ptransform that integrates large language > models as a PTransform

Request for Reviewers for Spark Runner Improvements

2024-12-04 Thread LDesire
Hi Beam Dev Community, I've made some updates to Spark Runner in Apache Beam. The updates are in these PRs: - #33212 : Added OrderedListState support for SparkRunner - #33267 : Added support for SparkRunner s

Re: [ANNOUNCE] New PMC Member: Danny McCormick

2024-12-20 Thread LDesire
Congratulations Danny! 😀

[Improvement Proposal] Enhancing Custom Metrics Visibility in Flink Runner

2025-03-04 Thread LDesire
Hello Apache Beam Community, While working with the Flink Runner, I've noticed that custom metrics are not easily visible in the Flink UI. (They can be checked in the Metrics tab of the Flink UI.) Typical users would expect to see these metrics in the Accumulators tab as well. However, in the

Re: [ANNOUNCE] New Committer: Vitaly Terentev

2025-03-24 Thread LDesire
Congraturation Vitaly!2025. 3. 25. 오전 7:42, Rakesh Kumar via dev 작성:Congrats Vitaly!!!On Mon, Mar 24, 2025 at 2:37 PM Yi Hu via dev wrote:Congratulations Vitaly!On Mon, Mar 24, 2025 at 3:28 PM Robert Burke wrote:Congratulations Vitaly!On Mon, Mar 24, 202