Re: [ANNOUNCE] New Committer: Shunping Huang

2025-06-07 Thread LDesire
Congratulations Shuping! 🥳🥳🥳 > 2025. 6. 7. 오후 10:24, Kenneth Knowles 작성: > > Hi all, > > Please join me and the rest of the Beam PMC in welcoming a new committer: > Shunping Huang (shunp...@apache.org ). > > Shunping has been contributing to Beam since 2023. He s

Re: [ANNOUNCE] New Committer: Vitaly Terentev

2025-03-24 Thread LDesire
Congraturation Vitaly!2025. 3. 25. 오전 7:42, Rakesh Kumar via dev 작성:Congrats Vitaly!!!On Mon, Mar 24, 2025 at 2:37 PM Yi Hu via dev wrote:Congratulations Vitaly!On Mon, Mar 24, 2025 at 3:28 PM Robert Burke wrote:Congratulations Vitaly!On Mon, Mar 24, 202

[Improvement Proposal] Enhancing Custom Metrics Visibility in Flink Runner

2025-03-04 Thread LDesire
Hello Apache Beam Community, While working with the Flink Runner, I've noticed that custom metrics are not easily visible in the Flink UI. (They can be checked in the Metrics tab of the Flink UI.) Typical users would expect to see these metrics in the Accumulators tab as well. However, in the

Re: [ANNOUNCE] New PMC Member: Danny McCormick

2024-12-20 Thread LDesire
Congratulations Danny! 😀

Request for Reviewers for Spark Runner Improvements

2024-12-04 Thread LDesire
Hi Beam Dev Community, I've made some updates to Spark Runner in Apache Beam. The updates are in these PRs: - #33212 : Added OrderedListState support for SparkRunner - #33267 : Added support for SparkRunner s

Re: Integrating LLMs as PTransform in beam pipeline

2024-12-01 Thread LDesire
Hi Ganesh, I'm very excited to see this new PTransform for Beam's Java SDK. Thank you for sharing this valuable contribution. Best regards > 2024. 12. 1. 오후 11:46, Ganesh 작성: > > Hi All, > > I've been working on a custom Ptransform that integrates large language > models as a PTransform

[QUESTION] about empty iterator emission in Spark Runner's GroupByKey

2024-11-30 Thread LDesire
Hi, I’m testing a Spark Runner Streaming Pipeline that combines stateful operations with GroupByKey operation (using org.apache.beam.runners.spark.stateful.SparkGroupAlsoByWindowViaWindowSet). During testing, I’ve observed that GroupByKey occasionally emits empty iterators. I’m wondering if th

[DISCUSS] Implementing Stateful Streaming in Spark Runner without Timer Support

2024-11-25 Thread LDesire
Hello, I have implemented stateful streaming functionality in the Spark Runner and am currently in the testing phase. Since implementing timer functionality seems quite challenging at this point, I am considering creating a PR with state management implementation only. I have added logic to re

[DISCUSS] Initial Implementation of MailIO - Bounded Read with SDF

2024-11-12 Thread LDesire
m / Google Cloud Dataflow github.com Looking forward to your feedback and suggestions. Best regards, LDesire

[DISCUSS] Potential Use Cases for MailIO Connector

2024-11-12 Thread LDesire
Hello, I am currently working on developing a MailIO connector for Apache Beam. While I have made progress implementing bounded read functionality, I'm somewhat uncertain about the practical use cases where users would need the MailIO connector. The use cases I've considered are: - Bounded Re

Re: Joining Apache Beam Slack

2024-11-09 Thread LDesire
Hi. Can I also join the Apache Beam Slack channel? my email is two_som...@icloud.com Thanks. LDesire > 2024. 6. 4. 오전 2:38, Ahmet Altay via dev 작성: > > Sent an invite. Welcome to the Beam community! > > On Fri, May 31, 2024 at 6:00 PM Ben Warren <mailto:b...@sno

Re: Subject: [RESULT] [VOTE] Release 2.60.0, release candidate #2

2024-10-18 Thread LDesire
Thank you Yi hu!2024. 10. 18. 오후 11:40, Yi Hu via dev 작성:Beam 2.60.0 release has been finalized. Thanks again everyone!On Thu, Oct 17, 2024 at 2:52 PM Yi Hu wrote:Hi everyone,I'm happy to announce that we have unanimously approved this release.There are 7 approving votes, 4 of

[QUESTION] about CsvIO

2024-10-09 Thread LDesire
In the CsvIO.parseRows method, does it matter if the number of CSV headers is not the same as the number of fields in the Schema? I'm looking at that method and I don't see any logic anywhere that validates this. I've looked for related tests, but they don't seem to be validated properly. ```

Request for Reviewers for Spark Runner Improvements

2024-10-02 Thread LDesire
Hi Beam Dev Community, I've made some updates to Spark Runner in Apache Beam. The updates are in these PRs: - #32546 : optimized to skip filter operation when there is only one output - #32610 : Change to use partitioner in GroupNonMergingWindowsFunctions#groupByKeyInGlobalWindow Please review

Question abount Spark Runner's Filter in parDo

2024-09-22 Thread LDesire
Hello Beam community. I'm currently trying out Spark Runner and while going through the code, I noticed that when evaluating a ParDo operation, it applies too many filter operations (from line 467 in TransformTranslator.java). The original intent of this code seems to be to apply filters becau

[QUESTION] Question abount MqttIO checkpointing

2024-08-17 Thread LDesire
Hi everyone, Is MqttIO's checkpointing logic intentional? When MqttIO receives a large number of messages, for example when using a topic with a regular expression like 'data/#'. It will add these messages to the `MqttCheckpointMark` instance as a list. Then, each time the `MqttCheckpointMark#fi