Re: Beam 2.60.0 Release

2024-10-02 Thread Yi Hu via dev
Hi everyone, This reply is to note that the 2.60.0 release cut has been done. Currently there are 3 open Issue/PR marked as 2.60.0 milestone [1]. RC1 will be built once the milestone is cleared. Thanks for helping with the release process. [1] https://github.com/apache/beam/milestone/24 On Wed,

Re: Query Regarding Customizing Apache Beam for Sequence-Based Workload Processing

2024-10-02 Thread Kenneth Knowles
Ah. That makes sense, since in batch all you have to do is sort by timestamp when you shuffle (which Dataflow has always done anyhow, to optimize windowing) whereas in streaming you need an OrderedListState-like slack buffer and there's latency of approximately the full allowed lateness. It does s

Re: Request for Reviewers for Spark Runner Improvements

2024-10-02 Thread XQ Hu via dev
Thank you for your contributions! I have added some people from Google to review them. Others from the community are also welcome to do so. On Wed, Oct 2, 2024 at 4:58 AM LDesire wrote: > Hi Beam Dev Community, > > I've made some updates to Spark Runner in Apache Beam. The updates are in > thes

Beam High Priority Issue Report (49)

2024-10-02 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/32603 The PreCommit YAML

Request for Reviewers for Spark Runner Improvements

2024-10-02 Thread LDesire
Hi Beam Dev Community, I've made some updates to Spark Runner in Apache Beam. The updates are in these PRs: - #32546 : optimized to skip filter operation when there is only one output - #32610 : Change to use partitioner in GroupNonMergingWindowsFunctions#groupByKeyInGlobalWindow Please review