Re: GSoC 2025 Wrap-Up: New Tools for GCP Automation & Security

2025-09-12 Thread Robert Burke
This is phenomenal work! Thanks for the hard work, and improving Beam's infra automation! Thank you too Pablo for mentoring! Robert Burke On Fri, Sep 12, 2025, 9:58 PM Enrique Calderon wrote: > Hi everyone, > > As my Google Summer of Code 2025 project with Apache Beam comes to

Re: Beam Infrastructure: Health Status Report for August 2025

2025-09-11 Thread Robert Burke
Phenomenal! On Thu, Sep 11, 2025, 6:33 AM XQ Hu via dev wrote: > Great job! Thanks for sharing! > > On Thu, Sep 11, 2025 at 9:28 AM Vitaly Terentyev via dev < > dev@beam.apache.org> wrote: > >> Dear Community, >> >> August showed steady performance for Beam Infrastructure & Health, with >> overa

Re: [Announce] New Committer: Joey Tran

2025-08-20 Thread Robert Burke
Congratulations Joey and Well-deserved! On Wed, Aug 20, 2025, 12:39 PM Ahmed Abualsaud via dev wrote: > Congrats Joey! Thank you for the contributions and discussions > > On Wed, Aug 20, 2025 at 3:28 PM Ahmet Altay via dev > wrote: > >> Congratulations, Joey! >> >> On Wed, Aug 20, 2025 at 12:04

Re: Beam 2.66.0 Release

2025-07-01 Thread Robert Burke
Congratulations and good work! On Tue, Jul 1, 2025, 12:02 PM Shunping Huang wrote: > Thank you Vitaly! > > On Tue, Jul 1, 2025 at 2:52 PM XQ Hu via dev wrote: > >> Great job, Vitaly! >> >> On Tue, Jul 1, 2025 at 2:12 PM Vitaly Terentyev via dev < >> dev@beam.apache.org> wrote: >> >>> Hi, >>> >>

Re: MultimapState missing from Python SDK

2025-06-17 Thread Robert Burke
:52 PM Robert Bradshaw > wrote: > >> On Tue, Jun 17, 2025 at 3:57 PM Robert Burke wrote: >> > >> > +1 >> > >> > You should be able to use the Prism runner to implement this locally. >> > >> > Prism passes the full suite of java Multi

Re: MultimapState missing from Python SDK

2025-06-17 Thread Robert Burke
+1 You should be able to use the Prism runner to implement this locally. Prism passes the full suite of java MultimapState tests, and will ensure the implementation works on runners like Google Cloud Dataflow. I do not recommend spending time implementing multimap state in the python direct runn

Re: [ANNOUNCE] New Committer: Shunping Huang

2025-06-07 Thread Robert Burke
Congratulations Shunping! On Sat, Jun 7, 2025, 7:02 AM XQ Hu via dev wrote: > Congratulations Shunping! Thanks a lot for your contributions! > > On Sat, Jun 7, 2025 at 9:29 AM LDesire wrote: > >> Congratulations Shuping! 🥳🥳🥳 >> >> >> >> 2025. 6. 7. 오후 10:24, Kenneth Knowles 작성: >> >> Hi all, >

Re: Bug Found when running Beam YAML JOB in Prism and Flink Runner

2025-05-01 Thread Robert Burke
At least for Prism this may be the issue being chased down WRT Prism not periodically checkpointing an Unbounded SDF. The Deadline Exceeded is a bit odd through. When I look at the logs it seems to be from the State channel. This doesn't eliminate my hypothesis though. On Thu, May 1, 2025, 6:22 A

Re: [ANNOUNCE] New Committer: Vitaly Terentev

2025-03-24 Thread Robert Burke
Congratulations Vitaly! On Mon, Mar 24, 2025, 12:26 PM Vitalii Terentev wrote: > Thanks everyone! Glad to be part of the team! >

Innaproppriate cache usage on PR validation suites?

2025-02-28 Thread Robert Burke
I just sent out a PR for Prism, and the java suite executed way too fast: https://github.com/apache/beam/actions/runs/13597430640/attempts/1?pr=34132 And when I looked it said: > Task :runners:prism:java:prismLoopbackValidatesRunnerTests FROM-CACHE which it probably shouldn't be doing for most

Re: Beam High Priority Issue Report (31)

2025-02-06 Thread Robert Burke
The content does strike me as a dashboard snapshot more than something for the list. Really, it feels like something we could have as a badge link on the main repo page, to see the latest results, instead of sending an email. The trick there is visibility. If we have a way to kick the results gen

Re: FlumeJava - what happened to PObjects

2025-01-31 Thread Robert Burke
Really the main difference that makes it a little more complicated (but not terribly so) is Flume isn't windowed (along with the metadata). Beam would need to make that up front and available for use. On Fri, Jan 31, 2025, 9:50 AM Robert Bradshaw via dev wrote: > On Fri, Jan 31, 2025 at 8:19 AM

Re: FlumeJava - what happened to PObjects

2025-01-31 Thread Robert Burke
The main thing stopping a low impedance version of collection materialization is the lack of a robust notion of common storage across the submission program and different runners. We almost have this with runner specific staging directories, but we don't make them a canonical facet of Beam executi

Re: [python] flatten unzipping

2025-01-27 Thread Robert Burke
Chiming in for the Prism Runner implementation. I mostly didn't implement this in Prism because I had reached my limit in dealing with the Graph at the time. The Python SDK approach to the optimizations is very set theory based, which isn't as natural (to me at least) for handling the flatten unz

Re: Using resource hints or annotations for transform expansion

2025-01-14 Thread Robert Burke
+1 to Danny's comments. Technically the place to document these on a broader runner perspective should be the Runner Capability matrix. A similar hint would be "FanoutRatio" which can mark transforms that have a high fanout per element and lead the runner to make different optimization decisions.

A Beam Development podcast, via AI!

2024-12-20 Thread Robert Burke
notes for the episode even link to the PRs, copied below. I think it's a pretty neat way of getting a summary of what's going on. Cheers, Robert Burke Performance Benchmarking Framework for Dataflow: [#33297]( https://github.com/apache/beam/pull/33297) - YAML Transform RunInference with Ve

Re: [ANNOUNCE] New PMC Member: Danny McCormick

2024-12-20 Thread Robert Burke
Congratulations Danny! On Fri, Dec 20, 2024, 1:18 PM Jan Lukavský wrote: > Congrats Danny! Well deserved! > On 12/20/24 21:22, Danny McCormick via dev wrote: > > Thanks everyone! I'm excited and honored to join! > > On Fri, Dec 20, 2024 at 3:08 PM Ravi Magham > wrote: > >> Congrats Danny ! >> >

Re: Beam 2.62.0 Release

2024-12-12 Thread Robert Burke
Seems reasonable to me! Thanks for volunteering Kenn! On Thu, Dec 12, 2024, 1:31 PM Kenneth Knowles wrote: > Hi all, > > I'd like to volunteer to manage the next Beam release, which is 2.62.0. > > The branch cut is scheduled for December 25! > > That doesn't seem like a great time to cut a relea

Re: Distroless container image naming convention

2024-11-26 Thread Robert Burke
While the Go SDK is already default distroless, we probably should add an alias with _distroless for the Go SDK images to avoid questions like "why does Go not have a distroless image?" or avoids the incorrect assumption for users that want to ensure the property. As for whether the distroless ver

Re: [VOTE][vendor-grpc] Vendored Dependencies Release

2024-10-22 Thread Robert Burke
+1 (binding) Robert Burke On 2024/10/17 18:43:39 Yi Hu via dev wrote: > Hi everyone, > Please review and vote on the release candidate #1 for > beam-vendor-grpc-1_60_1 version 0.3, as follows: > [ ] +1, Approve the release > [ ] -1, Do not approve the release (please provide sp

Re: [VOTE] Release 2.60.0, release candidate #2

2024-10-16 Thread Robert Burke
+1 (binding) I have validated the prism release artifacts for linux-amd64, and that they work with the Python and Java SDK starters. I've also looked over and found a typo in the 2.60.0 blog post. Thanks Yi! Robert Burke On 2024/10/16 17:47:51 Yi Hu via dev wrote: > Thanks for catchi

Re: [VOTE] Release 2.60.0, release candidate #2

2024-10-16 Thread Robert Burke
-1 (binding) Unable to validate Prism artifacts as the Github Release wasn't published or updated with the latest RC tag. https://github.com/apache/beam/blob/master/contributor-docs/release-guide.md#update-the-github-release-with-the-blog-post-content It looks like the RC1 github release page ex

Re: Heartbeat for worker?

2024-10-04 Thread Robert Burke
ProcessBundleProgressRequest/ProcessBundleProgressResponse are usable for this purpose, and are recommended for gauging bundle progress anyway. https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L154 Same for the

Re: Building dev docker image

2024-09-19 Thread Robert Burke
step. I'd be happy to review any PRs to help fix it though, if you get it working for yourself. (Tag lostluck in GitHub). I think if you search the dev list, you might find previous discussions on the topic of the dev container. Robert Burke On Thu, Sep 19, 2024, 2:17 PM Joey Tran wro

[ANNOUNCE] Beam 2.59.0 Released

2024-09-11 Thread Robert Burke
Github release page https://github.com/apache/beam/releases/tag/v2.59.0 Thanks to everyone who contributed to this release, and we hope you enjoy using Beam 2.59.0. -- Robert Burke, on behalf of the Apache Beam Team.

[RESULT] [VOTE] Release 2.59.0, release candidate #1

2024-09-11 Thread Robert Burke
I'm happy to announce that we have unanimously approved this release. There are 7 approving votes, 4 of which are binding: * Robert Burke * Chamikara Jayalath * Jan Lukavský * Valentyn Tymofieiev There are no disapproving votes. Thanks everyone! Robert Burke, Beam 2.59.0 Release Manager

Re: [VOTE] Release 2.59.0, release candidate #1

2024-09-11 Thread Robert Burke
on pipelines. > > > > Thanks, > > Cham > > > > On Mon, Sep 9, 2024 at 9:47 AM Robert Burke wrote: > > > >> Hello Beam Community! > >> > >> I'd like to remind y'all that this vote remains open. There have yet to > >>

Re: beam-starter-typescript is broken | long module issue | ttypescript module issue

2024-09-07 Thread Robert Burke
Hello Robert, Robert. Since we don't have a great deal of experience with Typescript any assistance in documenting (or pointing to standard typescript documentation) on how to hook up a development module to a separate project that uses the module would be valuable. Aside, I should also do the sa

Re: [VOTE] Release 2.59.0, release candidate #1

2024-08-29 Thread Robert Burke
t; >> > >>> +1 (non-binding). Tested this with the simple Dataflow ML pipeline ( > >>> https://github.com/google/dataflow-ml-starter/actions/runs/10540551699/job/29205343623 > >>> ) > >>> > >>> On Sat, Aug 24, 2024 at 1:16 PM Robert B

Re: [VOTE] Release 2.59.0, release candidate #1

2024-08-29 Thread Robert Burke
today. The vote will remain open in the mean time. Should there be Release blockers, a new RC won't be built until the later of Monday September 9th, and the blockers resolved and cherry picked. Robert Burke 2.59.0 Release Manager On 2024/08/29 19:46:31 Robert Burke wrote: > +1

Re: [VOTE] Release 2.59.0, release candidate #1

2024-08-29 Thread Robert Burke
> > >> +1 (non-binding). Tested this with the simple Dataflow ML pipeline ( > >> https://github.com/google/dataflow-ml-starter/actions/runs/10540551699/job/29205343623 > >> ) > >> > >> On Sat, Aug 24, 2024 at 1:16 PM Robert Burke wrote: >

Re: Sunsetting Beam Python 3.8 Support

2024-08-26 Thread Robert Burke
rom github actions?) > that's another reason to do so. > > > [image: Skärmavbild 2024-08-26 kl. 10.08.09 fm.png] > > > > On Mon, Aug 26, 2024 at 9:42 AM Robert Burke wrote: > > > I'd take care only relying on the most recent release (as much as it

Re: Sunsetting Beam Python 3.8 Support

2024-08-26 Thread Robert Burke
t;beam_python310_sdk",40,97,0,13 > "beam_python3.9_sdk",18,388,0,14 > "beam_python3.8_sdk",36,97,0,2 > > So it was <10% of pulls (including our automation as Rebo pointed out) > > I'll join Jack, Kenn, and Rebo and agree dropping support is t

Re: [DISCUSS] Beam 3.0: Paving the Path to the Next Generation Data Processing Framework

2024-08-26 Thread Robert Burke
I'm all aboard for an improved local single machine experience, via my work on Prism. Having consistent, simple to start up single program to iterate SDK or transform development will help with All Beam SDKs provide a consistent experience, vs each one having various levels of "Direct Runner" supp

Re: Sunsetting Beam Python 3.8 Support

2024-08-26 Thread Robert Burke
As an approximation we can use the docker container pulls at least. Py version : Pulls last week 3.8: 7476 3.9: 1,259 3.10: 6169 3.11: 2999 3.12: 241 3.7: 395 3.6: 241 3.4: 156 2.7: 188 But note that any of our automation for 3.8 that pulls containers would impact these result too. I will n

Re: Beam Patch Releases

2024-08-26 Thread Robert Burke
t that we >> can't choose the base branch to start from? >> >> On Fri, Aug 23, 2024 at 10:40 PM Robert Burke wrote: >> >>> LGTM with the addendum that if we approve of the patch process, we >>> automate the patch PR process via an action like we do for

[VOTE] Release 2.59.0, release candidate #1

2024-08-24 Thread Robert Burke
Hi everyone, Please review and vote on the release candidate #1 for the version 2.59.0, as follows: [ ] +1, Approve the release [ ] -1, Do not approve the release (please provide specific comments) Reviewers are encouraged to test their own use cases with the release candidate, and vote +1 if no

Re: Beam 2.59.0 Release

2024-08-23 Thread Robert Burke
may not be terminated until Tuesday, to allow validation and voting.) Robert Burke Your friendly neighbourhood release manager. [1]: https://github.com/apache/beam/pull/32284 On 2024/08/22 00:26:47 Robert Burke wrote: > The 2.59.0 branch has been cut. > > https://github.com/apache/beam/tree/re

Re: Beam Patch Releases

2024-08-23 Thread Robert Burke
LGTM with the addendum that if we approve of the patch process, we automate the patch PR process via an action like we do for a regular cut. We've only been able to make our releases faster through this automation, there's no sense in dropping that when the criteria of a patch requires a quick, ti

Re: Beam 2.59.0 Release

2024-08-21 Thread Robert Burke
/milestone/23 Fixes and cherrypick for same would be much appreciated. Cheers, Robert Burke On 2024/08/07 20:55:08 Robert Burke wrote: > Hey everyone, > > The next release (2.59.0) branch cut is scheduled for August 21, > 2024, 2 weeks from today, according to the release calendar [

Re: [VOTE] Release 2.58.1, release candidate #1

2024-08-16 Thread Robert Burke
+1 (binding) Validated the linux-amd64 prism binary with a few pipelines. On 2024/08/16 00:25:58 Danny McCormick via dev wrote: > Hi everyone, > Please review and vote on the patch release candidate #1 for the version > 2.58.1, as follows: > > [ ] +1, Approve the release > [ ] -1, Do not approve

Re: BatchElements Overview Documentation

2024-08-15 Thread Robert Burke
I like it! My vote is put it on the beam site Likely linked from here https://beam.apache.org/documentation/ml/about-ml/ But also as a sibling to that page, it's in the python specific section at least. If it's done in the next few days, before the cut it would be worth including as a callout

Beam 2.59.0 Release

2024-08-07 Thread Robert Burke
release blockers. Let me know if you have any comments/objections/questions. Thanks, Robert Burke [1] https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com [2] https://github.com/apache/beam/milestone/23 [3] https://beam.apache.org/contribute/release-blocking/

Re: [VOTE] Release 2.58.0, release candidate #2

2024-08-05 Thread Robert Burke
+1 (Binding) Once again validated the linux-amd prism binary against the java and python validates runner tests. Asie: it is nice to see that the state at HEAD has moved forward from the cut! Next release! :3 On Mon, Aug 5, 2024, 7:40 AM Jack McCluskey via dev wrote: > Hey everyone, > > We nee

Re: [VOTE] Release 2.58.0, release candidate #1

2024-07-24 Thread Robert Burke
+1 (binding) Checked that the Go Prism linux amd64 artifact worked with a Go pipeline. On 2024/07/24 00:50:36 Valentyn Tymofieiev via dev wrote: > +1 (binding), checked that Dataflow containers have been released, checked > release notes, spot-checked some Python test suites and ran a pipeline on

Re: [ANNOUNCE] New Committer: XQ Hu

2024-06-24 Thread Robert Burke
Congratulations XQ! On Mon, Jun 24, 2024, 1:43 PM Svetak Sundhar via dev wrote: > Congrats XQ! > > > Svetak Sundhar > > Data Engineer > s vetaksund...@google.com > > > > On Mon, Jun 24, 2024 at 4:42 PM Byron Ellis via dev > wrote: > >> Congrats XQ! >> >> On Mon, Jun 24, 2024 at 1:40 PM Kennet

Re: design docs that get deleted, etc

2024-05-29 Thread Robert Burke
Honestly, it's less about how we propose things and more to do with actually making "completing" the documentation work to some permanent place. That can be as simple as "migrate the final implemented design to a markdown file in the GitHub repo at *foo* or published on the beam website at *bar*.

Re: Patch release proposal

2024-03-27 Thread Robert Burke
+1 to a targeted patch release. We did the same for the Go SDK a little while back. It would be good to see what's different for a different SDK. On Wed, Mar 27, 2024, 4:01 PM Robert Bradshaw via dev wrote: > Given the severity of the breakage, and the simplicity of the workaround, > I'm in fav

Re: container dev environment: go get issue

2024-03-22 Thread Robert Burke
ldest trick in the book "just > delete the problematic line" > > Thanks for the quick response. I am unblocked now :) > > On Fri, Mar 22, 2024 at 8:47 AM Robert Burke wrote: > >> It's not clear to me why that's even requesting that package at all. I >

Re: container dev environment: go get issue

2024-03-22 Thread Robert Burke
It's not clear to me why that's even requesting that package at all. I would remove that 'go get' line. There's a different issue at play here too since it was written for pre-module Go in mind. I'm unfamiliar with that script though. I'll take a proper look in a few hours. On Fri, Mar 22, 2024,

Re: [DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-28 Thread Robert Burke
uild into Beam a mechanism / paired primitive where such a Cross Worker Communication Pair (the processor/server + DoFn client) could be built, but not purely be limited to Rate limiting/Throttling. Possibly mumble mumble StatePipe? But that feels like a harder problem for the time being. Rober

Re: [DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-27 Thread Robert Burke
>> Is there anything missing in such definition that would still require >> splitting the timers into two distinct features? >> >> Jan >> On 2/26/24 21:22, Kenneth Knowles wrote: >> >> Yea I like DelayTimer, or SleepTimer, or WaitTimer or some such. >> >

Re: [DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-26 Thread Robert Burke
rect > to not wait at least that long > > The main reason we created timers: to take action in the absence of data. > The archetypal use case for processing time timers was/is "flush data from > state if it has been sitting there too long". For this use case, the right >

Re: [DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-23 Thread Robert Burke
While I'm currently on the other side of the fence, I would not be against changing/requiring the semantics of ProcessingTime constructs to be "must wait and execute" as such a solution, and enables the Proposed "batch" process continuation throttling mechanism to work as hypothesized for both "

Re: [DISCUSS] Processing time timers in "batch" (faster-than-wall-time [re]processing)

2024-02-22 Thread Robert Burke
ve gotten away from the core topic. My opinion is "ProcessingTime Timers Shouldn't Block Execution" and "We should figure out the best central primitive to manage this class of concept". Robert Burke Beam Go Busybody [0] https://github.com/apache/beam/blob/11f9bce485

Re: Throttle PTransform

2024-02-21 Thread Robert Burke
t; >> > annotation and, though mentioned years ago as a question, I do not >> >> > remember what arguments were used against enforcing sorting inputs by >> >> > timestamp in the batch stateful DoFn as a requirement in the model. >> That >> >> > would enabl

Re: Throttle PTransform

2024-02-20 Thread Robert Burke
https://s.apache.org/beam-design-docs if not already. Robert Burke Beam Go Busybody On 2024/02/20 14:00:00 Damon Douglas wrote: > Hello Everyone, > > The following describes a Throttle PTransform that holds element throughput > to minimize downstream API overusage. Thank you for

Re: [API PROPOSAL] PTransform.getURN, toProto, etc, for Java

2024-02-15 Thread Robert Burke
perimental design for that specific languages affordances. It's definitely a big plus to be able to see all the bits for a single transform in one file, instead of trying to find the 5-8 different places once must add a registration for it. More so in Java where such handler registrations can be d

Re: [RESULT] [VOTE] Vendored Dependencies Release beam-vendor-grpc-1-60-1:0.2

2024-02-15 Thread Robert Burke
t; > * chamikara@ Chamikara Madhusanka Jayalath > > * kenn@ Kenneth Knowles > > * lostluck@ Robert Burke > > * tvalentyn@ Valentyn Tymofieiev > > * yhu@ Yi Hu (non-binding) > > There are no disapproving votes. > > Thanks everyone! >

[ANNOUNCE] Beam 2.54.0 Released

2024-02-14 Thread Robert Burke
Github release page https://github.com/apache/beam/releases/tag/v2.54.0 Thanks to everyone who contributed to this release, and we hope you enjoy using Beam 2.54.0. -- Robert Burke, on behalf of the Apache Beam Team.

Re: [RESULT] [VOTE] Release 2.54.0, release candidate #2

2024-02-14 Thread Robert Burke
The release is now complete https://beam.apache.org/blog/beam-2.54.0/ Please share and promote. I'll be working on the last odds and ends, but the release is now out. Robert Burke Beam 2.54.0 Release Manager On 2024/02/14 17:36:31 Robert Burke wrote: > I'm happy to announce

[RESULT] [VOTE] Release 2.54.0, release candidate #2

2024-02-14 Thread Robert Burke
I'm happy to announce that we have unanimously approved this release. There are 9 approving votes, 5 of which are binding: * Jan Lukavský * Chamikara Jayalath * Valentyn Tymofieiev * Robert Bradshaw * Robert Burke There are no disapproving votes. Thanks everyone! Robert Burke Beam 2

Re: [VOTE] Release 2.54.0, release candidate #2

2024-02-14 Thread Robert Burke
And with that, we have sufficient conditions to declare that RC2 has met community approval. The vote is now closed. Robert Burke Beam 2.54.0 Release Manager On 2024/02/14 17:20:46 Robert Bradshaw via dev wrote: > +1 (binding) > > We've done the validation we can for now, let

Re: [VOTE] Vendored Dependencies Release

2024-02-14 Thread Robert Burke
+1 (binding) On Wed, Feb 14, 2024, 7:35 AM Yi Hu via dev wrote: > +1 (non-binding) > > checked artifact packages not leaking namespace (or under > org.apache.beam.vendor.grpc.v1p60p1) and the tests in > https://github.com/apache/beam/pull/30212 > > > > > On Tue, Feb 13, 2024 at 4:29 AM Sam Whitt

Re: [VOTE] Release 2.54.0, release candidate #2

2024-02-09 Thread Robert Burke
ew of the quickstarts from the validation sheets and updated my own Beam Go code. Other than a non-blocking update to the Go wordcount quickstart, I didn't run into any issues. Robert Burke Beam 2.54.0 Release Manager On 2024/02/10 01:41:12 Robert Bradshaw via dev wrote: > I validated that

Re: Playground: File Explorer?

2024-02-08 Thread Robert Burke
Joey Tran wrote: > Ah that makes sense. Does the new version of Playground get staged for > release validation? > > On Thu, Feb 8, 2024 at 12:08 PM Robert Burke wrote: > >> We redeploy the playground along with the release, so once 2.54.0 RC2 has >> been validated and

Re: [DESIGN PROPOSAL] Reshuffle Allowing Duplicates

2024-02-08 Thread Robert Burke
tion added accordingly if they need that property per previous discussions. On Thu, Feb 8, 2024, 10:31 AM Kenneth Knowles wrote: > > > On Wed, Feb 7, 2024 at 5:15 PM Robert Burke wrote: > >> OK, so my stance is a configurable Reshuffle might be interesting, so my >> vot

Re: Playground: File Explorer?

2024-02-08 Thread Robert Burke
We redeploy the playground along with the release, so once 2.54.0 RC2 has been validated and voted on, I'll be redeploying it with 2.54.0. On Thu, Feb 8, 2024, 7:18 AM Joey Tran wrote: > Here's two: > > https://play.beam.apache.org/?path=SDK_PYTHON_MultipleOutputPardo&sdk=python > https://play.b

Re: [DESIGN PROPOSAL] Reshuffle Allowing Duplicates

2024-02-07 Thread Robert Burke
that has the same elements as the input PCollection". It remains an open question about what that means for checkpointing/durability behavior, but that's largely been runner dependent anyway. I admit the above definition is biased by the uses of Reshuffle I'm aware of, which largely

[VOTE] Release 2.54.0, release candidate #2

2024-02-06 Thread Robert Burke
on how to try the release in your projects, check out our RC testing guide [13]. Thanks, Robert Burke Beam 2.54.0 Release Manager [1] https://github.com/apache/beam/milestone/18?closed=1 [2] https://dist.apache.org/repos/dist/dev/beam/2.54.0/ [3] https://dist.apache.org/repos/dist/release/be

[VOTE] Release 2.54.0, release candidate #2

2024-02-06 Thread Robert Burke via dev
on how to try the release in your projects, check out our RC testing guide [13]. Thanks, Robert Burke Beam 2.54.0 Release Manager [1] https://github.com/apache/beam/milestone/18?closed=1 [2] https://dist.apache.org/repos/dist/dev/beam/2.54.0/ [3] https://dist.apache.org/repos/dist/release/be

Re: [VOTE] Release 2.54.0, release candidate #1

2024-02-06 Thread Robert Burke
I'd be nice to have a working website when we publish the release blog, and likely digging into the current issue with the GoUsingJava suite issues. Absolute last call for cherry picks for RC2. I will be doing my final cleanup and then start the RC build in an hour or two. Robert Burke Be

Re: [VOTE] Release 2.54.0, release candidate #1

2024-02-05 Thread Robert Burke
s/30203 > > On Fri, Feb 2, 2024 at 6:01 PM XQ Hu via dev wrote: > > > +1 validated by running the simple RunInference ML pipeline: > > https://github.com/google/dataflow-ml-starter/actions/runs/7761835540/job/21171080332 > > > > On Fri, Feb 2, 2024 at 4:10 PM Robert Bu

[VOTE] Release 2.54.0, release candidate #1

2024-02-02 Thread Robert Burke
hours. It is adopted by majority approval, with at least 3 PMC affirmative votes. For guidelines on how to try the release in your projects, check out our RC testing guide [13]. Thanks, Robert Burke Beam 2.54.0 Release Manager [1] https://github.com/apache/beam/milestone/18?closed=1

Re: [DESIGN PROPOSAL] Reshuffle Allowing Duplicates

2024-01-30 Thread Robert Burke
its news to me ( just now, because I looked) that the Java reshuffle promises GBK-like side effects. But that's a long deprecated transform without a satisfying replacement for it's usage, so it may be moot. Robert Burke On Tue, Jan 30, 2024, 1:34 PM Kenneth Knowles wrote: > Hi

Re: [Release 2.54.0] Release Branch has been Cut!

2024-01-29 Thread Robert Burke
can't pass until the Google internal containers have been generated (the ones that do pass at this stage tend to be validating the container, and build their own.) Thank you for your patience and cooperation. Robert Burke Beam 2.54.0 Release Manager On 2024/01/24 22:55:32 Robert Burke wrot

[Release 2.54.0] Release Branch has been Cut!

2024-01-24 Thread Robert Burke
to be resolved before we can cut an RC1. If so, I'll be adding them to this thread. Thank you very much for your cooperation and support. Robert Burke Your friendly neighbourhood Beam 2.54.0 release manager [0] https://github.com/apache/beam/tree/release-2.54.0 [1] https://github.com/a

Re: Google Artifact Registry detects critical vuln CVE-2023-45853 in beam dataflow

2024-01-24 Thread Robert Burke
Thanks for the shout out XQ! And thanks for bringing this up. Moving to a Distroless base for Go SDK images should reduce the vulnerability surface to whichever version of glibc we have packaged in . I do have some concerns around if a user would like to extend the image (not having shells or pac

Re: @RequiresTimeSortedInput adoption by runners

2024-01-20 Thread Robert Burke
necessary information for > pre-portability runners to still work, which is the same prereqs as > pre-portable "Override" implementations to still work. > > TBH I'm 50/50 on the idea. If something is going to be implemented more > slowly or less scalably as a fall

Re: @RequiresTimeSortedInput adoption by runners

2024-01-19 Thread Robert Burke
. It's entirely possible I've over simplified the "fallback" protocol described above, so this thread is still useful for my Prism work, especially if I see any similar situations once I start on the Java Validates Runner suite. Robert Burke Beam Go Busybody On Fri, J

Re: @RequiresTimeSortedInput adoption by runners

2024-01-18 Thread Robert Burke
s indicate we didn't drive the feature to completion and enable user adoption beyond "This Exists, and we can tell you about it if you ask.". AFAICT this is just one of those features we built, but then proceeded not to use within Beam, and evangelize. This is a point we c

Re: Beam 2.54.0 Release

2024-01-10 Thread Robert Burke
Not sure why newlines were eaten. Hopefully reflowed inline below. On 2024/01/10 17:53:56 Robert Burke wrote: > Hey everyone, Happy New Year! > > The next release (2.54.0) branch cut is scheduled for Jan 24, 2024, 2 weeks > from today, according to the release calendar [1]. I'

Re: ByteBuddy DoFnInvokers Write Up

2024-01-10 Thread Robert Burke
That's neat! Thanks for writing that up! On Wed, Jan 10, 2024, 11:12 AM John Casey via dev wrote: > The team at Google recently held an internal hackathon, and my hack > involved modifying how our ByteBuddy DoFnInvokers work. My hack didn't end > up going anywhere, but I learned a lot about how

Beam 2.54.0 Release

2024-01-10 Thread Robert Burke
There are currently 8 release blockers. Let me know if you have any comments/objections/questions. Thanks, Robert Burke [1] https://calendar.google.com/calendar/embed?src=0p73sl034k80oob7seouanigd0%40group.calendar.google.com [2] https://github.com/apache/beam/milestone/18 [3] https://beam.apache.org/contribute/release-blocking/

Re: [RESULT] [VOTE] Release 2.53.0, release candidate #2

2024-01-05 Thread Robert Burke
Done! On Fri, Jan 5, 2024, 11:30 AM Robert Burke wrote: > Going to try to get this done. Will report back when completed (or I get > pulled elsewhere). > > On Thu, Jan 4, 2024, 11:23 AM Jack McCluskey via dev > wrote: > >> Hey everyone, >> >> Following u

Re: [RESULT] [VOTE] Release 2.53.0, release candidate #2

2024-01-05 Thread Robert Burke
ously approved this release. >> >> There are nine approving votes, three of which are binding: >> * Jan Lukavský (binding) >> * Chamikara Jayalath (binding) >> * Robert Burke (binding) >> * XQ Hu >> * Danny McCormick >> * Bruno Volpato >> * Sv

Re: [VOTE] Release 2.53.0, release candidate #2

2024-01-03 Thread Robert Burke
+1 (binding) Validated the Go SDK against my own pipleines. Robert Burke On Wed, Jan 3, 2024, 7:52 AM Chamikara Jayalath via dev wrote: > +1 (binding) > > Validated Java/Python x-lang jobs. > > - Cham > > On Tue, Jan 2, 2024 at 7:35 AM Jack McCluskey via dev > wr

Re: How do side inputs relate to stage fusion?

2023-12-15 Thread Robert Burke
venting the fusion I was expecting. I'm I'm looking into how to > make these hints mergeable now. > > On Thu, Dec 14, 2023 at 7:46 PM Robert Burke wrote: > >> Building on what Robert Bradshaw has said, basically, if these fusion >> breaks don't exist,

Re: How do side inputs relate to stage fusion?

2023-12-14 Thread Robert Burke
Building on what Robert Bradshaw has said, basically, if these fusion breaks don't exist, the pipeline can live lock, because the side input is unable to finish computing for a given input element's window. I have recently added fusion to the Go Prism runner based on the python side input semantic

Re: Beam 2.53.0 Release

2023-11-29 Thread Robert Burke
Thanks Jack! On Wed, Nov 29, 2023, 10:01 AM Jack McCluskey via dev wrote: > Hey everyone, the next release (2.53.0) branch cut is scheduled for Dec 13, > 2023, 2 weeks from today, according to the release calendar [1]. I'd like > to perform this release; I will cut the branch on that date, and c

Re: [YAML] Aggregations

2023-10-29 Thread Robert Burke
I came across Edge DB, and it has a novel syntax moving away from SQL with their EdgeQL. https://www.edgedb.com/ Eg. Heere are two equivalent "nested" queries. # EdgeQL select Movie { title, actors: { name }, rating := math::mean(.reviews.score) } filter "Zendaya" in .actors.name;

Re: Streaming update compatibility

2023-10-27 Thread Robert Burke
On Fri, Oct 27, 2023, 9:09 AM Robert Bradshaw via dev wrote: > On Fri, Oct 27, 2023 at 7:50 AM Kellen Dye via dev > wrote: > > > > > Auto is hard, because it would involve > > > querying the runner before pipeline construction, and we may not even > > > know what the runner is at this point > >

Re: Streaming update compatibility

2023-10-27 Thread Robert Burke
You raise a very good point: https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/org/apache/beam/model/job_management/v1/beam_job_api.proto#L54 The job management API does allow for the pipeline proto to be returned. So one could get the live job, so the SDK could make

Re: Streaming update compatibility

2023-10-26 Thread Robert Burke
Regarding 3. I suspect Go wasn't changed because the PR is centering around implementations of the Expansion Service server, not client callers. The Go SDK doesn't yet have an expansion service. On Thu, Oct 26, 2023, 3:59 AM Johanna Öjeling via dev wrote: > Hi, > > I like this idea of making it

Re: [YAML] Aggregations

2023-10-18 Thread Robert Burke
MongoDB has its own concept of aggregation pipelines as well. https://www.mongodb.com/docs/manual/core/aggregation-pipeline/#std-label-aggregation-pipeline On Wed, Oct 18, 2023, 6:07 PM Robert Bradshaw via dev wrote: > On Wed, Oct 18, 2023 at 5:06 PM Byron Ellis wrote: > > > > Is it worth tak

Re: [Question] Bundle finalization callback

2023-10-15 Thread Robert Burke
ne of e.g. 30s and a continuous stream of messages that > keeps the ProcessElement active. Then I will want to interrupt processing > of new messages and self-checkpoint before those 30s have passed, if the > runner hasn't initiated it within that time frame. > > Johanna > >

Re: [Question] Bundle finalization callback

2023-10-15 Thread Robert Burke
Hi! Nswers inline. On Sun, Oct 15, 2023, 11:48 AM Johanna Öjeling via dev wrote: > Hi, > > I'm working on a native streaming IO connector for the Go SDK to enable > reads and writes from/to NATS (#29000 > ) and would like to better > understand how bu

Re: [PROPOSAL] [Nice-to-have] CI job names and commands that match

2023-10-10 Thread Robert Burke
of writing > ":runners:google-cloud-dataflow-java:validatesRunnerV2Streaming". > > Last thing I'll add - this is true for you and probably many contributors, > but is less friendly for new folks who are less familiar with the project > IMO (especially when the filepath/comma

Re: [PROPOSAL] [Nice-to-have] CI job names and commands that match

2023-10-10 Thread Robert Burke
+1 to the general proposal. I'm not bothered if something says a gradle command and in execution, that task ends up running multiple different commands. Arguably, if we're running a gradle task manualy to prepare for a subsequent task that latter task should be adding the former to it's dependenci

Re: [YAML] Fileio sink parameterization (streaming, sharding, and naming)

2023-10-09 Thread Robert Burke
lexity you mention.) But it's just as likely I missed a document somewhere. It has been a while since I last searched for this, let alone have time to do the deep dives required to produce it. Robert Burke Beam Go Busybody On Mon, Oct 9, 2023, 12:37 PM Robert Bradshaw via dev wrote: > C

  1   2   3   4   5   6   >