Beam High Priority Issue Report (38)
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/33425 [Bug]: beam_Publish_Beam_SDK_Snapshots is extremely flaky due to failing to build wheels https://github.com/apache/beam/issues/33407 [Bug]: tfrecordio does not work with snappy >= 0.7 https://github.com/apache/beam/issues/33393 The PerformanceTests HadoopFormat job is flaky https://github.com/apache/beam/issues/33253 The PostCommit Python Xlang IO Dataflow job is flaky https://github.com/apache/beam/issues/33065 The Python ValidatesContainer Dataflow ARM job is flaky https://github.com/apache/beam/issues/32997 [Bug]: Non Retained Messages missing after MqttIO.Read checkpoint restore https://github.com/apache/beam/issues/32832 [Failing Test]: PreCommit Yaml Xlang Direct is broken https://github.com/apache/beam/issues/32509 [Bug]: Unable to Restart Google Spanner Change Streams Consumer due to tableExists(table_name) bug https://github.com/apache/beam/issues/32161 The Publish Beam SDK Snapshots job is flaky https://github.com/apache/beam/issues/32144 The PerformanceTests WordCountIT PythonVersions job is flaky https://github.com/apache/beam/issues/31882 The Build python source distribution and wheels job is flaky https://github.com/apache/beam/issues/31846 The Clean Up GCP Resources job is flaky https://github.com/apache/beam/issues/31254 [Failing Test]: Onnx inference unit tests are failing. https://github.com/apache/beam/issues/30593 The PostCommit XVR PythonUsingJavaSQL Dataflow job is flaky https://github.com/apache/beam/issues/30526 The PerformanceTests xlang KafkaIO Python job is flaky https://github.com/apache/beam/issues/30525 The PostCommit Python ValidatesContainer Dataflow With RC job is flaky https://github.com/apache/beam/issues/30521 The LoadTests Go Combine Flink Batch job is flaky https://github.com/apache/beam/issues/30520 The LoadTests Python Combine Flink Streaming job is flaky https://github.com/apache/beam/issues/30519 The PostCommit XVR GoUsingJava Dataflow job is flaky https://github.com/apache/beam/issues/30517 The PostCommit XVR Direct job is flaky https://github.com/apache/beam/issues/29971 [Bug]: FixedWindows not working for large Kafka topic https://github.com/apache/beam/issues/29515 [Bug]: WriteToFiles in python leave few records in temp directory when writing to large number (100+) of files https://github.com/apache/beam/issues/29099 [Bug]: FnAPI Java SDK Harness doesn't update user counters in OnTimer callback functions https://github.com/apache/beam/issues/28760 [Bug]: EFO Kinesis IO reader provided by apache beam does not pick the event time for watermarking https://github.com/apache/beam/issues/26329 [Bug]: BigQuerySourceBase does not propagate a Coder to AvroSource https://github.com/apache/beam/issues/26041 [Bug]: Unable to create exactly-once Flink pipeline with stream source and file sink https://github.com/apache/beam/issues/25946 [Task]: Support more Beam portable schema types as Python types https://github.com/apache/beam/issues/24776 [Bug]: Race condition in Python SDK Harness ProcessBundleProgress https://github.com/apache/beam/issues/23525 [Bug]: Default PubsubMessage coder will drop message id and orderingKey https://github.com/apache/beam/issues/22605 [Bug]: Beam Python failure for dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest.test_metrics_it https://github.com/apache/beam/issues/21643 FnRunnerTest with non-trivial (order 1000 elements) numpy input flakes in non-cython environment https://github.com/apache/beam/issues/21476 WriteToBigQuery Dynamic table destinations returns wrong tableId https://github.com/apache/beam/issues/21260 Python DirectRunner does not emit data at GC time https://github.com/apache/beam/issues/20108 Python direct runner doesn't emit empty pane when it should P1 Issues with no update in the last week: https://github.com/apache/beam/issues/30606 The PostCommit Java Nexmark Dataflow job is flaky https://github.com/apache/beam/issues/30507 The LoadTests Go GBK Flink Batch job is flaky https://github.com/apache/beam/issues/30502 The LoadTests Go CoGBK Flink Batch job is flaky https://github.com/apache/beam/issues/25975 [Bug]: KinesisIO processing-time watermarking can cause data loss
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congratulations Danny! 😀
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congratulations Danny! On Fri, Dec 20, 2024, 1:18 PM Jan Lukavský wrote: > Congrats Danny! Well deserved! > On 12/20/24 21:22, Danny McCormick via dev wrote: > > Thanks everyone! I'm excited and honored to join! > > On Fri, Dec 20, 2024 at 3:08 PM Ravi Magham > wrote: > >> Congrats Danny ! >> >> On Fri, Dec 20, 2024 at 12:03 PM Valentyn Tymofieiev via dev < >> dev@beam.apache.org> wrote: >> >>> So well deserved!! >>> >>> Congratulations, Danny! >>> >>> >>> >>> On Fri, Dec 20, 2024, 19:56 Robert Bradshaw via dev >>> wrote: >>> Hi all, Please join me and the rest of the Beam PMC in welcoming Danny McCormick as the newest member of the PMC. Danny has been contributing to Beam for several years now, most notably in the important and growing ML components of Beam. During this time he has been one of the most prolific developers and reviewers across the entire project. In addition, he has initiated and carried out numerous improvements to the beam project as a whole (e.g. improving our testing infrastructure, simplifying our release process) and plays a key role in strengthening our community, both online and via in person events like Beam Summits. Congratulations Danny and thanks for being a part of Apache Beam! Robert, on behalf of the Beam PMC (which now includes Danny) >>>
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congratulations Danny!! Well deserved! On Fri, Dec 20, 2024 at 6:36 PM Robert Burke wrote: > Congratulations Danny! > > On Fri, Dec 20, 2024, 1:18 PM Jan Lukavský wrote: > >> Congrats Danny! Well deserved! >> On 12/20/24 21:22, Danny McCormick via dev wrote: >> >> Thanks everyone! I'm excited and honored to join! >> >> On Fri, Dec 20, 2024 at 3:08 PM Ravi Magham >> wrote: >> >>> Congrats Danny ! >>> >>> On Fri, Dec 20, 2024 at 12:03 PM Valentyn Tymofieiev via dev < >>> dev@beam.apache.org> wrote: >>> So well deserved!! Congratulations, Danny! On Fri, Dec 20, 2024, 19:56 Robert Bradshaw via dev < dev@beam.apache.org> wrote: > Hi all, > > Please join me and the rest of the Beam PMC in welcoming Danny > McCormick as the newest member of the PMC. > > Danny has been contributing to Beam for several years now, most > notably in the important and growing ML components of Beam. During > this time he has been one of the most prolific developers and > reviewers across the entire project. In addition, he has initiated and > carried out numerous improvements to the beam project as a whole (e.g. > improving our testing infrastructure, simplifying our release process) > and plays a key role in strengthening our community, both online and > via in person events like Beam Summits. > > Congratulations Danny and thanks for being a part of Apache Beam! > > Robert, on behalf of the Beam PMC (which now includes Danny) >
A Beam Development podcast, via AI!
I learned about pi.dev from one of my Go podcasts and decided to point it to the Beam Repo: https://pi.dev/github.com/apache/beam If you put that url into a podcast app, you'll be able to find an AI voiced and analyzed ~5 minute summary of changes to Beam from the 18th to the 20th! The notes for the episode even link to the PRs, copied below. I think it's a pretty neat way of getting a summary of what's going on. Cheers, Robert Burke Performance Benchmarking Framework for Dataflow: [#33297]( https://github.com/apache/beam/pull/33297) - YAML Transform RunInference with VertexAI: [#33406](https://github.com/apache/beam/pull/33406) - RAG Pipelines with Chunking and Embedding: [#33364]( https://github.com/apache/beam/pull/33364) - BigQueryIO BigLake Table Creation: [#33125](https://github.com/apache/beam/pull/33125) - Spanner Change Stream IO Connector Fix: [#32474]( https://github.com/apache/beam/pull/32474) - SolaceIO.read Connector Client Closure Fix: [#32962](https://github.com/apache/beam/pull/32962) - Related Issue to SolaceIO.read: [#32964](https://github.com/apache/beam/pull/32964
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congrats Danny! On Fri, Dec 20, 2024 at 11:46 AM Riju Kallivalappil via dev < dev@beam.apache.org> wrote: > Congratulations Danny! > > On Fri, Dec 20, 2024 at 11:33 AM Ketan Maydeo via dev > wrote: > >> Congratulations Danny! >> >> On Fri, Dec 20, 2024 at 2:27 PM XQ Hu via dev >> wrote: >> >>> Congrats Danny! >>> >>> On Fri, Dec 20, 2024 at 2:12 PM Byron Ellis via dev >>> wrote: >>> Congrats Danny! On Fri, Dec 20, 2024 at 11:02 AM Svetak Sundhar via dev < dev@beam.apache.org> wrote: > Congrats Danny! Really well deserved, and excited to see all that you > do to continue to take the project forward. > > > Svetak Sundhar > > Data Engineer > s vetaksund...@google.com > > > > On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev < > dev@beam.apache.org> wrote: > >> Hi all, >> >> Please join me and the rest of the Beam PMC in welcoming Danny >> McCormick as the newest member of the PMC. >> >> Danny has been contributing to Beam for several years now, most >> notably in the important and growing ML components of Beam. During >> this time he has been one of the most prolific developers and >> reviewers across the entire project. In addition, he has initiated and >> carried out numerous improvements to the beam project as a whole (e.g. >> improving our testing infrastructure, simplifying our release process) >> and plays a key role in strengthening our community, both online and >> via in person events like Beam Summits. >> >> Congratulations Danny and thanks for being a part of Apache Beam! >> >> Robert, on behalf of the Beam PMC (which now includes Danny) >> >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
So well deserved!! Congratulations, Danny! On Fri, Dec 20, 2024, 19:56 Robert Bradshaw via dev wrote: > Hi all, > > Please join me and the rest of the Beam PMC in welcoming Danny > McCormick as the newest member of the PMC. > > Danny has been contributing to Beam for several years now, most > notably in the important and growing ML components of Beam. During > this time he has been one of the most prolific developers and > reviewers across the entire project. In addition, he has initiated and > carried out numerous improvements to the beam project as a whole (e.g. > improving our testing infrastructure, simplifying our release process) > and plays a key role in strengthening our community, both online and > via in person events like Beam Summits. > > Congratulations Danny and thanks for being a part of Apache Beam! > > Robert, on behalf of the Beam PMC (which now includes Danny) >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congrats Danny ! On Fri, Dec 20, 2024 at 12:03 PM Valentyn Tymofieiev via dev < dev@beam.apache.org> wrote: > So well deserved!! > > Congratulations, Danny! > > > > On Fri, Dec 20, 2024, 19:56 Robert Bradshaw via dev > wrote: > >> Hi all, >> >> Please join me and the rest of the Beam PMC in welcoming Danny >> McCormick as the newest member of the PMC. >> >> Danny has been contributing to Beam for several years now, most >> notably in the important and growing ML components of Beam. During >> this time he has been one of the most prolific developers and >> reviewers across the entire project. In addition, he has initiated and >> carried out numerous improvements to the beam project as a whole (e.g. >> improving our testing infrastructure, simplifying our release process) >> and plays a key role in strengthening our community, both online and >> via in person events like Beam Summits. >> >> Congratulations Danny and thanks for being a part of Apache Beam! >> >> Robert, on behalf of the Beam PMC (which now includes Danny) >> >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Big congrats, Danny! On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev wrote: > Hi all, > > Please join me and the rest of the Beam PMC in welcoming Danny > McCormick as the newest member of the PMC. > > Danny has been contributing to Beam for several years now, most > notably in the important and growing ML components of Beam. During > this time he has been one of the most prolific developers and > reviewers across the entire project. In addition, he has initiated and > carried out numerous improvements to the beam project as a whole (e.g. > improving our testing infrastructure, simplifying our release process) > and plays a key role in strengthening our community, both online and > via in person events like Beam Summits. > > Congratulations Danny and thanks for being a part of Apache Beam! > > Robert, on behalf of the Beam PMC (which now includes Danny) >
Re: Remove Deprecated v1 AWS IOs
Thanks! I will go ahead and move forward with this over the next few weeks. I created https://github.com/apache/beam/issues/33430 to track the work, and I will follow up with a user@ thread as well (thanks for the suggestion Alexey). Thanks, Danny On Wed, Dec 18, 2024 at 8:01 AM Alexey Romanenko wrote: > +1 > Yes, long waiting thing! > > Makes sense for me since 2+ two years should be quite enough to move to > AWS v2 Io connectors. Though, I'd recommend to announce it on user@ as > well in advance. > > --- > Alexey > > On 2024/12/12 20:25:20 Danny McCormick via dev wrote: > > Hey everyone, I've been working on upgrading our Java version of protobuf > > to protobuf 4 (also needed to keep many other dependencies up to date). > As > > part of this, I've found that the AWS v1 KinesisIO [1] is incompatible > with > > protobuf 4 (on upgrade, tests now hang [2]). Other v1 libraries likely > are > > incompatible as well. > > > > These IOs have been deprecated since Beam 2.41.0 (July 2022), with the > > message "You are using a deprecated IO for DynamoDB. Please migrate to > > module 'org.apache.beam:beam-sdks-java-io-amazon-web-services2'." [3], > and > > the new libraries have been recommended for longer than that. The > > underlying libraries are in maintenance mode, with EOL scheduled for the > > end of next year [4]. Rather than trying to find a workaround to patch > > these libraries, I'd like to remove them in favor of the non-deprecated > > libraries right after next week's release cut. Are there any objections > to > > this approach? If I don't hear any objections, I will proceed with this > > approach next week (draft PR [5]). > > > > Thanks, > > Danny > > > > [1] > > > https://github.com/apache/beam/tree/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis > > [2] > > > https://github.com/apache/beam/actions/runs/12263616473/job/34215568385?pr=33192 > > [3] > > > https://github.com/apache/beam/commit/f5435c0575870062f39575271c0f483117908403 > > [4] > > > https://aws.amazon.com/blogs/developer/announcing-end-of-support-for-aws-sdk-for-java-v1-x-on-december-31-2025/ > > >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congrats Danny! On Fri, Dec 20, 2024 at 11:02 AM Svetak Sundhar via dev wrote: > Congrats Danny! Really well deserved, and excited to see all that you do > to continue to take the project forward. > > > Svetak Sundhar > > Data Engineer > s vetaksund...@google.com > > > > On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev < > dev@beam.apache.org> wrote: > >> Hi all, >> >> Please join me and the rest of the Beam PMC in welcoming Danny >> McCormick as the newest member of the PMC. >> >> Danny has been contributing to Beam for several years now, most >> notably in the important and growing ML components of Beam. During >> this time he has been one of the most prolific developers and >> reviewers across the entire project. In addition, he has initiated and >> carried out numerous improvements to the beam project as a whole (e.g. >> improving our testing infrastructure, simplifying our release process) >> and plays a key role in strengthening our community, both online and >> via in person events like Beam Summits. >> >> Congratulations Danny and thanks for being a part of Apache Beam! >> >> Robert, on behalf of the Beam PMC (which now includes Danny) >> >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congrats Danny! Well deserved! On 12/20/24 21:22, Danny McCormick via dev wrote: Thanks everyone! I'm excited and honored to join! On Fri, Dec 20, 2024 at 3:08 PM Ravi Magham wrote: Congrats Danny ! On Fri, Dec 20, 2024 at 12:03 PM Valentyn Tymofieiev via dev wrote: So well deserved!! Congratulations, Danny! On Fri, Dec 20, 2024, 19:56 Robert Bradshaw via dev wrote: Hi all, Please join me and the rest of the Beam PMC in welcoming Danny McCormick as the newest member of the PMC. Danny has been contributing to Beam for several years now, most notably in the important and growing ML components of Beam. During this time he has been one of the most prolific developers and reviewers across the entire project. In addition, he has initiated and carried out numerous improvements to the beam project as a whole (e.g. improving our testing infrastructure, simplifying our release process) and plays a key role in strengthening our community, both online and via in person events like Beam Summits. Congratulations Danny and thanks for being a part of Apache Beam! Robert, on behalf of the Beam PMC (which now includes Danny)
[ANNOUNCE] New PMC Member: Danny McCormick
Hi all, Please join me and the rest of the Beam PMC in welcoming Danny McCormick as the newest member of the PMC. Danny has been contributing to Beam for several years now, most notably in the important and growing ML components of Beam. During this time he has been one of the most prolific developers and reviewers across the entire project. In addition, he has initiated and carried out numerous improvements to the beam project as a whole (e.g. improving our testing infrastructure, simplifying our release process) and plays a key role in strengthening our community, both online and via in person events like Beam Summits. Congratulations Danny and thanks for being a part of Apache Beam! Robert, on behalf of the Beam PMC (which now includes Danny)
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congrats Danny! Really well deserved, and excited to see all that you do to continue to take the project forward. Svetak Sundhar Data Engineer s vetaksund...@google.com On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev wrote: > Hi all, > > Please join me and the rest of the Beam PMC in welcoming Danny > McCormick as the newest member of the PMC. > > Danny has been contributing to Beam for several years now, most > notably in the important and growing ML components of Beam. During > this time he has been one of the most prolific developers and > reviewers across the entire project. In addition, he has initiated and > carried out numerous improvements to the beam project as a whole (e.g. > improving our testing infrastructure, simplifying our release process) > and plays a key role in strengthening our community, both online and > via in person events like Beam Summits. > > Congratulations Danny and thanks for being a part of Apache Beam! > > Robert, on behalf of the Beam PMC (which now includes Danny) >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congrats Danny! On Fri, Dec 20, 2024 at 2:12 PM Byron Ellis via dev wrote: > Congrats Danny! > > On Fri, Dec 20, 2024 at 11:02 AM Svetak Sundhar via dev < > dev@beam.apache.org> wrote: > >> Congrats Danny! Really well deserved, and excited to see all that you do >> to continue to take the project forward. >> >> >> Svetak Sundhar >> >> Data Engineer >> s vetaksund...@google.com >> >> >> >> On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev < >> dev@beam.apache.org> wrote: >> >>> Hi all, >>> >>> Please join me and the rest of the Beam PMC in welcoming Danny >>> McCormick as the newest member of the PMC. >>> >>> Danny has been contributing to Beam for several years now, most >>> notably in the important and growing ML components of Beam. During >>> this time he has been one of the most prolific developers and >>> reviewers across the entire project. In addition, he has initiated and >>> carried out numerous improvements to the beam project as a whole (e.g. >>> improving our testing infrastructure, simplifying our release process) >>> and plays a key role in strengthening our community, both online and >>> via in person events like Beam Summits. >>> >>> Congratulations Danny and thanks for being a part of Apache Beam! >>> >>> Robert, on behalf of the Beam PMC (which now includes Danny) >>> >>
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congratulations Danny! On Fri, Dec 20, 2024 at 2:27 PM XQ Hu via dev wrote: > Congrats Danny! > > On Fri, Dec 20, 2024 at 2:12 PM Byron Ellis via dev > wrote: > >> Congrats Danny! >> >> On Fri, Dec 20, 2024 at 11:02 AM Svetak Sundhar via dev < >> dev@beam.apache.org> wrote: >> >>> Congrats Danny! Really well deserved, and excited to see all that you do >>> to continue to take the project forward. >>> >>> >>> Svetak Sundhar >>> >>> Data Engineer >>> s vetaksund...@google.com >>> >>> >>> >>> On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev < >>> dev@beam.apache.org> wrote: >>> Hi all, Please join me and the rest of the Beam PMC in welcoming Danny McCormick as the newest member of the PMC. Danny has been contributing to Beam for several years now, most notably in the important and growing ML components of Beam. During this time he has been one of the most prolific developers and reviewers across the entire project. In addition, he has initiated and carried out numerous improvements to the beam project as a whole (e.g. improving our testing infrastructure, simplifying our release process) and plays a key role in strengthening our community, both online and via in person events like Beam Summits. Congratulations Danny and thanks for being a part of Apache Beam! Robert, on behalf of the Beam PMC (which now includes Danny) >>>
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Congratulations Danny! On Fri, Dec 20, 2024 at 11:33 AM Ketan Maydeo via dev wrote: > Congratulations Danny! > > On Fri, Dec 20, 2024 at 2:27 PM XQ Hu via dev wrote: > >> Congrats Danny! >> >> On Fri, Dec 20, 2024 at 2:12 PM Byron Ellis via dev >> wrote: >> >>> Congrats Danny! >>> >>> On Fri, Dec 20, 2024 at 11:02 AM Svetak Sundhar via dev < >>> dev@beam.apache.org> wrote: >>> Congrats Danny! Really well deserved, and excited to see all that you do to continue to take the project forward. Svetak Sundhar Data Engineer s vetaksund...@google.com On Fri, Dec 20, 2024 at 1:56 PM Robert Bradshaw via dev < dev@beam.apache.org> wrote: > Hi all, > > Please join me and the rest of the Beam PMC in welcoming Danny > McCormick as the newest member of the PMC. > > Danny has been contributing to Beam for several years now, most > notably in the important and growing ML components of Beam. During > this time he has been one of the most prolific developers and > reviewers across the entire project. In addition, he has initiated and > carried out numerous improvements to the beam project as a whole (e.g. > improving our testing infrastructure, simplifying our release process) > and plays a key role in strengthening our community, both online and > via in person events like Beam Summits. > > Congratulations Danny and thanks for being a part of Apache Beam! > > Robert, on behalf of the Beam PMC (which now includes Danny) >
Re: [ANNOUNCE] New PMC Member: Danny McCormick
Thanks everyone! I'm excited and honored to join! On Fri, Dec 20, 2024 at 3:08 PM Ravi Magham wrote: > Congrats Danny ! > > On Fri, Dec 20, 2024 at 12:03 PM Valentyn Tymofieiev via dev < > dev@beam.apache.org> wrote: > >> So well deserved!! >> >> Congratulations, Danny! >> >> >> >> On Fri, Dec 20, 2024, 19:56 Robert Bradshaw via dev >> wrote: >> >>> Hi all, >>> >>> Please join me and the rest of the Beam PMC in welcoming Danny >>> McCormick as the newest member of the PMC. >>> >>> Danny has been contributing to Beam for several years now, most >>> notably in the important and growing ML components of Beam. During >>> this time he has been one of the most prolific developers and >>> reviewers across the entire project. In addition, he has initiated and >>> carried out numerous improvements to the beam project as a whole (e.g. >>> improving our testing infrastructure, simplifying our release process) >>> and plays a key role in strengthening our community, both online and >>> via in person events like Beam Summits. >>> >>> Congratulations Danny and thanks for being a part of Apache Beam! >>> >>> Robert, on behalf of the Beam PMC (which now includes Danny) >>> >>