Thank you Vitaly and Beam Infrastructure Team! On Fri, Apr 11, 2025 at 10:25 AM Jan Lukavský <je...@seznam.cz> wrote:
> Great job, thanks for your work! > On 4/11/25 16:01, XQ Hu via dev wrote: > > Great work, Vitaly and your team! Thanks a lot! > > On Fri, Apr 11, 2025 at 9:48 AM Vitaly Terentyev via dev < > dev@beam.apache.org> wrote: > >> Dear Community, >> >> March was a dynamic month for Beam Infrastructure & Health. We began and >> ended the month with a solid health level of 98.38%, but encountered two >> temporary dips due to a combination of emerging issues and system-level >> changes. >> >> Health Trends and Incident Analysis: >> >> - >> >> The first drop was linked to scattered failures across multiple areas >> of the codebase, including Python, Java, Go, and both Flink and Spark >> runners. These issues were quickly triaged and mitigated. >> - >> >> The second drop occurred due to a method signature change that >> required an update to the Dataflow Java container version, alongside a >> group of failing XVR workflows caused by an integer overflow during >> varint32 encoding. These were promptly resolved. >> >> Thanks to rapid resolution efforts, the system health recovered to 98.38% >> by the end of March. Please see the attached chart for March's Health >> Status trends. >> >> Key Improvements: >> >> - >> >> Flaky Test Fixes: >> - >> >> PostCommit and PreCommit jobs across Java, Python, SQL, and Go. >> - >> >> XVR workflows and other runner related jobs. >> - >> >> You can find the full list of closed or fixed 21 issues here >> >> <https://github.com/apache/beam/issues?q=is%3Aissue%20state%3Aclosed%20label%3Aflaky_test%20closed%3A%3E2025-03-01%20%20closed%3A%3C2025-03-31%20(involves%3AAmar3tto%20OR%20involves%3Aakashorabek)%20> >> . >> - >> >> Performance Metrics Update >> - >> >> Added Performance Metrics for Python ML pipelines. >> - >> >> Updated Performance Metrics graphs on the Beam website >> <https://beam.apache.org/performance/> using Looker-generated >> images up to Beam 2.64.0. >> >> Currently failing workflows >> >> - >> >> Core Infrastructure (1) >> - >> >> Publish Beam SDK Snapshots >> <https://github.com/apache/beam/issues/32161> >> - >> >> Important Signals (2) >> - >> >> PostCommit Python Arm <https://github.com/apache/beam/issues/30760> >> - >> >> PostCommit Python <https://github.com/apache/beam/issues/30513> >> - >> >> Dataflow Java Tests (1) >> - >> >> PostCommit XVR GoUsingJava Dataflow >> <https://github.com/apache/beam/issues/30519> >> - >> >> Python Runners Tests (1) >> - >> >> Python ValidatesContainer Dataflow ARM >> <https://github.com/apache/beam/issues/33065> >> - >> >> Misc Tests (2) >> - >> >> IcebergIO Integration Tests >> <https://github.com/apache/beam/issues/31931> >> - >> >> PostCommit XVR Flink <https://github.com/apache/beam/issues/31418> >> >> Ongoing and Future Work >> >> - >> >> Continue stabilizing newly emerging issues, with particular attention >> to Python-related workflows. >> - >> >> Investigate and fix instability in IcebergIO Integration Tests >> workflow. >> - >> >> Maintain high visibility of flaky and infra issues via our Health >> Dashboard. >> >> As always, if you notice infrastructure-related issues, feel free to open >> a GitHub issue with the label “infra >> <https://github.com/apache/beam/issues?q=is%3Aissue%20state%3Aopen%20label%3Ainfra>”, >> and our team will triage and handle it. >> >> Your engagement makes a big difference — and is always welcome. >> Best regards, >> Vitaly Terentyev >> Akvelon Inc. >> Apache Beam Infrastructure Team >> >