Beam High Priority Issue Report (43)

2023-10-12 Thread beamactions
This is your daily summary of Beam's current high priority issues that may need attention. See https://beam.apache.org/contribute/issue-priorities for the meaning and expectations around issue priorities. Unassigned P1 Issues: https://github.com/apache/beam/issues/28909 [Stuck Test]: GitHu

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Anand Inguva via dev
The PR https://github.com/apache/beam/pull/28385 is merged today. If there are any observed failures, please comment on the PR and I will follow up with a forward fix. Thanks. On Fri, Sep 1, 2023 at 2:30 PM Anand Inguva wrote: > Since there is positive feedback from the dev community, I am going

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Robert Bradshaw via dev
Does this change any development practices? E.g. if I clone the repo, I'm assuming I couldn't run "setup.py test" anymore. What about the generated files (like protos, or the yaml definitions copied from other parts of the repo)? On Thu, Oct 12, 2023 at 12:27 PM Anand Inguva via dev wrote: > The

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Anand Inguva via dev
I am in the process of updating the documentation at https://cwiki.apache.org/confluence/display/BEAM/Python+Tips related to setup.py/pyproject.toml changes, but yes you can't call setup.py directly because it might fail due to the lack of presence of beam python's build time dependencies. With re

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Robert Bradshaw via dev
On Thu, Oct 12, 2023 at 2:04 PM Anand Inguva wrote: > I am in the process of updating the documentation at > https://cwiki.apache.org/confluence/display/BEAM/Python+Tips related to > setup.py/pyproject.toml changes, but yes you can't call setup.py directly > because it might fail due to the lack

Re: Proposal for pyproject.toml Support in Apache Beam Python

2023-10-12 Thread Anand Inguva via dev
The gen_protos.py will be called while building a sdist, wheel or an editable installation. We use pytest through the tox package and during the tox build process, gen_protos.py is called during either wheel or sdist creation. For building sdist, the process now changed from `python setup.py sdis

Apache Beam 2.50.0 - org.apache.beam.sdk.options.MemoryMonitorOptions ClassDef Not found

2023-10-12 Thread Deliwala, Jaymik H. via dev
Hello Team Greetings!! As part of upgrading our Dataflow - Apache beam version from 2.46.0 to 2.49.0/2.50.0, we are able to compile the mvn package successfully. However, while running the compile exec command, we are getting an error as below - org.apache.beam.sdk.options.MemoryMonitorOptions

Re: [Question] Read Parquet Schema from S3 Directory

2023-10-12 Thread Robert Bradshaw via dev
You'll probably need to resolve "s3a:///*.parquet" out into a concrete non-glob filepattern to inspect it this way. Presumably any individual shard will do. match and open from https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileSystems.html may be useful. On Wed, Oct 11, 2

Re: [YAML] Fileio sink parameterization (streaming, sharding, and naming)

2023-10-12 Thread Robert Bradshaw via dev
OK, so how about this for a concrete proposal: sink: type: WriteToParquet config: path: "/beam/filesytem/{record.my_col}-{timestamp.year}{timestamp.month}{timestamp.day}" suffix: ".parquet" The eventual path would be . The suffix would be optional, and there could