Hi Damon,

I fear the current release / versioning strategy of Beam doesn’t lend itself 
well for such breaking changes. Alexey and I have spent quite some time 
discussing how to proceed with the problematic Avro dependency in core (and 
respectively AvroIO, of course).
Such changes essentially always require duplicating code to continue supporting 
a deprecated legacy code path to not break users’ code. But this comes at a 
very high price. Until the deprecated code path can be finally removed again, 
it must be maintained in two places.
Unfortunately, the removal of deprecated code is rather problematic without a 
major version release as it would break semantic versioning and people’s 
expectations. With that deprecations bear the inherent risk to unintentionally 
deplete quality rather than improving it.
I’d therefore recommend against such efforts unless there’s very strong reasons 
to do so.

Best, Moritz

On 07.12.22, 18:05, "Damon Douglas via dev" <dev@beam.apache.org> wrote:

Hello Everyone, If you identify yourself on the Beam learning journey, even if 
this is your first day, please see yourself as a welcome participant in this 
conversation and consider reviewing the bottom portion of this email for 
guidance. The

Hello Everyone,

If you identify yourself on the Beam learning journey, even if this is your 
first day, please see yourself as a welcome participant in this conversation 
and consider reviewing the bottom portion of this email for guidance.

The Short Version (For those with Java Beam SDK knowledge):

Should we migrate FileIO / TextIO and related classes from :sdks:java:core to 
:sdks:java:io:file?  If so, should we target such a migration to a future Beam 
version with repeated announcements?  Does the Beam repository have any example 
of a similar change in the past?  What learnings from said past change could be 
potentially applied to this one?

The Long Version (For those on the learning path):

This email is more about our repository organization rather than Beam.  The 
proposal is to move two highly used classes (and anything related) in our Java 
SDK called FileIO [1] and TextIO [2].  The Beam GitHub repository uses a 
software called gradle [3], to automate routine code tasks such as build and 
test.  Gradle projects, such as Beam, organize code in what are called modules 
[4].  The three main ingredients that make a module are 1) a unique directory 
path, 2) a file called build.gradle (or build.gradle.kts) in this directory, 3) 
referencing the gradle module in a settings.gradle (or settings.gradle.kts) 
file at the root of the repository.

The gradle documentation discusses why such organization might matter and how 
to achieve this with large projects [5].  Essentially, modules allow us to have 
mini-projects inside our large project and focus related automations to this 
one focused portion of our larger repository.  In Beam, we have the module 
:sdks:java:core [6] with all things related to the core of Beam, whereas we 
have separate modules related to reading from and writing to various resources 
within :sdks:java:io [7].

The proposal suggests moving the aforementioned file reading and writing 
classes, FileIO and TextIO, and anything related, to its own :sdks:java:io:file 
module.  This would correspond to a new sdks/java/io/file directory and moving 
these classes into sdks/java/io/file/main/java/org/apache/beam/sdk/io/file.

Definitions / References:

1. FileIO - a General-purpose transforms for working with files: listing files 
(matching), reading and writing.  See - 
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileIO.html<https://urldefense.com/v3/__https:/beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/FileIO.html__;!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKRAO322f$>

2. TextIO - Similar to FileIO but focused on text files.  See 
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TextIO.html<https://urldefense.com/v3/__https:/beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/TextIO.html__;!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKdJr8h_h$>

3. Gradle - a build automation tool used by the Apache Beam repository to 
automate code-related tasks.  See 
https://docs.gradle.org/current/userguide/what_is_gradle.html<https://urldefense.com/v3/__https:/docs.gradle.org/current/userguide/what_is_gradle.html__;!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKfpKrYIT$>

4. Gradle Module - a subsection of your larger repository.  See 
https://docs.gradle.org/current/userguide/dependency_management_terminology.html#sub:terminology_module<https://urldefense.com/v3/__https:/docs.gradle.org/current/userguide/dependency_management_terminology.html*sub:terminology_module__;Iw!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKa_7kemk$>

5. Structuring Large Projects with Gradle - 
https://docs.gradle.org/current/userguide/structuring_software_products.html<https://urldefense.com/v3/__https:/docs.gradle.org/current/userguide/structuring_software_products.html__;!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKbcu5E4h$>

6. sdks:java:core - Corresponds to the sdks/java/core repository directory. See 
https://github.com/apache/beam/tree/master/sdks/java/core<https://urldefense.com/v3/__https:/github.com/apache/beam/tree/master/sdks/java/core__;!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKW9ef-FT$>

7. sdks:java:io - Corresponds to the sdks/java/io repository directory.  See 
https://github.com/apache/beam/tree/master/sdks/java/io<https://urldefense.com/v3/__https:/github.com/apache/beam/tree/master/sdks/java/io__;!!CiXD_PY!VpiEtZfX43WKYrHgfxds2YmEAnz7H5eFbfvfOW7HQX8htQHFxJkvwJ2PoXmas4i_j40TKQbRi8tr$>

Best,

Damon


As a recipient of an email from Talend, your contact personal data will be on 
our systems. Please see our privacy notice. <https://www.talend.com/privacy/>


Reply via email to