Hello all,

The proposed I/O Standards are now ready as a new page for the Apache Beam
website, please review this PR
https://github.com/apache/beam/pull/24962

Thanks!

Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
herman...@google.com |  +852-3923-5417





On Wed, Dec 21, 2022 at 6:02 PM Herman Mak <herman...@google.com> wrote:

> Hello all,
>
> I've addressed the areas with comments with updated explanations and
> responses where necessary.
>
> Please do have a quick read if you have time.
> I shall follow-up with these datas as markdown changes to beam site in a
> couple of days for feedback.
>
> Thanks!
>
> Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
> herman...@google.com |  +852-3923-5417 <+852%203923%205417>
>
>
>
>
>
> On Sat, Dec 17, 2022 at 2:13 AM Andrew Pilloud <apill...@google.com>
> wrote:
>
>> By "Relational" I mean things like: Column Pruning, Filter Pushdown,
>> Table Statistics, Partition Metadata, Metastore. We have a bunch of one-off
>> implementations in various IOs (mostly BigQueryIO) and have been waiting
>> for IO standards to push them out to all IOs. This was section "F5 -
>> Relational" from https://s.apache.org/beam-io-api-standard-documentation
>>
>> On Thu, Dec 15, 2022 at 6:50 PM Herman Mak <herman...@google.com> wrote:
>>
>>> Hey all,
>>>
>>> Firstly apologies for the confusion.
>>>
>>> The scope of this effort is to *finalize and have this added to the
>>> Beam public documentation* to be used as a PR reference once we have
>>> resolved the comments.
>>> YES this document is a continuation of the below docs with some
>>> additional components such as testing!
>>>
>>> The idea is to convert this to a MD file and add a page under
>>> "Developing new I/O connectors" with some small cleanup work around this
>>> area in other pages.
>>> [image: image.png]
>>>
>>>
>>>
>>>
>>> Docs that this is a continuation of:
>>> https://s.apache.org/beam-io-api-standard-documentation
>>> https://s.apache.org/beam-io-api-standard
>>>
>>>
>>> @Andrew Pilloud <apill...@google.com> Totally not intending to start
>>> from the beginning here, by relational do you mean having this hosting in
>>> the Beam confluence?
>>>
>>> Thanks all, and keep the feedback to the docs coming
>>>
>>> Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
>>> herman...@google.com |  +852-3923-5417 <+852%203923%205417>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Dec 16, 2022 at 1:36 AM Chamikara Jayalath <chamik...@google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Dec 15, 2022, 8:33 AM Alexey Romanenko <
>>>> aromanenko....@gmail.com> wrote:
>>>>
>>>>> Cham, do you remember what was a reason to not finalise that doc?
>>>>>
>>>>
>>>> I think this is a continuation of those docs (so we are trying to
>>>> finalize) but probably  Herman can explain better.
>>>>
>>>>
>>>>> Personally, I find having such standards very useful (if they are
>>>>> flexible during a time, of course), especially for new developers and PR
>>>>> reviewers, and it’d be great to finally have such doc as a part of
>>>>> contribution guide.
>>>>>
>>>>
>>>> +1
>>>>
>>>> Thanks,
>>>> Cham
>>>>
>>>>>
>>>>> —
>>>>> Alexey
>>>>>
>>>>> On 13 Dec 2022, at 04:32, Chamikara Jayalath via dev <
>>>>> dev@beam.apache.org> wrote:
>>>>>
>>>>> Yeah, I don't think either finalized or documented (in the Website)
>>>>> the previous iteration. This doc seems to contain details from the
>>>>> documents shared in the previous iteration.
>>>>>
>>>>> Thanks,
>>>>> Cham
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Dec 12, 2022 at 6:49 PM Robert Burke <rob...@frantil.com>
>>>>> wrote:
>>>>>
>>>>>> I think ultimately: until the docs a clearly available on the Beam
>>>>>> site itself, it's not documentation. See also, design docs, previous
>>>>>> emails, and similar.
>>>>>>
>>>>>> On Mon, Dec 12, 2022, 6:07 PM Andrew Pilloud via dev <
>>>>>> dev@beam.apache.org> wrote:
>>>>>>
>>>>>>> I believe the previous iteration was here:
>>>>>>> https://lists.apache.org/thread/3o8glwkn70kqjrf6wm4dyf8bt27s52hk
>>>>>>>
>>>>>>> The associated docs are:
>>>>>>> https://s.apache.org/beam-io-api-standard-documentation
>>>>>>> https://s.apache.org/beam-io-api-standard
>>>>>>>
>>>>>>> This is missing all the relational stuff that was in those docs,
>>>>>>> this appears to be another attempt starting from the beginning?
>>>>>>>
>>>>>>> Andrew
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 12, 2022 at 9:57 AM Alexey Romanenko <
>>>>>>> aromanenko....@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks for writing this!
>>>>>>>>
>>>>>>>> IIRC, the similar design doc was sent for review here a while ago.
>>>>>>>> Is this just an updated version and a new one?
>>>>>>>>
>>>>>>>> —
>>>>>>>> Alexey
>>>>>>>>
>>>>>>>> On 11 Dec 2022, at 15:16, Herman Mak via dev <dev@beam.apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hello Everyone,
>>>>>>>>
>>>>>>>> *TLDR*
>>>>>>>>
>>>>>>>> Should we adopt a set of standards that Connector I/Os should
>>>>>>>> adhere to?
>>>>>>>> Attached is a first version of a Beam I/O Standards guideline that
>>>>>>>> includes opinionated best practices across important components of a
>>>>>>>> Connector I/O, namely Documentation, Development and Testing.
>>>>>>>>
>>>>>>>> *The Long Version*
>>>>>>>>
>>>>>>>> Apache Beam is a unified open-source programming model for both
>>>>>>>> batch and streaming. It runs on multiple platform runners and 
>>>>>>>> integrates
>>>>>>>> with over 50 services using individually developed I/O Connectors
>>>>>>>> <https://beam.apache.org/documentation/io/connectors/>.
>>>>>>>>
>>>>>>>> Given that Apache Beam connectors are written by many different
>>>>>>>> developers and at varying points in time, they vary in syntax style,
>>>>>>>> documentation completeness and testing done. For a new adopter of 
>>>>>>>> Apache
>>>>>>>> Beam, that can definitely cause some uncertainty.
>>>>>>>>
>>>>>>>> So should we adopt a set of standards that Connector I/Os should
>>>>>>>> adhere to?
>>>>>>>> Attached is a first version, in Doc format, of a Beam I/O Standards
>>>>>>>> guideline that includes opinionated best practices across important
>>>>>>>> components of a Connector I/O, namely Documentation, Development and
>>>>>>>> Testing. And the aim is to incorporate this into the documentation and 
>>>>>>>> to
>>>>>>>> have it referenced as standards for new Connector I/Os (and ideally 
>>>>>>>> have
>>>>>>>> existing Connectors upgraded over time). If it looks helpful, the 
>>>>>>>> immediate
>>>>>>>> next step is that we can convert it into a .md as a PR into the Beam 
>>>>>>>> repo!
>>>>>>>>
>>>>>>>> Thanks and looking forward to feedbacks and discussion,
>>>>>>>>
>>>>>>>>  [PUBLIC] Beam I/O Standards
>>>>>>>> <https://docs.google.com/document/d/1BCTpSZDUjK90hYZjcn8aAnPd9vuRfj8YU1j3mpSgRwI/edit?usp=drive_web>
>>>>>>>>
>>>>>>>> Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
>>>>>>>> herman...@google.com |  +852-3923-5417 <+852%203923%205417>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>

Reply via email to