Neat.

Nothing like writing and SDK to actually understand how the FnAPI works :).
I like the use of groupBy. I have to admit I'm a bit mystified by the
syntax for parDo (I don't know swift at all which is probably tripping me
up). The addition of external (cross-language) transforms could let you
steal everything (e.g. IOs) pretty quickly from other SDKs.

On Fri, Aug 18, 2023 at 7:55 AM Byron Ellis via user <user@beam.apache.org>
wrote:

> For everyone who is interested, here's the draft PR:
>
> https://github.com/apache/beam/pull/28062
>
> I haven't had a chance to test it on my M1 machine yet though (there's a
> good chance there are a few places that need to properly address
> endianness. Specifically timestamps in windowed values and length in
> iterable coders as those both use specifically bigendian representations)
>
>
> On Thu, Aug 17, 2023 at 8:57 PM Byron Ellis <byronel...@google.com> wrote:
>
>> Thanks Cham,
>>
>> Definitely happy to open a draft PR so folks can comment---there's not as
>> much code as it looks like since most of the LOC is just generated
>> protobuf. As for the support, I definitely want to add external transforms
>> and may actually add that support before adding the ability to make
>> composites in the language itself. With the way the SDK is laid out adding
>> composites to the pipeline graph is a separate operation than defining a
>> composite.
>>
>> On Thu, Aug 17, 2023 at 4:28 PM Chamikara Jayalath <chamik...@google.com>
>> wrote:
>>
>>> Thanks Byron. This sounds great. I wonder if there is interest in Swift
>>> SDK from folks currently subscribed to the +user <user@beam.apache.org>
>>>  list.
>>>
>>> On Wed, Aug 16, 2023 at 6:53 PM Byron Ellis via dev <d...@beam.apache.org>
>>> wrote:
>>>
>>>> Hello everyone,
>>>>
>>>> A couple of months ago I decided that I wanted to really understand how
>>>> the Beam FnApi works and how it interacts with the Portable Runner. For me
>>>> at least that usually means I need to write some code so I can see things
>>>> happening in a debugger and to really prove to myself I understood what was
>>>> going on I decided I couldn't use an existing SDK language to do it since
>>>> there would be the temptation to read some code and convince myself that I
>>>> actually understood what was going on.
>>>>
>>>> One thing led to another and it turns out that to get a minimal FnApi
>>>> integration going you end up writing a fair bit of an SDK. So I decided to
>>>> take things to a point where I had an SDK that could execute a word count
>>>> example via a portable runner backend. I've now reached that point and
>>>> would like to submit my prototype SDK to the list for feedback.
>>>>
>>>> It's currently living in a branch on my fork here:
>>>>
>>>> https://github.com/byronellis/beam/tree/swift-sdk/sdks/swift
>>>>
>>>> At the moment it runs via the most recent XCode Beta using Swift 5.9 on
>>>> Intel Macs, but should also work using beta builds of 5.9 for Linux running
>>>> on Intel hardware. I haven't had a chance to try it on ARM hardware and
>>>> make sure all of the endian checks are complete. The
>>>> "IntegrationTests.swift" file contains a word count example that reads some
>>>> local files (as well as a missing file to exercise DLQ functionality) and
>>>> output counts through two separate group by operations to get it past the
>>>> "map reduce" size of pipeline. I've tested it against the Python Portable
>>>> Runner. Since my goal was to learn FnApi there is no Direct Runner at this
>>>> time.
>>>>
>>>> I've shown it to a couple of folks already and incorporated some of
>>>> that feedback already (for example pardo was originally called dofn when
>>>> defining pipelines). In general I've tried to make the API as "Swift-y" as
>>>> possible, hence the heavy reliance on closures and while there aren't yet
>>>> composite PTransforms there's the beginnings of what would be needed for a
>>>> SwiftUI-like declarative API for creating them.
>>>>
>>>> There are of course a ton of missing bits still to be implemented, like
>>>> counters, metrics, windowing, state, timers, etc.
>>>>
>>>
>>> This should be fine and we can get the code documented without these
>>> features. I think support for composites and adding an external transform
>>> (see, Java
>>> <https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/External.java>,
>>> Python
>>> <https://github.com/apache/beam/blob/c7b7921185686da573f76ce7320817c32375c7d0/sdks/python/apache_beam/transforms/external.py#L556>,
>>> Go
>>> <https://github.com/apache/beam/blob/c7b7921185686da573f76ce7320817c32375c7d0/sdks/go/pkg/beam/xlang.go#L155>,
>>> TypeScript
>>> <https://github.com/apache/beam/blob/master/sdks/typescript/src/apache_beam/transforms/external.ts>)
>>> to add support for multi-lang will bring in a lot of features (for example,
>>> I/O connectors) for free.
>>>
>>>
>>>>
>>>> Any and all feedback welcome and happy to submit a PR if folks are
>>>> interested, though the "Swift Way" would be to have it in its own repo so
>>>> that it can easily be used from the Swift Package Manager.
>>>>
>>>
>>> +1 for creating a PR (may be as a draft initially). Also it'll be easier
>>> to comment on a PR :)
>>>
>>> - Cham
>>>
>>> [1]
>>> [2]
>>> [3]
>>>
>>>
>>>>
>>>> Best,
>>>> B
>>>>
>>>>
>>>>

Reply via email to