Thanks Ahmet. I created https://issues.apache.org/jira/browse/BEAM-1630 for this.
Any further comments are welcome :). Implementation wise, I think we should add Splittable DoFn to streaming direct runner first, after [1] is finalized, and then follow it up with support for bounded sources and other runners (currently Dataflow runner and Dataflow/other runners through Fn API [2]). Thanks, Cham [1] https://issues.apache.org/jira/browse/BEAM-1265 [2] https://s.apache.org/beam-fn-api On Fri, Mar 3, 2017 at 5:34 PM Ahmet Altay <[email protected]> wrote: > +1 Thank you, this is a great and clean API proposal. > > Ahmet > > On Fri, Mar 3, 2017 at 5:16 PM, Chamikara Jayalath <[email protected]> > wrote: > > > Hi All, > > > > I've put together a document that proposes a Splittable DoFn API for > Python > > SDK. > > > > https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH > > 1YDmi-s_ozM/edit?usp=sharing > > > > Splittable DoFn framework [1] is currently being implemented for Java SDK > > [2] and will unlock many use-cases that are not possible with the current > > BoundedSource framework [3] (see [1] for details). So, I believe, it will > > be good to add a similar framework to Python SDK as well. > > > > Please let me know what you think. > > > > Thanks, > > Cham > > > > [1] http://s.apache.org/splittable-do-fn > > [2] https://issues.apache.org/jira/browse/BEAM-65 > > [3] > > https://github.com/apache/beam/blob/master/sdks/python/ > > apache_beam/io/iobase.py > > >
