On Mon, Jan 6, 2020 at 1:39 PM Chamikara Jayalath <[email protected]> wrote:
> Regarding cross-language transforms, we need to add better documentation, > but for now you'll have to go with existing examples and tests. For example, > > > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/external/gcp/pubsub.py > > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/external/kafka.py > > Note that cross-language transforms feature is currently only available > for Flink Runner. Dataflow support is in development. > I think it works with all non-Dataflow runners, with the exception of the Java and Go Direct runners. (It does work with the Python direct runner.) > I'm fine with developing this natively for Python as well. AFAIK Java JDBC > IO connector is not a super-complicated connector and it should be fine to > make relatively easy to maintain and widely usable connectors available in > multiple SDKs. > Yes, a case can certainly be made for having native connectors for particular common/simple sources. (We certainly don't call cross-language to read text files for example.) > > Thanks, > Cham > > > On Mon, Jan 6, 2020 at 10:56 AM Luke Cwik <[email protected]> wrote: > >> +Chamikara Jayalath <[email protected]> +Heejong Lee >> <[email protected]> >> >> On Mon, Jan 6, 2020 at 10:20 AM <[email protected]> wrote: >> >>> How do I go about doing that? From the docs, it appears cross language >>> transforms are >>> currently undocumented. >>> https://beam.apache.org/roadmap/connectors-multi-sdk/ >>> On Jan 6, 2020, at 12:55 PM, Luke Cwik <[email protected]> wrote: >>> >>> What about using a cross language transform between Python and the >>> already existing Java JdbcIO transform? >>> >>> On Sun, Jan 5, 2020 at 5:18 AM Peter Dannemann <[email protected]> wrote: >>> >>>> I’d like to develop the Python SDK’s SQL IO connector. I was thinking >>>> it would be easiest to use sqlalchemy to achieve maximum database engine >>>> support, but I suppose I could also create an ABC for databases that follow >>>> the DB API and create subclasses for each database engine that override a >>>> connect method. What are your thoughts on the best way to do this? >>>> >>>
