Re: [Newbie Contributing Question] How to add a dependency?

2020-03-10 Thread Tomo Suzuki
Hi Jacob, You'll need to modify BeamModulePlugin.groovy, which defines a big map of Maven artifacts. The map is referenced by each module such as the google-cloud-platform/build.gradle. Example PR to touch dependencies: https://github.com/apache/beam/pull/11063/files On Tue, Mar 10, 2020 at 11:22

Selecting a 2020 GSoC Project

2020-03-10 Thread Badrul Chowdhury
Hi, I am looking to submit a proposal for one of Beam's listed projects. Which is a higher priority for the committee? Or are they of equal priority? I am trying to understand the proposal selection criteria because I am super excited about both of these. Implement an Azure blobstore filesystem f

[Newbie Contributing Question] How to add a dependency?

2020-03-10 Thread Jacob Ferriero
Hi beam dev list, Hoping to find some pointers on how to best add a dependency for a new GCP IO connector. Specifically I want to add a dependency on the Cloud Healthcare API or the equivalent of this section of a maven pom.xml: ```xml v1alpha2-rev20190901-1.30.1 1.30.1

[Progress Update] Improvements to the Apache Beam website - Phase 1

2020-03-10 Thread Aizhamal Nurmamat kyzy
Hello everyone, As per our previous discussion [1], we are ready to move forward with the 1st phase of our project plan and migrate the Beam website to Docsy [2]. Please review the doc and discussion to see the background for the change. Also feel free to ask any questions and share any feedback y

Re: Snowflake connector

2020-03-10 Thread Chamikara Jayalath
On Tue, Mar 10, 2020 at 1:18 PM Tyler Akidau wrote: > On Tue, Mar 10, 2020 at 1:27 AM Elias Djurfeldt < > elias.djurfe...@mirado.com> wrote: > >> From what I can tell, the only difference is that the Python connector is >> a pure Python implementation and doesn't rely on ODBC or JDBC (it's just a

Re: Snowflake connector

2020-03-10 Thread Tyler Akidau
On Tue, Mar 10, 2020 at 1:27 AM Elias Djurfeldt wrote: > From what I can tell, the only difference is that the Python connector is > a pure Python implementation and doesn't rely on ODBC or JDBC (it's just a > pip installable). Whereas the Java version needs JDBC. But that seems to be > the only

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Luke Cwik
I have to disagree. Allowing for runners within the Apache Beam repo and SDKs that reach into the implementation details of each other are usability, feature development, maintenance and complexity problems. The usability issue comes from our public core facing APIs exposing methods that runners "

Re: Jenkins jobs not running for my PR 10438

2020-03-10 Thread Ahmet Altay
Done. On Tue, Mar 10, 2020 at 12:21 PM Tomo Suzuki wrote: > Hi Beam committers, > > Would you trigger precomimt checks for > https://github.com/apache/beam/pull/11095 with the following 6 commands ? > Run Java PostCommit > Run Java HadoopFormatIO Performance Test > Run BigQueryIO Streaming Perfo

Re: Jenkins jobs not running for my PR 10438

2020-03-10 Thread Tomo Suzuki
Hi Beam committers, Would you trigger precomimt checks for https://github.com/apache/beam/pull/11095 with the following 6 commands ? Run Java PostCommit Run Java HadoopFormatIO Performance Test Run BigQueryIO Streaming Performance Test Java Run Dataflow ValidatesRunner Run Spark ValidatesRunner Ru

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Kenneth Knowles
I do support all the efforts to get Dataflow, Flink, and Spark to 3 (Fn API). But I disagree with it as a requirement; the whole point of ptransforms with URNs is that if the runner can figure out how to execute it according to semantics, then it is fine. A runner meets (1) and (2) but can only run

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Luke Cwik
I would like to move away from having runners access APIs that are related to pipeline construction and other internal SDK APIs and I would like for SDKs to not inspect internal runner APIs. This would enable the community to improve each independently without needing to fix the world all the time

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Kenneth Knowles
There are a lot of different meanings to "portable runner". Here are some: (1) A runner that accepts a pipeline proto and either runs it or says it cannot run it (2) A runner that accepts jobs via the job management APIs (3) A runner that executes UDFs via the Fn API (4) A runner that can execute

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Luke Cwik
+1 On Tue, Mar 10, 2020 at 12:59 AM Alex Van Boxel wrote: > One last thing, for any runner after this one... wouldn't it be a good > acceptance criteria to only accept portable implementations anymore? > > _/ > _/ Alex Van Boxel > > > On Mon, Mar 9, 2020 at 10:42 PM Ismaël Mejía wrote: > >> Go

Re: Snowflake connector

2020-03-10 Thread Elias Djurfeldt
>From what I can tell, the only difference is that the Python connector is a pure Python implementation and doesn't rely on ODBC or JDBC (it's just a pip installable). Whereas the Java version needs JDBC. But that seems to be the only difference. I don't know enough about the Java side of Beam (or

Re: Contributing Twister2 runner to Apache Beam

2020-03-10 Thread Alex Van Boxel
One last thing, for any runner after this one... wouldn't it be a good acceptance criteria to only accept portable implementations anymore? _/ _/ Alex Van Boxel On Mon, Mar 9, 2020 at 10:42 PM Ismaël Mejía wrote: > Good points Kenn. I think we mostly agree on what has been discussed in > this