Re: Community Examples Repository

2018-08-01 Thread Charles Chen
I would also prefer that examples be linked to releases so that we can build and test them during development; i.e. if your commit breaks wordcount, we want to know right away so we can revert. Perhaps we can keep these in the repo but more clearly modularize the artifacts we release? For the Pyt

Re: Cleanup resources on pipeline cancelation

2018-08-01 Thread Romain Manni-Bucau
I agree Reuven. But leaking in a source doesnt give any guarantee regarding the execution since it will depends the runner and current API will not provide you that feature. Using a reference counting state can work better but would require a sdf migration (and will hit runner support issues :().

Re: Community Examples Repository

2018-08-01 Thread jb
Hi, I don't have problem to move the examples in a dedicated repository. However, IMHO, we have to: 1. Keep a build of examples linked to latest core release/SNAPSHOT 2. Include the examples in the distribution (convenient for the users) On another topic, I think it would be better to avoid us

Re: Cleanup resources on pipeline cancelation

2018-08-01 Thread Reuven Lax
Hi Romain, Andrew's example actually wouldn't work for that. With Google Cloud Pub/Sub (the example source he referenced), if there is no subscription to a topic, all publishes to that topic are dropped on the floor; if you don't want to lose data, your are expected to keep the subscription around

Re: Community Examples Repository

2018-08-01 Thread Davor Bonaci
> > it makes sense to modularize It certainly does, but somebody just had another proposal to move the website into the main repository ;-). That proposal was also good for ~everyone. Fun times... (I have my opinions, of course, but I'm fine with any approach.) On Wed, Aug 1, 2018 at 4:37 PM, A

Re: CODEOWNERS for apache/beam repo

2018-08-01 Thread Udi Meiri
Hi, so I saw mention bot working this week. How was the quality of suggestions? Holden, I would like to start testing Prow starting next week if that's possible. I'll be opening a ticket to INFRA to give my Github bot account read a

Re: Community Examples Repository

2018-08-01 Thread Ahmet Altay
Thank you for this initiative. How about keeping a set of core examples in the main repository as a way of 1) convenient testing at a PR level 2) Testing with end to end tests against Beam head rather than a released Beam version 3) I think there is some educational value in having wordcount as a

Re: Community Examples Repository

2018-08-01 Thread Andrew Pilloud
Looks good to me. This also sounds like a great way to ensure we aren't making breaking changes to the API surface of Beam across releases. I've been told this is a painful thing we frequently do. Hopefully we can run tests on these external examples against various versions of Beam. Andrew On We

Re: [VOTE] Apache Beam, version 2.6.0, release candidate #1

2018-08-01 Thread Boyuan Zhang
+1 Tested Dataflow related items in: https://s.apache.org/beam-release-validation On Wed, Aug 1, 2018 at 11:40 AM Yifan Zou wrote: > +1 > Tested Python quickstarts and mobile gaming examples against tar and wheel > versions. > https://builds.apache.org/job/beam_PostRelease_Python_Candidate/123/

Re: Community Examples Repository

2018-08-01 Thread Jesse Anderson
The examples have to be separate from the main beam repository. This way, they serve as an example of how to use them in your code instead of how to do it as part of Beam. It would also you to show the dependencies in sbt or Maven. On Wed, Aug 1, 2018, 3:16 PM Charles Chen wrote: > The examples

Re: Community Examples Repository

2018-08-01 Thread David Cavazos
For visibility, we can have a link on both the beam.apache.org website and in the core repository's README file. Regarding testing *could* be a little trickier. Any unit test should continue to live in the core repository, and the examples from the examples repository could serve as end-to-end tes

Re: Community Examples Repository

2018-08-01 Thread Charles Chen
The examples we have right now serve both as examples to users and along with their unit tests, as tests of functionality. If we move the examples out, what is a good way to make sure that we continue to have visibility and test coverage? Can we address this in a section of the doc? On Wed, Aug

Community Examples Repository

2018-08-01 Thread David Cavazos
Hi everyone! We wanted to migrate the examples from the core repository to a new Beam community examples repository. As the number of examples grow, it makes sense to modularize and decouple the core functionality from the examples. We will also create some guidelines with the best practices for

Re: [VOTE] Apache Beam, version 2.6.0, release candidate #1

2018-08-01 Thread Yifan Zou
+1 Tested Python quickstarts and mobile gaming examples against tar and wheel versions. https://builds.apache.org/job/beam_PostRelease_Python_Candidate/123/ On Wed, Aug 1, 2018 at 8:27 AM Andrew Pilloud wrote: > +1 tested the Beam SQL jar from the Maven Central repo, it worked. > > On Wed, Aug 1

Re: Build failed in Jenkins: beam_Release_Gradle_NightlySnapshot #127

2018-08-01 Thread Chamikara Jayalath
Created https://issues.apache.org/jira/browse/BEAM-5057 On Wed, Aug 1, 2018 at 1:20 AM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See < > https://builds.apache.org/job/beam_Release_Gradle_NightlySnapshot/127/display/redirect?page=changes > > > > Changes: > > [github] [BEAM-4852]

Jenkins build is back to normal : beam_SeedJob #2346

2018-08-01 Thread Apache Jenkins Server
See

Build failed in Jenkins: beam_SeedJob #2345

2018-08-01 Thread Apache Jenkins Server
See -- GitHub pull request #4943 of commit 73965f43d84ed30cba50c6802783d10df6fef9d4, no merge conflicts. Setting status of 73965f43d84ed30cba50c6802783d10df6fef9d4 to PENDING with url https

Build failed in Jenkins: beam_SeedJob #2344

2018-08-01 Thread Apache Jenkins Server
See -- GitHub pull request #4943 of commit 901cdcf0d8d4264035c5da668cec9a39743317cf, no merge conflicts. Setting status of 901cdcf0d8d4264035c5da668cec9a39743317cf to PENDING with url https

Re: Parallelizing test runs

2018-08-01 Thread Pablo Estrada
It feels to me like a peak of 60 jobs per minute is pretty high. If I understand correctly, we run up to 20 dataflow jobs in parallel per test suite? Or what's the number here? It is also true that most our tests are simple NeedsRunner tests, that test a couple elements, so the whole pipeline over

Re: Parallelizing test runs

2018-08-01 Thread Andrew Pilloud
I like 1 and 2. How do credentials get into Jenkins? Could we create a user per Jenkins host? On Tue, Jul 31, 2018 at 4:33 PM Reuven Lax wrote: > There was also a proposal to lump multiple tests into a single Dataflow > job instead of spinning up a separate Dataflow job for each test. > > On Tue

Re: [VOTE] Apache Beam, version 2.6.0, release candidate #1

2018-08-01 Thread Andrew Pilloud
+1 tested the Beam SQL jar from the Maven Central repo, it worked. On Wed, Aug 1, 2018 at 7:37 AM Romain Manni-Bucau wrote: > Hi Pablo, > > +1, tested on my apps and libs and words after some fixed due to some > breaking changes in ArgProvider - but guess it is not "public" to need to > be repor

Runner agnostic Metrics

2018-08-01 Thread Jozef Vilcek
Hello, I would like to ask about Beam's runner agnostic metrics I found in 2.5.0 release notes. I am considering to abandon Flink specific reporter in favour of this one, but when looking at code, it seems feature is not fully integrated. Here: https://github.com/apache/beam/blob/279a05604b83a54

Re: [VOTE] Apache Beam, version 2.6.0, release candidate #1

2018-08-01 Thread Romain Manni-Bucau
Hi Pablo, +1, tested on my apps and libs and words after some fixed due to some breaking changes in ArgProvider - but guess it is not "public" to need to be reported. Romain Manni-Bucau @rmannibucau | Blog | Old Blog

Re: Cleanup resources on pipeline cancelation

2018-08-01 Thread Chamikara Jayalath
Hi Andrew, Beam currently does not have a generalized cleanup story so answer usually has been ad-hoc. For bounded source we can (1) cleanup any resources created for splitting after splitting (2) cleanup resources created for a given reader when the reader exists (last advaince() call). I'm not

Re: pipeline with parquet and sql

2018-08-01 Thread Chamikara Jayalath
On Wed, Aug 1, 2018 at 1:12 AM Akanksha Sharma B < akanksha.b.sha...@ericsson.com> wrote: > Hi, > > > Thanks. I understood the Parquet point. I will wait for couple of days on > this topic. Even if this scenario cannot be achieved now, any design > document or future plans towards this direction w

Build failed in Jenkins: beam_Release_Gradle_NightlySnapshot #127

2018-08-01 Thread Apache Jenkins Server
See Changes: [github] [BEAM-4852] Only read symbol table when required. [github] Update symbols.go [github] Don't rely on order of elements in a PCollection after GBK in [devinduan] Spelling