Re: Release 0.6.0

2017-03-02 Thread Jean-Baptiste Onofré
Hi Kenn, Fair enough. +1 Regards JB On 03/03/2017 12:28 AM, Kenneth Knowles wrote: Hi all, I've just filed https://issues.apache.org/jira/browse/BEAM-1611. It is technically not a bug in Beam but the easiest quick fix is to workaround in the DataflowRunner, so I'd like to block the release on

Re: Apache Beam (virtual) contributor meeting @ Tue Mar 7, 2017

2017-03-02 Thread Davor Bonaci
I'd prefer not to record the video; just to keep things informal. We'll, however, keep the notes and share anything that may be relevant. On Thu, Mar 2, 2017 at 2:24 PM, Amit Sela wrote: > I'll be there! > > On Thu, Mar 2, 2017 at 1:06 PM Aljoscha Krettek > wrote: > > > Shoot, I can't because I

Re: Release 0.6.0

2017-03-02 Thread Kenneth Knowles
Hi all, I've just filed https://issues.apache.org/jira/browse/BEAM-1611. It is technically not a bug in Beam but the easiest quick fix is to workaround in the DataflowRunner, so I'd like to block the release on it. It should be available ahead of the release's existing schedule, and can easily be

Re: Performance Testing Next Steps

2017-03-02 Thread Amit Sela
Looks great, and I'll be sure to follow this. Ping me if I can assist in any way! On Fri, Mar 3, 2017 at 12:09 AM Ahmet Altay wrote: > Sounds great, thank you! > > On Thu, Mar 2, 2017 at 1:41 PM, Jason Kuster .invalid > > wrote: > > > D'oh, my bad Ahmet. I've opened BEAM-1610, which handles sup

Re: Pipeline termination in the unified Beam model

2017-03-02 Thread Amit Sela
+1 on Eugene's words - this shows how batch is conceptually a subset of a streaming problem. I also believe that Stas has a very good point on education - we have to try and understand developer's current perspective and try to make the transition to the Beam model as natural as possible for new us

Re: Apache Beam (virtual) contributor meeting @ Tue Mar 7, 2017

2017-03-02 Thread Amit Sela
I'll be there! On Thu, Mar 2, 2017 at 1:06 PM Aljoscha Krettek wrote: > Shoot, I can't because I already have another meeting scheduled. Don't mind > me, though. Will you also maybe produce a video of the meeting? > > On Wed, 1 Mar 2017 at 21:50 Davor Bonaci wrote: > > > Hi everyone, > > Based

Re: Performance Testing Next Steps

2017-03-02 Thread Ahmet Altay
Sounds great, thank you! On Thu, Mar 2, 2017 at 1:41 PM, Jason Kuster wrote: > D'oh, my bad Ahmet. I've opened BEAM-1610, which handles support for Python > in PKB against the Dataflow runner. Once the Fn API progresses some more we > can add some work items for the other runners too. Let's chat

Re: Performance Testing Next Steps

2017-03-02 Thread Jason Kuster
D'oh, my bad Ahmet. I've opened BEAM-1610, which handles support for Python in PKB against the Dataflow runner. Once the Fn API progresses some more we can add some work items for the other runners too. Let's chat about this more, maybe next week? On Thu, Mar 2, 2017 at 1:31 PM, Ahmet Altay wrote

Re: Performance Testing Next Steps

2017-03-02 Thread Ahmet Altay
Thank you Jason, this is great. Which one of these issues fall into the land of sdk-py? Ahmet On Thu, Mar 2, 2017 at 12:34 PM, Jason Kuster < jasonkus...@google.com.invalid> wrote: > Glad to hear the excitement. :) > > Filed BEAM-1595 - 1609 to track work items. Some of these fall under runner

Re: Performance Testing Next Steps

2017-03-02 Thread Jason Kuster
Glad to hear the excitement. :) Filed BEAM-1595 - 1609 to track work items. Some of these fall under runner components, please feel free to reach out to me if you have any questions about how to accomplish these. Best, Jason On Wed, Mar 1, 2017 at 5:50 AM, Aljoscha Krettek wrote: > Thanks for

Re: Vacation for a few weeks

2017-03-02 Thread Jean-Baptiste Onofré
ENJOY ! You deserve great vacations ! Say hi to the Maoris for me ;) Regards JB On 03/02/2017 08:22 PM, Dan Halperin wrote: Hey folks, I wanted to give you a heads-up that I'll be offline starting tomorrow through 20th March. I think I've handled most of the questions and pull requests and J

Vacation for a few weeks

2017-03-02 Thread Dan Halperin
Hey folks, I wanted to give you a heads-up that I'll be offline starting tomorrow through 20th March. I think I've handled most of the questions and pull requests and JIRA issues you've sent me, but I know the community will be happy to help with urgent issues in the rest. (I also will not be ab

Re: Pipeline termination in the unified Beam model

2017-03-02 Thread Eugene Kirpichov
OK, I'm glad everybody is in agreement on this. I raised this point because we've been discussing implementing this behavior in the Dataflow streaming runner, and I wanted to make sure that people are okay with it from a conceptual point of view before proceeding. On Thu, Mar 2, 2017 at 10:27 AM K

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-03-02 Thread Stephen Sisk
Thanks for your time thinking about this! Sounds like we all like #1. That's great. I see that Ismael commented on the PR to suggest the specific changes, so I think we should be good to go. To answer JB's question about the later merging: > when BEAM-59 will be done, hadoop IO will only contains

Re: Pipeline termination in the unified Beam model

2017-03-02 Thread Kenneth Knowles
Isn't this already the case? I think semantically it is an unavoidable conclusion, so certainly +1 to that. The DirectRunner and TestDataflowRunner both have this behavior already. I've always considered that a streaming job running forever is just [very] suboptimal shutdown latency :-) Some bits

Re: Pipeline termination in the unified Beam model

2017-03-02 Thread Dan Halperin
Note that even "unbounded pipeline in a streaming runner".waitUntilFinish() can return, e.g., if you cancel it or terminate it. It's totally reasonable for users to want to understand and handle these cases. +1 Dan On Thu, Mar 2, 2017 at 2:53 AM, Jean-Baptiste Onofré wrote: > +1 > > Good idea

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-03-02 Thread Ismaël Mejía
​Hello, I answer since I have been leading the refactor to hadoop-common. My criteria to move a class into hadoop-common is that it is used at least by more than one other module or IO, this is the reason is not big, but it can grow if needed. +1 for option #1 because of the visibility reasons yo

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-03-02 Thread Jean-Baptiste Onofré
By the way Stephen, when BEAM-59 will be done, hadoop IO will only contains the hadoop format support (no HdfsFileSource or HdfsSink required as it will use the "regular" FileIO). Agree ? Regards JB On 03/02/2017 03:27 PM, Jean-Baptiste Onofré wrote: Hi Stephen, I agree to use the following

Re: Merge HadoopInputFormatIO and HDFSIO in a single module

2017-03-02 Thread Jean-Baptiste Onofré
Hi Stephen, I agree to use the following structure (and it's basically what I proposed in a comment of the PR): io/hadoop io/hadoop-common io/hbase I would be more than happy to help on the "merge" of HdfsIO and HadoopFormat. Regards JB On 03/01/2017 08:00 PM, Stephen Sisk wrote: I wante

Re: Apache Beam (virtual) contributor meeting @ Tue Mar 7, 2017

2017-03-02 Thread Aljoscha Krettek
Shoot, I can't because I already have another meeting scheduled. Don't mind me, though. Will you also maybe produce a video of the meeting? On Wed, 1 Mar 2017 at 21:50 Davor Bonaci wrote: > Hi everyone, > Based on the high demand [1], let's try to organize a virtual contributor > meeting on Tues

Re: First stable release: version designation?

2017-03-02 Thread Aljoscha Krettek
I prefer 2.0.0 for the first stable release. It totally makes sense for people coming from Dataflow 1.x and I can already envision the confusion between Beam 1.5 and Dataflow 1.5. On Thu, 2 Mar 2017 at 07:42 Jean-Baptiste Onofré wrote: > Hi Davor, > > > For a Beam community perspective, 1.0.0 wo

Re: Pipeline termination in the unified Beam model

2017-03-02 Thread Jean-Baptiste Onofré
+1 Good idea !! Regards JB On 03/02/2017 02:54 AM, Eugene Kirpichov wrote: Raising this onto the mailing list from https://issues.apache.org/jira/browse/BEAM-849 The issue came up: what does it mean for a pipeline to finish, in the Beam model? Note that I am deliberately not talking about "b

Re: Pipeline termination in the unified Beam model

2017-03-02 Thread Stas Levin
+1! I think it's a very cool way to abstract away the batch vs. streaming dissonance from the Beam model. It does require that practitioners are *educated* to think this way as well. I believe that nowadays the terms "batch" and "streaming" are so deeply rooted, that they play a key role in the u