Re: Expected behavior of PipelineResult#waitUntilFinish()

2017-03-29 Thread Eugene Kirpichov
Correct. On Wed, Mar 29, 2017, 5:35 PM Shen Li wrote: > Hi, thanks for the prompt reply. I read the discussion. So, the > waitUntilFinish() should *block* until all watermarks reach infinity > regardless of how long (1 second, 1 year, 100 years) it might take, right? > > Shen > > On Wed, Mar 29,

Re: Expected behavior of PipelineResult#waitUntilFinish()

2017-03-29 Thread Shen Li
Hi, thanks for the prompt reply. I read the discussion. So, the waitUntilFinish() should *block* until all watermarks reach infinity regardless of how long (1 second, 1 year, 100 years) it might take, right? Shen On Wed, Mar 29, 2017 at 8:00 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wr

Re: Expected behavior of PipelineResult#waitUntilFinish()

2017-03-29 Thread Eugene Kirpichov
Hi! Please see discussion on https://issues.apache.org/jira/browse/BEAM-849 . A pipeline terminates when all watermarks reach infinity - regardless of boundedness. On Wed, Mar 29, 2017 at 4:54 PM Shen Li wrote: > Hi, > > For pipelines with unbounded sources, should waitUntilFinish() block > fore

Expected behavior of PipelineResult#waitUntilFinish()

2017-03-29 Thread Shen Li
Hi, For pipelines with unbounded sources, should waitUntilFinish() block forever, or should it just block until the job is submitted to the cluster? Thanks, Shen

Re: [PROPOSAL] @OnWindowExpiration

2017-03-29 Thread Aljoscha Krettek
+1 I had also already commented on the issue a while back ;-) On Wed, Mar 29, 2017, at 21:23, Kenneth Knowles wrote: > I had totally forgotten that this was filed as > https://issues.apache.org/jira/browse/BEAM-1589 already, which I have now > assigned to myself. > > And, of course, there have be

Re: Spec cleanup for Finalize Checkpoint

2017-03-29 Thread Thomas Groh
(Short URL: https://s.apache.org/FIWQ) On Wed, Mar 29, 2017 at 1:15 PM, Thomas Groh wrote: > Hey everyone, > > We've had a few bugs recently in the DirectRunner based around finalizing > checkpoints, as well as a bit of confusion on what should be permitted from > within a checkpoint. Those caus

Spec cleanup for Finalize Checkpoint

2017-03-29 Thread Thomas Groh
Hey everyone, We've had a few bugs recently in the DirectRunner based around finalizing checkpoints, as well as a bit of confusion on what should be permitted from within a checkpoint. Those caused some revisiting of the checkpoint spec, both to make sure we have written down what a runner is mean

Re: [PROPOSAL] @OnWindowExpiration

2017-03-29 Thread Kenneth Knowles
I had totally forgotten that this was filed as https://issues.apache.org/jira/browse/BEAM-1589 already, which I have now assigned to myself. And, of course, there have been many discussions that mentioned the feature, so my initial phrasing as though it was a new idea probably seemed a bit odd. I

Re: [DISCUSS] Change "RunnableOnService" To A More Intuitive Name

2017-03-29 Thread Kenneth Knowles
To have something to focus on, I have opened https://github.com/apache/beam/pull/2359 to remove all of either tags from anything outside of the core libraries. I believe there is rough consensus around this and I just wanted to make it extremely concrete. Comments appreciated, especially on those

Re: [PROPOSAL] @OnWindowExpiration

2017-03-29 Thread Thomas Groh
+1 The fact that we have this ability already (including all of the required information), just in a roundabout way by manually dredging in the allowed lateness, means that this isn't a huge burden to implement on an SDK or runner side; meanwhile, this much more strongly communicates what a user i

Re: [DISCUSSION] Consistent use of loggers

2017-03-29 Thread Davor Bonaci
+1 on consistency across Beam modules on the logging facade +1 on enforcing consistency +1 on clearly documenting how to do logging Mixed feelings: * Logging backend could be runner-specific, particularly if it needs to integrate into some other experience * java.util.logging could be a good choic

Re: [PROPOSAL] @OnWindowExpiration

2017-03-29 Thread Kenneth Knowles
On Wed, Mar 29, 2017 at 12:16 AM, JingsongLee wrote: > If user have a WordCount StatefulDoFn, the result of > counts is always changing before the expiration of window. > Maybe the user want a signal to know the count is the final value > and then archive the value to the timing database or somew

Re: IO IT Patterns: Simplifying data loading

2017-03-29 Thread Stephen Sisk
Hey Cham, Debugging is harder I definitely agree. As I said (and I think you still generally agree), I think the tradeoff is worth it. Looking at the data store in question can quickly narrow it down to one vs the other for a particular failure. Eventually consistent data stores

Re: Beam spark 2.x runner status

2017-03-29 Thread Jean-Baptiste Onofré
Cool for the PR merge, I will rebase my branch on it. Thanks ! Regards JB On 03/29/2017 01:58 PM, Amit Sela wrote: @Ted definitely makes sense. @JB I'm merging https://github.com/apache/beam/pull/2354 soon so any deprecated Spark API issues should be resolved. On Wed, Mar 29, 2017 at 2:46 PM T

Re: Beam spark 2.x runner status

2017-03-29 Thread Amit Sela
@Ted definitely makes sense. @JB I'm merging https://github.com/apache/beam/pull/2354 soon so any deprecated Spark API issues should be resolved. On Wed, Mar 29, 2017 at 2:46 PM Ted Yu wrote: > This is what I did over HBASE-16179: > > -f.call((asJavaIterator(it), conn)).iterator() > +

Re: Beam spark 2.x runner status

2017-03-29 Thread Ted Yu
This is what I did over HBASE-16179: -f.call((asJavaIterator(it), conn)).iterator() +// the return type is different in spark 1.x & 2.x, we handle both cases +f.call(asJavaIterator(it), conn) match { + // spark 1.x + case iterable: Iterable[R] => iterable.

Re: Beam spark 2.x runner status

2017-03-29 Thread Amit Sela
You will need replace the return value of the callback to iterator On Wed, Mar 29, 2017, 12:19 Jean-Baptiste Onofré wrote: > I tested a workaround with reflection and it seems to work (at least it > compiles > ;)). > > I will share the PR asap. > > Regards > JB > > On 03/29/2017 10:47 AM, Amit S

Re: Beam spark 2.x runner status

2017-03-29 Thread Jean-Baptiste Onofré
I tested a workaround with reflection and it seems to work (at least it compiles ;)). I will share the PR asap. Regards JB On 03/29/2017 10:47 AM, Amit Sela wrote: Just tried to replace dependencies and see what happens: Most required changes are about the runner using deprecated Spark APIs,

Re: Beam spark 2.x runner status

2017-03-29 Thread Amit Sela
Just tried to replace dependencies and see what happens: Most required changes are about the runner using deprecated Spark APIs, and after fixing them the only real issue is with the Java API for Pair/FlatMapFunction that changed return value to Iterator (in 1.6 its Iterable). So I'm not sure tha

Re: [PROPOSAL] @OnWindowExpiration

2017-03-29 Thread JingsongLee
If user have a WordCount StatefulDoFn, the result of counts is always changing before the expiration of window. Maybe the user want a signal to know the count is the final value and then  archive the value to the timing database or somewhere else. best, JingsongLee