Re: Beam Meetups Feb 2019

2019-02-11 Thread Teja MVSR
Hi, Can you please provide any video recordings if they are available? Thanks, Teja On Mon, Feb 11, 2019, 4:51 PM Austin Bennett The slides from Tyler's presentation found: > http://s.apache.org/beam-intro-feb-2019 > > I'll also send out links to videos once I get my hands on them (@Mark > Grov

Re: Thoughts on a reference runner to invest in?

2019-02-11 Thread Kenneth Knowles
Interesting silence here. You've got it right that the reason we initially chose Java was because of the cross-runner sharing. The reference runner could be the first target runner for any new feature and then its work could be directly (or indirectly via copy/paste/modify if it works better) be us

Re: Thoughts on a reference runner to invest in?

2019-02-11 Thread Daniel Oliveira
Yeah, the FnApiRunner is what I'm leaning towards too. I wasn't sure how much demand there was for an actual reference implementation in Java though, so I was hoping there were runner authors that would want to chime in. On the other hand, the Flink runner could serve as a reference implementation

Re: JIRA priorities explaination

2019-02-11 Thread Daniel Oliveira
Ah, sorry, I missed that Alex was just quoting from our Jira installation (didn't read his email closely enough). Also I wasn't aware about those pages on our website. Seeing as we do have definitions for our priorities, I guess my main request would be that they be made more discoverable somehow.

Re: Add exception handling to MapElements

2019-02-11 Thread Sam Rohde
Sure, I was thinking of treating the apply as a promise (making use of your CodableException idea as well): ``` PCollection<...> result = words.apply(new SomeUserDoFn()) .then(new SomeOtherDoFn()) .then(new OtherDoFn(), // Error Handler (CodableException<...> e) -> { logger.info(e.getM

Re: Beam Meetups Feb 2019

2019-02-11 Thread Austin Bennett
The slides from Tyler's presentation found: http://s.apache.org/beam-intro-feb-2019 I'll also send out links to videos once I get my hands on them (@Mark Grover ). On Mon, Feb 11, 2019 at 9:48 AM Thomas Weise wrote: > Here are slides for 2 of the presentations from the Lyft meetup: > > Python/

Re: Add exception handling to MapElements

2019-02-11 Thread Jeff Klukas
Vallery Lancey's post is definitely one of the viewpoints incorporated into this approach. I neglected to include that link in this iteration, but it was discussed in the previous thread. Can you explain more about "another option that adds A+ Promise spec into the apply method"? I'm failing to pa

Re: Correct way to implement ProcessBundleProgressResponse in the Java SDK

2019-02-11 Thread Sam Rohde
Yeah, take a look at the ProcessRemoteBundleOperation.java class. This is the class that is in charge of handling

Re: [PROPOSAL] Prepare Beam 2.11.0 release

2019-02-11 Thread Sam Rohde
Thanks Ahmet! The 2.11.0 release will also be using the revised release process from PR-7529 that I authored. Let me know if you have any questions or if I can help in any way. I would love feedback on how to improve on the modifications I made and the rel

Re: Add exception handling to MapElements

2019-02-11 Thread Sam Rohde
Interesting ideas! I think you're really honing in on what the Apache Beam API is missing: error handling for bad data and runtime errors. I like your method because it coalesces all the errors into a single collection to be looked at later. Also easy to add a PAssert on the errors collection. Loo

Re: Another another new contributor! :)

2019-02-11 Thread Kyle Weaver
Forgot to get added to the Jira. My username is ibzib. Can someone add me please? Kyle Weaver | Software Engineer | kcwea...@google.com | +1650203 On Fri, Feb 8, 2019 at 1:23 AM Matthias Baetens wrote: > Welcome, good to have you Kyle! > > On Fri, 8 Feb 2019 at 00:43, Joana Filipa Berna

Re: Thoughts on a reference runner to invest in?

2019-02-11 Thread Sam Rohde
Thanks for starting this thread. If I had to guess, I would say there is more of a demand for Python as it's more widely used for data scientists/ analytics. Being pragmatic, the FnApiRunner already has more feature work than the Java so we should go with that. -Sam On Fri, Feb 8, 2019 at 10:07 A

Re: FYI: beam11 bad worker

2019-02-11 Thread Yifan Zou
The beam11 is temporarily offline. Jobs won't be assigned to it at this time. We'll reconfigure the VM and bring it back later. On Mon, Feb 11, 2019 at 10:05 AM Mikhail Gryzykhin wrote: > Hi everyone, > > Small update: > We have a bad jenkins executor that fails all builds. You might experience

Re: Empty projects defined in settings.xml

2019-02-11 Thread Kenneth Knowles
Looks like an accidental revert of changes in https://github.com/apache/beam/commit/097be25fa77a0f1cff1883112fa8a863ac17b895. The modules that were deleted go way back to before settings.gradle. On Mon, Feb 11, 2019 at 9:12 AM Alexey Romanenko wrote: > Hmm, I’m confused since this is me, who is

Re: pipeline steps

2019-02-11 Thread Reuven Lax
On Mon, Feb 11, 2019 at 8:53 AM Kenneth Knowles wrote: > In use cases that actually need the filename / topic name / etc, it > mandatory information. It isn't overhead or a performance hit. > I think many other systems track records as offsets in a source. So despite the fact that they provide

FYI: beam11 bad worker

2019-02-11 Thread Mikhail Gryzykhin
Hi everyone, Small update: We have a bad jenkins executor that fails all builds. You might experience pre/post commit failures. Yifan follows up on this. Regards, --Mikhail Have feedback ?

Re: Beam Meetups Feb 2019

2019-02-11 Thread Thomas Weise
Here are slides for 2 of the presentations from the Lyft meetup: Python/Flink/Streaming: http://go.lyft.com/python-flink-beam-meetup-2019 Use Case: https://www.slideshare.net/AmarPai2/dynamic-pricing-of-lyft-rides-using-streaming +Tyler Akidau do you have pointers for the others by chance? On

[RESULT] [VOTE] Release 2.10.0, release candidate #3

2019-02-11 Thread Kenneth Knowles
Thank you everyone for voting. The vote has passed with 9 supportive +1 votes, 6 of which are binding PMC votes: * Ahmet Altay * Robert Bradshaw * Etienne Chauchot * Kenneth Knowles * Reuven Lax * Maximilian Michels I will proceed with release finalization steps. Kenn On Mon, Feb 11, 2019 at 9

Re: [VOTE] Release 2.10.0, release candidate #3

2019-02-11 Thread Kenneth Knowles
+1 On Fri, Feb 8, 2019 at 12:37 PM Chamikara Jayalath wrote: > +1. Verified that leaderboard passes with Dataflow streaming engine (which > was broken for 2.9.0). > > Thanks, > Cham > > On Fri, Feb 8, 2019 at 9:58 AM Ahmet Altay wrote: > >> +1. I verified python quick start examples. >> >> On F

Re: JIRA priorities explaination

2019-02-11 Thread Scott Wegner
Thanks for driving this discussion. I also was not aware of these existing definitions. Once we agree on the terms, let's add them to our Contributor Guide and start using them. +1 in general; I like both Alex and Kenn's definitions; Additional wordsmithing could be moved to a Pull Request. Can we

Re: Empty projects defined in settings.xml

2019-02-11 Thread Alexey Romanenko
Hmm, I’m confused since this is me, who is author of this commit [1], but these 2 project definitions are absolutely not related to HadoopFormatIO.Write Do we need to drop it? [1] https://github.com/apache/beam/commit/757b71e749ab8a9f0a08e3669596ce69920acbac

Re: pipeline steps

2019-02-11 Thread Kenneth Knowles
In use cases that actually need the filename / topic name / etc, it mandatory information. It isn't overhead or a performance hit. Before SDF, FileIO was somewhat of a special case because it read globs and directories. Most other IOs knew the names of their data source statically anyhow so reifyi

Beam Dependency Check Report (2019-02-11)

2019-02-11 Thread Apache Jenkins Server
High Priority Dependency Updates Of Beam Python SDK: Dependency Name Current Version Latest Version Release Date Of the Current Used Version Release Date Of The Latest Release JIRA Issue future 0.16.0 0.17.1 2016-10-27

Re: pipeline steps

2019-02-11 Thread Alexey Romanenko
Talking about KafkaIO, it’s already possible to have this since "apply(KafkaIO.read())" returns "PCollection>” where KafkaRecord contains message metadata (topic, partition, etc). Though, it works _only_ if “withoutMetadata()” was not used before - in this case it will return simple KV. In th

Re: Empty projects defined in settings.xml

2019-02-11 Thread Michael Luckey
Thanks, Kenn. So this is something 'yet to come'? As the definitions are pointing to folders that do not exist? michel On Mon, Feb 11, 2019 at 5:49 AM Kenneth Knowles wrote: > I think :beam-runners-gcp-gcsproxy would be an implementation of the > artifact API [1] on top of GCS. Something fitti

Re: pipeline steps

2019-02-11 Thread Robert Bradshaw
In terms of performance, it would likely be minimal overhead if (as is likely) the step consuming the filename gets fused with the read. There's still overhead constructing this composite, object, etc. but that's (again likely) smaller than the cost of doing the read itself. On Sun, Feb 10, 2019 a