Community over Code EU 2024: The countdown has started!

2024-05-14 Thread Ryan Skraba
[Note: You're receiving this email because you are subscribed to one or more project dev@ mailing lists at the Apache Software Foundation.] We are very close to Community Over Code EU -- check out the amazing program and the special discounts that we have for you. Special discounts You still hav

Community over Code EU 2024: Start planning your trip!

2024-04-03 Thread Ryan Skraba
[Note: You're receiving this email because you are subscribed to one or more project dev@ mailing lists at the Apache Software Foundation.] Dear community, We hope you are doing great, are you ready for Community Over Code EU? Check out the featured sessions, get your tickets with special discoun

Meet our keynote speakers and register to Community Over Code EU!

2023-12-22 Thread Ryan Skraba
[Note: You're receiving this email because you are subscribed to one or more project dev@ mailing lists at the Apache Software Foundation.] * Merge with the ASF EUniverse!The registration for Community

Call for Presentations now open: Community over Code EU 2024

2023-10-30 Thread Ryan Skraba
excellent platform for showcasing high-level projects and incubator initiatives in a visually engaging manner. We believe this will foster lively discussions and facilitate networking opportunities among participants. All my best, and thanks so much for your participation, Ryan Skraba (on be

Re: [RESULT] [VOTE] Release 2.33.0, release candidate #1

2021-09-29 Thread Ryan Skraba
Just to follow up -- the cherry-pick of https://github.com/apache/beam/pull/15616 changes the default value of a configuration option that appears for the first time in 2.33.0. I think it's a strong argument for making the change now, before unwary developers start using the wrong default. I unde

Re: Avro String decoding changes in Beam 2.30.0

2021-07-27 Thread Ryan Skraba
ew coder >> makes more sense to me. >> >> That said, people who might have an opinion: /cc @Ismaël Mejía @Kenneth >> Knowles @Lukasz Cwik +Vitaly >> >>> >>> >>> Thanks, >>> Claire >>> >>> On Tue, Jul 20, 2021 at

Re: Avro String decoding changes in Beam 2.30.0

2021-07-16 Thread Ryan Skraba
Hello! Good catch, I'm taking a look, but it looks like you're entirely correct and there isn't any obvious workaround. I guess you could regenerate every SpecificRecord class in order to add the "java-class" or "avro.java.string" annotation, but that shouldn't be necessary. >From the Avro persp

Re: [ANNOUNCE] New Committer: Kamil Wasilewski

2020-03-02 Thread Ryan Skraba
Congratulations Kamil! On Mon, Mar 2, 2020 at 8:06 AM Michał Walenia wrote: > Congratulations! > > On Sun, Mar 1, 2020 at 2:55 AM Reza Rokni wrote: > >> Congratilation Kamil >> >> On Sat, 29 Feb 2020, 06:18 Udi Meiri, wrote: >> >>> Welcome Kamil! >>> >>> On Fri, Feb 28, 2020 at 12:53 PM Mark L

Re: [ANNOUNCE] New committer: Alex Van Boxel

2020-02-19 Thread Ryan Skraba
Congratulations Alex! On Wed, Feb 19, 2020 at 9:52 AM Katarzyna Kucharczyk < ka.kucharc...@gmail.com> wrote: > Great news! Congratulations, Alex! 🎉 > > On Wed, Feb 19, 2020 at 9:14 AM Reza Rokni wrote: > >> Fantastic news! Congratulations :-) >> >> On Wed, 19 Feb 2020 at 07:54, jincheng sun >>

Re: [ANNOUNCE] New committer: Michał Walenia

2020-01-28 Thread Ryan Skraba
Congratulations! On Tue, Jan 28, 2020 at 11:26 AM Jan Lukavský wrote: > Congrats Michał! > On 1/28/20 11:16 AM, Katarzyna Kucharczyk wrote: > > Congratulations Michał! 😸 🎉 > > On Tue, Jan 28, 2020 at 9:29 AM Alexey Romanenko > wrote: > >> Congrats, Michał! >> >> On 28 Jan 2020, at 09:20, Ismaël

Re: [VOTE] Beam Mascot animal choice: vote for as many as you want

2019-11-20 Thread Ryan Skraba
*** Vote for as many as you like, using this checklist as a template [X] Beaver [ ] Hedgehog [ ] Lemur [ ] Owl [ ] Salmon [X] Trout [ ] Robot dinosaur [ ] Firefly [ ] Cuttlefish [ ] Dumbo Octopus [ ] Angler fish

Re: Library to Parse Thrift Files for ThriftIO

2019-11-19 Thread Ryan Skraba
For info: https://github.com/airlift/drift has forked and maintained the code over the last few years. On Fri, Nov 15, 2019 at 7:23 PM Reuven Lax wrote: > > At a quick glance, the license is Apache which is fine (though we'd have to > check dependencies as well). I do notice that git repro is no

Re: [ANNOUNCE] New committer: Brian Hulette

2019-11-15 Thread Ryan Skraba
Congratulations! On Fri, Nov 15, 2019 at 10:12 AM Jan Lukavský wrote: > > Congrats Brian! > > On 11/15/19 9:58 AM, Reza Rokni wrote: > > Great news! > > On Fri, 15 Nov 2019 at 15:09, Gleb Kanterov wrote: >> >> Congratulations! >> >> On Fri, Nov 15, 2019 at 5:44 AM Valentyn Tymofieiev >> wrote:

Re: [spark structured streaming runner] merge to master?

2019-11-07 Thread Ryan Skraba
Just a personal opinion! I would prefer ONE jar including all spark runners, and I think the new Spark runner should be present in the release artifacts even if it's in an incomplete state. I have no objection to putting the experimental runner alongside the stable, mature runner. We have some p

Re: JdbcIO read needs to fit in memory

2019-10-29 Thread Ryan Skraba
ith 4GB heap to do the main heavy lifting - JDBC is the main data >>> set, just metadata. >>> >>> I just did run the same JdbcIO read code on Spark and Flink runner. Flink >>> did not blow up on memory. So it seems like this is a limitation of >>&

Re: JdbcIO read needs to fit in memory

2019-10-25 Thread Ryan Skraba
One more thing to try -- depending on your pipeline, you can disable the "auto-reshuffle" of JdbcIO.Read by setting withOutputParallelization(false) This is particularly useful if (1) you do aggressive and cheap filtering immediately after the read or (2) you do your own repartitioning action like

Re: JdbcIO read needs to fit in memory

2019-10-24 Thread Ryan Skraba
Hello! If I remember correctly -- the JdbcIO will use *one* DoFn instance to read all of the rows, but that instance is not required to hold all of the rows in memory. The fetch size will, however, read 50K rows at a time by default and those will all be held in memory in that single worker until

Re: Question related to running unit tests in IDE

2019-10-23 Thread Ryan Skraba
Just for info -- I managed to get a pretty good state using IntelliJ 2019.2.3 (Fedora) and a plain gradle import! There's a slack channel at https://s.apache.org/beam-slack-channel (see https://beam.apache.org/community/contact-us/) called #beam-intellij It's pretty low-traffic, but you might be

Re: DoFn and Source sequence diagrams

2019-10-17 Thread Ryan Skraba
All is well, PlantUML has an Apache Licensed distribution as well, AND the diagrams are explicitly not covered by a license: http://plantuml.com/faq The UML diagrams in Beam Fn API doc are almost certainly PlantUML ! On Thu, Oct 17, 2019 at 4:07 PM Ismaël Mejía wrote: > > In previous documents t

Re: [spark structured streaming runner] merge to master?

2019-10-10 Thread Ryan Skraba
Merging to master sounds like a really good idea, even if it is not feature-complete yet. It's already a pretty big accomplishment getting it to the current state (great job all!). Merging it into master would give it a pretty good boost for visibility and encouraging some discussion about where

Re: [ANNOUNCE] New committer: Jan Lukavský

2019-07-31 Thread Ryan Skraba
Congratulations Jan! On Wed, Jul 31, 2019 at 10:10 AM Ismaël Mejía wrote: > > Hi, > > Please join me and the rest of the Beam PMC in welcoming a new > committer: Jan Lukavský. > > Jan has been contributing to Beam for a while, he was part of the team > that contributed the Euphoria DSL extension,

Re: [DISCUSS] Moving FakeBigQueryServices to main/ rather than test/

2019-07-31 Thread Ryan Skraba
Hello! No objection to the move :/ But what do you think about publishing the test jar created in google-cloud-platform to be reused without moving the code to the main artifact jar? I admit that I'm familiar with this technique with maven, and not at all with gradle, but it's described here: ht

Re: Choosing a coder for a class that contains a Row?

2019-07-24 Thread Ryan Skraba
guring this > out. But I reckon I'll have to come back to this... > > Best > -P. > > On Tue, Jul 23, 2019 at 1:07 AM Ryan Skraba wrote: >> >> Hello Pablo! Just to clarify -- the Row schemas aren't known at >> pipeline construction time, but can be d

Re: [Python] Read Hadoop Sequence File?

2019-07-17 Thread Ryan Skraba
Hello! I dug a bit into this (not a FileIO expert), and it looks like LocalFileSystem only matches globs in file names (not directories): https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/LocalFileSystem.java#L251 Perhaps related: https://issues.apache

Re: pubsub -> IO

2019-07-17 Thread Ryan Skraba
Hello! To clarify, you want to do something like this? PubSubIO.read() -> extract mongodb collection and range -> MongoDbIO.read(collection, range) -> ... If I'm not mistaken, it isn't possible with the implementation of MongoDbIO (based on BoundedSource interface, requiring the collection to be

Re: Wiki access?

2019-07-03 Thread Ryan Skraba
Oof, sorry: ryanskraba Thanks in advance! There's a lot of great info in there. On Wed, Jul 3, 2019 at 5:03 PM Lukasz Cwik wrote: > Can you share your login id for cwiki.apache.org? > > On Wed, Jul 3, 2019 at 7:21 AM Ryan Skraba wrote: > >> Hello -- I've been r

Wiki access?

2019-07-03 Thread Ryan Skraba
Hello -- I've been reading through a lot of Beam documentation recently, and noting minor typos here and there... Is it possible to get Wiki access to make fixes on the spot? Best regards, Ryan