Re: Does writeDynamic() support writing different element groups to different output paths?

2021-03-03 Thread Kobe Feng
html?org/apache/beam/sdk/io/FileIO.html > > > > Seems like writeDynamic() only supports specifying different naming > strategy. > > > > How can I specify different hourly based output paths for hourly data with > Beam writeDynamic? Please advise. Thanks! > > > > > -- Yours Sincerely Kobe Feng

Re: Quick question regarding ParquetIO

2021-01-08 Thread Kobe Feng
>. > Is there a way to avoid specifying the avro schema when reading parquet > files? The reason is that we may not know the parquet schema until we read > the files. In comparison, spark parquet reader > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fsql-data-sources-parquet.html&data=04%7C01%7Ctaol%40zillow.com%7Cdab3777011ed4b6e0ec708d8b3d3c2b5%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637457069377544530%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=zAmSyGlNveyeI4aKA5GreuLxBKrwRDS0pM55CP6bzeY%3D&reserved=0> > does not require such a schema specification. > > > > Please advise. Thanks a lot! > > > -- Yours Sincerely Kobe Feng

Re: About Beam SQL Schema Changes and Code generation

2020-12-08 Thread Kobe Feng
Talat, my bad, first thing first, to resolve the issue, your proposal would definitely help the start point for researching schema revolution in beam pipeline, and I could comment there if any. Andrew first reply is clear about the intention and scope for apache beam: static graph for maximum opti

Re: About Beam SQL Schema Changes and Code generation

2020-12-08 Thread Kobe Feng
data with schema Id for schema-based transform, etc. I'm kinder of being away from apache beam for a while, sorry if beam already had such native support or I misunderstood. Thanks! Kobe Feng On Tue, Dec 8, 2020 at 3:15 PM Reuven Lax wrote: > Talat, are you interested in writing a propo

Re: Quick question regarding production readiness of ParquetIO

2020-12-01 Thread Kobe Feng
: > https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/io/parquet/ParquetIO.html > > Does it mean it’s not yet ready for prod usage? If that’s the case, when > will it be ready? > > Also, is there any known performance/scalability/reliability issue with > ParquetIO? > > Thanks a lot! > > > -- Yours Sincerely Kobe Feng

Re: Upload third party runtime dependencies for expanding transform like KafkaIO.Read in Python Portable Runner

2020-10-02 Thread Kobe Feng
anually start the expansion service.) > > Exactly how to specify at a top level a set of extra dependencies to > be applied to a particular subset of other-language transforms is > still an open problem. Alternatively we could try to make expansion > services themselves

Re: Upload third party runtime dependencies for expanding transform like KafkaIO.Read in Python Portable Runner

2020-10-02 Thread Kobe Feng
ng to do? When using > KafkaIO, the provided jar should have all the necessary dependencies to > construct and execute the kafka read/write. Is there some reason you need > to inject additional dependencies into the environment provided by kafka? > > On Fri, Oct 2, 2020 at 3:20 PM Kobe

Re: Upload third party runtime dependencies for expanding transform like KafkaIO.Read in Python Portable Runner

2020-10-02 Thread Kobe Feng
loaded (default still), I guess only dataflow runner support it by glancing the code, but I believe it's the correct way and just need to deep dive the codes here when I turn back, then I will update this thread too. Kobe On Wed, Sep 30, 2020 at 1:26 PM Kobe Feng wrote: > Hi everyone, &g

Upload third party runtime dependencies for expanding transform like KafkaIO.Read in Python Portable Runner

2020-09-30 Thread Kobe Feng
pipeline options. -- Yours Sincerely Kobe Feng