ParquetIO exists in Beam since 2.5.0 release, so it can be considered quite stable and mature. I’m not aware about any open major issues and you can check the performance here [1][2]
On the other hand, you are right - it’s annotated with @Experimental as many other Beam Java IOs and components that make people confusing. There is a long story on this in Beam and we had several related discussions (the latest one [3]) on how to reduce the number of these "experimental”s. [1] http://metrics.beam.apache.org/d/bnlHKP3Wz/java-io-it-tests-dataflow?panelId=16&fullscreen&orgId=1 [2] http://metrics.beam.apache.org/d/bnlHKP3Wz/java-io-it-tests-dataflow?panelId=17&fullscreen&orgId=1 [3] https://lists.apache.org/thread.html/0f769736be1cf2fc5227f7a25dd3fdbb9296afe8a071761cb91f588a%40%3Cdev.beam.apache.org%3E > On 30 Nov 2020, at 22:13, Tao Li <[email protected]> wrote: > > Hi Beam community, > > According to this link the ParquetIO is still considered > experimental:https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/io/parquet/ParquetIO.html > > <https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/io/parquet/ParquetIO.html> > > Does it mean it’s not yet ready for prod usage? If that’s the case, when will > it be ready? > > Also, is there any known performance/scalability/reliability issue with > ParquetIO? > > Thanks a lot!
