Chendi.Xue created ARROW-6720:
---------------------------------

             Summary: [JAVA][C++]Support Parquet Read and Write in Java
                 Key: ARROW-6720
                 URL: https://issues.apache.org/jira/browse/ARROW-6720
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++, Java
    Affects Versions: 0.15.0
            Reporter: Chendi.Xue
             Fix For: 0.15.0


We added a new java interface to support parquet read and write from hdfs or 
local file.

The purpose of this implementation is that when we loading and dumping parquet 
data in Java, we can only use rowBased put and get methods. Since arrow already 
has C++ implementation to load and dump parquet, so we wrapped those codes as 
Java APIs.

After test, we noticed in our workload, performance improved more than 2x 
comparing with rowBased load and dump. So we want to contribute codes to arrow.

since this is a total independent change, there is no codes change to current 
arrow codes. We added two folders as listed:  java/adapter/parquet and 
cpp/src/jni/parquet



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to