Thanks a lot, Wes. On Fri, Aug 24, 2018 at 12:17 AM Wes McKinney <wesmck...@gmail.com> wrote:
> hi Wenjian -- I am not an expert in the Java library. Perhaps Bryan, > Li, Jacques, or Sidd can point you in the right direction. You can > take a look at the Dremio codebase to see more examples of Arrow in > action > > https://github.com/dremio/dremio-oss > > - Wes > > On Tue, Aug 14, 2018 at 10:08 PM, Xu,Wenjian <zero...@gmail.com> wrote: > > Hi Wes, > > > > Thank you for your kind help. > > > > Actually I am working on the Java UDF iterating the *array<string>* in > SQL > > language. > > > > I understand that , in order to represent *array<string>* in Arrow > format, I > > could use ListVector with VarCharVector as the inner list. My question > is, > > how to efficiently access the all the elements (i.e., each byte[] as > > string)? > > > > By checking the test code: > > > https://github.com/apache/arrow/blob/master/java/vector/src/test/java/org/apache/arrow/vector/TestListVector.java > > > > one option is to use ListVector.getObject(int index) to get each > > ArrayList<Text>, and then iterate each element in ArrayList<Text>. But > this > > method is expensive because: > > > > 1) it calls VarCharVector.get(int index) which involves memory copy > > 2) it calls Text.set(byte[]) which assemble the Text from byte array. > > > > My goal is just to retrieve each byte[] and do some filtering. Is there > any > > other less expensive method to achieve my goal? For example, > > VarCharVector.get(int index, NullableVarCharHolder holder) seems to be a > > less-expensive operation. But how to use this method in my case? > > > > Thanks again. > > > > Best regards, > > Wenjian > > > > > > > > > > On Wed, Aug 15, 2018 at 3:19 AM Wes McKinney <wesmck...@gmail.com> > wrote: > >> > >> hi Wenjian, > >> > >> In C++ you can use ListBuilder together with UInt8Builder. There are > >> examples of using ListBuilder you can look at in > >> src/arrow/array-test.cc. > >> > >> For Java you might want to have a look at how Spark SQL converts its > >> Array<T> types into Arrow (there should be other examples in the Java > >> unit test suite, too): > >> > >> > >> > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowWriter.scala > >> > >> - Wes > >> > >> On Mon, Aug 13, 2018 at 6:00 AM, Xu,Wenjian <zero...@gmail.com> wrote: > >> > Hi, > >> > > >> > If I want to create list<list<byte>> structure (as shown in > >> > https://arrow.apache.org/docs/memory_layout.html), what class(es) do > I > >> > need > >> > to use in Java API and C++ API? > >> > > >> > Any suggestion would be appreciated. Thanks. > >> > > >> > Best regards, > >> > Wenjian >