Hello,

I've been using the Java API to send and receive data in Arrow format.  I
am having trouble in cases where the schema has map fields, and wanted to
see if this is a usage problem before logging an issue (most everything
else is working as expected so far).

I have an attached example that creates a simple schema with 1 integer
field and 1 map field and writes one record batch.  This part works
correctly.  However when trying to read back the record batch it always
fails with an exception (see stack trace below).  This only happens when I
use map types -- I've done a similar thing with a list and it works
perfectly.

The error makes me think more is being written out in the record batch than
should be, and there are unexpected buffers when the batch is being read
back.  I have to admit I am a little confused by the MapWriter API so I may
be doing something wrong here -- but again everything is fine with lists
and ListWriter.

I've attached the source example I used.  If it's helpful I could rewrite
this as a unit test and submit a PR to include it and can log an issue in
GitHub if this turns out to be a real error.

Thanks!
Derek



Exception in thread "main" java.lang.IllegalArgumentException: not all
nodes and buffers were consumed. nodes: [ArrowFieldNode [length=5,
nullCount=0], ArrowFieldNode [length=5, nullCount=0], ArrowFieldNode
[length=5, nullCount=0]] buffers: [ArrowBuf[30], udle: [21 104..105],
ArrowBuf[31], udle: [21 112..113], ArrowBuf[32], udle: [21 120..144],
ArrowBuf[33], udle: [21 144..189], ArrowBuf[34], udle: [21 192..193],
ArrowBuf[35], udle: [21 200..220]]
at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:66)
at org.apache.arrow.vector.file.ArrowReader$1.visit(ArrowReader.java:109)
at org.apache.arrow.vector.file.ArrowReader$1.visit(ArrowReader.java:95)
at
org.apache.arrow.vector.schema.ArrowRecordBatch.accepts(ArrowRecordBatch.java:128)
at
org.apache.arrow.vector.file.ArrowReader.loadNextBatch(ArrowReader.java:121)
at
com.stitchfix.algorithms.arrowjava.MapReadWrite.run(MapReadWrite.java:136)



*Derek Bennett*
*Software Engineer - Data Platform* | Stitch Fix

One Montgomery Tower Suite 1500
San Francisco, CA 94104

Reply via email to