Hi all, I am looking for a quick way to look up the total row count of a data set stored in Arrow’s random access file format using the Java API. Basically, a quicker way to do this:
// The reader is in an instance of ArrowFileReader List<ArrowBlock> blocks = reader.getRecordBlocks(); int nRows = 0; for (ArrowBlock block : blocks) { reader.loadRecordBatch(block); nRows += root.getRowCount(); } My understanding is that the above snippets loads the entire data set instead of just the block headers. To give you some context, I am looking into using Arrow for IPC between a JVM and a Python interpreter using a custom data format and PyArrow/Pandas respectively. While the streaming API might be a better tool for this job, I started out with using files to keep things simple. Any help would be greatly appreciated – maybe I just missed the right bit of documentation. Thanks, Michael