iemejia commented on code in PR #3565:
URL: https://github.com/apache/parquet-java/pull/3565#discussion_r3343039837


##########
parquet-column/src/main/java/org/apache/parquet/column/values/plain/PlainValuesReader.java:
##########
@@ -19,120 +19,91 @@
 package org.apache.parquet.column.values.plain;
 
 import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
 import org.apache.parquet.bytes.ByteBufferInputStream;
-import org.apache.parquet.bytes.LittleEndianDataInputStream;
 import org.apache.parquet.column.values.ValuesReader;
-import org.apache.parquet.io.ParquetDecodingException;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 /**
- * Plain encoding for float, double, int, long
+ * Plain encoding for float, double, int, long.
+ *
+ * <p>Reads directly from a {@link ByteBuffer} with {@link 
ByteOrder#LITTLE_ENDIAN} byte order,
+ * bypassing the deprecated {@code LittleEndianDataInputStream} wrapper to 
avoid per-value virtual
+ * dispatch overhead. The underlying page data is obtained as a single 
contiguous {@link ByteBuffer}
+ * via {@link ByteBufferInputStream#slice(int)}.
  */
 public abstract class PlainValuesReader extends ValuesReader {
   private static final Logger LOG = 
LoggerFactory.getLogger(PlainValuesReader.class);
 
-  protected LittleEndianDataInputStream in;

Review Comment:
   Good point. The old `protected LittleEndianDataInputStream in` field is now 
`protected ByteBuffer buffer` — the type change is binary-incompatible 
regardless, so a deprecation cycle wouldn't help external subclasses (they'd 
get a compile error either way). I searched the project and only internal 
subclasses (the 4 inner classes) access this field. I think this is acceptable 
given this class was never annotated `@Public` and the field type change makes 
deprecation impractical. WDYT?



##########
parquet-common/src/main/java/org/apache/parquet/bytes/BytesInput.java:
##########
@@ -643,6 +673,20 @@ public ByteBuffer toByteBuffer() throws IOException {
       return java.nio.ByteBuffer.wrap(in, offset, length);
     }
 
+    /**
+     * Zero-copy override: returns the backing array directly when fully used,
+     * skipping the base-class BAOS allocation + copy on every decompressor 
call.
+     * Returning the mutable array is safe — the base class already exposes a
+     * mutable {@code BAOS.getBuf()}.
+     */
+    @Override
+    public byte[] toByteArray() {

Review Comment:
   Addressed in f0bdac633. The base-class `toInputStream()` now tries 
`getInternalByteBuffer()` first (zero-copy fast path) before falling back to 
the deprecated `toByteBuffer()`. Also added `getInternalByteBuffer()` and 
`toInputStream()` overrides to `ByteArrayBytesInput` so the byte-array-backed 
path is zero-copy too. Added `@SuppressWarnings("deprecation")` on the 
intentional overrides.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to