ArnavBalyan commented on code in PR #3269:
URL: https://github.com/apache/parquet-java/pull/3269#discussion_r2304981961


##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetFileWriter.java:
##########
@@ -1804,14 +1804,27 @@ private static void copy(SeekableInputStream from, 
PositionOutputStream to, long
    * @throws IOException if there is an error while writing
    */
   public void end(Map<String, String> extraMetaData) throws IOException {
+    final long footerStart = out.getPos();
+
+    // Build the footer metadata) in memory using the helper stream
+    InMemoryPositionOutputStream buffer = new 
InMemoryPositionOutputStream(footerStart);

Review Comment:
   Thanks for the reviews @Fokko @wgtmac, happy to abandon this PR, I agree the 
memory pressure would increase. Would you suggest any alternate way to mitigate 
this. I see one of the possibilities to snapshot the write to a temporary 
directory and do an atomic rename (which would come at the expense of more 
intermediate storage requirements, possibly 2x in the worst case). Is there 
anything already discussed about it previously? thanks! 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to