Hi everyone!

We were just tracking down.a possible file handle leak in some Iceberg test
code and I came across what I think is a possible race condition/ file
handle link. Looking at the init method

https://github.com/apache/avro/blob/408099a9f82c542caf397b96f4279a827774a1cb/lang/java/avro/src/main/java/org/apache/avro/file/DataFileWriter.java#L237-L249

We open the file handle before setting the "isOpen" method to true. This
means there is a window of time where a concurrent close call, or a failure
in the init method. Will result in an orphaned outputstream.

This is because the close method only will attempt to close output streams
if the init method has completed successfully before "close" is called and
"isOpen" is true.

https://github.com/apache/avro/blob/408099a9f82c542caf397b96f4279a827774a1cb/lang/java/avro/src/main/java/org/apache/avro/file/DataFileWriter.java#L456-L464


So I believe there can be two cases which could cause a leak here,

1) Race Condition:  A concurrent call of "close" while "init" is also being
called leading to "close" running when isOpen is false but after the
outputstream is opened
2) Exception Handling: An exception is thrown in init but after the output
stream is opened, this would break a caller who attempts to clean up with
    try( doSomething with writer ) finally { writer.close }

I'm not sure if either of these cases are extremely likely since I'm not
familiar with the classes used in the "init" method but I wanted to raise
the issue with the Avro devs in case you had more insight into the issue.

Thanks for taking your time to read this,
Russ

Reply via email to