Hi, According to my testing, ALL versions since 3-something did close InputStream when you use XSSFWorkbook or OPCPackage, so this has been the established behaviour for quite some time in Apache POI.
The change in ZipPackage in 66584 seems to have caused a close() even if no exception is thrown, which is causing the problem reported in #67579. So I think it best for now to revert both changes so we are back at what 5.2.3 did, not sure what 66584 actually fixed anyway. Dominik. On Thu, Oct 26, 2023 at 5:18 PM PJ Fanning <fannin...@yahoo.com.invalid> wrote: > HSSFWorkbook(InputStream) closes its InputStream. So this is really a > mess. So I guess the best is for XSSFWorkbook to also close > its InputStream. We can document this and warn users that this is the > behaviour and that they can work around it by using something like > NoCloseInputStream. > > > https://github.com/apache/poi/blob/f64524916d449d37350e0763e4f86edc239bbd34/poi-ooxml/src/main/java/org/apache/poi/openxml4j/opc/internal/NoCloseInputStream.java > > > > > > > > > > On Thursday 26 October 2023 at 15:44:03 IST, PJ Fanning < > fannin...@yahoo.com> wrote: > > > > > > Can I clarify? > > You want https://bz.apache.org/bugzilla/show_bug.cgi?id=67579 reverted? > > If I interpret the poi-reproduce results correctly, there is a regression > from 5.2.3 to 5.2.4 in the stream case. > > The behaviour might have changed in 5.2.2 to 5.2.3 migration but it still > means that there is a regression in 5.2.3 to 5.2.4. > > Surely, we should not be closing input streams - they should be closed by > the provider. We should only close input streams that our code creates > internally. > > What if we call the prospective release 5.3.0 and keep the 67579 fix? > > > > On Thursday 26 October 2023 at 15:16:48 IST, Dominik Stadler < > dominik.stad...@gmx.at> wrote: > > > > > > Hi, > > ah, sorry, seems I did not take a close enough look myself until now. > > I wanted to be sure what the state was before and how it should be. See the > "reproducer" project at https://github.com/centic9/poi-reproduce and the > resulting summary: > > > 3.17 4.0.0 4.0.1 4.1.0 4.1.1 4.1.2 5.0.0 5.1.0 5.2.0 5.2.1 5.2.2 5.2.3 > 5.2.4 > trunk-2023-10-26 Goal > Workbook-Stream closed closed leaked closed closed closed closed closed > closed closed closed closed closed leaked closed > Workbook-File fails, closed closed closed closed closed closed closed > closed > closed closed closed closed closed closed closed > Workbook-String fails, closed closed closed closed closed closed closed > closed closed closed closed closed closed closed closed > OPCPacakge-Stream closed closed leaked closed closed closed closed closed > closed closed closed closed closed leaked closed > OPCPackage-File closed closed closed closed closed closed closed closed > closed closed closed closed closed closed closed > OPCPackage-String closed closed closed closed closed closed closed closed > closed closed closed closed closed closed closed > Broken-file - Stream leaked closed closed leaked leaked leaked leaked > leaked > leaked leaked leaked leaked closed leaked closed > Broken-file - File leaked closed closed leaked leaked leaked leaked leaked > leaked leaked leaked leaked leaked leaked closed > Broken-file - String leaked closed closed leaked leaked leaked leaked > leaked > leaked leaked leaked leaked leaked leaked closed > > > This indicates the following to me: > > * The bug-report about a "regression with closing streams in 5.2.4" seems > to be incorrect. We closed provided InputStreams since a long time! > * => So I do not think we should change close-behavior, we rather should > revert the recent changes in trunk to restore the state of 5.2.4 for now > * We can look at the special case of a file that is not a proper Zip-File, > but this is a minor thing not relevant to the release > > > Thanks... Dominik. > > > On Wed, Oct 25, 2023 at 6:49 PM PJ Fanning <fannin...@yahoo.com.invalid> > wrote: > > > I cannot reproduce the leaked file handle. This test case does not leak > > files. I tested with `lsof -f`. I had a file handle until I closed the > > xssfWorkbook but not after. > > > > File file = ... > > XSSFWorkbook xssfWorkbook = new XSSFWorkbook(file); > > // some Thread.sleep(...) here - use `lsof- f` during this sleep > > xssfWorkbook.close(); > > // some Thread.sleep(...) here - use `lsof- f` during this 2nd sleep > > > > Tested on a Mac with latest POI code in svn trunk. > > > > > > > > > > > > On Wednesday 25 October 2023 at 12:40:12 IST, PJ Fanning > > <fannin...@yahoo.com.invalid> wrote: > > > > > > > > > > > > I will try to set up some debugging of the File path. I am not convinced > > that the ZipPackage InputStream changes affect the behaviour when you > > provide a File instead. The code handles Files and InputStreams > differently. > > > > > > > > > > > > On Wednesday 25 October 2023 at 07:24:52 IST, Dominik Stadler < > > dominik.stad...@gmx.at> wrote: > > > > > > > > > > > > Hi, > > > > We now leave open file-handles whenever the constructor for "File" is > used > > in ZipPackage. Not sure if the background Java support for closing stale > > file-handles will capture those or if we will see "too many open files" > in > > large environments. > > > > Can we only wrap the stream in NoCloseStream when it is actually opened > via > > an incoming InputStream and not when it is opened from a File? > > > > Thanks... Dominik. > > > > On Wed, Oct 25, 2023 at 12:38 AM PJ Fanning <fannin...@apache.org> > wrote: > > > > > Are we in a position to proceed with a 5.2.5 RC1? > > > > > > > > > On 2023/10/16 10:23:42 Tim Allison wrote: > > > > This just bit us on Tika: > > > > https://bz.apache.org/bugzilla/show_bug.cgi?id=67767 > > > > > > > > The fix is easy. I can patch it today. It would be great to get it > > into > > > > 5.2.5. I'm sorry that I didn't catch it during the earlier regression > > > > tests...my fault. > > > > > > > > On Sun, Oct 15, 2023 at 4:34 PM Dominik Stadler < > > dominik.stad...@gmx.at> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > I use file-leak-detector when running tests locally, this shows any > > > > > left-over file-handles when running tests via the IDE. > > > > > > > > > > See https://github.com/jenkinsci/lib-file-leak-detector > > > > > > > > > > If you clone the latest version and build the tool locally via > Maven, > > > you > > > > > can add something like the following in the run-config in the IDE: > > > > > > > > > > > > > > > > > > > > > -javaagent:/opt/file-leak-detector/target/file-leak-detector-1.17-SNAPSHOT-jar-with-dependencies.jar=http=0,strong,excludes=/opt/poi/file-leak-detector.exclude,dumpatshutdown > > > > > > > > > > Then you will get a dump of all leftover file-handles at the end. > > > > > > > > > > In this case, it only happens if we enter the special catch-block > in > > > > > ZipPackage#ZipPackage(File, PackageAccess). > > > > > > > > > > One of the broken files from fuzzing seems to trigger this, > > > constructing > > > > > the package succeeds then, however with a resulting unclosed > > > file-handle. > > > > > > > > > > So it seems to be less wide-spread than I originally thought and > may > > > > > actually have been the case before as well, not sure any more > now... > > > > > > > > > > Dominik. > > > > > > > > > > On Sun, Oct 15, 2023 at 9:34 PM PJ Fanning > > <fannin...@yahoo.com.invalid > > > > > > > > > wrote: > > > > > > > > > > > Thanks Dominik for the feedback. The probable breaking change > was: > > > > > > > > > > > > > > > > > > > > > > > https://svn.apache.org/viewvc/poi/trunk/poi-ooxml/src/main/java/org/apache/poi/openxml4j/opc/ZipPackage.java?r1=1909467&r2=1909466&pathrev=1909467 > > > > > > > > > > > > > > > > > > There have been some more changes, eg using java.nio.files.File > to > > > create > > > > > > InputStreams instead of using FileInputStream. > > > > > > > > > > > > For this reason, I would prefer to use a test based approach to > > > ensuring > > > > > > the 'close' is called. Ideally, I would like to leave behind a > > > regression > > > > > > test. > > > > > > > > > > > > The ZipPackage change (r1909467) only affects the use of > > > InputStreams and > > > > > > does not affect the use of File inputs. The ZipPackage code is > very > > > > > > different on whether you start by providing an InputStream as > > > opposed to > > > > > a > > > > > > java.io.File. In the latter case, we use a ZipFile class instance > > to > > > do > > > > > > random access to the wrapped files. > > > > > > > > > > > > Could you provide me with some example of what needs debugging? > For > > > > > > instance, would something like this not work ok? > > > > > > > > > > > > File file = .... > > > > > > try (XSSFWorkbook wb = new XSSFWorkbook(file)) { > > > > > > ... use wb > > > > > > } > > > > > > > > > > > > Is the sort of usage you are suggesting where we would fail to > > close > > > an > > > > > > InputStream? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sunday 15 October 2023 at 19:37:33 IST, Dominik Stadler < > > > > > > dominik.stad...@gmx.at> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > FYI, the current fix for not closing streams also does not close > > any > > > > > stream > > > > > > opened via a "File" parameter, so we are introducing many > unclosed > > > > > > resources via this change at the moment. > > > > > > > > > > > > Can we solve this somehow differently? Maybe cleanly revert the > > > change > > > > > > which introduced it? > > > > > > > > > > > > Thanks... Dominik. > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Oct 9, 2023 at 1:47 PM PJ Fanning <fannin...@apache.org> > > > wrote: > > > > > > > > > > > > > HI everyone, > > > > > > > > > > > > > > I think the regression in issue 67579 [1] is serious to > warrant a > > > new > > > > > POI > > > > > > > release. With that in mind, could we ramp down on changes while > > we > > > > > decide > > > > > > > if we want to release soon. > > > > > > > > > > > > > > I can do this one again unless someone else wants to do it. > > > > > > > > > > > > > > Regards, > > > > > > > PJ > > > > > > > > > > > > > > [1] https://bz.apache.org/bugzilla/show_bug.cgi?id=67579 > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > > > > > > > For additional commands, e-mail: dev-h...@poi.apache.org > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > > > > > > For additional commands, e-mail: dev-h...@poi.apache.org > > > > > > > > > > > > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > > > For additional commands, e-mail: dev-h...@poi.apache.org > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > > For additional commands, e-mail: dev-h...@poi.apache.org > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > > For additional commands, e-mail: dev-h...@poi.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > For additional commands, e-mail: dev-h...@poi.apache.org > >