Here are the results of a regression run on 1258881 documents. Overall it looks fairly good, I see no big problems with either the replaced zip-classes or XmlBeans 3.0.
There are 1332 suspicious new errors with the following breakdown: * 664 times java.lang.NoClassDefFoundError: org/openxmlformats/schemas/presentationml/x2006/main/CTHeaderFooter, which indicates a missing unit test for this CT-class and should be easy to fix (actually already in the works in my local repo) * 192 strange NullPointerExceptions in my code which I cannot fully explain * a number of "Element styles@ http://schemas.openxmlformats.org/wordprocessingml/2006/main is not a valid styleSheet@http://schemas.openxmlformats.org/spreadsheetml/2006/main document or a valid substitution.", which I think we saw before as well, so probably just a flakiness where we had TIMEOUT or OOM before * 26 java.lang.ArrayIndexOutOfBoundsException in XSLF handling somewhere, see report for details * and finally a number of new ZipBombExceptions, which sounds like the limit for reporting was lowered by the zip-handling-changes? Full list of errors is at http://people.apache.org/~centic/poi_regression/reportsAll/index317to400SNAPSHOT.html Comparison of 3.17 vs. 4.0.0-SNAPSHOT is at http://people.apache.org/~centic/poi_regression/reports/index317to400SNAPSHOT.html Thanks... Dominik. On Tue, Jun 26, 2018 at 9:30 PM, Dominik Stadler <[email protected]> wrote: > Hi, > > Unfortunately I have to restart the run, commons-compress was missing > which caused 70k failures out of 1.2mio documents. Thus XmlBeans was not > properly tested and I need to re-trigger. > > > However I already noticed the following failures that look new (however > these are all very rare, not related to XmlBeans at all, mostly in > HSSF/HSLF!): > > 4 > ERROR > java.lang.ClassCastException: o.a.p.hssf.record.StyleRecord cannot be cast > to o.a.p.hssf.record.ExtendedFormatRecord > > java.lang.ClassCastException: o.a.p.hssf.record.StyleRecord cannot be cast to > o.a.p.hssf.record.ExtendedFormatRecord > at > o.a.p.hssf.model.InternalWorkbook.getExFormatAt(InternalWorkbook.java:870) > at o.a.p.hssf.usermodel.HSSFCell.getCellStyle(HSSFCell.java:943) > at o.a.p.hssf.usermodel.HSSFCell.getCellStyle(HSSFCell.java:71) > at o.a.p.stress.HSSFFileHandler.handleFile(HSSFFileHandler.java:63) > at > org.dstadler.commoncrawl.FileHandlingRunnable.run(FileHandlingRunnable.java:64) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > Download > <http://people.apache.org/~centic/poi_regression/reports/download.oldindex/com.adapchain.www_files_MMP_Based_Sequence.xls> > > > > 1 > ERROR > java.lang.NullPointerException > > java.lang.NullPointerException > at > o.a.p.hslf.model.textproperties.HSLFTabStopPropCollection.writeTabStops(HSLFTabStopPropCollection.java:83) > at o.a.p.hslf.record.TextRulerAtom.writeIf(TextRulerAtom.java:150) > at o.a.p.hslf.record.TextRulerAtom.writeOut(TextRulerAtom.java:126) > at > o.a.p.hslf.record.EscherTextboxWrapper.writeOut(EscherTextboxWrapper.java:90) > at > o.a.p.hslf.usermodel.HSLFTextParagraph.refreshRecords(HSLFTextParagraph.java:1158) > at > o.a.p.hslf.usermodel.HSLFTextParagraph.storeText(HSLFTextParagraph.java:969) > at o.a.p.hslf.usermodel.HSLFTextShape.storeText(HSLFTextShape.java:851) > at > o.a.p.hslf.usermodel.HSLFSlideShow.writeDirtyParagraphs(HSLFSlideShow.java:485) > at > o.a.p.hslf.usermodel.HSLFSlideShow.writeDirtyParagraphs(HSLFSlideShow.java:477) > at o.a.p.hslf.usermodel.HSLFSlideShow.write(HSLFSlideShow.java:451) > at o.a.p.stress.SlideShowHandler.writeToArray(SlideShowHandler.java:63) > at > o.a.p.stress.SlideShowHandler.handleSlideShow(SlideShowHandler.java:49) > at o.a.p.stress.HSLFFileHandler.handleFile(HSLFFileHandler.java:49) > at > org.dstadler.commoncrawl.FileHandlingRunnable.run(FileHandlingRunnable.java:64) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > Download > <http://people.apache.org/~centic/poi_regression/reports/download.oldindex/ca.casid-acedi.www_sites_default_files_GDS_20save_20the_20date_20Nov_204.ppt> > > > > 1 > ERROR > java.lang.ArrayIndexOutOfBoundsException: * > > java.lang.ArrayIndexOutOfBoundsException: * > at > o.a.p.hssf.usermodel.HSSFOptimiser.optimiseCellStyles(HSSFOptimiser.java:229) > at o.a.p.stress.HSSFFileHandler.handleFile(HSSFFileHandler.java:59) > at > org.dstadler.commoncrawl.FileHandlingRunnable.run(FileHandlingRunnable.java:64) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > Download > <http://people.apache.org/~centic/poi_regression/reports/download.oldindex/com.9thtee.www_tivoconv3release.xls> > > Dominik. > > On Mon, Jun 25, 2018 at 7:30 AM, Dominik Stadler <[email protected]> > wrote: > >> A regression run on the > 1mio commoncrawl documents is underway... >> >> Dominik >> >> On Sun, Jun 24, 2018, 22:50 Andreas Beeker <[email protected]> wrote: >> >>> I'm +1 too ... we can refactor the code in the next release. >>> >>> A govdocs run would be nice, to see if we have further OOM problems. >>> >>> Andi >>> >>> >>> >
