Hi, the 2nd run of the regression tests is now finished, results look much better now, only very few failures left (56 failures in 12 stacktraces):
1) o.a.p.ooxml.POIXMLException: error: The document is not a xml@urn:schemas-poi-apache-org:vmldrawing: document element namespace mismatch expected "urn:schemas-poi-apache-org:vmldrawing" got " http://schemas.openxmlformats.org/spreadsheetml/2006/main" => Seems to have been introduced by #64773 - Visual signatures for .xlsx/.docx, Subversion Revision 1882394 2) A few failures related to drawing slideshows, likely introduced by support much more functionality there, not sure if we need to fix those 3) java.lang.RuntimeException: CountryRecord or SSTRecord not found: This is just a change in an error-message which needs to be catched differently in the integration-tests 4) some documents try to allocate very large arrays, which I would ignore as a user can increase the allowed max allocated memory easily 5) "java.lang.IllegalArgumentException: Invalid char (*) found at index (*) in sheet name *" => now happens because we fixed another issue, so not an actual regression Full reports are at http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html and http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html I think we only need to take a look at 1) and 2) before releasing. Thanks... Dominik. On Sun, Jan 3, 2021 at 1:08 PM Dominik Stadler <dominik.stad...@gmx.at> wrote: > Hi, > > Thanks for the fixes and the "stress" documents, I added a few more and > added a test for the normal unit-tests to trigger those documents, > otherwise the ooxml-schema-lite does not contain them as far as I saw. > > Next regression-run is underway... > > Dominik. > > On Wed, Dec 30, 2020 at 8:25 PM Andreas Beeker <kiwiwi...@apache.org> > wrote: > >> HI, >> >> I've mentioned it in our private slack group *) - there's also an ant >> error, which ignores quite a few *$Factory.class-es in packing the lite jar. >> I'm currently trying to figure out how I can workaround this. >> >> > Another potential approach: ... >> This was my first approach class -> xsb, but it was not reliable >> therefore I've spent some time to find out (the few lines) of byte-buddy >> code. >> So those .xsb are the ones we use in our test. if we do b) those should >> be picked up. >> >> Andi >> >> *) this is just a participation reminder for the rest - I'm happy to >> invite you if you tell me your asf slack id ;) >> >> On 30.12.20 20:04, Dominik Stadler wrote: >> > Hi, >> > >> > I'd go for b), hopefully not too many are necessary, it seems a simple >> test >> > which reads in the document triggers the necesary parts in most of the >> > cases. >> > >> > c) would mean anybody out there with such a file would now get >> > regression-errors unless he switches to the full file. >> > >> > Another potential approach: I don't know much about how you do all this >> > agent-stuff nowadays, but is there a way to match the classes to the >> xsb to >> > find those missing ones as we seem to cover the classes themselves >> already >> > as they are only included when used in tests. >> > >> > Dominik. >> > >> > On Wed, Dec 30, 2020 at 7:09 PM Andreas Beeker <kiwiwi...@apache.org> >> wrote: >> > >> >> Hi Dominik, >> >> >> >> thank you for running the regression test. >> >> >> >>> * Most of these are because the "lite" ooxml-schema jar is still >> missing >> >>> some stuff, not sure if the new way of building the lite-jar is the >> cause >> >>> or if we now use more parts in the regression tests >> >> The lite jar used to contain all *.xsb files and now it will only >> contains >> >> the ones used in the tests, which decreased its size by around 40%. >> >> >> >> Should we ... ? >> >> a) rollback the change and include all *.xsbs - the class files might >> be >> >> still missing >> >> b) provide unit tests for the failing files - we might need a few >> >> roundtrips to fix those cases, i.e. best would be a reduced file list >> of >> >> those failures >> >> c) use the full schema for the regression tests >> >> >> >> Andi >> >> >> >> >> >> On 30.12.20 17:37, Dominik Stadler wrote: >> >>> Hi, >> >>> >> >>> In order to get the release-preparations rolling a bit, I have >> finished a >> >>> first run of the "mass regression test" exercise. >> >>> >> >>> As usual it brings up cases where documents fail now, but did work >> fine >> >>> previously, i.e. regressions that we may have introduced since the >> >> previous >> >>> release. >> >>> >> >>> I now process 3,356,984 documents (460k of those are skipped because >> they >> >>> are duplicates), currently there are around 3800 documents which show >> a >> >>> regression: >> >>> * Most of these are because the "lite" ooxml-schema jar is still >> missing >> >>> some stuff, not sure if the new way of building the lite-jar is the >> cause >> >>> or if we now use more parts in the regression tests >> >>> * some exceptions/NPEs probably related to more support for >> >>> drawing/rendering PPT(X) and so some may in fact be simply new >> "expected" >> >>> exceptions for broken documents >> >>> * Note: The ones with TIMEOUT or OLDFORMAT are not regressions >> >>> >> >>> 5.0.0 vs. 4.1.2: >> >>> >> >> >> http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html >> >>> 5.0.0 overall errors: >> >>> >> >> >> http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html >> >>> I can fairly easily re-run this as soon as we have fixes for some of >> the >> >>> things. >> >>> >> >>> Thanks... Dominik. >> >>> >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org >> >> For additional commands, e-mail: dev-h...@poi.apache.org >> >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org >> For additional commands, e-mail: dev-h...@poi.apache.org >> >>