Hi Tim,

the source for this is an optimization of the included .xsbs in the lite jar 
done via revision 1884139 and 1884142 [1].
The selection of the xsbs is based on the used classes, but there are a few 
xsbs loaded without a factory class.
After I finished my junit5-migration-purgatory, I have a look if I can tweak 
the lite agent by weave in some logging to the call to 
SchemaTypeLoaderImpl::typeSystemForComponent via ByteBuddy [2]

Until then we can simply add oleobjectelement.xsb to the build.xml and wait for 
the build to generate a new poi-ooxml-lite.jar

Andi.


[1] https://svn.apache.org/viewvc?view=revision&revision=1884139
[2] https://www.infoq.com/articles/Easily-Create-Java-Agents-with-ByteBuddy/

On 21.12.20 17:26, Tim Allison wrote:
Andi,
   Thank you for all of your work on this!  This is probably user error, but
I'm getting a failed test when I integrate poi trunk with Tika.  Is this
something I can fix at the Tika level?

org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.microsoft.ooxml.OOXMLParser@785a4557

at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
at
org.apache.tika.extractor.ParserContainerExtractor.extract(ParserContainerExtractor.java:82)
at
org.apache.tika.parser.microsoft.AbstractPOIContainerExtractionTest.process(AbstractPOIContainerExtractionTest.java:68)
at
org.apache.tika.parser.microsoft.POIContainerExtractionTest.testEmbeddedOfficeFilesXML(POIContainerExtractionTest.java:335)
...
Caused by: org.apache.xmlbeans.SchemaTypeLoaderException: XML-BEANS
compiled schema: Could not locate compiled schema resource
org/apache/poi/schemas/ooxml/system/ooxml/oleobjectelement.xsb
(org.apache.poi.schemas.ooxml.system.ooxml.oleobjectelement) - code 0
at
org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl$XsbReader.<init>(SchemaTypeSystemImpl.java:1315)
at
org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl.resolveHandle(SchemaTypeSystemImpl.java:3138)
at
org.apache.xmlbeans.SchemaComponent$Ref.getComponent(SchemaComponent.java:113)
at
org.apache.xmlbeans.SchemaGlobalElement$Ref.get(SchemaGlobalElement.java:76)
at
org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.findElement(SchemaTypeLoaderBase.java:103)
at
org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:988)
at
org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:913)
at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1597)
at org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2571)
at org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2565)
at org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:819)
at org.apache.xmlbeans.impl.store.Cursor.syncWrapHelper(Cursor.java:2522)
at org.apache.xmlbeans.impl.store.Cursor.syncWrap(Cursor.java:2453)
at org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2080)
at
org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractParagraph(XWPFWordExtractorDecorator.java:236)
at
org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractIBodyText(XWPFWordExtractorDecorator.java:161)
at
org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.buildXHTML(XWPFWordExtractorDecorator.java:124)
at
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:136)
at
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:213)
at
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:113)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)

On Sat, Dec 19, 2020 at 8:47 AM Tim Allison <[email protected]> wrote:

If anyone else on this list has time and an interest POI 5.0.0 is on the
way! Please help test!

---------- Forwarded message ---------
From: Tim Allison <[email protected]>
Date: Sat, Dec 19, 2020 at 8:45 AM
Subject: Re: Plea - test the POI 5.0.0 snapshot
To: POI Users List <[email protected]>


Will integrate w Tika on Monday and test it out. Thank you!!!

On Sat, Dec 19, 2020 at 7:52 AM Andreas Beeker <[email protected]>
wrote:

Dear POI users,

we are shortly before releasing POI 5.0.0 and there have been some
breaking changes [1].
Notably the JPMS/JigSaw migration and the upgrade of the ECMA-376 schemas
to the 5th edition.

Please download the snapshot [2] and give it a try - especially with the
new schemas, I'm interested if documents created by POI still can be opened
without errors in various office applications.

Thank you for your support.

Andi


[1] http://poi.apache.org/changes.html

[2]
https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]




---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to