Hi Tim,

I was able to build 1.24.1 without any problems.

I think tried “mvn clean package” on 1.25-rc1, and this time it made it past 
the parsers, but failed on OSGi.

But what’s weird is there wasn’t any error reported for that part of the build:

[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Tika OSGi bundle 1.25
[INFO] ------------------------------------------------------------------------
Downloading: 
https://repo.maven.apache.org/maven2/org/osgi/org.osgi.service.cm/1.6.0/org.osgi.service.cm-1.6.0.pom
Downloaded: 
https://repo.maven.apache.org/maven2/org/osgi/org.osgi.service.cm/1.6.0/org.osgi.service.cm-1.6.0.pom
 (1.4 kB at 7.1 kB/s)
Downloading: 
https://repo.maven.apache.org/maven2/org/osgi/org.osgi.service.cm/1.6.0/org.osgi.service.cm-1.6.0.jar
Downloaded: 
https://repo.maven.apache.org/maven2/org/osgi/org.osgi.service.cm/1.6.0/org.osgi.service.cm-1.6.0.jar
 (55 kB at 951 kB/s)
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ tika-bundle ---
[INFO] Deleting /Users/kenkrugler/git/tika/tika-bundle/target
[INFO] 
[INFO] --- ossindex-maven-plugin:3.1.0:audit (audit-dependencies) @ tika-bundle 
---
[INFO] Checking for vulnerabilities; 153 artifacts
[INFO] Exclude coordinates: [junit:junit:4.13.1]
[INFO] Exclude vulnerability identifiers: []
[INFO] CVSS-score threshold: 0.0
[WARNING] Excluding coordinates: junit:junit:4.13.1
[INFO] 
[INFO] --- maven-enforcer-plugin:3.0.0-M3:enforce (enforce) @ tika-bundle ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ tika-bundle ---
[INFO] 
[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ 
tika-bundle ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ tika-bundle 
---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ 
tika-bundle ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 4 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @ 
tika-bundle ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 1 source file to 
/Users/kenkrugler/git/tika/tika-bundle/target/test-classes
[INFO] 
/Users/kenkrugler/git/tika/tika-bundle/src/test/java/org/apache/tika/bundle/BundleIT.java:
 
/Users/kenkrugler/git/tika/tika-bundle/src/test/java/org/apache/tika/bundle/BundleIT.java
 uses or overrides a deprecated API.
[INFO] 
/Users/kenkrugler/git/tika/tika-bundle/src/test/java/org/apache/tika/bundle/BundleIT.java:
 Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:3.0.0-M4:test (default-test) @ tika-bundle ---
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:unpack-dependencies (default) @ 
tika-bundle ---
[INFO] 
[INFO] --- maven-bundle-plugin:5.1.1:bundle (default-bundle) @ tika-bundle ---
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Tika parent ................................. SUCCESS [  1.703 s]
[INFO] Apache Tika core ................................... SUCCESS [ 35.949 s]
[INFO] Apache Tika parsers ................................ SUCCESS [04:33 min]
[INFO] Apache Tika OSGi bundle ............................ FAILURE [12:51 min]

It hangs for about 10 minutes after printing out the “[INFO] — 
maven-bundle-plugin:5.1.1:bundle (default-bundle) @ tika-bundle ---” line.

 So…any ideas about this? I’d seen some discussion about maven versioning being 
an issue recently? I’ve got:

Apache Maven 3.5.0 (ff8f5e7444045639af65f6095c62210b5713f426; 
2017-04-03T12:39:06-07:00)
Maven home: /Users/kenkrugler/Tools/apache-maven
Java version: 1.8.0_131, vendor: Oracle Corporation
Java home: /Library/Java/JavaVirtualMachines/jdk1.8.0_131.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.12.6", arch: "x86_64", family: "mac"

Thanks,

— Ken


> On Nov 23, 2020, at 10:46 AM, Tim Allison <talli...@apache.org> wrote:
> 
> Ken,
>  Thank you for finding this and sharing it.  I haven't seen this on my mac
> or ubuntu...not denying what you are seeing!
> 
>  Are you able to build 1.24.1 with no problem?  I wonder if your system is
> using a different SAXParser which is not handled correctly in
> XMLReaderUtils?  What OS, what version of java?
> 
>  Thank you, again.
> 
>      Best,
> 
>              Tim
> 
> On Mon, Nov 23, 2020 at 1:40 PM Ken Krugler <kkrugler_li...@transpac.com>
> wrote:
> 
>> Hi all,
>> 
>> I got past the JCE issue, but now some tests are failing with timeouts.
>> 
>> For this test:
>> 
>> [INFO] Running org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest
>> 
>> I get 100s of these warnings:
>> 
>> Nov 21, 2020 10:28:38 PM org.apache.tika.utils.XMLReaderUtils
>> acquireSAXParser
>> WARNING: Contention waiting for a SAXParser. Consider increasing the
>> XMLReaderUtils.POOL_SIZE
>> 
>> And then:
>> 
>> [ERROR] Tests run: 87, Failures: 0, Errors: 1, Skipped: 3, Time elapsed:
>> 318.512 s <<< FAILURE! - in
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest
>> [ERROR]
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testUnsupportedPowerPoint
>> Time elapsed: 308.223 s  <<< ERROR!
>> org.apache.tika.exception.TikaException: TIKA-237: Illegal SAXException
>> from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@e30d60
>>        at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testUnsupportedPowerPoint(OOXMLParserTest.java:341)
>> Caused by: org.xml.sax.SAXException: Waited more than 5 minutes for a
>> SAXParser; This could indicate that a parser has not correctly released its
>> SAXParser. Please report this to the Tika team: dev@tika.apache.org
>> <mailto:dev@tika.apache.org>
>>        at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testUnsupportedPowerPoint(OOXMLParserTest.java:341)
>> Caused by: org.apache.tika.exception.TikaException: Waited more than 5
>> minutes for a SAXParser; This could indicate that a parser has not
>> correctly released its SAXParser. Please report this to the Tika team:
>> dev@tika.apache.org <mailto:dev@tika.apache.org>
>>        at
>> org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testUnsupportedPowerPoint(OOXMLParserTest.java:341)
>> 
>> Similarly, for:
>> 
>> [INFO] Running org.apache.tika.parser.microsoft.ooxml.SXSLFExtractorTest
>> 
>> Many of these:
>> 
>> Nov 21, 2020 10:33:55 PM org.apache.tika.utils.XMLReaderUtils
>> acquireSAXParser
>> WARNING: Contention waiting for a SAXParser. Consider increasing the
>> XMLReaderUtils.POOL_SIZE
>> 
>> And then similarly:
>> 
>> [ERROR] Tests run: 24, Failures: 0, Errors: 1, Skipped: 3, Time elapsed:
>> 309.375 s <<< FAILURE! - in
>> org.apache.tika.parser.microsoft.ooxml.SXSLFExtractorTest
>> [ERROR]
>> org.apache.tika.parser.microsoft.ooxml.SXSLFExtractorTest.testUnsupportedPowerPoint
>> Time elapsed: 307.9 s  <<< ERROR!
>> org.apache.tika.exception.TikaException: TIKA-237: Illegal SAXException
>> from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@e30d60
>>        at
>> org.apache.tika.parser.microsoft.ooxml.SXSLFExtractorTest.testUnsupportedPowerPoint(SXSLFExtractorTest.java:281)
>> Caused by: org.xml.sax.SAXException: Waited more than 5 minutes for a
>> SAXParser; This could indicate that a parser has not correctly released its
>> SAXParser. Please report this to the Tika team: dev@tika.apache.org
>> <mailto:dev@tika.apache.org>
>>        at
>> org.apache.tika.parser.microsoft.ooxml.SXSLFExtractorTest.testUnsupportedPowerPoint(SXSLFExtractorTest.java:281)
>> Caused by: org.apache.tika.exception.TikaException: Waited more than 5
>> minutes for a SAXParser; This could indicate that a parser has not
>> correctly released its SAXParser. Please report this to the Tika team:
>> dev@tika.apache.org <mailto:dev@tika.apache.org>
>>        at
>> org.apache.tika.parser.microsoft.ooxml.SXSLFExtractorTest.testUnsupportedPowerPoint(SXSLFExtractorTest.java:281)
>> 
>> And now:
>> 
>> [INFO] Running org.apache.tika.parser.microsoft.ooxml.SXWPFExtractorTest
>> [INFO] Tests run: 36, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
>> 0.832 s - in org.apache.tika.parser.microsoft.ooxml.SXWPFExtractorTest
>> [INFO] Running org.apache.tika.parser.microsoft.ooxml.TruncatedOOXMLTest
>> [WARNING] Tests run: 5, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
>> 0.053 s - in org.apache.tika.parser.microsoft.ooxml.TruncatedOOXMLTest
>> [INFO] Running org.apache.tika.parser.microsoft.ooxml.xps.XPSParserTest
>> Nov 21, 2020 10:39:05 PM org.apache.tika.utils.XMLReaderUtils
>> acquireSAXParser
>> WARNING: Contention waiting for a SAXParser. Consider increasing the
>> XMLReaderUtils.POOL_SIZE
>> Nov 21, 2020 10:39:06 PM org.apache.tika.utils.XMLReaderUtils
>> acquireSAXParser
>> WARNING: Contention waiting for a SAXParser. Consider increasing the
>> XMLReaderUtils.POOL_SIZE
>> Nov 21, 2020 10:39:07 PM org.apache.tika.utils.XMLReaderUtils
>> acquireSAXParser
>> WARNING: Contention waiting for a SAXParser. Consider increasing the
>> XMLReaderUtils.POOL_SIZE
>> … and so on…
>> 
>> Any suggestions?
>> 
>> Thanks!
>> 
>> — Ken
>> 
>> --------------------------
>> Ken Krugler
>> http://www.scaleunlimited.com
>> custom big data solutions & training
>> Hadoop, Cascading, Cassandra & Solr
>> 
>> 

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

Reply via email to