[ 
https://issues.apache.org/jira/browse/TIKA-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17955906#comment-17955906
 ] 

Hudson commented on TIKA-4427:
------------------------------

FAILURE: Integrated in Jenkins build Tika » tika-branch_3x-jdk11 #2052 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-branch_3x-jdk11/2052/])
TIKA-4427 -- allow pool size to be zero, and set a configurable max reuse value 
(#2239) (tallison: 
[https://github.com/apache/tika/commit/f7abd982ec148d78f6c0c2378b294231d43ff713])
* (add) 
tika-core/src/test/resources/org/apache/tika/config/TIKA-4427-max-num-reuses.xml
* (edit) tika-core/src/main/java/org/apache/tika/utils/XMLReaderUtils.java
* (edit) tika-core/src/test/java/org/apache/tika/config/TikaConfigTest.java
* (add) 
tika-core/src/test/resources/org/apache/tika/config/TIKA-4427-no-sax-pool.xml
* (edit) tika-core/src/main/java/org/apache/tika/config/TikaConfig.java
* (edit) tika-core/src/test/java/org/apache/tika/utils/XMLReaderUtilsTest.java


> Memory Leak when parsing a large (110K+)  number of documents 
> --------------------------------------------------------------
>
>                 Key: TIKA-4427
>                 URL: https://issues.apache.org/jira/browse/TIKA-4427
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 3.2.0
>            Reporter: Tim Barrett
>            Priority: Major
>             Fix For: 4.0.0, 3.2.1
>
>         Attachments: Screenshot 2025-05-30 at 17.22.38.png, Screenshot 
> 2025-05-30 at 18.31.01.png, Screenshot 2025-05-30 at 18.31.47.png
>
>
> When parsing a very large number of documents, which include a lot of eml 
> files we see that  
> The static field XMLReaderUtils.SAX_PARSERS  is holding a massive amount of 
> memory: 3.28 GB. This is a static pool of cached SAXParser instances, each of 
> which is holding onto substantial amounts of memory, apparently in the 
> fDocumentHandler field.
> This is a big data test we run regularly, the memory issues did not occur in 
> Tika version 2.x
>  
> I have attached JVM monitor screenshots.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to