[ 
https://issues.apache.org/jira/browse/PDFBOX-4668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17945976#comment-17945976
 ] 

ASF subversion and git services commented on PDFBOX-4668:
---------------------------------------------------------

Commit 1925173 from le...@apache.org in branch 'pdfbox/branches/3.0'
[ https://svn.apache.org/r1925173 ]

PDFBOX-4668: use font name when comparing fonts

> Add ResourceCacheFactory as global setting
> ------------------------------------------
>
>                 Key: PDFBOX-4668
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4668
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Rendering
>    Affects Versions: 3.0.4 PDFBox, 4.0.0
>            Reporter: Ben Manes
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>             Fix For: 3.0.5 PDFBox, 4.0.0
>
>         Attachments: Screenshot 2023-03-20 at 18.57.40.png, memory.png, 
> threads.png
>
>
> Image rendering is cached by {{DefaultResourceCache}} per-document using soft 
> references. As described in the [FAQ|https://pdfbox.apache.org/2.0/faq.html], 
> this can lead to an {{OutOfMemoryError}} when processing, e.g. many documents 
> in parallel. The configuration of this cache is per-document and it is 
> initialized with the default.
> {code}
> // document-wide cached resources
> private ResourceCache resourceCache = new DefaultResourceCache();
> {code}
> This requires all call sites be modified to disable it, some of which may be 
> in 3rd party code. The ask is to static factory to configure the default 
> globally, which would return a new {{DefaultResourceCache}} when called. This 
> would let a user specify a new static factory, e.g. one that returns a custom 
> cache or {{null}} if disabled.
> Soft references are a problematic caching scheme, which degrades poorly. It 
> is very likely that the many and large image fragments causes GC promotion 
> (eden=>young=>old) which requires a full GC to collect. Under memory/cpu 
> pressure, the GC can devolve into a death spiral of collecting the minimal 
> heap space to match its pause time constraints, leading to repeated GCs due 
> to soft reference pollutions and an eventual OOME. If caching was set, it 
> might be preferable to be size-based (by rough byte-size) and perhaps tied 
> into {{MemoryUsageSetting}} main memory configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to