Axel Howind created PDFBOX-5997:
-----------------------------------

             Summary: avoid creation of temporary objects when parsing hex 
values
                 Key: PDFBOX-5997
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5997
             Project: PDFBox
          Issue Type: Improvement
            Reporter: Axel Howind
         Attachments: 
avoid_the_creation_of_temporary_string_instances_when_parsing_hex_values_version1.patch

There currently are two places where hex numbers are parsed in PDFBox, the Hex 
and COSString classes. The current implementation instantiates several 
temporary objects for each conversion:

1. trim() is called on the String, creating a copy if the String is not yet 
trimmed.
2. a Stringbuilder is created containing the String and possibly a padding 0. 
This has to copy the whole character arrangement every time.
3. for each pair of hex digits, substring() is called, creating a new String 
instances (or looking it up in the String pool

I have created two different patches for this. One that also replaces the 
Integer.parseInt() call and one that uses an overload of the method. Both 
should be much more performant and reduce GC activity. You might want to run a 
benchmark to decide which one to use.

version 1 also does not rely on exception handling which is inherently slow to 
handle incorrect hex data. version two still uses exception handling, but 
should nevertheless improve performance and reduce GC activity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to