Axel Howind created PDFBOX-5997: ----------------------------------- Summary: avoid creation of temporary objects when parsing hex values Key: PDFBOX-5997 URL: https://issues.apache.org/jira/browse/PDFBOX-5997 Project: PDFBox Issue Type: Improvement Reporter: Axel Howind Attachments: avoid_the_creation_of_temporary_string_instances_when_parsing_hex_values_version1.patch
There currently are two places where hex numbers are parsed in PDFBox, the Hex and COSString classes. The current implementation instantiates several temporary objects for each conversion: 1. trim() is called on the String, creating a copy if the String is not yet trimmed. 2. a Stringbuilder is created containing the String and possibly a padding 0. This has to copy the whole character arrangement every time. 3. for each pair of hex digits, substring() is called, creating a new String instances (or looking it up in the String pool I have created two different patches for this. One that also replaces the Integer.parseInt() call and one that uses an overload of the method. Both should be much more performant and reduce GC activity. You might want to run a benchmark to decide which one to use. version 1 also does not rely on exception handling which is inherently slow to handle incorrect hex data. version two still uses exception handling, but should nevertheless improve performance and reduce GC activity. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org