[
https://issues.apache.org/jira/browse/AVRO-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846520#comment-13846520
]
Doug Cutting commented on AVRO-1411:
------------------------------------
Please contribute a patch with this change. Also please provide benchmark
results. Ideally these would use the existing performance suite
(lang/java/ipc/src/test/java/org/apache/avro/io/Perf.java). Once we can
validate the performance improvement then we can probably get the change
committed.
> org.apache.avro.util.Utf8 performance improvement by remove private Charset
> in class
> ------------------------------------------------------------------------------------
>
> Key: AVRO-1411
> URL: https://issues.apache.org/jira/browse/AVRO-1411
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.7.5
> Reporter: Tie Liu
> Priority: Minor
>
> Inside org.apache.avro.util.Utf8 class, it has a private member field defined
> as: private static final Charset UTF8 = Charset.forName("UTF-8");
> and it's used as:
> public static final byte[] getBytesFor(String str) {
> return str.getBytes(UTF8);
> }
> I guess the intention of create this object is to save object creation, but
> when we dive into the string.getBytes code, when it's called with Charset, it
> actually create a new StringEncoder in java.lang.StringCoding:
> static byte[] encode(Charset cs, char[] ca, int off, int len) {
> StringEncoder se = new StringEncoder(cs, cs.name());
> char[] c = Arrays.copyOf(ca, ca.length);
> return se.encode(c, off, len);
> }
> If instead we just call it with string literal "UTF-8", it will just reuse
> the threadlocal StringEncoder.
> We tried overwrite this class with passing string literal and proved those
> short lived StringEncoder objects is not created any more. Would like apache
> to fix this so we don't need to overwrite it anymore.
>
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)