[
https://issues.apache.org/jira/browse/AVRO-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tie Liu resolved AVRO-1411.
---------------------------
Resolution: Duplicate
duplicate as jira https://issues.apache.org/jira/browse/AVRO-1348
> org.apache.avro.util.Utf8 performance improvement by remove private Charset
> in class
> ------------------------------------------------------------------------------------
>
> Key: AVRO-1411
> URL: https://issues.apache.org/jira/browse/AVRO-1411
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.7.5
> Reporter: Tie Liu
> Priority: Minor
>
> Inside org.apache.avro.util.Utf8 class, it has a private member field defined
> as: private static final Charset UTF8 = Charset.forName("UTF-8");
> and it's used as:
> public static final byte[] getBytesFor(String str) {
> return str.getBytes(UTF8);
> }
> I guess the intention of create this object is to save object creation, but
> when we dive into the string.getBytes code, when it's called with Charset, it
> actually create a new StringEncoder in java.lang.StringCoding:
> static byte[] encode(Charset cs, char[] ca, int off, int len) {
> StringEncoder se = new StringEncoder(cs, cs.name());
> char[] c = Arrays.copyOf(ca, ca.length);
> return se.encode(c, off, len);
> }
> If instead we just call it with string literal "UTF-8", it will just reuse
> the threadlocal StringEncoder.
> We tried overwrite this class with passing string literal and proved those
> short lived StringEncoder objects is not created any more. Would like apache
> to fix this so we don't need to overwrite it anymore.
>
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)