[ https://issues.apache.org/jira/browse/HIVE-24688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-24688: -------------------------------- Description: It's not necessarily copyToStandardObject which should be optimized, but we need to consider some optimization on the attached codepath. In a customer case, 3 reducer tasks run forever (handling skewed keys) and most of the time is spent on this code path, utilizing GC heavily. At the moment I'm open to any kind of optimization: 1. do we need to copy Text? cannot we get a reference back? !Screen Shot 2021-01-27 at 9.52.32 AM.png|width=652,height=280! {code} public Object copyObject(Object o) { ... if (o instanceof Text) { String str = ((Text)o).toString(); HiveVarcharWritable hcw = new HiveVarcharWritable(); hcw.set(str, ((VarcharTypeInfo)typeInfo).getLength()); return hcw; } {code} was: It's not necessarily copyToStandardObject which should be optimized, but we need to consider some optimization on the attached codepath. In a customer case, 3 reducer tasks run forever (handling skewed keys) and most of the time is spent on this code path, utilizing GC heavily. At the moment I'm open to any kind of optimization: 1. do we need to copy Text? cannot we get a reference back? !Screen Shot 2021-01-27 at 9.52.32 AM.png|width=652,height=280! > Optimise ObjectInspectorUtils.copyToStandardObject > -------------------------------------------------- > > Key: HIVE-24688 > URL: https://issues.apache.org/jira/browse/HIVE-24688 > Project: Hive > Issue Type: Improvement > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Attachments: Screen Shot 2021-01-27 at 9.52.32 AM.png > > > It's not necessarily copyToStandardObject which should be optimized, but we > need to consider some optimization on the attached codepath. > In a customer case, 3 reducer tasks run forever (handling skewed keys) and > most of the time is spent on this code path, utilizing GC heavily. At the > moment I'm open to any kind of optimization: > 1. do we need to copy Text? cannot we get a reference back? > !Screen Shot 2021-01-27 at 9.52.32 AM.png|width=652,height=280! > {code} > public Object copyObject(Object o) { > ... > if (o instanceof Text) { > String str = ((Text)o).toString(); > HiveVarcharWritable hcw = new HiveVarcharWritable(); > hcw.set(str, ((VarcharTypeInfo)typeInfo).getLength()); > return hcw; > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)