[ https://issues.apache.org/jira/browse/HIVE-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HIVE-7144: -------------------------- Attachment: orc-string-write.png > GC pressure during ORC StringDictionary writes > ----------------------------------------------- > > Key: HIVE-7144 > URL: https://issues.apache.org/jira/browse/HIVE-7144 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 0.14.0 > Environment: ORC Table ~ 12 string columns > Reporter: Gopal V > Assignee: Gopal V > Labels: ORC, Performance > Attachments: orc-string-write.png > > > When ORC string dictionary writes data out, it suffers from bad GC > performance due to a few allocations in-loop. > !orc-string-write.png! > The conversions are as follows > StringTreeWriter::getStringValue() causes 2 conversions > LazyString -> Text (LazyString::getWritableObject) > Text -> String (LazyStringObjectInspector::getPrimitiveJavaObject) > Then StringRedBlackTree::add() does one conversion > String -> Text > This causes some GC pressure with un-necessary String and byte[] array > allocations. -- This message was sent by Atlassian JIRA (v6.2#6252)