[ https://issues.apache.org/jira/browse/HADOOP-17141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-17141. ------------------------------------- Fix Version/s: 3.4.0 Resolution: Fixed > Add Capability To Get Text Length > --------------------------------- > > Key: HADOOP-17141 > URL: https://issues.apache.org/jira/browse/HADOOP-17141 > Project: Hadoop Common > Issue Type: Improvement > Components: common > Reporter: David Mollitor > Assignee: David Mollitor > Priority: Minor > Fix For: 3.4.0 > > > The Hadoop {{Text}} class contains an array of byte which contain a UTF-8 > encoded string. However, there is no way to quickly get the length of that > string. One can get the number of bytes in the byte array, but to figure out > the length of the String, it needs to be decoded first. In this simple > example, sorting the {{Text}} objects by String length, the String needs to > be decoded from the byte array repeatedly. This was brought to my attention > based on [HIVE-23870]. > {code:java} > public static void main(String[] args) { > List<Text> list = Arrays.asList(new Text("1"), new Text("22"), new > Text("333")); > list.sort((Text t1, Text t2) -> t1.toString().length() - > t2.toString().length()); > } > {code} > Also helpful if I want to check the last letter in the {{Text}} object > repeatedly: > {code:java} > Text t = new Text("4444"); > System.out.println(t.charAt(t.toString().length() - 1)); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org