[ https://issues.apache.org/jira/browse/HIVE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029893#comment-13029893 ]
Geoff Howard commented on HIVE-1262: ------------------------------------ There is a bug in the implementation of GenericUDFSha in the evaluate method. In the for loop that converts the hashed bytes back out to the string representation the use of Integer.toHexString(0xFF & digested[i]) will miss leading zeroes for hex values less than 0x10. You can see this in the udf_sha.q.out file in the patch. The correct SHA-1 has of "hive rules!" is: e0b2715219b30234f0aef56786f81046a366699f but the output of this function is: e0b2715219b3234f0aef56786f81046a366699f The seventh byte is 0x02, but is output as string "2". The typical fix is to force the pad with code as follows: Integer.toString((0xFF & digested[i]) + 0x100, 16).substring(1) but that creates an extra String object and I prefer the following: int j = 0xFF & digested[i]; if (j < 0x10) hexString.append('0'); hexString.append(Integer.toHexString(j)); I can upload a new patch but don't currently have the source code checked out, so I'm hoping someone beats me to it... ;) > Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt > --------------------------------------------------------------------- > > Key: HIVE-1262 > URL: https://issues.apache.org/jira/browse/HIVE-1262 > Project: Hive > Issue Type: New Feature > Components: UDF > Affects Versions: 0.6.0 > Reporter: Edward Capriolo > Assignee: Edward Capriolo > Attachments: hive-1262-1.patch.txt > > > Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira