[ https://issues.apache.org/jira/browse/HIVE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726475#comment-13726475 ]
Edward Capriolo commented on HIVE-2482: --------------------------------------- [~thejas] Thank you for your comment, I am going to agree and disagree with your for my prospective on this issue. * I use hive_test to tests my udfs https://github.com/edwardcapriolo/hive_test * At one point we added a plugin developer kit to hive which allowed annotation based testing of UDFS At one point this was removed, there were reports that it was flakey and I was not paying much attention at that time, but I probably would have advocated that it not be removed. Now, I do agree with you that we can get better coverage of some things outside end-to-end tests, but believe it or not functions are not one of them. Why do I say this? A few reasons: * Most functions are not functional. * They actually have state, conf at initialization, reusable objects shared between calls to evaluate. * UDAFs have entire aggregation buffers systems. To your specific points 1) Welcome to my life, I have been complaining about our test infrastructure for years. Honestly now that we have a build system we can test udf's fairly fast, and there is not a huge volume of them anyway. 2) That can be true, again I use hive_test and I am not against having units + end-to-end tests 3) I agree with this to an extent, but even in a real unit test one still has to write Assert.assertEquals( something, somethingElse ) so you still eyeball something. From a review standpoints it's easier to eyeball the .out then tens or hundreds of asserts. Again I am not against having more traditionally unit tests and writing code in functional style that is easier to document and and reason about, but I think to cover all the corner cases of exceptions and cleaning out private state properly the unit tests will be more ugly then the q tests. I am talking on hive-dev about the project split up. This is one of the things I want to do, move all the end-to-end test to a final project and really step up the unit style testing. There is lots of things we can do to make the tests faster * move all the UDFs into 1 big test :) save the overhead of launching multiple tests * optimize 'select udf(column) from table limit 1' <-- we should be able to make that test scream Anyway unlike the past where stuff like this sits on the queue forever we now have a build bot and I am dedicated to seeing patches reviewed and committed fast (especially those like these) BTW at minimum there is show_functions.q, so every time you add a function you at least have to touch that test. > Convenience UDFs for binary data type > ------------------------------------- > > Key: HIVE-2482 > URL: https://issues.apache.org/jira/browse/HIVE-2482 > Project: Hive > Issue Type: New Feature > Affects Versions: 0.9.0 > Reporter: Ashutosh Chauhan > Assignee: Mark Wagner > Attachments: HIVE-2482.1.patch > > > HIVE-2380 introduced binary data type in Hive. It will be good to have > following udfs to make it more useful: > * UDF's to convert to/from hex string > * UDF's to convert to/from string using a specific encoding > * UDF's to convert to/from base64 string > * UDF's to convert to/from non-string types using a particular serde -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira