I was reading the horton-works blog and found an interesting article. http://hortonworks.com/blog/stinger-phase-2-the-journey-to-100x-faster-hive/#comment-160753
There is a very interesting graphic which attempts to demonstrate lines of code in the 12 release. http://hortonworks.com/wp-content/uploads/2013/09/hive4.png Although I do not know how they are calculated, they are probably counting code generated by tests output, but besides that they are wrong. One claim is that Cloudera contributed 4,244 lines of code. So to debunk that claim: In https://issues.apache.org/jira/browse/HIVE-4675 Brock Noland from cloudera, created the ptest2 testing framework. He did all the work for ptest2 in hive 12, and it is clearly more then 4,244 This consists of 84 java files [edward@desksandra ptest2]$ find . -name "*.java" | wc -l 84 and by itself is 8001 lines of code. [edward@desksandra ptest2]$ find . -name "*.java" | xargs cat | wc -l 8001 [edward@desksandra hive-trunk]$ wc -l HIVE-4675.patch 7902 HIVE-4675.patch This is not the only feature from cloudera in hive 12. There is also a section of the article that talks of a "ROAD MAP" for hive features. I did not know we (hive) had a road map. I have advocated switching to feature based release and having a road map before, but it was suggested that might limit people from itch-scratching.