[jira] [Commented] (HIVE-4732) Reduce or eliminate the expensive Schema equals() check for AvroSerde

Mohammad Kamrul Islam (JIRA) Mon, 16 Sep 2013 18:06:42 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769049#comment-13769049
 ]


Mohammad Kamrul Islam commented on HIVE-4732:
---------------------------------------------

[~appodictic]: I can see your point. Indeed a very informative link.
As the link mentioned, the probability of ID collisions are very very rare. 
Pasted from wikipedia:
"To put these numbers into perspective, the annual risk of someone being hit by 
a meteorite is estimated to be one chance in 17 billion,[38] which means the 
probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of 
creating a few tens of trillions of UUIDs in a year and having one duplicate. 
In other words, only after generating 1 billion UUIDs every second for the next 
100 years, the probability of creating just one duplicate would be about 50%. 
The probability of one duplicate would be about 50% if every person on earth 
owns 600 million UUIDs."

With these probability, will it be necessary to make thing complex. Moreover, 
these IDs are often few in one hive session.




 

 
                
> Reduce or eliminate the expensive Schema equals() check for AvroSerde
> ---------------------------------------------------------------------
>
>                 Key: HIVE-4732
>                 URL: https://issues.apache.org/jira/browse/HIVE-4732
>             Project: Hive
>          Issue Type: Improvement
>          Components: Serializers/Deserializers
>            Reporter: Mark Wagner
>            Assignee: Mohammad Kamrul Islam
>         Attachments: HIVE-4732.1.patch, HIVE-4732.4.patch, 
> HIVE-4732.v1.patch, HIVE-4732.v4.patch
>
>
> The AvroSerde spends a significant amount of time checking schema equality. 
> Changing to compare hashcodes (which can be computed once then reused) will 
> improve performance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4732) Reduce or eliminate the expensive Schema equals() check for AvroSerde

Reply via email to