Doubles are not perfect fractional numbers. Because of rounding errors, a set of doubles added in different orders can produce different results (e.g., a+b+c != b+c+a)
Because of this, if your computation is happening in a different order locally than on the hive server, you might end up with different results. I don't think hive supports a native decimal type, unfortunately, so it's difficult to verify this. --Tom On Monday, December 17, 2012, Johnny Zhang wrote: > Hi, Periya: > Can you take a look at the patch of > https://issues.apache.org/jira/browse/HIVE-3715 and see if you can apply > the similar change to make sinc/cons more accurate for your use case? Feel > free to comments on the jira as well. Thanks. > > Johnny > > > On Sat, Dec 8, 2012 at 11:23 AM, Periya.Data > <periya.d...@gmail.com<javascript:_e({}, 'cvml', 'periya.d...@gmail.com');> > > wrote: > >> Hi Lauren and Zhang, >> The book "Programming Hive" suggests to use Double (instead of float) >> and also to cast any literal value to double. I am already using double for >> all my computations (both in hive table schema as well as in my UDF). >> Furthermore, I am not comparing two floats/doubles. I am doing some >> computations involving doubles...and those minor differences are adding up. >> >> It looks like what Mark Grover was telling - mapping between Java >> datatypes to Hive data-types. I am yet to look at that portion of the >> source-code. >> >> Thanks and will keep you posted, >> /PD >> >> >> >> On Fri, Dec 7, 2012 at 2:12 PM, Lauren Yang >> <lauren.y...@microsoft.com>wrote: >> >> This sounds like https://issues.apache.org/jira/browse/HIVE-2586 , >> where comparing float/doubles will not work because of the way floating >> point numbers are represented.**** >> >> ** ** >> >> Perhaps there is a comparison between a float and double type because of >> some internal representation in the Java library, or the UDF.**** >> >> ** ** >> >> Ed Capriolo’s book has a good section about workarounds and caveats for >> working with floats/doubles in hive.**** >> >> ** ** >> >> Thanks,**** >> >> Lauren**** >> >> *From:* Periya.Data [mailto:periya.d...@gmail.com] >> *Sent:* Friday, December 07, 2012 1:28 PM >> *To:* user@hive.apache.org; cdh-u...@cloudera.org >> *Subject:* Hive double-precision question**** >> >> ** ** >> >> Hi Hive Users, >> I recently noticed an interesting behavior with Hive and I am unable >> to find the reason for it. Your insights into this is much appreciated. >> >> I am trying to compute the distance between two zip codes. I have the >> distances computed in various 'platforms' - SAS, R, Linux+Java, Hive UDF >> and using Hive's built-in functions. There are some discrepancies from the >> 3rd decimal place when I see the output got from using Hive UDF and Hive's >> built-in functions. Here is an example: >> >> zip1 zip 2 Hadoop Built-in function >> SAS R Linux + >> Java**** >> >> 00501 **** >> >> 11720 **** >> >> 4.49493083698542000**** >> >> 4.49508858**** >> >> 4.49508858054005**** >> >> -- >> >> >> >> > >