[Python-Dev] Adventures with Decimal
Several people have pointed me at this interesting thread, and
both Tim and Raymond have sent me summaries of their arguments.
Thank you all! I see various things I have written have caused
some confusion, for which I apologise.
The 'right' answer might, in fact, depend somewhat on the
programming language, as I'll try and explain below, but let me
first try and summarize the background of the decimal specification
which is on my website at:
http://www2.hursley.ibm.com/decimal/#arithmetic
Rexx
Back in 1979/80, I was writing the Rexx programming language,
which has always had (only) decimal arithmetic. In 1980, it was
used within IBM in over 40 countries, and had evolved a decimal
arithmetic which worked quite well, but had some rather quirky
arithmetic and rounding rules -- in particular, the result of an
operation had a number of decimal places equal to the larger of
the number of decimal places of its operands.
Hence 1.23 + 1.27 gave 2.50 and 1.23 + 1.27 gave 2.50.
This had some consequences that were quite predictable, but
were unexpected by most people. For example, 1.2 x 1.2 gave 1.4,
and you had to suffix a 0 to one of the operands (easy to do in
Rexx) to get an exact result: 1.2 x 1.20 => 1.44.
By 1981, much of the e-mail and feedback I was getting was related
to various arithmetic quirks like this. My design strategy for
the language was more-or-less to 'minimise e-mail' (I was getting
350+ every day, as there were no newsgroups or forums then) --
and it was clear that the way to minimise e-mail was to make the
language work the way people expected (not just in arithmetic).
I therefore 'did the research' on arithmetic to find out what it
was that people expected (and it varies in some cases, around the
world), and then changed the arithmetic to match that. The result
was that e-mail on the subject dropped to almost nothing, and
arithmetic in Rexx became a non-issue: it just did what people
expected.
It's strongest feature is, I think, that "what you see is what
you've got" -- there are no hidden digits, for example. Indeed,
in at least one Rexx interpreter the numbers are, literally,
character strings, and arithmetic is done directly on those
character strings (with no conversions or alternative internal
representation).
I therefore feel, quite strongly, that the value of a literal is,
and must be, exactly what appears on the paper. And, in a
language with no constructors (such as Rexx), and unlimited
precision, this is straightforward. The assignment
a = 1.1001
is just that; there's no operation involved, and I would argue
that anyone reading that and knowing the syntax of a Rexx
assignment would expect the variable a to have the exact value of
the literal (that is, "say a" would then display 1.1001).
The Rexx arithmetic does have the concept of 'context', which
mirrors the way people do calculations on paper -- there are some
implied rules (how many digits to work to, etc.) beyond the sum
that is written down. This context, in Rexx, "is used to change
the way in which arithmetic operations are carried out", and does
not affect other operations (such as assignment).
Java
So what should one do in an object-oriented language, where
numbers are objects? Java is perhaps a good model, here. The
Java BigDecimal class originally had only unlimited precision
arithmetic (the results of multiplies just got longer and longer)
and only division had a mechanism to limit (round) the result in
some way, as it must.
By 1997, it became obvious that the original BigDecimal, though
elegant in its simplicity, was hard to use. We (IBM) proposed
various improvements and built a prototype:
http://www2.hursley.ibm.com/decimalj/
and this eventually became a formal Java Specification Request:
http://jcp.org/aboutJava/communityprocess/review/jsr013/index.html
which led to the extensive enhancements in BigDecimal that were
shipped last year in Java 5:
http://java.sun.com/j2se/1.5.0/docs/api/java/math/BigDecimal.html
In summary, for each operation (such as a.add(b)) a new method was
added which takes a context: a.add(b, context). The context
supplies the rounding precision and rounding mode. Since the
arguments to an operation can be of any length (precision), the
rounding rule is simple: the operation is carried out as though to
infinite precision and is then rounded (if necessary). This rule
avoids double-rounding.
Constructors were not a point of debate. The constructors in the
original BigDecimal always gave an exact result (even when
constructing from a binary double) so those were not going to
change. We did, however, almost as an afterthought, add versions
of the constructors that took a context argument.
The model, therefore, is essentially the same as the Rexx one:
what you see is what you get. In Java, the assignment:
BigDecimal a = new BigDecimal("1.1001");
ends up with a having an object with the value you see in the
string, and
[Python-Dev] Decimal FAQ
Some of the private email I've received indicates a need for a decimal
FAQ that would shorten the module's learning curve.
A discussion draft follows.
Raymond
---
Q. It is cumbersome to type decimal.Decimal('1234.5'). Is there a way
to
minimize typing when using the interactive interpreter?
A. Some users prefer to abbreviate the constructor to just a single
letter:
>>> D = decimal.Decimal
>>> D('1.23') + D('3.45')
Decimal("4.68")
Q. I'm writing a fixed-point application to two decimal places.
Some inputs have many places and needed to be rounded. Others
are not supposed to have excess digits and need to be validated.
What methods should I use?
A. The quantize() method rounds to a fixed number of decimal
places. If the Inexact trap is set, it is also useful for
validation:
>>> TWOPLACES = Decimal(10) ** -2
>>> # Round to two places
>>> Decimal("3.214").quantize(TWOPLACES)
Decimal("3.21")
>>> # Validate that a number does not exceed two places
>>> Decimal("3.21").quantize(TWOPLACES,
context=Context(traps=[Inexact]))
Decimal("3.21")
Q. Once I have valid two place inputs, how do I maintain that invariant
throughout an application?
A. Some operations like addition and subtraction automatically
preserve fixed point. Others, like multiplication and division,
change the number of decimal places and need to be followed-up with
a quantize() step.
Q. There are many ways to write express the same value. The numbers
200, 200.000, 2E2, and .02E+4 all have the same value at various
precisions.
Is there a way to transform these to a single recognizable canonical
value?
A. The normalize() method maps all equivalent values to a single
representive:
>>> values = map(Decimal, '200 200.000 2E2 .02E+4'.split())
>>> [v.normalize() for v in values]
[Decimal("2E+2"), Decimal("2E+2"), Decimal("2E+2"), Decimal("2E+2")]
Q. Is there a way to convert a regular float to a Decimal?
A. Yes, all binary floating point numbers can be exactly expressed as a
Decimal. An exact conversion may take more precision than intuition
would
suggest, so trapping Inexact will signal a need for more precision:
def floatToDecimal(f):
"Convert a floating point number to a Decimal with no loss of
information"
# Transform (exactly) a float to a mantissa (0.5 <= abs(m) < 1.0)
and an
# exponent. Double the mantissa until it is an integer. Use the
integer
# mantissa and exponent to compute an equivalent Decimal. If this
cannot
# be done exactly, then retry with more precision.
mantissa, exponent = math.frexp(f)
while mantissa != int(mantissa):
mantissa *= 2
exponent -= 1
mantissa = int(mantissa)
oldcontext = getcontext()
setcontext(Context(traps=[Inexact]))
try:
while True:
try:
return mantissa * Decimal(2) ** exponent
except Inexact:
getcontext().prec += 1
finally:
setcontext(oldcontext)
Q. Why isn't the floatToDecimal() routine included in the module?
A. There is some question about whether it is advisable to mix binary
and
decimal floating point. Also, its use requires some care to avoid the
representation issues associated with binary floating point:
>>> floatToDecimal(1.1)
Decimal("1.100088817841970012523233890533447265625")
Q. I have a complex calculation. How can I make sure that I haven't
gotten
a spurious result because of insufficient precision or rounding
anomalies.
A. The decimal module makes it easy to test results. A best practice
is
to re-run calculations using greater precision and with various rounding
modes. Widely differing results indicate insufficient precision,
rounding
mode issues, ill-conditioned inputs, or a numerically unstable
algorithm.
Q. I noticed that context precision is applied to the results of
operations
but not to the inputs. Is there anything I should watch out for when
mixing
values of different precisions?
A. Yes. The principle is all values are considered to be exact and so
is
the arithmetic on those values. Only the results are rounded. The
advantage
for inputs is that "what you type is what you get". A disadvantage is
that
the results can look odd if you forget that the inputs haven't been
rounded:
>>> getcontext().prec = 3
>>> Decimal('3.104') + D('2.104')
Decimal("5.21")
>>> Decimal('3.104') + D('0.000') + D('2.104')
Decimal("5.20")
The solution is either to increase precision or to force rounding of
inputs
using the unary plus operation:
>>> getcontext().prec = 3
>>> +Decimal('1.23456789')
Decimal("1.23")
Alternatively, inputs can be rounded upon creation using the
Context.create_decimal() method:
>>> Context(prec=5, rounding=ROUND_DOWN).create_decimal('1.2345678')
Decimal("1.2345")
Q. I'm writing an application that tracks measurement units along
with numeric values (for example 1.1 meters and 2.3 grams). Is a
Decimal subclass the best approach?
Re: [Python-Dev] Decimal FAQ
> Q. I'm writing a fixed-point application to two decimal places.
> Some inputs have many places and needed to be rounded. Others
> are not supposed to have excess digits and need to be validated.
> What methods should I use?
>
> A. The quantize() method rounds to a fixed number of decimal
> places. If the Inexact trap is set, it is also useful for
> validation:
>
> >>> TWOPLACES = Decimal(10) ** -2
> >>> # Round to two places
> >>> Decimal("3.214").quantize(TWOPLACES)
> Decimal("3.21")
> >>> # Validate that a number does not exceed two places
> >>> Decimal("3.21").quantize(TWOPLACES,
> context=Context(traps=[Inexact]))
> Decimal("3.21")
I think an example of what happens when it does exceed two places would make
this example clearer. For example, adding this to the end of that:
>>> Decimal("3.214").quantize(TWOPLACES, context=Context(traps=[Inexact]))
Traceback (most recent call last):
[...]
Inexact: Changed in rounding
=Tony.Meyer
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adventures with Decimal
Raymond Hettinger wrote: > IMO, user input (or > the full numeric strings in a text data file) is sacred and presumably > done for a reason -- the explicitly requested digits should not be > throw-away without good reason. I still don't understand what's so special about the input phase that it should be treated sacredly, while happily desecrating the result of any *other* operation. To my mind, if you were really serious about treating precision as sacred, the result of every operation would be the greater of the precisions of the inputs. That's what happens in C or Fortran - you add two floats and you get a float; you add a float and a double and you get a double; etc. > Truncating/rounding a > literal at creation time doesn't work well when you are going to be > using those values several times, each with a different precision. This won't be a problem if you recreate the values from strings each time. You're going to have to be careful anyway, e.g. if you calculate some constants, such as degreesToRadians = pi/180, you'll have to make sure that you recalculate them with the desired precision before rerunning the algorithm. > Remember, the design documents for the spec state a general principle: > the digits of a decimal value are *not* significands, rather they are > exact and all arithmetic on the is exact with the *result* being subject > to optional rounding. I don't see how this is relevant, because digits in a character string are not "digits of a decimal value" according to what we are meaning by "decimal value" (i.e. an instance of Decimal). In other words, this principle only applies *after* we have constructed a Decimal instance. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
