Re: [sage-devel] On scientific computing, Python and Julia

Erik Massop Sat, 09 Aug 2014 19:35:27 -0700

On Fri, 8 Aug 2014 18:57:21 -0700 (PDT), rjf <fate...@gmail.com> wrote:

> On Thursday, August 7, 2014 10:55:37 PM UTC-7, Robert Bradshaw wrote:
>
> > On Thu, Aug 7, 2014 at 9:02 AM, rjf <fate...@gmail.com> wrote:
> >
> > > On Wednesday, August 6, 2014 8:11:21 PM UTC-7, Robert Bradshaw
> > > wrote:
...
> Just that for a mathematician to call something a field when it isn't
> would, I think, be a serious mistake.  You seem to think that it is
> ok for a computer system to do so.    I certainly agree that there are
> more important issues with floating point computation than the
> fact that these numbers do not constitute a real closed field.

In my experience computer scientists are often more precise than
mathematicians.

It seems quite common to me to (intentionally) confuse mathematical
objects with their imperfect realizations. For instance when I use my
compasses to draw something that looks a lot like a circle, I would call
what I drew a circle. Yet I am sure that the points that I drew do not
have equal distance to the center.

Similarly I would call RealField(prec) a field, because it represents
the field of real numbers (e.g. Dedekind cuts). At the same time I am
sure that some mathematical properties that I know from the mathematical
field of real numbers will not hold in RealField(prec). Luckily Sage
readily admits that RealField(prec) is not an exact representation of
the mathematical field of real numbers:
  sage: F = RealField(100)
  sage: F.is_exact()
  False

I agree that it is imprecise that when using F.is_field(), we are
asking whether the mathematical object of real numbers is a field, while
when using F.is_exact() we are asking whether the representation is
exact.

By the way, there also is AA or AlgebraicRealField() in Sage, which is
an exact representation of the real closed field of algebraic real
numbers.

...
> The domain of arbitrary-precision integers is an excellent model of
> the ring of integers.  It is true that one can specify a computation
> that would fill up the memory of all the computers in existence. or
> even all the atoms in the (known?) universe.  Presumably a
> well-constructed support system will give an error message on much
> smaller examples.   I assume that your Real Field  operation of
> division would give an error if the result is inexact.

RealField(prec) is an inexact approximation of a field (as witnessed by
is_exact), so I never expect the division to be exact. In fact I don't
expect addition, negation, multiplication, or subtraction to be exact
either. Indeed, they are not exact:
  sage: a = 1e-58
  sage: a
  1.00000000000000e-58
  sage: (1+a)-1
  0.000000000000000

> > > > > Does Sage have other um, approximations, in its nomenclature?
> > > >
> > > > Sure. RealField(123)[x]. Power series rings. P-adics.
> > >
> > > These approximations are approximations by their nature.  If you
> > > are computing with a power series, the concept inherently includes
> > > an error term which you are aware of.  Real Field is (so far as I
> > > know) a concept that should have the properties of a field.  The
> > > version in Sage does not. It's like saying someone isn't pregnant.
> > > well only a little pregnant.
> >
> > They're no more approximate by nature than the real numbers.
>
> Huh?  How so? I was not aware that real numbers (at least the ones
> that you can construct) are approximate by nature.  But maybe you
> can direct me to some reference on this.

What do you mean by "can construct"? The computable reals perhaps? Those
would indeed form an actual field. I am very unsure about how practical
they are though.

Might we then not also have a ring of computable power series? That is,
those power series given by an algorithm that given n returns the n'th
term in finite time?

...
> > Power series (to make things concrete, say the power series in one
> > variable over the integers) form a ring. For any choice of
> > representation some of them can be represented exactly on a
> > computer, most can't. When doing computations with power series one
> > is typically chooses a precision (e.g. how many terms, not unlike a
> > choice of number of bits) to use.
>
> Truncated power series with coefficients that are typically exact
> rational numbers  (but could be, you say , elements of a finite field)
> form a computation structure that has exact operations.   Just because
> you associate some other concept that is outside the computer with
> that TPS  doesn't mean the TPS is approximate.

I would argue that a ring of truncated power series with rational
coefficients is an inexact representation of the ring of (untruncated)
power series with rational coefficients, just as RealField(prec) is an
inexact representation of the field of real numbers.

It is important to realize that my use of "inexact" here refers to
TPS/RealField(prec) as a representation of the ring of (untruncated)
power series with rational coefficients/the field of real numbers, and
not to TPS/RealField(prec) itself.

The object PowerSeriesRing(QQ, 'x') in Sage is not a ring of truncated
power series QQ[x]/x^n. In it not a ring at all. In fact == is not even
transitive:
  sage: R.<x> = PowerSeriesRing(QQ, 'x')
  sage: a = O(x^20)
  sage: a in R
  True
  sage: (0 == x^30 + a) and (x^30 + a == x^30)
  True
  sage: 0 == x^30
  False

> > Real numbers form a field. For any choice of representation some of 
> > them can be represented exactly on a computer, most can't. When
> > doing computations with real numbers...
>
> So you would say that 3  is approximate, because maybe it is pi, but pi
> cannot be represented as an integer.  This point of view seems to me
> to be peculiar.

Indeed, I would argue that 3 in RealField(100) is approximate. For
instance the (computable) real number 3.00...001, where the ellipsis is
1000 zeroes, gets mapped to 3 in RealField(100), but is not equal to 3
as an actual real number.

When speaking of exact/inexact I think it is important, to keep track of
what is to be represented. Hence the element 3 of RealField(100) is an
inexact approximation of a real number since there are many reals that
become 3 in RealField(100). The element 3 of IntegerRing() is an exact
approximation of an integer, since there is only one integer that
becomes 3 in IntegerRing().

> > > > > > It is more conservative to convert operands to the domain
> > > > > > with less precision.
> > > > >
> > > > > Why do you say that?  You can always exactly convert a float
> > > > > number in radix b to an equal number of higher precision in
> > > > > radix b by appending zeros.
> > > > > So it is more conserving (of values) to do so, rather than
> > > > > clipping off bits from the other.
> > > >
> > > > Clipping bits (or digits) is exactly how one is taught to deal
> > > > with significant figures in grade school, and follows the
> > > > principle of least surprise (though floating point numbers like
> > > > to play surprises on you no matter what). It's also what
> > > > floating point arithmetic does when the exponent is different.
> > >
> > > It is of course also taught in physics and chemistry labs, and I
> > > used this myself in the days when slide-rules were used and you
> > > could read only 3 or so significant figures.  That doesn't make it
> > > suitable for a computer system.  There are many things you learn
> > > along the way that are simplified versions of the more fully
> > > elaborated systems of higher math.
> > > What did you know about the branch cuts in the complex logarithm
> > > or  log(-1)  when you were first introduced to log?
> >
> > Only being able to store 53 significant bits is completely analogous
> > to only being able to read 3 significant (decimal) figures.
>
> Actually this analogy is false.  The 3 digits (sometimes 4) from a
> slide rule are the best that can be read out because of the inherent
> uncertainty in the rulings and construction of the slide rule, the
> human eye reading the lines, etc.   So if I read my slide rule and say
> 0.25  it is because I think it is closer to 0.25  than 0.24 or 0.26
> There is uncertainty there. If a floating point number is computed as
> 0.25, there is no uncertainty in the representation per se.  It is
> 1/4, exactly a binary fraction, etc. Now you could use this
> representation in various ways, e.g. 0.25+-0.01    storing 2 numbers
> representing a center and a "radius" or an interval or ..../   But the
> floating point number itself is simply a computer representation of a
> particular rational number    aaa x 2^bbb  Nothing more, nothing less.
> And in particular it does NOT mean that bits 54,55,56... are
> uncertain.  Those bits do not exist in the representation and are
> irrelevant for ascertaining the value of the number aaa x 2^bbb.
>
> So the analogy is false.

I think it is again important to remember what is to be represented. If
you use floats to express numbers of the form aaa x 2^bbb (with
appropriate conditions on aaa and bbb), then sure, floats are an exact
representation. However there are many real numbers that are not of
this form. So, if you use floats for real numbers, then they are an
inexact representation. In that case the bits 54, 55, 56, ... of the
real number are lost in the 53 bit floating point representation. Hence
given the 53 bit floating point representation, the bits 54, 55,
56, ... of the real number that it is supposed to represent, are indeed
uncertain. (This is simplified since the bit-numbering depends on the
exponent.)

If you happen to know that the real number that is represented is of the
form aaa x 2^bbb (with appropriate conditions on aaa and bbb), then you
can reconstruct all those extra bits, and the representation is arguably
exact. But how should Sage know that the represented real number is of
this form?

...
> > Or are you seriously proposing when adding 3.14159 and 1e-100 it
> > makes more sense, by default, to pad the left hand side with zeros
> > (whether in binary or decimal) and return 3.1415900000...0001 as the
> > result?
>
> If you did so, you would preserve the  identity  (a+b)-a   =  b

How would the real number pi be represented in this system? Do you have
to explicitly say that you want the first n digits? Or is the padding
scheme smart enough to add pi's digits? In the latter case, how do you
keep checking (in)equality efficient?

How does (1/3)*3 give? Does it compare equal to 1? I mean, I would
imagine 1/3 to be 0.333..., with more digits 3 as I need them, coming
from larger and larger 0 paddings of 1 and 3. Supposing this, (1/3)*3
would be 0.999..., with more digits 9 as I need them. Will comparing
with 1 ever finish?

...
> Now if the user specified the kind of arithmetic explicitly, or even
> implicitly by saying "use IEEE754 binary floating point arithmetic
> everywhere" then I could go along with that.

I think you can specify this to some extend. At least
  sage: RealField?
gives mentions opportunities to tweak precision and rounding. You can
use AA if you want exact arithmetic (of real algebraic numbers). I don't
think there is a field of computable reals in Sage.

...
> > > > > You seem to think that a number of low precision has some
> > > > > inaccuracy or uncertainty. Which it doesn't.   0.5 is the same
> > > > > number as 0.500000. Unless you believe that 0.5  is  the
> > > > > interval [0.45000000000....01, 0.549999..........9] which you
> > > > > COULD believe  -- some people do believe this.
> > > > > But you probably don't want to ask them about scientific
> > > > > computing.

Again, I think you should keep track of ambient sets here. For sure, the
real numbers 0.5 and 0.5000000 are the same, and real numbers have no
uncertainty (when representing real numbers). However, users do not type
real numbers. They type finite strings of characters that they probably
mean to represent some real number. The _strings_ 0.5 and 0.5000000 are
not the same. Hence it might be reasonable to assign different meaning
to these strings, as long as the user also has a way to precisely
express what is meant. Yet, I'm no expert on scientific computing, so I
will leave this be.

...
> > > > There's 0.5 the real number equal to one divided by two. There's
> > > > also 0.5 the IEEE floating point number, which is a
> > > > representative for an infinite number of real numbers in a small
> > > > interval.
> > >
> > > Can you cite a source for that last statement?  While I suppose
> > > you can decide anything you wish about your computer, the
> > > (canonical?) explanation is that the IEEE ordinary floats
> > > correspond to a subset of the EXACT rational numbers equal to
> > > <some integer> X 2^<some integer> which is not an interval at all.
> > > (there are also inf and nan things which are not ordinary in that
> > > sense)
> > >
> > > So, citation?  (And I don't mean citing a physicist, or someone
> > > who learned his arithmetic in 4th grade and hasn't re-evaluated it
> > > since. A legitimate numerical analyst.)

I am not a numerical analyst, so perhaps the following is naive, but I
would really like know where it is flawed or why my assumptions are
unreasonable.

I call a real number exactly representable when it occurs as an exact
value of a float with some bounded number of bits for its exponent.
Then when using floats to represent real numbers, you have to decide
for each real number x by which exactly representable real number f(x)
it should be represented. I can think of two desirable properties:
 * exactly representable real numbers should be represented by
   themselves,
 * the relation <= should be preserved.
These properties yield that for each exactly representable real number y
the set {x real : f(x) = y} is an interval around y. Moreover this
interval is contained in every interval (a, b) where a and b are exactly
representable real numbers with a < y and y < b.

Okay, we need some condition on the mantissa and exponent, for if the
mantissa and exponent are allowed to be arbitrary integers, a function
with these properties fails to exists: Let f be a partial function with
the required properties, then the intervals {x real : f(x) = y}, being
contained in the intervals (a,b) with a and b exactly representable
with a < y < b, necessarily contain only y. Since there are only
countable many exactly representable real numbers, each with preimage
containing only itself, the domain of definition of f is countable. In
particular f is not a total function.

...
> What this means is that the numbers in the system A and b often come
> from the real world and you don't know their true values, just the
> values that you measured.
> When you enter the data into the computer you put the values for A and
> b that are the closest numbers to what you believe.  A major question
> in computational linear algebra is how to measure the sensitivity of
> the computed solution to slight changes in the initial data.  The
> numbers used in solving the system are exactly the represented
> floating-point numbers.
>
> > They're representatives for the actual (either unknown or impossible
> > to represent) elements of your data. And, yes, they're also exact
> > rational numbers--once again they can be both--but the point is that
> > the input and output floating point numbers are viewed as
> > perturbations of the (actual) real field elements you care about.
>
> That does not give you or anyone else permission to say  "oh, this
> data might not be right so instead of solving the system as best as
> I can, I will use some off-hand rule to clip bits off, since it's just
> a perturbation of something".   No, you compute the best you can
> with the data you have.

You cannot just assume either that the data is exact.

Also it makes more sense to me not to do the best you can, but the best
you can do efficiently. Trying to do the best you can is of no use if it
means running out of memory or taking a year. If doing things
efficiently involves some clipping, then that's okay with me,
especially if the clipping is configurable.

Regards,

Erik Massop

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To post to this group, send email to sage-devel@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Re: [sage-devel] On scientific computing, Python and Julia

Reply via email to