Numeric literals, take 3

Angel Faus Wed, 27 Nov 2002 13:03:18 -0800

Hi, 

This in an updated version of the numeric literals document. Hopefully 
it is consistent with Michael's summary, and with discussions on the 
list.


The portions that were wrong (complex numbers, etc..) have been 
removed. Other parts (NaN, etc..) are still there, but I think that 
they should later moved elsewhere. 

I am just not sure that I got it right with the Ints part, see the 
separate post about it.

Michael: feel free to edit/add/remove as you need for the integrated 
number document. 

-angel

-------------------------
=section * Numeric Literals

=section ** Integer and Decimal literals

There are many ways to specify literal numeric values in
Perl, but the default is to use base 10 for both input and 
output.

  14
  -123.5

=section ** Scientific (or exponential) notation

Perl allows you to use standard scientific notation
to represent floating-point literals. For example:

  7.823e6

This notation is designed to let you write very large or
very small numbers efficiently. The left portion of the
C<e> is the coefficient, and the right is the exponent,
so a number of the form C<C.CCCeXX> is actually intepreted 
as C<C.CCC * 10**XX>.

You may use either upper or lowercase 'e' as the exponent 
marker, so, for example, 3.6e3 is the same as 3.6E3.

Some examples:

  7.828e6        # equivalent to 7828000
  7.828E6        # the same
  3.45e-12       # 0.00000000000345 

=section ** Radix Notation

Perl lets you write numbers in any base just by prepending 
the radix and a C<#> to the literal.

For example:

  2#101110                # binary
  2#0.1                   # fractional binary
  3#1210112               # tertiary 
  8#1270                  # octal

Printing these would give 46, 0.5, 1310, and 696
respectively. 

This is because once the number has been read by Perl
it becomes just a magnitude. That is, it loses all trace of
the way it was originally represented and is just a number. 
Perl always uses decimal for printing numbers, regardless of 
what base you used to enter them.

(See the section "Number to String Conversion" for information
about how to costumize the string representation of numbers)

When the base is greater than 10, there is a need to 
represent digits that are greater than 9. This can be done in
two ways:

=over

=item *

Alphanumeric digits: Following the common practice,
perl will interpret the A letter as the digit 10, the B
letter as digit 11, and so on. Alphanumeric digits are case 
insensitive:

  16#1E3A7              # base 16
  16:1e3a5              # the same 

This won't work for bases greater than 36, so we
have too:

=item *

Colon Form: You can also write each digit in its own
decimal representation, and separate digits using the C<:>
character.

 my $m = 256#255:255:255:0;    # base 256

=back

For example, the integer 30 can be written in base 16
in two equivalent ways:

   my $x = 16#1D;
   my $x = 16#1:14;

These two representations are incompatible, so writing
something like C<16#D:13> will generate a compile-time
error.

Also note that a compile-time error will be generated
if you specify a "digit" that is larger than your radix
can support. For instance,

  my $x = 3#23; # error; can't use digit '3' in base 3"

Finally, when writing negative integers, be aware that 
the minus sign (C<->) must always be the leftmost 
character in the literal, which means it will come to 
the left of the radix when using radix notation.

  my $z = -256#234:254;                  # negative number
  my $e = 256#-234:254;                  # error


=section *** Bin/Hex/Oct shorthands

Since writing in binary, hexadecimal and octal is a 
moderately common operation, Perl offers a shorthand
for them:

 0b0110   # bin
 0c0123   # oct
 0x00ff   # hex
 0x00fF   # hex, == 0x00ff
 0x00FF   # hex, == 0x00ff
 
=section ** Underscore character as a seprator

Perl allows the underscore character, C<_>, to be used as
a separator between the digits of any literal number. You
can use this to break up a long number into a more readable
forms.

The only rule is that you may only use underscore between 
digits, not between any other characters that is allowed 
to appear in a number.

 123_456_000.000     # decimal
 2#0110_1000         # binary
 16#FF_88_EE         # hexidecimal
 1312512.25          # decimal 
 1_312_512.25        # the same one
 
 _2_3_4____5___6     # error
 1.434_e_45          # erro: this 'e' is not a digit.
 
=section ** Pseudo-Numbers

=section *** NaN

The value C<NaN> ("Not a Number") may be returned by some
functions or operations to signal that the result of a
calculation (for example, division by zero) cannot be
represented by a numeric value.

Perl follows the IEEE-754 standard in operation between
C<NaN> and other numbers: the result is always C<NaN>.

 print 0 * NaN;    # NaN 
 print NaN / NaN;  # NaN
 print NaN == NaN; # NaN
 print NaN != NaN; # NaN

In boolean context (see "Boolean Context") C<NaN> always
evaluates to false so.

=section *** Inf

The terms C<Inf> and C<-Inf> represent positive and
negative infinity; you may sometimes use these to create
infinite lists.

Perl will also operate C<Inf> following the IEEE-754
standard, which means that it in general it will do 
what you expect from it. For example:

 (In the following table $X represents a finite 
 number, but not Inf itself or NaN)
 
  Operation      Result      When
 -----------    --------    ------

  Inf + $X       Inf        
  Inf + Inf      Inf   
  Inf - $X       Inf
  Inf - Inf      NaN

                -Inf       $X < 0
  Inf * $X        0        $X == 0
                 Inf       $X > 0
                 
  Inf * Inf      Inf 
  
                -Inf       $X < 0
  Inf / $X       NaN       $X == 0  # +-Inf actually
                +Inf       $X > 0
                
  Inf / Inf      NaN
  
                 NaN       $X < 0
  $X ** Inf      0         0 <= $X < 1
                 1         $X == 1
                 Inf       $X >= 1

NOTE: are we going to have +-Inf too?

=section * Caveats when using typed variables

All literal numbers are interepreted at compile-time,
before there is any information available about the type
of the variable that will store them.

This can produce undesired effects when working with
custom types of numbers, such as variables typed
as C<Int>.

For example:

 my Int $i is bigint = 777_666_555_444_333_222_111;
 print $i;  # prints 77766655544433300000

This happens because the C<777_666_555_444_333_222_111>
literal is interpreted as an untyped number, and since
it is too big to be stored in the native integer type,
is automatically promoted to the floating point number
C<7.77666555444333e+20>. When this number is 
stored in a Int variable it has alreadly lost the last 
digits.

If you need to create typed C<Int> numbers without risking 
to lose precision, you should write them as a string literals:

 my Int $i is bigint = '777_666_555_444_333_222_111'; 
 print $i; # prints 777666555444333222111

In that case the conversion to a number type happens
at run-time, and is controlled by the type of the C<$i>
variable, so everything goes well.

Numeric literals, take 3

Reply via email to