Re: Numeric Semantics

2007-01-23 Thread TSa

HaloO

Darren Duncan wrote:
Up front, I will say that, all this stuff about 1 vs 1.0 won't matter at 
all if the Int type is an actual subset of the Num type (but whose 
implementation is system-recognized and optimized), meaning that Int and 
Num are not disjoint, as "most folks" usually expect to be the case, 
such that, eg, 1 === 1.0 returns true.


I agree to that except for the last statement. I think that 1 === 1.0
should be False because the involved types are different. This e.g.
also applies to 1.0 === Complex(1.0,0.0) which should be False. In
both cases we should have numeric equality, i.e. 1 == 1.0 and
1.0 == Complex(1.0,0.0) are True. And of course we have the subtyping
chain Int <: Num <: Complex.

The Gaussian integers are a subtype of Complex and a supertype of
Int but not of Num. So in the end we have the type lattice

   Complex
/   \
  Num  Gaussian
\   /
 Int

It's interesting how this Gaussian type might be fitted in after the
other three. The link from Int to Gaussian needs a supertyping
construct. Something like 'role Gaussian does Complex superdoes Int'.
So consider this as an addendum to the supertyping thread.


Regards, TSa.
--


Re: Numeric Semantics

2007-01-23 Thread Larry Wall
On Mon, Jan 22, 2007 at 08:47:22PM -0800, Darren Duncan wrote:
: At 5:56 PM -0800 1/22/07, Larry Wall wrote:
: >Whether a Num that happens to be an integer prints out with .0 is a
: >separate issue.  My bias is that a Num pretend to be an integer when
: >it can.  I think most folks (including mathematicians) think that
: >the integer 1 and the distance from 0 to 1 on the real number line
: >happen to be the same number most of the time.  And Perl is not about
: >forcing a type-theoretical viewpoint on the user...
: 
: Up front, I will say that, all this stuff about 1 vs 1.0 won't matter 
: at all if the Int type is an actual subset of the Num type (but whose 
: implementation is system-recognized and optimized), meaning that Int 
: and Num are not disjoint, as "most folks" usually expect to be the 
: case, such that, eg, 1 === 1.0 returns true.
: 
: Of course if we did that, then dealing with Int will have a number of 
: the same implementation issues as with dealing with "subset"-declared 
: types in general.
: 
: I don't know yet whether it was decided one way or the other.

For various practical reasons I don't think we can treat Int as a
subset of Num, especially if Num is representing any of several
approximating types that may or may not have the "headroom" for
arbitrary integer math, or that lose low bits in the processing of
gaining high bits.  It would be possible to make Int a subset of Rat
(assuming Rat is implemented as Int/Int), but I don't think Rats are
very practical either for most applications.  It is unlikely that the
universe uses Rats to calculate QM interactions.

: Whereas, if Int is not an actual subset of Num, and so their values 
: are disjoint, then ...

Then 1.0 == 1 !=== 1.0.  I'm fine with that.

: FYI, my comment about a stringified Num having a .0 for 
: round-tripping was meant to concern Perl code generation in 
: particular, such as what .perl() does, but it was brought up in a 
: more generic way, to stringification in general, for an attempt at 
: some consistency.

It seems like an unnecessary consistency to me.  The .perl method
is intended to provide a human-readable, round-trippable form of
serialization, and as a form of serialization it is required to
capture all the information so that the original data structure
can be recreated exactly.  This policy is something the type should
have little say in.  At most it should have a say in *which* way to
canonicalize all the data.

The purpose of stringification, on the other hand, is whatever the type
wants it to be, and it is specifically allowed to lose information,
as long as the remaining string is suggestive to a human reader how
to reconstruct the information in question (or that the information is
none of their business).  A document object could decide to stringify
to a URI, for instance.  An imported Perl 5 code reference could
decide to stringify to CODE(0xdeadbeef).  :)

: Whether or not that is an issue depends really on whether we consider 
: the literal 1.0 in Perl code to be an Int or a Num.  If, when we 
: parse Perl, we decide that 1.0 is a Num rather than an Int such as 1 
: would be, then a .perl() invoked on a Num of value 1.0 should return 
: 1.0 also, so that executing that code once again produces the Num we 
: started with rather than an Int.

I think the intent of 1.0 in Perl code is clearly more Numish than
Intish, so I'm with you there.  So I'm fine with Num(1).perl coming
out simply as "1.0" without further type annotation.  But ~1.0 is
allowed to say "1" if that's how Num likes to stringify.

: On the other hand, if .perl() produces some long-hand like "Int(1)" 
: or "Num(1)", then it won't matter whether it is "Num(1)" or 
: "Num(1.0)" etc.

Well, mostly, unless we consider that Num(1.0) might have to wait
till run time to know what conversion to Num actually means, if Num
is sufficiently delegational  But I think the compiler can probably
require a tighter definition of basic types for optimization purposes,
at least by default.

Larry


Re: Numeric Semantics

2007-01-23 Thread Jonathan Scott Duff

On 1/22/07, Doug McNutt <[EMAIL PROTECTED]> wrote:


At 00:32 + 1/23/07, Smylers wrote:
>  % perl -wle 'print 99 / 2'
>  49.5

I would expect the line to return 49 because you surely meant integer
division. Perl 5 just doesn't have a user-available type integer.



That doesn't mean that I surely meant integer division. Being used to how
Perl 5 (and many other languages) do things, I would expect floating point
division (though if it's not floating point beneath the covers that's fine
with me as long as I can still get 49.5 out).

% perl -wle 'print 99.0 / 2.0'   OR

% perl -wle 'print 99.0 / 2'

would return 49.5 because a coercion was required and float is the default
for such things.

But that may be the mathematician in me.



I don't see why the mathematician in you doesn't expect "regular"
mathematical behavior from Perl.  Perhaps it's that you've been using
computers too long and have become used to the limitations of digital media.

-Scott
--
Jonathan Scott Duff
[EMAIL PROTECTED]


Fwd: Numeric Semantics

2007-01-23 Thread Jonathan Scott Duff

I accidently sent this just to Darren ...

-Scott

-- Forwarded message --
From: Jonathan Scott Duff <[EMAIL PROTECTED]>
Date: Jan 22, 2007 6:23 PM
Subject: Re: Numeric Semantics
To: Darren Duncan <[EMAIL PROTECTED]>



On 1/22/07, Darren Duncan <[EMAIL PROTECTED]> wrote:.


I think that 1 should be an Int and 1.0 should be a Num.  That makes
things very predictable for users, as well as easy to parse ... the
visible radix point indicates that you are usually measuring to
fractions of an integer, even if you aren't in that exact case.  Also
importantly, it makes it easy for users to choose what they want.

For round-trip consistency, a generic non-formatted
num-to-char-string operation should include a .0 as appropriate if it
is converting from a Num, whereas when converting from an Int it
would not.

Furthermore, my preference is for Int and Num to be completely
disjoint types, meaning that "1 === 1.0" will return False.  However,
every Int value can be mapped to a Num value, and so "1 == 1.0" will
return True as expected, because == casts both sides as Num.



While I'm in general agreement with everything you've said it makes me a
tad  nervous to hinge so much on the difference of one character.  Can you
imagine trying to track down the bug where

   if ($alpha === $beta) { ... }

really should have been

   if ($alpha == $beta) { ... }

Anyway, it's not like this problem wasn't already there, it's just that your
email made it stand out to me.

-Scott

--
Jonathan Scott Duff
[EMAIL PROTECTED]


Re: Fwd: Numeric Semantics

2007-01-23 Thread Paul Seamons
> While I'm in general agreement with everything you've said it makes me a
> tad  nervous to hinge so much on the difference of one character.  Can you
> imagine trying to track down the bug where
>
> if ($alpha === $beta) { ... }
>
> really should have been
>
> if ($alpha == $beta) { ... }
>
> Anyway, it's not like this problem wasn't already there, it's just that
> your email made it stand out to me.

I'm not adding support to either side of the issue.  I just wanted to point 
out that with Perl 5 and other current languages I occasionally have to 
search for that bug right now.  Except it is spelled a little different with 

  if ($alpha = $beta) { ... }

When I really meant:

  if ($alpha == $beta) { ... }

It is rare though.  I think the == vs === will be rare also.

Paul


Re: Fwd: Numeric Semantics

2007-01-23 Thread Jonathan Scott Duff

On 1/23/07, Paul Seamons <[EMAIL PROTECTED]> wrote:


> While I'm in general agreement with everything you've said it makes me a
> tad  nervous to hinge so much on the difference of one character.  Can
you
> imagine trying to track down the bug where
>
> if ($alpha === $beta) { ... }
>
> really should have been
>
> if ($alpha == $beta) { ... }
>
> Anyway, it's not like this problem wasn't already there, it's just that
> your email made it stand out to me.

I'm not adding support to either side of the issue.  I just wanted to
point
out that with Perl 5 and other current languages I occasionally have to
search for that bug right now.  Except it is spelled a little different
with

  if ($alpha = $beta) { ... }

When I really meant:

  if ($alpha == $beta) { ... }

It is rare though.  I think the == vs === will be rare also.



Perhaps.

To me, finding the = vs. == bug is a bit easier due to the large conceptual
difference between the operators.  (or maybe I'm just used to looking for it
after 20+ years of coding in languages that have = and ==)  But for == vs.
===, they are both comparators and that tends to muddy the waters a bit when
it comes to your brain helping you find the bug.  (at least it does for me)

-Scott
--
Jonathan Scott Duff
[EMAIL PROTECTED]


Smooth or Chunky?

2007-01-23 Thread Larry Wall
I've been struggling lately with a missing generalization, and I'm not
sure how it's going to play out, so I thought I'd ask for advice, or
at least think out loud a bit.

Perl has always had functions and listops that take a flat list and
do something with each element.  Perl has also had various functions
that return flat lists, and these naturally feed into the listops.
For instance, the map function has always produced a flat list.  A
split with captures flattens the captures along with the split values.

Recently I started redefining C to return multislices such that 

map { $_, $_ * 10 }, 1..3

seems to return 1,10,2,20,3,30 by default, but in a multidimensional
context:

@@multislice := map { $_, $_ * 10 }, 1..3

it would have the value [1,10], [2,20], [3,30].

Likewise the values returned by loop iterations or by C have been
turned into such multislices.

But then these all seem like special casing something more general.
When I look at the functions we've defined so far, I see that

zip(1,2,3; 4,5,6)

produces

[1,4],[2,5],[3,6]

while

each(1,2,3; 4,5,6)

produces

1,4,2,5,3,6

and then I have to ask myself, "Why in this case do we have two separate
functions that essentially do the same thing?"  It's a design smell.

Which leads me to think that zip should really return

1,4; 2,5; 3,6

and let the context either flatten or not.  Basically, a list return
is a Capture, so a higher order function that calls a list operator
repeatedly is really returning a Capture of Captures, or a List of Captures,
and that's probably what a "multislice" is really.

However, currently the only way to get "chunky" behavior is to bind
the CoC to a multislice array:

@smooth  := zip(1,2,3; 4,5,6)
@@chunky := zip(1,2,3; 4,5,6)

If the default is "smooth", then we need better rvalue syntax for explicitly
turning a CoC into LoA, such that you could say

chunky zip(1,2,3; 4,5,6)

and get the list value

[1,4],[2,5],[3,6]

And indeed, it's easy to define the function in current notation, something
like:

sub chunky (*@@chunky) { return @chunky }

Basically, this is the inverse of [;], which turns LoA into a CoC.

[;] chunky mumble()

But "chunky" is clunky, and I'm wondering what syntactic relief we can
give ourselves here.  I think people would get tired of writing

.map({...}).chunky
chunky split
chunky for 1..100 { $_, $_*10 }

I think most other languages would probably just default to returning
a structured value and force the user to flatten explicitly.  That doesn't
seem much like the Perl Way though...

Distinguish via unary operators maybe?

|(1,2; 3,4) # smooth: 1,2,3,4
=(1,2; 3,4) # chunky: [1,2],[3,4]

Doesn't seem to work well with method forms though...

(1,2; 3,4)."|"
(1,2; 3,4)."="

We could have a special .| form if the default where .=, but .= is
taken already so .| can't be the default there.  Maybe something else
is better than = there.

Or maybe we need a naming convention that distinguishes smooth/chunky
variants of all the named functions/methods.  Then we have the
non-commital form:

map

the smooth form

Xmap
mapX

and the chunky form

Ymap
mapY

for some value of X and Y.  But that approach starts to get a bit
obnoxious when you want to start adding other similar modifiers, like
whether map is allowed to be parallel or must be executed serially.
It also doesn't work well with operator forms like ¥ and such.

(That almost suggests it should be another metaoperator.  Let's all
shudder together now...but not rule out the possibility.)

Adverbs would be another approach.  Those could conceivably work on
operators, though a bit oddly insofar as applying to all previous
list-associative siblings:

for 1,2 ¥ 3,4 ¥ 5,6 :smooth -> $a, $b, $c {...}
for 1,2 ¥ 3,4 ¥ 5,6 :chunky -> [$a, $b, $c] {...}

And how do you force the return value of the "for" to be smooth or
chunky?  Course, a "chunky" listop would probably do for that.

I should also mention I did (briefly) consider the "null" reduce
operator:

[] zip(1,2;3,4)

to mean "slap [] around each element", but it runs into ambiguity with
the existing [] form indicating an empty list.

Or maybe a multislice array is a special type, so it's really a type cast:

@@(zip(1,2;3,4))

But then people will try to write @@zip and wonder why it doesn't work...

The possibilities are endless, and I don't doubt that you can think of
a few more...

Larry


Re: Smooth or Chunky?

2007-01-23 Thread Darren Duncan

At 6:22 PM -0800 1/23/07, Larry Wall wrote:

Recently I started redefining C to return multislices such that

map { $_, $_ * 10 }, 1..3

seems to return 1,10,2,20,3,30 by default, but in a multidimensional
context:

@@multislice := map { $_, $_ * 10 }, 1..3

it would have the value [1,10], [2,20], [3,30].


Maybe I'm missing something important that would affect various new 
Perl 6 features, but if not then we could simply do this like it is 
done in Perl 5, which is to use the "smooth" approach all the time, 
and if people want chunks, they do something that would cause 
explicit chunking.


Eg, smooth:

  map { $_, $_ * 10 }, 1..3

Vs chunky:

  map { [$_, $_ * 10] }, 1..3

-- Darren Duncan


Re: Smooth or Chunky?

2007-01-23 Thread Larry Wall
On Tue, Jan 23, 2007 at 07:23:31PM -0800, Darren Duncan wrote:
: At 6:22 PM -0800 1/23/07, Larry Wall wrote:
: >Recently I started redefining C to return multislices such that
: >
: >map { $_, $_ * 10 }, 1..3
: >
: >seems to return 1,10,2,20,3,30 by default, but in a multidimensional
: >context:
: >
: >@@multislice := map { $_, $_ * 10 }, 1..3
: >
: >it would have the value [1,10], [2,20], [3,30].
: 
: Maybe I'm missing something important that would affect various new 
: Perl 6 features, but if not then we could simply do this like it is 
: done in Perl 5, which is to use the "smooth" approach all the time, 
: and if people want chunks, they do something that would cause 
: explicit chunking.

Perl 5 has lots of places where it doesn't scale to a multi-programmer
team very well.  Arguably the assumption that you have control over
the entire source code is one of those places.

: Eg, smooth:
: 
:   map { $_, $_ * 10 }, 1..3
: 
: Vs chunky:
: 
:   map { [$_, $_ * 10] }, 1..3

You seem to have picked the one function where that approach is possible. :)

(Well, arguably C could use that approach too if you have control
of the code doing the gathering.  But what about things like C
and C?)

But the point is *not* to force it one way or the other--the point is
that many such functions would probably prefer not to commit one way or
the other, and they can't do that if they automatically throw away the
"dimensional" information.  Looking at it from the other end, you may
not have control of the code that is returning the multislice.  If the
code predecides for you, then either you have to explicitly strip away
the top level of [] all the time, or you can't even reproduce the []
structure because the run-length information is discarded.  That's why
I think the default should be to hide the top-level [] in what we call
a multislice, and give the user the choice of whether to unflatten it
or leave it (seemingly) flat to begin with.

Larry