Re: Numeric Semantics
HaloO Darren Duncan wrote: Up front, I will say that, all this stuff about 1 vs 1.0 won't matter at all if the Int type is an actual subset of the Num type (but whose implementation is system-recognized and optimized), meaning that Int and Num are not disjoint, as "most folks" usually expect to be the case, such that, eg, 1 === 1.0 returns true. I agree to that except for the last statement. I think that 1 === 1.0 should be False because the involved types are different. This e.g. also applies to 1.0 === Complex(1.0,0.0) which should be False. In both cases we should have numeric equality, i.e. 1 == 1.0 and 1.0 == Complex(1.0,0.0) are True. And of course we have the subtyping chain Int <: Num <: Complex. The Gaussian integers are a subtype of Complex and a supertype of Int but not of Num. So in the end we have the type lattice Complex / \ Num Gaussian \ / Int It's interesting how this Gaussian type might be fitted in after the other three. The link from Int to Gaussian needs a supertyping construct. Something like 'role Gaussian does Complex superdoes Int'. So consider this as an addendum to the supertyping thread. Regards, TSa. --
Re: Numeric Semantics
On Mon, Jan 22, 2007 at 08:47:22PM -0800, Darren Duncan wrote: : At 5:56 PM -0800 1/22/07, Larry Wall wrote: : >Whether a Num that happens to be an integer prints out with .0 is a : >separate issue. My bias is that a Num pretend to be an integer when : >it can. I think most folks (including mathematicians) think that : >the integer 1 and the distance from 0 to 1 on the real number line : >happen to be the same number most of the time. And Perl is not about : >forcing a type-theoretical viewpoint on the user... : : Up front, I will say that, all this stuff about 1 vs 1.0 won't matter : at all if the Int type is an actual subset of the Num type (but whose : implementation is system-recognized and optimized), meaning that Int : and Num are not disjoint, as "most folks" usually expect to be the : case, such that, eg, 1 === 1.0 returns true. : : Of course if we did that, then dealing with Int will have a number of : the same implementation issues as with dealing with "subset"-declared : types in general. : : I don't know yet whether it was decided one way or the other. For various practical reasons I don't think we can treat Int as a subset of Num, especially if Num is representing any of several approximating types that may or may not have the "headroom" for arbitrary integer math, or that lose low bits in the processing of gaining high bits. It would be possible to make Int a subset of Rat (assuming Rat is implemented as Int/Int), but I don't think Rats are very practical either for most applications. It is unlikely that the universe uses Rats to calculate QM interactions. : Whereas, if Int is not an actual subset of Num, and so their values : are disjoint, then ... Then 1.0 == 1 !=== 1.0. I'm fine with that. : FYI, my comment about a stringified Num having a .0 for : round-tripping was meant to concern Perl code generation in : particular, such as what .perl() does, but it was brought up in a : more generic way, to stringification in general, for an attempt at : some consistency. It seems like an unnecessary consistency to me. The .perl method is intended to provide a human-readable, round-trippable form of serialization, and as a form of serialization it is required to capture all the information so that the original data structure can be recreated exactly. This policy is something the type should have little say in. At most it should have a say in *which* way to canonicalize all the data. The purpose of stringification, on the other hand, is whatever the type wants it to be, and it is specifically allowed to lose information, as long as the remaining string is suggestive to a human reader how to reconstruct the information in question (or that the information is none of their business). A document object could decide to stringify to a URI, for instance. An imported Perl 5 code reference could decide to stringify to CODE(0xdeadbeef). :) : Whether or not that is an issue depends really on whether we consider : the literal 1.0 in Perl code to be an Int or a Num. If, when we : parse Perl, we decide that 1.0 is a Num rather than an Int such as 1 : would be, then a .perl() invoked on a Num of value 1.0 should return : 1.0 also, so that executing that code once again produces the Num we : started with rather than an Int. I think the intent of 1.0 in Perl code is clearly more Numish than Intish, so I'm with you there. So I'm fine with Num(1).perl coming out simply as "1.0" without further type annotation. But ~1.0 is allowed to say "1" if that's how Num likes to stringify. : On the other hand, if .perl() produces some long-hand like "Int(1)" : or "Num(1)", then it won't matter whether it is "Num(1)" or : "Num(1.0)" etc. Well, mostly, unless we consider that Num(1.0) might have to wait till run time to know what conversion to Num actually means, if Num is sufficiently delegational But I think the compiler can probably require a tighter definition of basic types for optimization purposes, at least by default. Larry
Re: Numeric Semantics
On 1/22/07, Doug McNutt <[EMAIL PROTECTED]> wrote: At 00:32 + 1/23/07, Smylers wrote: > % perl -wle 'print 99 / 2' > 49.5 I would expect the line to return 49 because you surely meant integer division. Perl 5 just doesn't have a user-available type integer. That doesn't mean that I surely meant integer division. Being used to how Perl 5 (and many other languages) do things, I would expect floating point division (though if it's not floating point beneath the covers that's fine with me as long as I can still get 49.5 out). % perl -wle 'print 99.0 / 2.0' OR % perl -wle 'print 99.0 / 2' would return 49.5 because a coercion was required and float is the default for such things. But that may be the mathematician in me. I don't see why the mathematician in you doesn't expect "regular" mathematical behavior from Perl. Perhaps it's that you've been using computers too long and have become used to the limitations of digital media. -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Fwd: Numeric Semantics
I accidently sent this just to Darren ... -Scott -- Forwarded message -- From: Jonathan Scott Duff <[EMAIL PROTECTED]> Date: Jan 22, 2007 6:23 PM Subject: Re: Numeric Semantics To: Darren Duncan <[EMAIL PROTECTED]> On 1/22/07, Darren Duncan <[EMAIL PROTECTED]> wrote:. I think that 1 should be an Int and 1.0 should be a Num. That makes things very predictable for users, as well as easy to parse ... the visible radix point indicates that you are usually measuring to fractions of an integer, even if you aren't in that exact case. Also importantly, it makes it easy for users to choose what they want. For round-trip consistency, a generic non-formatted num-to-char-string operation should include a .0 as appropriate if it is converting from a Num, whereas when converting from an Int it would not. Furthermore, my preference is for Int and Num to be completely disjoint types, meaning that "1 === 1.0" will return False. However, every Int value can be mapped to a Num value, and so "1 == 1.0" will return True as expected, because == casts both sides as Num. While I'm in general agreement with everything you've said it makes me a tad nervous to hinge so much on the difference of one character. Can you imagine trying to track down the bug where if ($alpha === $beta) { ... } really should have been if ($alpha == $beta) { ... } Anyway, it's not like this problem wasn't already there, it's just that your email made it stand out to me. -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Re: Fwd: Numeric Semantics
> While I'm in general agreement with everything you've said it makes me a > tad nervous to hinge so much on the difference of one character. Can you > imagine trying to track down the bug where > > if ($alpha === $beta) { ... } > > really should have been > > if ($alpha == $beta) { ... } > > Anyway, it's not like this problem wasn't already there, it's just that > your email made it stand out to me. I'm not adding support to either side of the issue. I just wanted to point out that with Perl 5 and other current languages I occasionally have to search for that bug right now. Except it is spelled a little different with if ($alpha = $beta) { ... } When I really meant: if ($alpha == $beta) { ... } It is rare though. I think the == vs === will be rare also. Paul
Re: Fwd: Numeric Semantics
On 1/23/07, Paul Seamons <[EMAIL PROTECTED]> wrote: > While I'm in general agreement with everything you've said it makes me a > tad nervous to hinge so much on the difference of one character. Can you > imagine trying to track down the bug where > > if ($alpha === $beta) { ... } > > really should have been > > if ($alpha == $beta) { ... } > > Anyway, it's not like this problem wasn't already there, it's just that > your email made it stand out to me. I'm not adding support to either side of the issue. I just wanted to point out that with Perl 5 and other current languages I occasionally have to search for that bug right now. Except it is spelled a little different with if ($alpha = $beta) { ... } When I really meant: if ($alpha == $beta) { ... } It is rare though. I think the == vs === will be rare also. Perhaps. To me, finding the = vs. == bug is a bit easier due to the large conceptual difference between the operators. (or maybe I'm just used to looking for it after 20+ years of coding in languages that have = and ==) But for == vs. ===, they are both comparators and that tends to muddy the waters a bit when it comes to your brain helping you find the bug. (at least it does for me) -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Smooth or Chunky?
I've been struggling lately with a missing generalization, and I'm not sure how it's going to play out, so I thought I'd ask for advice, or at least think out loud a bit. Perl has always had functions and listops that take a flat list and do something with each element. Perl has also had various functions that return flat lists, and these naturally feed into the listops. For instance, the map function has always produced a flat list. A split with captures flattens the captures along with the split values. Recently I started redefining C to return multislices such that map { $_, $_ * 10 }, 1..3 seems to return 1,10,2,20,3,30 by default, but in a multidimensional context: @@multislice := map { $_, $_ * 10 }, 1..3 it would have the value [1,10], [2,20], [3,30]. Likewise the values returned by loop iterations or by C have been turned into such multislices. But then these all seem like special casing something more general. When I look at the functions we've defined so far, I see that zip(1,2,3; 4,5,6) produces [1,4],[2,5],[3,6] while each(1,2,3; 4,5,6) produces 1,4,2,5,3,6 and then I have to ask myself, "Why in this case do we have two separate functions that essentially do the same thing?" It's a design smell. Which leads me to think that zip should really return 1,4; 2,5; 3,6 and let the context either flatten or not. Basically, a list return is a Capture, so a higher order function that calls a list operator repeatedly is really returning a Capture of Captures, or a List of Captures, and that's probably what a "multislice" is really. However, currently the only way to get "chunky" behavior is to bind the CoC to a multislice array: @smooth := zip(1,2,3; 4,5,6) @@chunky := zip(1,2,3; 4,5,6) If the default is "smooth", then we need better rvalue syntax for explicitly turning a CoC into LoA, such that you could say chunky zip(1,2,3; 4,5,6) and get the list value [1,4],[2,5],[3,6] And indeed, it's easy to define the function in current notation, something like: sub chunky (*@@chunky) { return @chunky } Basically, this is the inverse of [;], which turns LoA into a CoC. [;] chunky mumble() But "chunky" is clunky, and I'm wondering what syntactic relief we can give ourselves here. I think people would get tired of writing .map({...}).chunky chunky split chunky for 1..100 { $_, $_*10 } I think most other languages would probably just default to returning a structured value and force the user to flatten explicitly. That doesn't seem much like the Perl Way though... Distinguish via unary operators maybe? |(1,2; 3,4) # smooth: 1,2,3,4 =(1,2; 3,4) # chunky: [1,2],[3,4] Doesn't seem to work well with method forms though... (1,2; 3,4)."|" (1,2; 3,4)."=" We could have a special .| form if the default where .=, but .= is taken already so .| can't be the default there. Maybe something else is better than = there. Or maybe we need a naming convention that distinguishes smooth/chunky variants of all the named functions/methods. Then we have the non-commital form: map the smooth form Xmap mapX and the chunky form Ymap mapY for some value of X and Y. But that approach starts to get a bit obnoxious when you want to start adding other similar modifiers, like whether map is allowed to be parallel or must be executed serially. It also doesn't work well with operator forms like ¥ and such. (That almost suggests it should be another metaoperator. Let's all shudder together now...but not rule out the possibility.) Adverbs would be another approach. Those could conceivably work on operators, though a bit oddly insofar as applying to all previous list-associative siblings: for 1,2 ¥ 3,4 ¥ 5,6 :smooth -> $a, $b, $c {...} for 1,2 ¥ 3,4 ¥ 5,6 :chunky -> [$a, $b, $c] {...} And how do you force the return value of the "for" to be smooth or chunky? Course, a "chunky" listop would probably do for that. I should also mention I did (briefly) consider the "null" reduce operator: [] zip(1,2;3,4) to mean "slap [] around each element", but it runs into ambiguity with the existing [] form indicating an empty list. Or maybe a multislice array is a special type, so it's really a type cast: @@(zip(1,2;3,4)) But then people will try to write @@zip and wonder why it doesn't work... The possibilities are endless, and I don't doubt that you can think of a few more... Larry
Re: Smooth or Chunky?
At 6:22 PM -0800 1/23/07, Larry Wall wrote: Recently I started redefining C to return multislices such that map { $_, $_ * 10 }, 1..3 seems to return 1,10,2,20,3,30 by default, but in a multidimensional context: @@multislice := map { $_, $_ * 10 }, 1..3 it would have the value [1,10], [2,20], [3,30]. Maybe I'm missing something important that would affect various new Perl 6 features, but if not then we could simply do this like it is done in Perl 5, which is to use the "smooth" approach all the time, and if people want chunks, they do something that would cause explicit chunking. Eg, smooth: map { $_, $_ * 10 }, 1..3 Vs chunky: map { [$_, $_ * 10] }, 1..3 -- Darren Duncan
Re: Smooth or Chunky?
On Tue, Jan 23, 2007 at 07:23:31PM -0800, Darren Duncan wrote: : At 6:22 PM -0800 1/23/07, Larry Wall wrote: : >Recently I started redefining C to return multislices such that : > : >map { $_, $_ * 10 }, 1..3 : > : >seems to return 1,10,2,20,3,30 by default, but in a multidimensional : >context: : > : >@@multislice := map { $_, $_ * 10 }, 1..3 : > : >it would have the value [1,10], [2,20], [3,30]. : : Maybe I'm missing something important that would affect various new : Perl 6 features, but if not then we could simply do this like it is : done in Perl 5, which is to use the "smooth" approach all the time, : and if people want chunks, they do something that would cause : explicit chunking. Perl 5 has lots of places where it doesn't scale to a multi-programmer team very well. Arguably the assumption that you have control over the entire source code is one of those places. : Eg, smooth: : : map { $_, $_ * 10 }, 1..3 : : Vs chunky: : : map { [$_, $_ * 10] }, 1..3 You seem to have picked the one function where that approach is possible. :) (Well, arguably C could use that approach too if you have control of the code doing the gathering. But what about things like C and C?) But the point is *not* to force it one way or the other--the point is that many such functions would probably prefer not to commit one way or the other, and they can't do that if they automatically throw away the "dimensional" information. Looking at it from the other end, you may not have control of the code that is returning the multislice. If the code predecides for you, then either you have to explicitly strip away the top level of [] all the time, or you can't even reproduce the [] structure because the run-length information is discarded. That's why I think the default should be to hide the top-level [] in what we call a multislice, and give the user the choice of whether to unflatten it or leave it (seemingly) flat to begin with. Larry