Valid hash keys?
Just a quick question: Is Hash keys still Strings, or can they be arbitary values? If the latter, can Int 2, Num 2.0 and Str "2" point to different values? Thanks, /Autrijus/ pgpWfMqiYCwrU.pgp Description: PGP signature
Re: Valid hash keys?
Autrijus Tang writes: > Just a quick question: Is Hash keys still Strings, or can they be > arbitary values? They can be declared to be arbitrary: my %hash is shape(Any); > If the latter, can Int 2, Num 2.0 and Str "2" point to different > values? That's an interesting question. Some people would want them to, and some people would definitely not want them to. I think the general consensus is that people would not want them to be different, since in the rest of perl, 2 and "2" are the same. The object model that I'm working on actually identifies 2 and "2" as the same object, indistinguishable in every respect. But that hasn't been accepted (or even proposed)... yet. Luke
Re: Valid hash keys?
Luke Palmer writes: > Autrijus Tang writes: > > Just a quick question: Is Hash keys still Strings, or can they be > > arbitary values? > > They can be declared to be arbitrary: > > my %hash is shape(Any); > > > > If the latter, can Int 2, Num 2.0 and Str "2" point to different > > values? > > That's an interesting question. Some people would want them to, and > some people would definitely not want them to. I think the general > consensus is that people would not want them to be different, since in > the rest of perl, 2 and "2" are the same. I forgot an important concretity. Hashes should compare based on the generic "equal" operator, which knows how to compare apples and apples, and oranges and oranges, and occasionally a red orange to an apple. That is: 3 equal 3 ==> true 3 equal { codeblock } ==> false 3 equal "3"==> true Luke
Re: S06: Pairs as lvalues
Hi, Luke Palmer luqui.org> writes: > Ingo Blechschmidt writes: > > my $x = (a => 42); # $x is a Pair. > > $x = 13; # Is $x now the Pair (a => 13) or > > # the Int 13? > > You see, in your example, the pair is not "functioning as > an lvalue". The variable is the thing that is the lvalue, > not the pair. ah! That clears it up! Thanks! :) --Ingo -- Linux, the choice of a GNU | When cryptography is outlawed, bayl bhgynjf generation on a dual AMD- | jvyy unir cevinpl! Athlon!|
Re: Valid hash keys?
On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote: > Luke Palmer writes: > > Autrijus Tang writes: > > > Just a quick question: Is Hash keys still Strings, or can they be > > > arbitary values? > > > > They can be declared to be arbitrary: > > > > my %hash is shape(Any); > > > > > > > If the latter, can Int 2, Num 2.0 and Str "2" point to different > > > values? > > > > That's an interesting question. Some people would want them to, and > > some people would definitely not want them to. I think the general > > consensus is that people would not want them to be different, since in > > the rest of perl, 2 and "2" are the same. > > I forgot an important concretity. Hashes should compare based on the > generic "equal" operator, which knows how to compare apples and apples, > and oranges and oranges, and occasionally a red orange to an apple. > > That is: > > 3 equal 3 ==> true > 3 equal { codeblock } ==> false > 3 equal "3"==> true > I would have assumed a hash who shape was defined as C to perform the hashing function directly on the (nominally 32-bit) binary representation Int 2. Likewise, c, would perform the hashing on the binary rep of the (nom.64-bit) Double value. And C, on the address of the key passed? By extension, a C<%hash is shape( Any )> would hash the binary representation of whatever (type of) key it was given, which would make keys of 2, 2.0, '2', '2.0', (Int2)2 etc. all map to different keys. If C<%hash is shape(Any)> maps all the above representation of 2 to the same value, then C becomes a synonym for C<%hash is shape(Scalar)>. (is that the same as C?). If my auumption is correct, then that would imply that each type hash or inherits a .binary or .raw method to allow access to the internal representation? > Luke
Dynamically Scoped Dynamic Scopes
I was thinking about scopes (for a problem unrelated to Perl 6), and I realised that the scoping concepts in P6 are somewhat limited. We have my $var # lexical scope temp $var # lexically-scoped dynamic scope C is lexically scoped in that its effect goes away at the closing curly of the lexical scope that contains it. A concept that we seem to be missing is the possibility of dynamically scoped dynamic scopes. I hesitate to come up with a syntax; but I can think of a couple of examples where it might be used. Caveat: if you beleive that globals are fundamentally evil, and that everything should be objects, then this is unnecessary. but for other people ... Example 1: Create a dynamic scope, and then spawn N threads. Each thread has its own lexical scope. After each thread has done some work, each reaches a barrier. Once all the threads have reached that barrier, we terminate the dynamic scope that we previously introduced: the threads then continue in their lexical scopes, but with the different dynamic scope Example 2: a state machine: imagine binding a number of variables into a "scope space", in which we then instance multiple scopes. We can then create a state machine in which we change the currently visible values of the scoped variables by changing the "current scope" of the "scope space". One could imagine implementing this by creating the scopes as instances of an object, and then binding the object's attributes onto the variables (i.e. "our $foo := $obj.bar"). The "scope space" object would then be the set of global vaiables to be bound; and the "scope" object would be the set of values to bind. However, when we want to release the global vaiables from our scope, then we need a way to unbind the variables, and restore them to the bindings that existed before they were bound to our scope space. I'm not sure how to do that, because we don't have any builtin concept of dynamically scoped scopes. Dave.
Re: Valid hash keys?
On Sun, Feb 27, 2005 at 02:20:59AM -0700, Luke Palmer wrote: > I forgot an important concretity. Hashes should compare based on the > generic "equal" operator, which knows how to compare apples and apples, > and oranges and oranges, and occasionally a red orange to an apple. Um. Hashes don't really compare, though, do they? Maybe you just mean a notional equals operator, which isn't really used; but it seems to me that what hashes acutally implement is more of a 'canonicalize' operator. Actually, it would be useful sometimes to be able to give a hash an explicit canonicalizer: my %msdos_files is canonicalized_by lc; my %fractions is canonicalized_by gcd; Alex
Re: Valid hash keys?
Nigel Sandever writes: > On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote: > > I forgot an important concretity. Hashes should compare based on the > > generic "equal" operator, which knows how to compare apples and apples, > > and oranges and oranges, and occasionally a red orange to an apple. > > > > That is: > > > > 3 equal 3 ==> true > > 3 equal { codeblock } ==> false > > 3 equal "3"==> true > > > > I would have assumed a hash who shape was defined as C to perform > the hashing function directly on the (nominally 32-bit) binary > representation Int 2. I wasn't even thinking about implementation. Sometimes it's good to let implementation drive language, but I don't think it's appropriate here. When we're talking about hashes of everything, there are a couple of routes we can take. We can go Java's way and define a .hash method on every object. We can go C++'s way and not hash at all, but instead define a transitive ordering on every object. We can go Perl's way and find a string representation of the object and map the problem back to string hashing, which we can do well. But the biggest problem is that if the user overloads 'equal' on two objects, the hash should consider them equal. We could require that to overload 'equal', you also have to overload .hash so that you've given some thought to the thing. The worry I have is that people will do: method hash() { 0 } But I suppose that's okay. That just punts the work off to 'equal', albeit in linear time. That may be the right way to go. Use a Javaesque .hash method with a sane default (an introspecting default, perhaps), and use a sane equivalent default for 'equal'. As far as getting 2, 2.0, and "2" to hash to the same object, well, we know they're 'equal', so we just need to know how to hash them the same way. In fact, I don't believe 2.0 can be represented as a Num. At least in Perl 5, it translates itself back to an int. So we can just stringify and hash for the scalar types. Luke
Re: How are types related to classes and roles?
HaloO, Larry Wall wrote: On Fri, Feb 25, 2005 at 12:45:45AM +0800, Autrijus Tang wrote: : So, I think late binding is a sensible (and practical) default, but : do you think it may be a good thing to have a type inference mode that : assign static contexts to expressions, and prebind as much as possible? : It may be possible to enable via a pragma or a compiler switch... Well, that is the optimizer everybody keeps talking about. And the more type input it has, the better it can pre-select multi methods. A very interesting feature for later versions of Perl 6 could even allow to perform complete program optimization where code passages in modules that can't be pre-selected on local information alone could be optimized when all type information is available. It's certainly something to explore. If I recall, I took some kind of compromise position in the Apocalypse where I said we probably wouldn't treat return-type-MMD with the same authority as parameter MMD, but we might be able to use return type as a tie-breaker on otherwise equivalent routines. I guess "equivalent routines" shall mean "same specificity of invocant type"? At that point to choose the multi with the lower return type seems tricky and might lead to surprises. Note the following little diagram where the two operators <: subtype :> supertype ---which BTW would make nice standard operators :) --- are used to show the function subtyping and method selection in one picture: method selection covariant +-- <: --+ || multi sub f2 ( Inv2 : Arg2 ) returns Ret2 <: multi sub Ret1 f1 ( Inv1 : Arg1 ) | ||| subtyping of |covariant + <: +| Code objects || +--- :> -+ contravariant For method selection the short names have to be the same of course. The Arg types are handled as type errors at runtime or compile time, right? No tertiary tie breaking? :) Basically, instead of writing a single routine with a big switch statement in it, you'd be able to write multiple routines with the same parameters but different return types as a form of syntactic sugar over the switch statement. It's not clear if such an approach would buy us anything in terms of type inferencing, except insofar as sub declarations are easier to mine the return types out of than an embedded switch statement. Maybe that buys us a lot, though, just as having class metadata available at compile time is a big improvement over Perl 5's @ISA. Anyway, I don't profess to have thought deeply about type inferencing. But I do know that I don't want to turn Perl 6 into ML just yet... Hasn't type inferencing become easy with the full power of junctions? I imagine the compiler annotating the AST with types from the leaves up. At multi calls which can't be pre-selected at compile time a one() junction of all possibly matching multi's return types is assigned. BTW, How many types does Int|Str|Num produce? All these: Int|Str|Num, Int|Str, Int|Num, Str|Num, Int, Str, Num? Or just the last three? What is then the role of Int^Str^Num? Is the syntax type Criterion ::= KeyExtractor | Comparator | Pair(KeyExtractor, Comparator) ; used in the sort ruling still current? There the RHS looks more like a grammar rule alternation which is checked in turn than a real any() junction. Regards, -- TSa (Thomas Sandlaß)
Re: Valid hash keys?
On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote: > Nigel Sandever writes: > > On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote: > > > I forgot an important concretity. Hashes should compare based on the > > > generic "equal" operator, which knows how to compare apples and apples, > > > and oranges and oranges, and occasionally a red orange to an apple. > > > > > > That is: > > > > > > 3 equal 3 ==> true > > > 3 equal { codeblock } ==> false > > > 3 equal "3"==> true > > > > > > > I would have assumed a hash who shape was defined as C to perform > > the hashing function directly on the (nominally 32-bit) binary > > representation Int 2. > > I wasn't even thinking about implementation. Sometimes it's good to let > implementation drive language, but I don't think it's appropriate here. > > When we're talking about hashes of everything, there are a couple of > routes we can take. We can go Java's way and define a .hash method on > every object. We can go C++'s way and not hash at all, but instead > define a transitive ordering on every object. We can go Perl's way and > find a string representation of the object and map the problem back to > string hashing, which we can do well. > > But the biggest problem is that if the user overloads 'equal' on two > objects, the hash should consider them equal. We could require that to > overload 'equal', you also have to overload .hash so that you've given > some thought to the thing. The worry I have is that people will do: > > method hash() { 0 } > > But I suppose that's okay. That just punts the work off to 'equal', > albeit in linear time. > > That may be the right way to go. Use a Javaesque .hash method with a > sane default (an introspecting default, perhaps), and use a sane > equivalent default for 'equal'. > > As far as getting 2, 2.0, and "2" to hash to the same object, well, we > know they're 'equal', so we just need to know how to hash them the same > way. In fact, I don't believe 2.0 can be represented as a Num. At > least in Perl 5, it translates itself back to an int. So we can just > stringify and hash for the scalar types. > My thought is that if c uses the stringyfied values of the keys, then it is no different to C, I think it would be useful for shape(Any) be different to an ordinary hash, and hashing the binary representation of the key, so that (Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc. would be a useful way of collating things according to their "type" rather than their value? > > Luke njs.
Re: Valid hash keys?
Alex Burr wrote: [..] Actually, it would be useful sometimes to be able to give a hash an explicit canonicalizer: my %msdos_files is canonicalized_by lc; my %fractions is canonicalized_by gcd; Shouldn't that be handled by container subclasses of Hash? Like PersitentScalar or SparseArray? Regards, -- TSa (Thomas Sandlaß)
Re: Valid hash keys?
Nigel Sandever writes: > On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote: > > As far as getting 2, 2.0, and "2" to hash to the same object, well, we > > know they're 'equal', so we just need to know how to hash them the same > > way. In fact, I don't believe 2.0 can be represented as a Num. At > > least in Perl 5, it translates itself back to an int. So we can just > > stringify and hash for the scalar types. > > My thought is that if c uses the stringyfied values > of > the keys, then it is no different to C, Indeed, but I meant just for our non-reference scalar types, such as Num, Int, and Str. > I think it would be useful for shape(Any) be different to an ordinary > hash, and hashing the binary representation of the key, so that > > (Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc. > > would be a useful way of collating things according to their "type" > rather than their value? That may indeed be a useful kind of map to have, but it's hardly what people expect when they declare a hash keyed by Any. The reason I want to identify 2 and "2" is because they are identical everywhere else in the language. Also, if you hash on the binary representation, what's to say that uint hashes differently from uint4. And even if we do guarantee that they hash differently, there will be collisions. And then uint(2) equal uint4(2) (for any reasonable implementation of equal), and they are the same, but only for collision values. A better way to make a type-collated hash would be to: class TypeHash is Hash[shape => [Class; Any]] { method postcircumfix:<{ }> (Any $arg) { .SUPER::{$arg.type; $arg} } } And if you really think that it's going to be that common, then you can write a module that implements that. It would be about five lines long. Luke
Re: Valid hash keys?
On Sun, Feb 27, 2005 at 11:57:30PM +0100, Thomas Sandlaß wrote: > Alex Burr wrote: > > >[..] Actually, it would be useful sometimes > >to be able to give a hash an explicit canonicalizer: > > > >my %msdos_files is canonicalized_by lc; > > > >my %fractions is canonicalized_by gcd; > > Shouldn't that be handled by container subclasses of Hash? > Like PersitentScalar or SparseArray? Possibly. Clearly that's what one would do in any other language. What I was thinking was that if hashes are going to have a canonicalizer function *anyway*, maybe the default implementation could be overridable with a trait (or role?). But I can't actually claim to have followed perl6 development enough to be able to argue that it really makes sense. Alex
Re: Valid hash keys?
On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote: > Nigel Sandever writes: > > When we're talking about hashes of everything, there are a couple of > routes we can take. We can go Java's way and define a .hash method on > every object. We can go C++'s way and not hash at all, but instead > define a transitive ordering on every object. The more I think about this, please, no. The reason hashes are called hashes is because they hash. If we need bags, sets, or orderThingies with overloadable transitive ordering, they can be written as classes--that possibly overload the hash syntax obj<> = value or whatever, but don't tack all that overhead on to the basic primitive. > We can go Perl's way and > find a string representation of the object and map the problem back to > string hashing, which we can do well. The only question is why does the thing that gets hashed have to be stringyfied first? In p5, I often use hash{ pack'V', $key } = $value. # or 'd' 1) Because for large hashes using numeric keys it use up less space for the keys. 4-bytes rather than 10 for 2**32. 2) By using all 256 values of each byte, it tends to spread the keys more even across fewer buckets; use Devel::Size qw[total_size size]; undef $h{ pack 'V', $_ } for map{ $_ * 1 } 0 .. 99; print total_size \%h; 18418136 print scalar %h; 292754/524288 versus use Devel::Size qw[total_size size]; undef $h{ $_ } for map{ $_ * 1 } 0 .. 99; print total_size \%h; 48083250 print scalar %h; 644301/1048576 It would also avoid the need for hacks like Tie:RefHash, by hashing the address of the ref rather than the stringyfied ref and forcing the key to be stored twice and the creation of zillions of anonymous arrays to hold the unstrigyfied ref+value. The same could be extended to hashing the composite binary representations of whole structures and objects. njs.
Re: Valid hash keys?
Luke Palmer wrote: The object model that I'm working on actually identifies 2 and "2" as the same object, indistinguishable in every respect. Okay, that's fine, since C< 2 eq "2" > and C< 2 == "2" >. But what about 2.0 and "2.0"? In Perl5, C< 2.0 == "2.0" >, but C< 2.0 ne "2.0" >. -- Rod Adams