Valid hash keys?

2005-02-27 Thread Autrijus Tang
Just a quick question: Is Hash keys still Strings, or can they
be arbitary values? If the latter, can Int 2, Num 2.0 and Str "2"
point to different values?

Thanks,
/Autrijus/


pgpWfMqiYCwrU.pgp
Description: PGP signature


Re: Valid hash keys?

2005-02-27 Thread Luke Palmer
Autrijus Tang writes:
> Just a quick question: Is Hash keys still Strings, or can they be
> arbitary values? 

They can be declared to be arbitrary:

my %hash is shape(Any);


> If the latter, can Int 2, Num 2.0 and Str "2" point to different
> values?

That's an interesting question.  Some people would want them to, and
some people would definitely not want them to.  I think the general
consensus is that people would not want them to be different, since in
the rest of perl, 2 and "2" are the same.

The object model that I'm working on actually identifies 2 and "2" as
the same object, indistinguishable in every respect.  But that hasn't
been accepted (or even proposed)... yet.

Luke


Re: Valid hash keys?

2005-02-27 Thread Luke Palmer
Luke Palmer writes:
> Autrijus Tang writes:
> > Just a quick question: Is Hash keys still Strings, or can they be
> > arbitary values? 
> 
> They can be declared to be arbitrary:
> 
> my %hash is shape(Any);
> 
> 
> > If the latter, can Int 2, Num 2.0 and Str "2" point to different
> > values?
> 
> That's an interesting question.  Some people would want them to, and
> some people would definitely not want them to.  I think the general
> consensus is that people would not want them to be different, since in
> the rest of perl, 2 and "2" are the same.

I forgot an important concretity.  Hashes should compare based on the
generic "equal" operator, which knows how to compare apples and apples,
and oranges and oranges, and occasionally a red orange to an apple.

That is:

3 equal 3  ==> true
3 equal { codeblock }  ==> false
3 equal "3"==> true

Luke


Re: S06: Pairs as lvalues

2005-02-27 Thread Ingo Blechschmidt
Hi,

Luke Palmer  luqui.org> writes:
> Ingo Blechschmidt writes:
> >   my $x = (a => 42); # $x is a Pair.
> >   $x = 13;   # Is $x now the Pair (a => 13) or
> >  #   the Int 13?
> 
> You see, in your example, the pair is not "functioning as
> an lvalue".  The variable is the thing that is the lvalue,
> not the pair.

ah! That clears it up! Thanks! :)


--Ingo

-- 
Linux, the choice of a GNU | When cryptography is outlawed, bayl bhgynjf
generation on a dual AMD-  | jvyy unir cevinpl!  
Athlon!|



Re: Valid hash keys?

2005-02-27 Thread Nigel Sandever
On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
> Luke Palmer writes:
> > Autrijus Tang writes:
> > > Just a quick question: Is Hash keys still Strings, or can they be
> > > arbitary values? 
> > 
> > They can be declared to be arbitrary:
> > 
> > my %hash is shape(Any);
> > 
> > 
> > > If the latter, can Int 2, Num 2.0 and Str "2" point to different
> > > values?
> > 
> > That's an interesting question.  Some people would want them to, and
> > some people would definitely not want them to.  I think the general
> > consensus is that people would not want them to be different, since in
> > the rest of perl, 2 and "2" are the same.
> 
> I forgot an important concretity.  Hashes should compare based on the
> generic "equal" operator, which knows how to compare apples and apples,
> and oranges and oranges, and occasionally a red orange to an apple.
> 
> That is:
> 
> 3 equal 3  ==> true
> 3 equal { codeblock }  ==> false
> 3 equal "3"==> true
> 

I would have assumed a hash who shape was defined as C to perform the 
hashing function directly on the (nominally 32-bit) binary representation Int 
2. 

Likewise, c, would perform the hashing on the 
binary 
rep of the (nom.64-bit) Double value.

And C, on the address of the key passed?

By extension, a C<%hash is shape( Any )> would hash the binary representation 
of 
whatever (type of) key it was given, which would make keys of 2, 2.0, '2', 
'2.0', (Int2)2 etc. all map to different keys.

If C<%hash is shape(Any)> maps all the above representation of 2 to the same 
value, then C becomes a synonym for 

C<%hash is shape(Scalar)>.

(is that the same as C?).

If my auumption is correct, then that would imply that each type hash or 
inherits a .binary or .raw method to allow access to the internal 
representation?

> Luke
 




Dynamically Scoped Dynamic Scopes

2005-02-27 Thread Dave Whipp
I was thinking about scopes (for a problem unrelated to Perl 6), and I 
realised that the scoping concepts in P6 are somewhat limited.

We have
  my $var   # lexical scope
  temp $var # lexically-scoped dynamic scope
C is lexically scoped in that its effect goes away at the closing 
curly of the lexical scope that contains it.

A concept that we seem to be missing is the possibility of dynamically 
scoped dynamic scopes. I hesitate to come up with a syntax; but I can 
think of a couple of examples where it might be used. Caveat: if you 
beleive that globals are fundamentally evil, and that everything should 
be objects, then this is unnecessary. but for other people ...

Example 1: Create a dynamic scope, and then spawn N threads. Each thread 
has its own lexical scope. After each thread has done some work, each 
reaches a barrier. Once all the threads have reached that barrier, we 
terminate the dynamic scope that we previously introduced: the threads 
then continue in their lexical scopes, but with the different dynamic scope

Example 2: a state machine: imagine binding a number of variables into a 
"scope space", in which we then instance multiple scopes. We can then 
create a state machine in which we change the currently visible values 
of the scoped variables by changing the "current scope" of the "scope 
space".

One could imagine implementing this by creating the scopes as instances 
of an object, and then binding the object's attributes onto the 
variables (i.e. "our $foo := $obj.bar"). The "scope space" object would 
then be the set of global vaiables to be bound; and the "scope" object 
would be the set of values to bind.

However, when we want to release the global vaiables from our scope, 
then we need a way to unbind the variables, and restore them to the 
bindings that existed before they were bound to our scope space. I'm not 
sure how to do that, because we don't have any builtin concept of 
dynamically scoped scopes.

Dave.


Re: Valid hash keys?

2005-02-27 Thread Alex Burr

On Sun, Feb 27, 2005 at 02:20:59AM -0700, Luke Palmer wrote:
> I forgot an important concretity.  Hashes should compare based on the
> generic "equal" operator, which knows how to compare apples and apples,
> and oranges and oranges, and occasionally a red orange to an apple.

Um. Hashes don't really compare, though, do they? Maybe you
just mean a notional equals operator, which isn't really used; but
it seems to me that what hashes acutally implement is more of a 
'canonicalize' operator. Actually, it would be useful sometimes
to be able to give a hash an explicit canonicalizer:

my %msdos_files is canonicalized_by lc;

my %fractions is canonicalized_by gcd;

Alex


Re: Valid hash keys?

2005-02-27 Thread Luke Palmer
Nigel Sandever writes:
> On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
> > I forgot an important concretity.  Hashes should compare based on the
> > generic "equal" operator, which knows how to compare apples and apples,
> > and oranges and oranges, and occasionally a red orange to an apple.
> > 
> > That is:
> > 
> > 3 equal 3  ==> true
> > 3 equal { codeblock }  ==> false
> > 3 equal "3"==> true
> > 
> 
> I would have assumed a hash who shape was defined as C to perform
> the hashing function directly on the (nominally 32-bit) binary
> representation Int 2. 

I wasn't even thinking about implementation.  Sometimes it's good to let
implementation drive language, but I don't think it's appropriate here.

When we're talking about hashes of everything, there are a couple of
routes we can take.  We can go Java's way and define a .hash method on
every object.  We can go C++'s way and not hash at all, but instead
define a transitive ordering on every object.  We can go Perl's way and
find a string representation of the object and map the problem back to
string hashing, which we can do well.

But the biggest problem is that if the user overloads 'equal' on two
objects, the hash should consider them equal.  We could require that to
overload 'equal', you also have to overload .hash so that you've given
some thought to the thing.  The worry I have is that people will do:

method hash() { 0 }

But I suppose that's okay.  That just punts the work off to 'equal',
albeit in linear time.

That may be the right way to go.  Use a Javaesque .hash method with a
sane default (an introspecting default, perhaps), and use a sane
equivalent default for 'equal'.  

As far as getting 2, 2.0, and "2" to hash to the same object, well, we
know they're 'equal', so we just need to know how to hash them the same
way.  In fact, I don't believe 2.0 can be represented as a Num.  At
least in Perl 5, it translates itself back to an int.  So we can just
stringify and hash for the scalar types.

Luke


Re: How are types related to classes and roles?

2005-02-27 Thread Thomas Sandlaß
HaloO,
Larry Wall wrote:
On Fri, Feb 25, 2005 at 12:45:45AM +0800, Autrijus Tang wrote:
: So, I think late binding is a sensible (and practical) default, but
: do you think it may be a good thing to have a type inference mode that
: assign static contexts to expressions, and prebind as much as possible?
: It may be possible to enable via a pragma or a compiler switch...
Well, that is the optimizer everybody keeps talking about. And the more
type input it has, the better it can pre-select multi methods. A very
interesting feature for later versions of Perl 6 could even allow to perform
complete program optimization where code passages in modules that can't be
pre-selected on local information alone could be optimized when all type
information is available.

It's certainly something to explore.  If I recall, I took some kind of
compromise position in the Apocalypse where I said we probably wouldn't
treat return-type-MMD with the same authority as parameter MMD, but
we might be able to use return type as a tie-breaker on otherwise
equivalent routines.
I guess "equivalent routines" shall mean "same specificity of invocant 
type"?
At that point to choose the multi with the lower return type seems tricky
and might lead to surprises. Note the following little diagram where the two
operators
  <:  subtype
  :>  supertype
---which BTW would make nice standard operators :) ---
are used to show the function subtyping and method selection in one picture:
 method selection   covariant
+-- <: --+
||
multi sub f2 ( Inv2 : Arg2 ) returns Ret2   <:  multi sub Ret1 f1 ( Inv1 : Arg1 
)
   |  |||
 subtyping of  |covariant + <: +|
 Code objects  ||
   +--- :> -+
contravariant
For method selection the short names have to be the same of course. The Arg
types are handled as type errors at runtime or compile time, right?
No tertiary tie breaking? :)

Basically, instead of writing a single routine
with a big switch statement in it, you'd be able to write multiple
routines with the same parameters but different return types as a
form of syntactic sugar over the switch statement.  It's not clear if
such an approach would buy us anything in terms of type inferencing,
except insofar as sub declarations are easier to mine the return types
out of than an embedded switch statement.  Maybe that buys us a lot,
though, just as having class metadata available at compile time is
a big improvement over Perl 5's @ISA.
Anyway, I don't profess to have thought deeply about type inferencing.
But I do know that I don't want to turn Perl 6 into ML just yet...
Hasn't type inferencing become easy with the full power of junctions?
I imagine the compiler annotating the AST with types from the leaves up.
At multi calls which can't be pre-selected at compile time a one() junction
of all possibly matching multi's return types is assigned.
BTW, How many types does  Int|Str|Num produce?
All these: Int|Str|Num, Int|Str, Int|Num, Str|Num, Int, Str, Num?
Or just the last three? What is then the role of Int^Str^Num?
Is the syntax
type Criterion ::= KeyExtractor
 | Comparator
 | Pair(KeyExtractor, Comparator)
 ;
used in the sort ruling still current? There the RHS looks more like
a grammar rule alternation which is checked in turn than a real any()
junction.
Regards,
--
TSa (Thomas Sandlaß)


Re: Valid hash keys?

2005-02-27 Thread Nigel Sandever
On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
> Nigel Sandever writes:
> > On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
> > > I forgot an important concretity.  Hashes should compare based on the
> > > generic "equal" operator, which knows how to compare apples and apples,
> > > and oranges and oranges, and occasionally a red orange to an apple.
> > > 
> > > That is:
> > > 
> > > 3 equal 3  ==> true
> > > 3 equal { codeblock }  ==> false
> > > 3 equal "3"==> true
> > > 
> > 
> > I would have assumed a hash who shape was defined as C to perform
> > the hashing function directly on the (nominally 32-bit) binary
> > representation Int 2. 
> 
> I wasn't even thinking about implementation.  Sometimes it's good to let
> implementation drive language, but I don't think it's appropriate here.
> 
> When we're talking about hashes of everything, there are a couple of
> routes we can take.  We can go Java's way and define a .hash method on
> every object.  We can go C++'s way and not hash at all, but instead
> define a transitive ordering on every object.  We can go Perl's way and
> find a string representation of the object and map the problem back to
> string hashing, which we can do well.
> 
> But the biggest problem is that if the user overloads 'equal' on two
> objects, the hash should consider them equal.  We could require that to
> overload 'equal', you also have to overload .hash so that you've given
> some thought to the thing.  The worry I have is that people will do:
> 
> method hash() { 0 }
> 
> But I suppose that's okay.  That just punts the work off to 'equal',
> albeit in linear time.
> 
> That may be the right way to go.  Use a Javaesque .hash method with a
> sane default (an introspecting default, perhaps), and use a sane
> equivalent default for 'equal'.  
> 
> As far as getting 2, 2.0, and "2" to hash to the same object, well, we
> know they're 'equal', so we just need to know how to hash them the same
> way.  In fact, I don't believe 2.0 can be represented as a Num.  At
> least in Perl 5, it translates itself back to an int.  So we can just
> stringify and hash for the scalar types.
>

My thought is that if c uses the stringyfied values of 
the keys, then it is no different to C,

I think it would be useful for shape(Any) be different to an ordinary hash, and 
hashing the binary representation of the key, so that 

(Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc.

would be a useful way of collating things according to their "type" rather than 
their value?
> 
> Luke

njs.





Re: Valid hash keys?

2005-02-27 Thread Thomas Sandlaß
Alex Burr wrote:
[..] Actually, it would be useful sometimes
to be able to give a hash an explicit canonicalizer:
my %msdos_files is canonicalized_by lc;
my %fractions is canonicalized_by gcd;
Shouldn't that be handled by container subclasses of Hash?
Like PersitentScalar or SparseArray?
Regards,
--
TSa (Thomas Sandlaß)


Re: Valid hash keys?

2005-02-27 Thread Luke Palmer
Nigel Sandever writes:
> On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
> > As far as getting 2, 2.0, and "2" to hash to the same object, well, we
> > know they're 'equal', so we just need to know how to hash them the same
> > way.  In fact, I don't believe 2.0 can be represented as a Num.  At
> > least in Perl 5, it translates itself back to an int.  So we can just
> > stringify and hash for the scalar types.
> 
> My thought is that if c uses the stringyfied values 
> of 
> the keys, then it is no different to C,

Indeed, but I meant just for our non-reference scalar types, such as
Num, Int, and Str.

> I think it would be useful for shape(Any) be different to an ordinary
> hash, and hashing the binary representation of the key, so that 
> 
> (Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc.
> 
> would be a useful way of collating things according to their "type"
> rather than their value?

That may indeed be a useful kind of map to have, but it's hardly what
people expect when they declare a hash keyed by Any.  The reason I want
to identify 2 and "2" is because they are identical everywhere else in
the language.

Also, if you hash on the binary representation, what's to say that uint
hashes differently from uint4.  And even if we do guarantee that they
hash differently, there will be collisions.  And then uint(2) equal
uint4(2) (for any reasonable implementation of equal), and they are the
same, but only for collision values.

A better way to make a type-collated hash would be to:

class TypeHash is Hash[shape => [Class; Any]] {
method postcircumfix:<{ }> (Any $arg) {
.SUPER::{$arg.type; $arg}
}
}

And if you really think that it's going to be that common, then you can
write a module that implements that.  It would be about five lines long.

Luke


Re: Valid hash keys?

2005-02-27 Thread Alex Burr
On Sun, Feb 27, 2005 at 11:57:30PM +0100, Thomas Sandlaß wrote:
> Alex Burr wrote:
> 
> >[..] Actually, it would be useful sometimes
> >to be able to give a hash an explicit canonicalizer:
> >
> >my %msdos_files is canonicalized_by lc;
> >
> >my %fractions is canonicalized_by gcd;
> 
> Shouldn't that be handled by container subclasses of Hash?
> Like PersitentScalar or SparseArray?

Possibly. Clearly that's what one would do in any other language.
What I was thinking was that if hashes are going to have a
canonicalizer function *anyway*, maybe the default implementation
could be overridable with a trait (or role?). But I can't 
actually claim to have followed perl6 development enough to
be able to argue that it really makes sense.

Alex


Re: Valid hash keys?

2005-02-27 Thread Nigel Sandever
On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
> Nigel Sandever writes:
> 
> When we're talking about hashes of everything, there are a couple of
> routes we can take.  We can go Java's way and define a .hash method on
> every object.  We can go C++'s way and not hash at all, but instead
> define a transitive ordering on every object.  

The more I think about this, please, no. The reason hashes are called hashes is 
because they hash.

If we need bags, sets, or orderThingies with overloadable transitive ordering, 
they can be written as classes--that possibly overload the hash syntax 

obj<> = value

or whatever, but don't tack all that overhead on to the basic primitive.


> We can go Perl's way and
> find a string representation of the object and map the problem back to
> string hashing, which we can do well.

The only question is why does the thing that gets hashed have to be stringyfied 
first?

In p5, I often use hash{ pack'V', $key } = $value. # or 'd'

1) Because for large hashes using numeric keys it use up less space for the 
keys. 4-bytes rather than  10 for 2**32.

2) By using all 256 values of each byte, it tends to spread the keys more even 
across fewer buckets;

use Devel::Size qw[total_size size];
undef $h{ pack 'V', $_ } for map{ $_ * 1  } 0 .. 99;

print total_size \%h;
18418136

print scalar %h;
292754/524288

versus

use Devel::Size qw[total_size size];
undef $h{ $_ } for map{ $_ * 1  } 0 .. 99;

print total_size \%h;
48083250

print scalar %h;
644301/1048576

It would also avoid the need for hacks like Tie:RefHash, by hashing the address 
of the ref rather than the stringyfied ref and forcing the key to be stored 
twice and the creation of zillions of anonymous arrays to hold the unstrigyfied 
ref+value.

The same could be extended to hashing the composite binary representations of 
whole structures and objects.

njs.




Re: Valid hash keys?

2005-02-27 Thread Rod Adams
Luke Palmer wrote:
The object model that I'm working on actually identifies 2 and "2" as
the same object, indistinguishable in every respect. 

Okay, that's fine, since C< 2 eq "2" > and C< 2 == "2" >. But what about 
2.0 and "2.0"?

In Perl5, C< 2.0 == "2.0" >, but  C< 2.0 ne "2.0" >.
-- Rod Adams