Re: RFC 163 (v2) Objects: Autoaccessors for object data structures

Glenn Linderman Mon, 18 Sep 2000 15:53:30 -0700
Michael G Schwern wrote:

> I guess I'm trying to say something about micro-optmizations being
> more trouble than they're worth and usually hurt more than they help.
>
> > So let's posit you've cured the accessor overhead problem.  Now
> > we're left with set_const being 40% slower for hash, and set_var
> > 166% slower for hash.  Still want to ignore it?  Why?

> A further 40% reduction
> results in a 2% overall increase.  Who cares?  Spend the time elsewhere.

Other people complain about the overhead that exception handling might add... but
that's unlikely to be more than a 2% increase.  So if we save this 2%, we can
spend it on exception handling :)

> > I can sure believe this.  There'd be indexes from multiple base classes.  I
> > don't know how Perl does multiple inheritance anyway, so I can't comment
> > effectively on whether this is or would be a problem.  If Perl does multiple
> > inheritance, I haven't stumbled across the documentation for it, but neither
> > have I looked.  I don't use multiple inheritance.
>
> The phrase "multiple inheritance" only comes up in perlboot, perltoot and
> perltootc.  perlobj only implies that MI works because otherwise it
> would be $ISA, not @ISA.

I reread perlobj after sending this, and got reminded that @ISA is, indeed, an
array.

> I use MI alot and really couldn't see a language without it.

OK, I have no argument with your using MI.  I've never needed to myself, yet.

But thinking about the issues more, there is only one thing about arrays with
named indices that causes problems for MI: the use of both names and numbers to
access the same object array.  If, for arrays used as base classes, you referenced
only by name, or even enforced that only named references could be done, then it
could be made to work.

When an object is created in the subclass, it calls (or should call) the
constructor of each superclass, and those should create their members in the
object, and get assignments of name to number.  And then the subclass creates its
own members, too.  Or before calling the constructors, doesn't really matter.
Some before, some after, even.  Just don't reference them by number and all is
fine.

Of course, you have exactly the same name conflict problems discussed in RFC 188,
and those solutions could be applied here, too.

> > >         Muddles the behavior of typed variables
> > >
> > Not sure what this means.
>
> Currently, the only thing really using the C<my Dog $spot> syntax is
> psuedo-hashes.
>
>     my Dog $ph;
>     $ph->{cat} = 'Mrs. Chippy';  # $ph->[$Dog::FIELDS{cat}]
>
> and there have been several RFCs about clarifying what typed variables
> mean (usually in reference to objects).  Pseudo-hashes get in the way
> of alot of those proposals.

Oh, that explains why I haven't seen much of that syntax, in spite of quite a bit
of discussion on this list.  I must not use modules implemented as pseudo-hashes.

> Also, the whole fields and base modules are troublesome.  If you wish
> to write a subclass but use a pseudohash for your object instead of a
> hash, you really can't unless the class author was careful enough to
> declare all their fields (a rare occurance).  Also, consider the case
> of a pseudo-hash friendly class, but with a subclass that uses @ISA
> directly instead of base.pm and hashes instead of pseudo-hashes.  A
> subclass of that subclass will no longer see the fields and thus the
> pseudo-hashes are wrecked.

That seems to be because pseudo-hashes limit the number of keys.  My proposal
doesn't.

> > My proposal is different, because it would require additional
> > complication of array operations.  Hashes wouldn't be affected at
> > all.
>
> You're just shifting the additional complexities from hashes to arrays.

That's one way to think about it.  But also, by not limiting the keys, some of the
complexities are reduced, I suspect.  Maybe some of the original benefits of
pseudo-hashes are also reduced.

> > Arrays would be augmented with an internal hash (probably) to
> > do the key to index translation at compile time, the run-time code
> > wouldn't notice that.
>
> Consider the following:

I elided the "following" as it discussed MI, and I answered those concerns above.
The problem is just name to number consistency.

> > >         Inconsistencies between typed and untyped access.
> > >
> > I don't know what this means, either.
>
>     my Dog $ph = [\%Cat::FIELDS];
>     $ph->{name} = 'Foofer';  # $ph->[$Dog::FIELDS{name}]
>
>     foo($ph);
>
>     sub foo {
>         my $ph = shift;
>         print $ph->{name};  # $ph->[$ph->[0]{name}]
>     }
>
> Forget to type your lexicals and you might get something really really
> weird.

Another side effect of limited key sets, I guess.

> > >         Pseudo-hashes, unless used very carefully, often turn out slower
> > >                 than hashes.
> > >
> > Maybe so.  I'm not sure why, or why not, or what all the restrictions on
> > pseudo-hashes are.
>
> Untyped pseudohashes have to look at $ph->[0] to do their key-to-index
> translation.  So in effect you have to do an array lookup, a hash
> lookup and then another array lookup.  Typed pseudohashes are compiled
> to their array representation and only involve an array lookup.
>
> In the end, it means untyped pseudohashes are 15% slower than hashes.
> And its not always possible to type.

OK.  I agree pseudo-hashes are bad.  But you haven't yet convinced me my proposal
is the same, or is bad.  Maybe it is, though.

> > > Pseudo-hashes were added to solve three problems: restrict keyspace,
> > >
> > Not part of my proposal.
>
> Obviously part of your proposal.  You'll have a strictly defined set
> of keys, unless you want new keys to magically append to the array?

I think I even explicitly stated the magical appending code in the original
proposal.  That, if nothing else, makes my proposal different than pseudo-hashes.

> > > reduce memory usage
> > >
> > Not part of my proposal.  May be a side effect, I doubt it, though.
>
> If you do your proposal on a per-class basis, you're going to win some
> memory (but nothing to write home about).  If you do it on a per
> object basis, you're going to lose alot, since each AV would have an
> associated HV.

OK, I guess my proposal loses a lot of memory.  Not sure that is bad, yet, if it
offers a performance win.

> > After looking at these points, I'm missing how you jumped to the
> > conclusion that I'm proposing pseudo-hashes.... they seem quite
> > different than my proposal in many details.
>
> You're proposing that string-based keys be mapped directly onto a
> numerically indexed array.  Thats pseudohashes in a nutshell.  Replace
> "mapped" with "pseudo-randomly mapped" and you've got hashes in a
> nutshell.
>
> The only real difference being that you are using [] instead of {} and
> the $a->[string] syntax relieves some of the ambiguities of pseudohash
> vs hash access.

OK, my proposal is quite similar in some respects to pseudo-hashes.  I count the
following differences:

1) no restrictions on the key set [this costs memory per object]
2) faster set_const (40%) and set_key (166%) operations than hashes [per your
benchmark]
3) simpler implementation due to syntax differences
4) useful outside of OO environment as well as within it (pseudo-hash seems
restricted to OO?)

I can see one more problem with using my proposal to implement objects, though,
and that is that objects that attempt (via inheritance) to subclass other objects,
must use the same underlying structured object, either a hash, or my proposal.
All of today's objects use hashes, which would make it unlikely that there would
be a wholesale conversion for just a 40% accessor speedup.  Although there could
be; I don't think it would be beyond p52p6 to do something like that.

But I think the idea has merit outside of OO, and could be implemented quite
nicely.  Outside of OO, it is more likely to encounter variable keys, with the
166% performance benefit.

It could also be a nice way of implementing sets as arrays, which several have
asked for.  But maybe that isn't a benefit, either, and I doubt it would be faster
overall than using hashes for sets.

--
Glenn
=====
Even if you're on the right track,
you'll get run over if you just sit there.
                       -- Will Rogers



____________NetZero Free Internet Access and Email_________
Download Now     http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___________________________________________________________
Re: RFC 163 (v2) Objects: Autoaccessors for object data structures

Reply via email to