RE: [RfC] vtable->dump

Dan Sugalski Wed, 10 Sep 2003 07:05:36 -0700

On Tue, 9 Sep 2003, Gordon Henriksen wrote:

> Dan Sugalski <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 9 Sep 2003, Gordon Henriksen wrote:
> > 
> > > Random thought....
> > > 
> > > There's some discussion on perl-qa right now about how Test::More
> > > should implement "is_deeply", which executes a code block and tests
> > > that the return value is equivalent to a particular nested data 
> > > structure. (The question posed on that list is with regard to how to
> > > handle tie()'d objects, which is not what I'm addressing here.)
> > > Result of that discussion's outcome, should the serialization API
> > > being discussed here enable the following behavior?
> > > 
> > >     ok(freeze($expected) eq freeze($actual));
> > > 
> > > I bring this up because using object addresses as IDs in the
> > > serialization entirely prevents this usage.
> > 
> > Good. Having waded through one of the threads on p5p just a minute
> > ago, I'm not willing to guarantee this. In fact, I'm willing to 
> > explicitly not guarantee this. If we want to compare two frozen
> > structures, string equality is *not* the way to go.
> > 
> > Each freeze could, theoretically, choose a different freezing method
> > yet still represent the original.
> 
> What do you mean by a "different freezing method?" That key-value pairs
> from two externally identical hashes could be frozen in different
> orders? I can see that. Sorting large hashes can be expensive, and
> certainly requires memory allocation.


That's possible, yes. We may well decide to have each hash have a separate 
randomness seed. While a bit excessive, it's not too unreasonable.
 
> Or do you mean that freeze($objA) and freeze($objB)--freezing
> (definitionally identical!) object graphs and with no intervening code
> between the two calls to freeze--could internally and arbitrarily select
> a significantly divergent object graph encoding? I can't see that at
> ALL...

This is, in fact, what I mean. (Well, actually, what I mean is that 
freeze($a) and freeze($a), with no intervening changes at all to $a, may 
produce different freeze representations)

There are two reasons for this:

1) We make no promises on the format for the freeze, just that it will be 
able to reconstitute to the original structure. That means that the system 
may, if it chooses, use one of several formats. The freeze format can be 
chosen for any number of reasons including current free memory, phase of 
the moon, or sheer random chance.

2) We may choose to use internal characteristics, including addresses of
immovable structures, as unique identifiers for the freezing.

#2 gets relatively minor changes between two otherwise functionally 
identical graphs, but nonetheless significant enough to make string 
equality testing infeasable.

#1 means that the first freeze may give you the graph in the internal 
binary format and the second freeze gives you the graph in YAML. (And the 
third potentially in XML)
 
> Over time (and revisions), I certainly can see a desire not to marry
> parrot-freeze to a specific binary representation. That's not the
> question I intended to raise--I asked a question only of repeatability,
> not of permanent format invariance.

I don't want to make any guarantees of repeatability. Past history 
suggests that when behaviour isn't guaranteed it's best to actively 
randomize the behaviour if there's no significant penalty to doing so, as 
otherwise people will count on the non-promised behaviour to the point 
where the installed base makes it infeasable to change when the future 
rolls around.

> > This is a Good Place for a black-box comparison op, which string
> > equality definitely is not.
> 
> (At which point do extremely complex routines cease to be operators?)
> 
> A black-box comparison of the (live) object graphs, or black-box
> comparison of the serializations themselves?

Both, though I was thinking the latter, which is significantly more 
useful.

> Comparing serialized object graphs strikes me as tremendously esoteric,
> e.g. a maintenance burden to be used by very few clients.

It actually makes things potentially much simpler, if you consider 
comparison of live objects for equivalence a suecial case of this.

 (One of the
> few significant uses being that of replacing string-equals for the
> testing of the serializer itself.) It also strikes me as very, very,
> very complicated should the "freeze methods" diverge even slightly. I've
> never seen any such mechanism in any other environment.

Which is fine. I suppose we get to do at least one new thing with Parrot, 
though I expect this isn't anywhere near new. It's also going to be 
necessary to handle duplicate detection when getting objects over the 
network or from long-term storage.

> To compare
> graphs that I had saved in long-term storage, I as a caller would expect
> to need to deserialize the graphs and use a deep equals

Why? This one makes no sense to me. If I have two black-box serialized 
representations of objects, I'd expect to *not* have to reconstitute them, 
and would actively not want to do so, since I may not know if there are 
side-effects (such as filehandles needing opening or databases needing 
connecting to) that would be undesirable if all I want to do is check for 
functional equivalence.

This does argue for some support from the engine, however, in both 
functional and required functionality of freeze formats.

                                        Dan

RE: [RfC] vtable->dump

Reply via email to