On Tue, 9 Sep 2003, Gordon Henriksen wrote: > Dan Sugalski <[EMAIL PROTECTED]> wrote: > > > On Tue, 9 Sep 2003, Gordon Henriksen wrote: > > > > > Random thought.... > > > > > > There's some discussion on perl-qa right now about how Test::More > > > should implement "is_deeply", which executes a code block and tests > > > that the return value is equivalent to a particular nested data > > > structure. (The question posed on that list is with regard to how to > > > handle tie()'d objects, which is not what I'm addressing here.) > > > Result of that discussion's outcome, should the serialization API > > > being discussed here enable the following behavior? > > > > > > ok(freeze($expected) eq freeze($actual)); > > > > > > I bring this up because using object addresses as IDs in the > > > serialization entirely prevents this usage. > > > > Good. Having waded through one of the threads on p5p just a minute > > ago, I'm not willing to guarantee this. In fact, I'm willing to > > explicitly not guarantee this. If we want to compare two frozen > > structures, string equality is *not* the way to go. > > > > Each freeze could, theoretically, choose a different freezing method > > yet still represent the original. > > What do you mean by a "different freezing method?" That key-value pairs > from two externally identical hashes could be frozen in different > orders? I can see that. Sorting large hashes can be expensive, and > certainly requires memory allocation.
That's possible, yes. We may well decide to have each hash have a separate randomness seed. While a bit excessive, it's not too unreasonable. > Or do you mean that freeze($objA) and freeze($objB)--freezing > (definitionally identical!) object graphs and with no intervening code > between the two calls to freeze--could internally and arbitrarily select > a significantly divergent object graph encoding? I can't see that at > ALL... This is, in fact, what I mean. (Well, actually, what I mean is that freeze($a) and freeze($a), with no intervening changes at all to $a, may produce different freeze representations) There are two reasons for this: 1) We make no promises on the format for the freeze, just that it will be able to reconstitute to the original structure. That means that the system may, if it chooses, use one of several formats. The freeze format can be chosen for any number of reasons including current free memory, phase of the moon, or sheer random chance. 2) We may choose to use internal characteristics, including addresses of immovable structures, as unique identifiers for the freezing. #2 gets relatively minor changes between two otherwise functionally identical graphs, but nonetheless significant enough to make string equality testing infeasable. #1 means that the first freeze may give you the graph in the internal binary format and the second freeze gives you the graph in YAML. (And the third potentially in XML) > Over time (and revisions), I certainly can see a desire not to marry > parrot-freeze to a specific binary representation. That's not the > question I intended to raise--I asked a question only of repeatability, > not of permanent format invariance. I don't want to make any guarantees of repeatability. Past history suggests that when behaviour isn't guaranteed it's best to actively randomize the behaviour if there's no significant penalty to doing so, as otherwise people will count on the non-promised behaviour to the point where the installed base makes it infeasable to change when the future rolls around. > > This is a Good Place for a black-box comparison op, which string > > equality definitely is not. > > (At which point do extremely complex routines cease to be operators?) > > A black-box comparison of the (live) object graphs, or black-box > comparison of the serializations themselves? Both, though I was thinking the latter, which is significantly more useful. > Comparing serialized object graphs strikes me as tremendously esoteric, > e.g. a maintenance burden to be used by very few clients. It actually makes things potentially much simpler, if you consider comparison of live objects for equivalence a suecial case of this. (One of the > few significant uses being that of replacing string-equals for the > testing of the serializer itself.) It also strikes me as very, very, > very complicated should the "freeze methods" diverge even slightly. I've > never seen any such mechanism in any other environment. Which is fine. I suppose we get to do at least one new thing with Parrot, though I expect this isn't anywhere near new. It's also going to be necessary to handle duplicate detection when getting objects over the network or from long-term storage. > To compare > graphs that I had saved in long-term storage, I as a caller would expect > to need to deserialize the graphs and use a deep equals Why? This one makes no sense to me. If I have two black-box serialized representations of objects, I'd expect to *not* have to reconstitute them, and would actively not want to do so, since I may not know if there are side-effects (such as filehandles needing opening or databases needing connecting to) that would be undesirable if all I want to do is check for functional equivalence. This does argue for some support from the engine, however, in both functional and required functionality of freeze formats. Dan