>>>>> "DC" == Damian Conway <[EMAIL PROTECTED]> writes:
DC> Uri wrote: DC> @out = sort DC> [ { ~ %lookup{ .{remotekey} } }, #1 >> if string cmp is the default, wouldn't that ~ be redundant? DC> How do you know that the values of %lookup are strings? DC> How would the optimizer know? because that would be the default comparison and the extracted key value would be stringified unless some other marker is used. most sorts are on strings so this would be a useful huffman and removal of a redundancy. DC> { + substr( 0, 10 ) }, #3 DC> { int /foo(\d+)bar/ }, #4 >> i would also expect int to be a default over float as it will be used >> more often. + is needed there since the regex returns a string. in the >> #3 case that would be an int as well. so we need a 'float' cast >> thingy. DC> Unary C<+> *is* the "float cast thingy"! hmm, so + is float but int is needed otherwise? int is more commonly a sort key than float. it just feels asymetrical with one having a symbol and the other a named operator. DC> If you want to force numeric comparison of keys you explicitly cast DC> each key to number using unary C<+> or C<int>. If you want to force DC> stringific comparison you explicitly cast the key to string using DC> unary C<~>. or ~ is the default if nothing else is specified. it matches up with cmp being the default comparison as you agreed above. DC> If you don't explicitly cast either way, C<sort> just DWIMs by looking DC> at the actual type of the keys returned by the extractor. If any of DC> them isn't a number, it defaults to C<cmp>. that doesn't work for me. the GRT can't scan all keys before it decides on what comparison operator is needed. the GRT needs to merge keys as they are extracted and it needs to do the correct conversion to a byte string then and not later. you can't DWIM this away if you want the optimization. the ST can get away with it since you are still using a compare block even if it is generated internally by the sort function. DC> { just_guess $^b, $^a }, #7 >> is that a reverse order sort? why not skip the args and do this: >> { &just_guess is descending }, >> #7 DC> Because I wanted to show a plain old two-parameter block being used as DC> a *comparator* (not a one-parameter block being used as a key DC> extractor). that seems like extra redundant code just to mark a callback vs a extract block. there should be a cleaner way to do that. maybe a null extract key in the pair? { '' => &just_guess } { undef => &just_guess } # will => autoquote undef there? then the records are full keys with no special extraction but there is a callback comparator. no need to declare any arguments to the callback sub here since you know it is that and not key extract code. >> but what about this odd case, >> sort [...], [...], [...] >> now that is stupid code but it could be trying to sort the refs by >> their >> address in string mode. DC> In which case we probably should have written it: DC> sort <== [...], [...], [...] i did ask in another post whether <== or ==> would fit in here. so that line forces it all to be data and my silly example has the first anon list of criteria and the rest as data. works for me so far. >> or it could be a sort criteria list followed by >> 2 refs to input records. DC> Only if the first array ref contains nothing but Criterion objects. but what defines them as criteria objects? they look just like records which could be sorted as well. i don't see any syntactical markers that {} means criteria block vs a hash ref. DWIM guessing isn't good enough for me here. sort in p5 already had some issues with that IIRC. >> as i pointed out above, i don't see why >> you even need to show the ^$a and ^$b args? DC> So the block knows it has two parameters. but the callback sub knows it has two params since it has to be written that way. sort always calls the code block with 2 params. >> they will be passed into just_guess that way. let is descending >> handle the sort ordering. DC> But you *can't* apply C<is descending> to a Code reference. then how did you apply 'is insensitive'? aside from how it is done (traits and such), we need a syntax that conveys the semantics of insensitive and descending into the sort func. it will use that info to reverse the key order to comparison subs or modify the key merging of the GRT to effect those flags. what i am saying is i think that you need to go back to the drawing board to find a clean syntax to mark those flags. note that neither the code block nor the callback needs to see them, only the sort guts ever needs to see them. so we are communicating info to the sort about this key. the code block/callback only ever sees two arguments to compare and nothing else. maybe this will clarify for you the intentions of those flags and why they have nothing to do with the code block but rather describe the behavior of this key. DC> Nor are we sure that the order *is* descending. Maybe the DC> C<just_guess> predicate is invariant to argument order and there DC> were other reasons to pass the args in that order. Or maybe we DC> reversed the order because we know that in C<just_guess> never DC> returns zero, but defaults to its second argument being smaller, DC> in which case we needed to reverse the args so that the C<sort> DC> remained stable. i just don't like the reverse args concept at all. it is not semantically useful to sorting. sorts care about ascending and descending, not a vs b in args. DC> The point is that I wanted to show a vanilla two-parameter compare DC> block. (And, boy, am I ever sorry I whimsically reversed the args DC> to indicate generality ;-) but i am glad you did since it brings up this issue. i don't think we should allow any args there at all as they are not needed. compare blocks get called with 2 args. some sort of descending marker reverses the args before the call. also in the GRT, descending causes special data munging so the keys will sort in descending order so that has to be passed on to the guts somehow. DC> @sorted = sort {-M} @unsorted; >> that still wants to be cached somehow as -M is expensive. DC> It *is* cached. It's a one-parameter block. So its a key extractor. So DC> it automagically caches the keys it extracts. ??? who and what caches it? it will get called again and again for the same record. where is the definition that 1 param code blocks do caching? do they all do that? >> assuming no internal caching DC> Key extractors will always cache. so this is a key extraction feature about caching. the ST and GRT don't need caching so that is a waste if those are used. only a orcish sort needs caching. >> @sorted = sort {%M{$_} //= -M} @unsorted; >> i assume //= will be optimized and -M won't be called if it is >> cached. >> also where does %M get declared and/or cleared before this? DC> Exactly the problem. That's why key extractors aways cache. but they can't always cache. it depends on the implementation and possibly at runtime (selecting orchish, ST or GRT based on some other criteria). >> can it be >> done in the block: >> @sorted = sort {my %M ; %M{$_} //= -M} @unsorted; DC> If you'd gone insane and particularly wanted to do it that way, you'd DC> need something like: DC> @sorted = sort {state %M ; %M{$_} //= -M} @unsorted; DC> to ensure the cache persisted between calls to the key extractor. and will that get cleared before each sort? >> another -M problem is that it currently returns a float so that must be >> marked/cast as a float. >> @sorted = sort {float -M} @unsorted; DC> No. *Because* -M returns a number, C<sort> automatically knows to use DC> numeric comparison on those keys. >> maybe the fact that the compiler knows -M returns a float can be used to >> mark it internally and the explicit float isn't needed here. DC> Exactly. ok for this case but not for data from a record. >> but data >> from a user record will need to be marked as float as the compiler can't >> tell. DC> It *can* tell if the elements are typed. But, yes, most of the time if DC> you want to ensure numeric comparison you will explicitly prefix with DC> a C<+> to give the compiler a hint. Otherwise C<sort> will have to DC> fall back on looking at the keys that are extracted and working out at DC> run-time which type of comparison to use (kinda like the smartmatch DC> operator does). only in the ST. the orcish does its compares on the fly without a full prescan. and the GRT can't prescan as it mungs and merges keys on the fly. >> anyhow, i am glad i invoked your name and summoned you into this >> thread. :) DC> Well, that makes *one* of us. DC> ;-) when are you going to get your life sorted out? :) uri -- Uri Guttman ------ [EMAIL PROTECTED] -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org