>>>>> "PM" == Ph Marek <[EMAIL PROTECTED]> writes:
>> ... >> so here is a (very rough and probably broken) syntax idea building on >> that: >> >> sort :key { :descend :string .foo('bar').substr( 10, 3) } >> >> :key { :int .foo('baz') } >> :key { :float .foo('amount') } @unsorted ; PM> I see a kind of problem here: If the parts of the key are not PM> fixed length but can vary you can put them in strings *only* after PM> processing all and verifying the needed length. varying keys has no effect on this. you still build up a merged key of whatever length is needed for each record. perl5 does that with pack or .= or sprintf (all used in variants of the GRT). PM> Example: PM> sort :key { :descend :string .foo('bar') } PM> :key { :int .foo('baz') } PM> :key { :float .foo('amount') } @unsorted ; PM> Now .foo('bar') isn't bounded with any length - so you don't know how much PM> space to reserve. who needs to reserve space? you just let the guts do a realloc like p5 does now. PM> And I believe that PM> - generating keys on every list element PM> - storing them into a array (array of array) and PM> - after having processed all checking the length, and PM> - now generate the to-be-sorted-strings PM> - sort PM> isn't the optimal way. PM> BTW: this requires that *all* keys are generated. PM> In cases like PM> - by name, PM> - by age, PM> - by height, PM> - by number of toes left, PM> - and finally sort by the social security number PM> most of the extractions (and possibly database-queries of calculations or PM> whatever) will not be done - at least in the current form of PM> sort { $a->{"name"} cmp $b->{"name"} || PM> $a->{"age"} <=> $b->{"age"} || PM> ... PM> That is to say, I very much like the syntax you propose, but I'm PM> not sure if pre-generating *every* key-part is necessarily a PM> speed-up. see the paper for more on that. the work you do in the callback, even if you only do the first compare is O(N log N). in large sorts that will outweigh the O(N) work on extracting the keys one time. and as with all algorithms, real world cases will differ. short sorts can be faster with complex comparisons because of the initial overhead. maybe the internal code can decide to do that instead of a GRT based on the sort array size. the key extract code is the same in both cases so that would be an interesting optimization. PM> If there are expensive calculations you can always cut them short PM> be pre-calculating them into a hash by object, and just query this PM> in sort. again, the goal is to eliminate work in the comparison. but as i said, this is an implementation problem. we should focus on the language level needs of the sort func and not how it is done internally. i shouldn't have brought up the internal design here. PM> Also I fear that the amount of memory necessary to sort an array PM> of length N is not N*2 (unsorted, sorted), but more like N*3 PM> (unsorted, keys, sorted), which could cause troubles on bigger PM> arrays .... as larry said, virtual memory. and there are ways to reduce the copies internally with pointers (something p5 couldn't do at p5 code level). other than copying the extracted keys, you could leave the original array alone and sort it in place by indexing it based on the sorted pointers. and since the records will all be the same PMC type (usually), you could just swap the array's pointers and not move any data around. so no extra copies would be made other than key extraction. uri -- Uri Guttman ------ [EMAIL PROTECTED] -------- http://www.stemsystems.com --Perl Consulting, Stem Development, Systems Architecture, Design and Coding- Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org