At 06:59 PM 3/31/06, Mr. Shawn H. Corey wrote:

On Fri, 2006-31-03 at 15:45 -0800, Tom Phoenix wrote:
> You should loop over the input, pushing each item on to an array. If
> at any time you have 2000 items in the array, sort them and discard
> any you don't want to keep.
>
> $#data = 999 if $#data > 999; # OBperl: one way to discard elements
>
> When you finish looping, sort-and-discard again. You'll never need
> more than 2N items in memory at any given time. Does that algorithm
> work for your needs?

Sorting is not necessary. If he keeps an array of the best, that means
lowest, records then all he has to do is compare every new entry with
the highest. This is called "fail early." This means, if it's going to
fail, it should at the earliest opportunity. If it succeeds then it
searches down thru the list, to find its place. This is called "succeed
early." Given that the procedure can flip between these two methods, it
is faster than any sort.


I'm not the OP, but I have a script with a similar problem. The script has some logic that generates many (thousands of billions) of combinations from a little bit of data and only the best 100 combos are output. For each combination produced by script, a value is calculated. I maintain an array of "best" results so far; if size of this array is <100, then a "new" result is automatically added. If there are already 100 items, I find the "worst" entry in the array. If the "new" entry is better than the "worst", the "new" one replaces the "worst".

I am not sure how your reference to "fail early" and "succeed early" apply to this situation.

At the moment, the array is left unsorted. If I use a sorted array, it needs to resorted every time a "worst" entry is replaced by a "new" entry. Can I avoid sorting the array every iteration? How else to find the "worst" entry? If I replace "worst" with a "new" entry, doesn't the array need to be resorted? How does the cost of searching an unsorted array compare to resorting on every update and searching a sorted array.

Tom's idea about growing to 200, then chopping back to 100 also sounds interesting.

Frank

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to