Danl001 wrote:
>
> I do not have access to the sort operation. All I have is a file that is
> "sorted" but I don't know exactly the mechanism by which it was sorted.
> What I am trying to do is write a comparison function--given any two
> lines in this file, return -1, 0, 1 as perl's cmp function does. I don't
> have to sort the whole file, I just need to be able to tell, given two
> lines, which should come before the other in the sorted ordering. (I
> need this to do a binary search over the file, which is quite large).
>
> Anyhow, I'm posting the function I wrote to do this. It is quite long
> and to me is naive. It is also pretty slow. I have tested function in a
> larger script that I wrote. As a method of comparison, the script takes
> 23 sec to run over a 500k file using my function. When I substituted
> perl's cmp for my function, the same script ran in 12 sec. I'd really
> like to get this running more quickly as I have to run my script on
> files MUCH larger than 500k.
>
> I'm thinking the way the file is sorted is something simple, yet
> something I don't recgonize! As a result, you'll see that my method is
> probably very over-complicated.
>
> I have also posted some more data that is representative of what I have
> to work with. Both the comparison function I'm using and the sample data
> are attached if anyone wants to check it out. I appreicate any suggestions.

Hello Dan.

Thanks for the data. I assume it's just the first part of a huge file; if
you end up needing a lot of help with this I would consider putting the
entire thing up on a Web site that we can all access.

In the mean time, the data you gave sorts into exactly the same order with
just

  chomp (my @data = <DATA>);
  sort @data;

except that the last two records:

  ABC-MARKET.ABC-MARKET
  ABC-MARKET

are reversed.

You must be careful: as I hinted in my last post, a sort that looks like

  sort { 0 } @data;

will leave the list untouched, and there may well be a several pairs
of records which are, 'sortwise', equal, and could therefore appear in
either order.

Rob



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to