Doug Barton wrote: > Oliver Fromme wrote: > > I assume, with "this" you mean my solution to the slow > > shell loop problem (not quoted above), not Yoshihiro Ota's > > awk proposal? > > I meant the solution using comm, sorry. (I forgot to mention that I > would probably use cmp here, but that's a personal preference.)
I see. No problem. However, I think cmp wouldn't work here, because cmp only detects whether there is a difference between two files. In this case we need to know if one file is a subset of the other: For every hash there must be a .gz file, but it doesn't hurt if there are more files. So the list of hashes can be a subset of the list of .gz files, they don't have to be equal. While I were at it, I skimmed through the cmp source and found a bug (or inefficiency): When the -s option is specified (i.e. silent, exit code only), it would be sufficient to terminate when the first difference is encountered. But it always compares the whole files. I'll try to make a patch to improve this. > > Yes, it can. I already explained pretty much all of that > > (useless cat etc.) in my first post in this thread. Did > > you read it? > > Yes, I was attempting to agree with you. :) OK, sorry. I misunderstood. :) > > My suggestion (after a small correction by > > Christoph Mallon) was to replace the cat|cut|grep|cut > > sequence with this single awk command: > > > > awk -F "|" '$2 ~ /^f/ {print $7}' "$@" > > > > For those not fluent with awk, it means this: > > - Treat "|" as field separator. > > - Search for lines where the second field matches ^f > > (i.e. it starts with an "f"). > > - Print the 7th field of those matching lines. > > Like I said, I haven't seen the files, but this looks good at first > blush. That said, the generation of the hash list file is just a drop > in the bucket. The real inefficiency in this function is the test -f > for 64k files, one at a time. Yes, definitely. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "We will perhaps eventually be writing only small modules which are identi- fied by name as they are used to build larger ones, so that devices like indentation, rather than delimiters, might become feasible for expressing local structure in the source language." -- Donald E. Knuth, 1974 _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"