Tim Kientzle wrote:
I'm currently running a gamut of tests (500 tests, per package -- 128 total on my server), and outputting all data to CSV files to interpret later, using another Perl script to interpret calculated averages and standard deviations.

Excellent!  Much-needed work.

Using basic printf(2)'s with clock_gettime(2) I have determined that the majority of the issues are disk-bound (as Tom Kientzle put it).

Next question:  What are those disk operations and are any
of them redundant?

The scope of my problem is not to analyze tar,...

I've spent the last three years+ doing exactly that.
Make sure you're using the newest bsdtar/libarchive,
which has some very noticable performance improvements.

but I've discovered that a lot of time is spent in reading and interpreting the +CONTENTS and related files (most notably in parsing commands to be honest).

Oh?  That's interesting.  Is data being re-parsed (in which case
some structural changes to parse it once and store the results
may help)?  Or is the parser just slow?

Will post more conclusive results tomorrow once all of my results are available.

I don't follow ports@ so didn't see these "conclusive results"
of yours.  I'm very interested, though.

Tim Kientzle
Some extra notes:
-My tests are still running, but almost done (unfortunately I won't be able to post any results before tonight since I'm going to work now). It's taking a lot longer than I originally thought it would (I've produced several gigabytes of logfiles and csv files... eep). -I placed them around what I considered pkg_install specific sensitive areas, i.e. locations where tar was run, or the meta files were processed. -I tried implementing a small buffering technique (read in 10 lines at once, parse the 10 lines, and repeat, instead of read 1 line and parse, then repeat), around the +CONTENTS file parsing function, and the majority of the time it yielded good results (9/10 times the buffering technique won over the non-buffering technique). Given that success I'm going to try implementing the file reading in terms of fgetc(2) to properly read in a number of lines all at once, and see what happens instead (my hunch is those results may be more favorable).
Thanks,
-Garrett
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to