FileBench affectionados, I have just uploaded a webrev of a set of bug fixes and other modifications to FileBench to the OpenSolaris.org webrev site. This set of changes started out as modification to essentially merge "files" into "filesets", but ended up addressing a number of related issues. Here are the details:
This started out as just a fix for bug: 6601818 Turn FileBench "files" into filesets with 1 entry. However, to do this properly, found it helped to fix / implement five other outstanding bug / feature requests: 6601341 flowop_endop() needs to have the actual number of bytes of I/O done passed to it 6581691 pre-allocation is molasses 6568378 Flowop reads and writes should be consistent about memory buffer usage 6595374 "tf_memsize smaller than IO size" error when reading/writing large file 6564960 filebench should handle larger iosize's or gracefully error out with a nicer message To make reviewing easier, here are some comments about the individual bugs that were fixed and the files that were changed as part of each bug's specific fix. 6601818 Turn FileBench "files" into filesets with 1 entry: Basically, this involves changes to parser_gram.y so that the "define files" command creates a fileset with entry size of 1. It is implemented by creating a new subroutine "parser_fileset_define_common" which allocates a fileset and fills in all attributes that are common between files and filesets. The old parser_file_define() calls that, and sets default values to fileset things that don't apply when only a single entry (i.e. "files") is involved. Similarly, parser_fileset_define() first uses the common routine then sets attributes specific to filesets. Also, fileset.c and fileset.h were modified to support raw devices as a special case for filesets, (replacing the raw device mode for files, which, incidentally, was not enabled). Also, during testing I discovered that a subtle difference between files and filesets for "non allocated" files. Files actually create a file of size 0 at the specified path in that case, while filesets don't create a file at all. After consultation with other Sun engineers, I decided to leave the "file" behavior as "no file", and modify the two workloads that depend on having a file of size 0 to specifically request the "alloc" of 0 length file. Thus for both filesets (as was the case before) and the "new" files, leaving off the "alloc" attribute will result in no initial file creation on the disk. As part of this fix, I needed to preserve the parallel allocation feature of files, so filesets can now be allocated in parallel, with up to 32 allocation threads running across all filesets and their constituent files, addressing: 6581691 pre-allocation is molasses The changes to turn files into filesets also involved removing the "files" code path from the seven flowops that do I/O, and replacing it with some special code to handle raw devices. Also a few changes to the create / open / close / delete flowops. There was a lot of (almost) duplicate code here which I was planning to eliminate as part of: 6568378 Flowop reads and writes should be consistent about memory buffer usage so I decided to tackle that one too. I created a common routine to select filesetentries, determine memory buffer pointers and file offsets. As mentioned, the "files" portion disappeared as part of that. Also, the selection of memory locations to read or write to was unified. If thread memory has been specified (tf_memsize > 0), then a random offset into tf_mem is calculated and passed back to the calling routine. Otherwise, a private fo_buf is allocated or reused, with its location passed back. The private buffer is created to be large enough to hold fo_iosize worth of bytes, which may be much larger than the old method of only allocating 1 MB. Thus, we keep track of the size of the buffer, and free(), malloc() a new one if the existing one is too small. Thus, this change also addresses: 6564960 filebench should handle larger iosize's or gracefully error out with a nicer message If tf_mem is in use, it already will provide an error message if iosize is larger than tf_mem size, and that now applies to all seven I/O flowops (read, write, aiowrite, readwholefile, writewholefile, appendfile, appendfilerand). if tf_mem is not specified or set to 0 size, then private buffers (fo_buf) will be allocated of iosize. The old readwholefile and writewholefile ignored any supplied iosize (actually set it to the file size AFTER the first execution of the flowop), and arbitrarily broke the request into 1 megabyte chunks. This size gives full performance with many current disk drives and file systems, but not all, and can be too small for full performance with some RAIDed systems. So, for backwards compatibility, if iosize is 0 (what the legacy workloads use for those two flowops), they will read or write the entire file in one I/O. Note that the thread memory, if it exists, must be at least as large as the largest size the file(s) can be. If iosize is set, the whole file will be read or written in "iosize" increments (or whatever is left of the file on the last I/O). So, if you have thread memory of 10 MB, you can set iosize to 10 MB (or less) to prevent "tf_mem too small" errors. Changing readwholefile and writewholefile to do multiple iosize transfers, instead of multiple 1 MB transfers) until the whole file was read or written necessitated fixing: 6601341 flowop_endop() needs to have the actual number of bytes of I/O done passed to it So that correct accounting of transfered bytes occurs. This also makes the accounting more accurate for sequential reads which can do less than iosize reads when they hit the end of the file. Finally, it fixes: 6595374 "tf_memsize smaller than IO size" error when reading/writing large file which was caused by the way readwholefile and writewholefile manipulated iosize so the old flowop_endop() would (almost) do correct bytes transfered accounting. The webrev for all of this has been posted to OpenSolaris.org at: http://cr.opensolaris.org/~dreww/filebench_files2filesets <http://cr.opensolaris.org/%7Edreww/filebench_files2filesets> Looking forward to your comments. Drew Wilson _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org