Re: FILE I/O Performance

George Gatling (Contractor) Wed, 26 May 2004 11:23:33 -0700

David

This would work if I was defining the specification of the file... As it is, I am writing a library to create files someone else has defined (namely, FITS files). One of the limitations of this file format is that pointers are not allowed. The data *must* be in sequential order, hence the requirement to insert rather than append... and all the ensuing file shuffling. Since my original post I have been doing some preliminary work using separate temporary files for each table in the target file... this does seem promising and substantially reduces both the data shuffling and the code complexity. The only knocks that I can see so far are the obligation to manage more file pointers (to the temporary files) and the possibly big job of concatenating the files when the target file is closed (or flushed). Fortunately the these operations are not likely to be in a loop, so it is not too painful if it is "slow".

George

At 02:06 PM 5/26/2004, you wrote:

How big are the "chunks" that need to be inserted? If they are relatively large, you can simply append each chunk to the end of the file. But then keep a list of pointers to each chunk. The pointers you can shuffle at will to be in the proper order, but each one points to a different, nonsequential part of the data. You can then keep the pointer list in a separate file, or in a dedicated large space at the beginning of the data file.
David
I am still working on our new file i/o library and I have a performance/reliability question. The specification of the file format requires the data be sequential in the file... so if I want to add data to a table at the beginning of the file I am required to shift all of the following tables to make room for the new data. Therefore, if a program was writing to two tables in the file in a loop, there would be a lot of data shuffling... and it would get worse the longer the loop ran. Does anyone have any experience with this sort of thing? Am I worried about something that I wouldn't even be able to notice in the end? Our files routinely reach several hundred meg... and gigabyte files are not unheard of... what would be the performance hit on trying to insert data at the beginning of a several hundred meg file?

An alternative I am considering is to maintain each table in the file in a separate temporary file while the file is open... then when the file is closed LabVIEW would concatenate together the temporary files to produce the output file according to the format specification. The downside here is that I am now managing who-knows-how-many temporary file refnums while the file is open... this is not a huge deal from the library programming point of view, but are there issues in labview with having many files open at the same time? Are there other disadvantages with this approach I have not thought of?

Finally, does anyone have some other suggestion for how I could cleanly write the sorts of files I am describing here?
Thanks
George Gatling
Applied Technology Division, SFA Inc.
Space Physics Simulation Chamber
US Naval Research Laboratory
202-404-5405 (phone)
202-767-3553 (fax)
If trees could scream, would we be so cavalier about cutting them down?
We might, if they screamed all the time, for no good reason.  --Jack Handy
--
David Ferster
Actimetrics, Inc.
1621 Elmwood Ave., Wilmette, IL 60091
http://www.actimetrics.com
847/922-2643 Phone
847/589-8103 FAX


George Gatling
Applied Technology Division, SFA Inc.
Space Physics Simulation Chamber
US Naval Research Laboratory
202-404-5405 (phone)
202-767-3553 (fax)

If trees could scream, would we be so cavalier about cutting them down? We might, if they screamed all the time, for no good reason. --Jack Handy

Re: FILE I/O Performance

Reply via email to