"D'Arcy J.M. Cain" <da...@druid.net> writes: > Just curious, what database were you using that wouldn't keep up with > you? I use PostgreSQL and would never consider going back to flat > files.
Try making a file with a billion or so names and addresses, then compare the speed of inserting that many rows into a postgres table against the speed of copying the file. > The only thing I can think of that might make flat files faster is > that flat files are buffered whereas PG guarantees that your > information is written to disk before returning Don't forget all the shadow page operations and the index operations, and that a lot of these operations require reading as well as writing remote parts of the disk, so buffering doesn't help avoid every disk seek. Generally when faced with this sort of problem I find it worthwhile to ask myself whether the mainframe programmers of the 1960's-70's had to deal with the same thing, e.g. when sending out millions of phone bills, or processing credit card transactions (TPF), then ask myself how they did it. Their computers had very little memory or disk space by today's standards, so their main bulk storage medium was mag tape. A heck of a lot of these data processing problems can be recast in terms of sorting large files on tape, rather than updating database one record at a time on disk or in memory. And that is still what (e.g.) large search clusters spend a lot of their time doing (look up the term "pennysort" for more info). -- http://mail.python.org/mailman/listinfo/python-list