Gabriel Genellina wrote: > En Tue, 30 Jun 2009 22:52:18 -0300, Mag Gam <magaw...@gmail.com> escribió: > >> I am very new to python and I am in the process of loading a very >> large compressed csv file into another format. I was wondering if I >> can do this in a multi thread approach. > > Does the format conversion involve a significant processing time? If > not, the total time is dominated by the I/O time (reading and writing > the file) so it's doubtful you gain anything from multiple threads.
Well, the OP didn't say anything about multiple processors, so multiple threads may not help wrt. processing time. However, if the file is large and the OS can schedule the I/O in a way that a seek disaster is avoided (although that's hard to assure with today's hard disk storage density, but SSDs may benefit), multiple threads reading multiple partial streams may still reduce the overall runtime due to increased I/O throughput. That said, the OP was mentioning that the data was compressed, so I doubt that the I/O bandwidth is a problem here. As another poster put it: why bother? Run a few benchmarks first to see where (and if!) things really get slow, and then check what to do about the real problem. Stefan -- http://mail.python.org/mailman/listinfo/python-list