AMD wrote: > Hello, > > I need to split a very big file (10 gigabytes) into several thousand > smaller files according to a hash algorithm, I do this one line at a > time. The problem I have is that opening a file using append, writing > the line and closing the file is very time consuming. I'd rather have > the files all open for the duration, do all writes and then close them > all at the end. > The problem I have under windows is that as soon as I get to 500 files I > get the Too many open files message. I tried the same thing in Delphi > and I can get to 3000 files. How can I increase the number of open files > in Python? > > Thanks in advance for any answers! > > Andre M. Descombes > Try something like this:
Instead of opening several thousand files: * Create several thousand lists. * Open the input file and process each line, dropping it into the correct list. * Whenever a single list passes some size threshold, open its file, write the batch, and immediately close the file. * Similarly at the end (or when the total of all lists passes sme size threshold), loop through the several thousand lists, opening, writing, and closing. This will keep the open/write/closes operations to a minimum, and you'll never have more than 2 files open at a time. Both of those are wins for you. Gary Herron -- http://mail.python.org/mailman/listinfo/python-list