I've an application where I need to create 50K files spread
uniformly across 50 folders in python. The content can be the name of
file itself repeated 10 times.I wrote a code using normal for loops
but it is taking hours together for that. Can some one please share
the code for it using Multithreading. As am new to Python, I know
little about threading concepts.

I'm not sure multi-threading will assist much with what appears to be a filesystem & I/O bottleneck. Far more important will be factors like

- what OS are you running?

- what filesystem are you writing to and does it journal? (NTFS? FAT? ext[234]? JFS? XFS? tempfs/tmpfs? etc...) Some filesystems deal much better with lots of small files while others perform better with fewer-but-larger files.

- how is that filesystem mounted? Are syncs forced for all writes? How large are your write caches?

- what sort of hardware is it hitting? A 4500 RPM drive will have far worse performance than a 10k RPM drive; or you could try against a flash drive or RAID-0 which should have higher sustained write throughput.

- how sparse is the drive layout beforehand? If you're cramped for space, the FS drivers have to scatter your files across disparate space; while lots of free-space may allow it to spew the files without great consideration.

Hope this gives you some ways to test your configuration...

-tkc





--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to