On Mar 31, 1:15 pm, Steven D'Aprano <ste...@remove.this.cybersource.com.au> wrote: > On Mon, 30 Mar 2009 22:44:41 -0700, venutaurus...@gmail.com wrote: > > Hello all, > > I've a requirement where I need to create around 1000 > > files under a given folder with each file size of around 1GB. The > > constraints here are each file should have random data and no two files > > should be unique even if I run the same script multiple times. > > I don't understand what you mean. "No two files should be unique" means > literally that only *one* file is unique, the others are copies of each > other. > > Do you mean that no two files should be the same? > > > Moreover > > the filenames should also be unique every time I run the script. One > > possibility is that we can use Unix time format for the file names > > with some extensions. > > That's easy. Start a counter at 0, and every time you create a new file, > name the file by that counter, then increase the counter by one. > > > Can this be done within few minutes of time. Is it > > possble only using threads or can be done in any other way. This has to > > be done in Windows. > > Is it possible? Sure. In a couple of minutes? I doubt it. 1000 files of > 1GB each means you are writing 1TB of data to a HDD. The fastest HDDs can > reach about 125 MB per second under ideal circumstances, so that will > take at least 8 seconds per 1GB file or 8000 seconds in total. If you try > to write them all in parallel, you'll probably just make the HDD waste > time seeking backwards and forwards from one place to another. > > -- > Steven
That time is reasonable. The randomness should be in such a way that MD5 checksum of no two files should be the same.The main reason for having such a huge data is for doing stress testing of our product. -- http://mail.python.org/mailman/listinfo/python-list