Re: Creating huge data in very less time.

Tim Chase Tue, 31 Mar 2009 05:42:08 -0700

andrea wrote:

On 31 Mar, 12:14, "venutaurus...@gmail.com" <venutaurus...@gmail.com>
wrote:

That time is reasonable. The randomness should be in such a way that
MD5 checksum of no two files should be the same.The main reason for
having such a huge data is for doing stress testing of our product.



In randomness is not necessary (as I understood) you can just create
one single file and then modify one bit of it iteratively for 1000
times.
It's enough to make the checksum change.

Is there a way to create a file to big withouth actually writing
anything in python (just give me the garbage that is already on the
disk)?

Not exactly AFAIK, but this line of thinking does remind me ofsparse files[1] if your filesystem supports them:


  f = file('%i.txt' % i, 'wb')
  data = str(i) + '\n'
  f.seek(1024*1024*1024 - len(data))
  f.write(data)
  f.close()

On FS's that support sparse files, it's blindingly fast andcreates a virtual file of that size without the overhead ofwriting all the bits to the file. However, this sameoptimization may also throw off any benchmarking you do, as itdoesn't have to read a gig off the physical media. This may be agood metric for hash calculation across such files, but not agood metric for I/O.


-tkc

[1]
http://en.wikipedia.org/wiki/Sparse_file



--
http://mail.python.org/mailman/listinfo/python-list

Re: Creating huge data in very less time.

Reply via email to