Re: writing large files quickly

2006-01-29 Thread Thomas Bellman
Grant Edwards <[EMAIL PROTECTED]> writes: > $ dd if=/dev/zero of=zeros bs=64k count=1024 > 1024+0 records in > 1024+0 records out > $ ls -l zeros > -rw-r--r-- 1 grante users 67108864 Jan 28 14:49 zeros > $ du -h zeros > 65M zeros > In my book that's 64MB not 65MB, but that's an argument for

Re: writing large files quickly

2006-01-29 Thread Fredrik Lundh
Bengt Richter wrote: > >> How the heck does that make a 400 MB file that fast? It literally takes > >> a second or two while every other solution takes at least 2 - 5 minutes. > >> Awesome... thanks for the tip!!! > > > >Because it isn't really writing the zeros. You can make these > >files all

Re: writing large files quickly

2006-01-28 Thread Grant Edwards
On 2006-01-28, Bengt Richter <[EMAIL PROTECTED]> wrote: >>Because it isn't really writing the zeros. You can make these >>files all day long and not run out of disk space, because this >>kind of file doesn't take very many blocks. The blocks that >>were never written are virtual blocks, inasmu

Re: writing large files quickly

2006-01-28 Thread Bengt Richter
On Fri, 27 Jan 2006 12:30:49 -0800, Donn Cave <[EMAIL PROTECTED]> wrote: >In article <[EMAIL PROTECTED]>, > rbt <[EMAIL PROTECTED]> wrote: >> Won't work!? It's absolutely fabulous! I just need something big, quick >> and zeros work great. >> >> How the heck does that make a 400 MB file that fast

Re: writing large files quickly

2006-01-28 Thread Ivan Voras
Jens Theisen wrote: > cp bigfile bigfile2 > > cat bigfile > bigfile3 > > du bigfile* > 8 bigfile2 > 1032bigfile3 > > So it's not consumings 0's. It's just doesn't store unwritten data. And I Very possibly cp "understands" sparse file and cat (doint what it's meant to do) doesn't :

Re: writing large files quickly

2006-01-28 Thread Tim Peters
[Jens Theisen] > ... > Actually I'm not sure what this optimisation should give you anyway. The > only circumstance under which files with only zeroes are meaningful is > testing, and that's exactly when you don't want that optimisation. In most cases, a guarantee that reading "uninitialized" file

Re: writing large files quickly

2006-01-28 Thread Scott David Daniels
Jens Theisen wrote: > Ivan wrote: >> I read somewhere that it has a use in database software, but the only >> thing I can imagine for this is when using heap queues >> (http://python.active-venture.com/lib/node162.html). I've used this feature eons ago where the file was essentially a single large

Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Ivan wrote: > ext2 is a reimplementation of BSD UFS, so it does. Here: > f = file('bigfile', 'w') > f.seek(1024*1024) > f.write('a') > $ l afile > -rw-r--r-- 1 ivoras wheel 1048577 Jan 28 14:57 afile > $ du afile > 8 afile Interesting: cp bigfile bigfile2 cat bigfile > bigfile3 du big

Re: writing large files quickly

2006-01-28 Thread Ivan Voras
Jens Theisen wrote: > Ivan wrote: >>Yes, but AFAIK the only "modern" (meaning: in wide use today) file >>system that doesn't have this support is FAT/FAT32. > > I don't think ext2fs does this either. At least the du and df commands > tell something different. ext2 is a reimplementation of BSD

Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Donn wrote: > Because it isn't really writing the zeros. You can make these > files all day long and not run out of disk space, because this > kind of file doesn't take very many blocks. The blocks that > were never written are virtual blocks, inasmuch as read() at > that location will cause t

Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Ivan wrote: > Steven D'Aprano wrote: >> Isn't this a file system specific solution though? Won't your file system >> need to have support for "sparse files", or else it won't work? > Yes, but AFAIK the only "modern" (meaning: in wide use today) file > system that doesn't have this support is FAT

Re: writing large files quickly

2006-01-28 Thread Jens Theisen
Donn wrote: >> How the heck does that make a 400 MB file that fast? It literally takes >> a second or two while every other solution takes at least 2 - 5 minutes. >> Awesome... thanks for the tip!!! > Because it isn't really writing the zeros. You can make these > files all day long and not run

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, Steven D'Aprano <[EMAIL PROTECTED]> wrote: >> Because it isn't really writing the zeros. You can make these >> files all day long and not run out of disk space, because this >> kind of file doesn't take very many blocks. The blocks that >> were never written are virtual blocks,

Re: writing large files quickly

2006-01-27 Thread Ivan Voras
Steven D'Aprano wrote: > Isn't this a file system specific solution though? Won't your file system > need to have support for "sparse files", or else it won't work? Yes, but AFAIK the only "modern" (meaning: in wide use today) file system that doesn't have this support is FAT/FAT32. -- http://m

Re: writing large files quickly

2006-01-27 Thread Steven D'Aprano
On Fri, 27 Jan 2006 12:30:49 -0800, Donn Cave wrote: > In article <[EMAIL PROTECTED]>, > rbt <[EMAIL PROTECTED]> wrote: >> Won't work!? It's absolutely fabulous! I just need something big, quick >> and zeros work great. >> >> How the heck does that make a 400 MB file that fast? It literally tak

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: > OK I finally get it. It's too good to be true :) Sorry about that. I should have paid closer attention to what you were going to do with the file. > I'm going back to using _real_ files... files that don't look > as if they are there but aren't. B

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, Erik Andreas Brandstadmoen <[EMAIL PROTECTED]> wrote: > Grant Edwards wrote: >> Because the filesystem code keeps track of where you are in >> that 400MB stream, and returns 0x00 anytime you're reading from >> a "hole". The "cp" program and the "md5sum" just open the file >> and sta

Re: writing large files quickly

2006-01-27 Thread rbt
Grant Edwards wrote: > On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: > > >>Hmmm... when I copy the file to a different drive, it takes up >>409,600,000 bytes. Also, an md5 checksum on the generated file and on >>copies placed on other drives are the same. It looks like a regular, big >>file...

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: > OK, I'm still trying to pick my jaw up off of the floor. One > question... how big of a file could this method create? 20GB, > 30GB, limit depends on filesystem, etc? Right. Back in the day, the old libc and ext2 code had a 2GB file size limit at

Re: writing large files quickly

2006-01-27 Thread Erik Andreas Brandstadmoen
Grant Edwards wrote: > Because the filesystem code keeps track of where you are in > that 400MB stream, and returns 0x00 anytime you're reading from > a "hole". The "cp" program and the "md5sum" just open the file > and start read()ing. The filesystem code returns 0x00 bytes > for all of the read

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: > Hmmm... when I copy the file to a different drive, it takes up > 409,600,000 bytes. Also, an md5 checksum on the generated file and on > copies placed on other drives are the same. It looks like a regular, big > file... I don't get it. Because th

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: >fd.write('0') >>> >>>[cut] >>> f = file('large_file.bin','wb') f.seek(40960-1) f.write('\x00') >>> >>>While a mindblowingly simple/elegant/fast solution (kudos!), the >>>OP's file ends up with full of the character zero (ASCII

Re: writing large files quickly

2006-01-27 Thread Robert Kern
rbt wrote: > Hmmm... when I copy the file to a different drive, it takes up > 409,600,000 bytes. Also, an md5 checksum on the generated file and on > copies placed on other drives are the same. It looks like a regular, big > file... I don't get it. google("sparse files") -- Robert Kern [EMAI

Re: writing large files quickly

2006-01-27 Thread rbt
Donn Cave wrote: > In article <[EMAIL PROTECTED]>, > rbt <[EMAIL PROTECTED]> wrote: > >>Won't work!? It's absolutely fabulous! I just need something big, quick >>and zeros work great. >> >>How the heck does that make a 400 MB file that fast? It literally takes >>a second or two while every othe

Re: writing large files quickly

2006-01-27 Thread Donn Cave
In article <[EMAIL PROTECTED]>, rbt <[EMAIL PROTECTED]> wrote: > Won't work!? It's absolutely fabulous! I just need something big, quick > and zeros work great. > > How the heck does that make a 400 MB file that fast? It literally takes > a second or two while every other solution takes at leas

Re: writing large files quickly

2006-01-27 Thread rbt
Grant Edwards wrote: > On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: > > >>I've been doing some file system benchmarking. In the process, I need to >>create a large file to copy around to various drives. I'm creating the >>file like this: >> >>fd = file('large_file.bin', 'wb') >>for x in xrange

Re: writing large files quickly

2006-01-27 Thread rbt
Grant Edwards wrote: > On 2006-01-27, Tim Chase <[EMAIL PROTECTED]> wrote: > fd.write('0') >> >>[cut] >> >>>f = file('large_file.bin','wb') >>>f.seek(40960-1) >>>f.write('\x00') >> >>While a mindblowingly simple/elegant/fast solution (kudos!), the >>OP's file ends up with full of the

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, Tim Chase <[EMAIL PROTECTED]> wrote: >>> fd.write('0') > [cut] >> >> f = file('large_file.bin','wb') >> f.seek(40960-1) >> f.write('\x00') > > While a mindblowingly simple/elegant/fast solution (kudos!), the > OP's file ends up with full of the character zero (ASCII 0x30),

Re: writing large files quickly

2006-01-27 Thread Tim Chase
>> fd.write('0') [cut] > > f = file('large_file.bin','wb') > f.seek(40960-1) > f.write('\x00') While a mindblowingly simple/elegant/fast solution (kudos!), the OP's file ends up with full of the character zero (ASCII 0x30), while your solution ends up full of the NUL character (ASCII 0x

Re: writing large files quickly

2006-01-27 Thread casevh
Oops. I did mean fd.write(block) The only limit is available memory. I've used 1MB block sizes when I did read/write tests. I was comparing NFS vs. local disk performance. I know Python can do at least 100MB/sec. -- http://mail.python.org/mailman/listinfo/python-list

Re: writing large files quickly

2006-01-27 Thread Grant Edwards
On 2006-01-27, rbt <[EMAIL PROTECTED]> wrote: > I've been doing some file system benchmarking. In the process, I need to > create a large file to copy around to various drives. I'm creating the > file like this: > > fd = file('large_file.bin', 'wb') > for x in xrange(40960): > fd.write(

Re: writing large files quickly

2006-01-27 Thread Paul Rubin
Tim Chase <[EMAIL PROTECTED]> writes: > Is there anything preventing one from just doing the following? > fd.write("0" * 40960) > It's one huge string for a very short time. It skips all the looping > and allows Python to pump the file out to the disk as fast as the OS > can handle it. (

Re: writing large files quickly

2006-01-27 Thread Tim Chase
> Untested, but this should be faster. > > block = '0' * 409600 > fd = file('large_file.bin', 'wb') > for x in range(1000): > fd.write('0') > fd.close() Just checking...you mean fd.write(block) right? :) Otherwise, you end up with just 1000 "0" characters in your file :) Is ther

Re: writing large files quickly

2006-01-27 Thread casevh
rbt wrote: > I've been doing some file system benchmarking. In the process, I need to > create a large file to copy around to various drives. I'm creating the > file like this: > > fd = file('large_file.bin', 'wb') > for x in xrange(40960): > fd.write('0') > fd.close() > > This takes a fe

Re: writing large files quickly

2006-01-27 Thread superfun
One way to speed this up is to write larger strings: fd = file('large_file.bin', 'wb') for x in xrange(5120): fd.write('') fd.close() However, I bet within an hour or so you will have a much better answer or 10. =) -- http://mail.python.org/mailman/listinfo/python-list

writing large files quickly

2006-01-27 Thread rbt
I've been doing some file system benchmarking. In the process, I need to create a large file to copy around to various drives. I'm creating the file like this: fd = file('large_file.bin', 'wb') for x in xrange(40960): fd.write('0') fd.close() This takes a few minutes to do. How can I s