On 04/ 6/10 06:28 PM, Phil Stracchino wrote: > On 04/06/10 12:06, Josh Fisher wrote: >> On 4/6/2010 8:42 AM, Phil Stracchino wrote: >>> On 04/06/10 02:37, Craig Ringer wrote: >>> Well, just off the top of my head, the first thing that comes to mind is >>> that the only ways such a scheme is not going to result in massive disk >>> fragmentation are: >>> >>> (a) it's built on top of a custom filesystem with custom device drivers >>> to allow pre-positioning of volumes spaced across the disk surface, in >>> which case it's going to be horribly slow because it's going to spend >>> almost all its time seeking track-to-track; or >> >> I disagree. A filesystem making use of extents and multi-block >> allocation, such as ext4, is designed for large file efficiency by >> keeping files mostly contiguous on disk. Also, filesystems with delayed >> allocation, such as ext4/XFS/ZFS, are much better at concurrent i/o than >> non-delayed allocation filesystems like ext2/3, reiser3, etc. The >> thrashing you mentioned is substantially reduced on writes, and for >> restores, the files (volumes) remain mostly contiguous. So with a modern >> filesystem, concurrent jobs writing to separate volume files will be >> pretty much as efficient as concurrent jobs writing to the same volume >> file, and restores will be much faster with no job interleaving. > > > I think you're missing the point, though perhaps that's because I didn't > make it clear enough. > > Let me try restating it this way: > > When you are writing large volumes of data from multiple sources onto > the same set of disks, you have two choices. Either you accept > fragmentation, or you use a space allocation algorithm that keeps the > distinct file targets self-contiguous, in which case you must accept > hammering the disks as you constantly seek back and forth between the > different areas you're writing your data streams to. > > Yes, aggressive write caching can help a bit with this. But when we're > getting into data sizes where this realistically matters on modern > hardware, the data amounts have long since passed the range it's > reasonable to cache in memory before writing. Delayed allocation can > only help just so much when you're talking multiple half-terabyte backup > data streams.
No - aggressive write caching is key to solving a large part of this problem. Write caching to DRAM in particular is a very efficient way of doing this since it is relatively cheap and most modern servers have a lot of DRAM banks. It also leaves room for flexibility since you easily can tune your cache size to your workload. I have no problem saturating a 4 Gbit LAG group (~400 MB/s) when running backups via Bacula and data *only* touches the disks every 15 to 20 seconds when ZFS flushes its transaction groups to spinning rust. Adding more DRAM would probably push this all the way to 30 seconds, perhaps less once I convert this box to 10 Gbit ethernet. These 15-20 seconds are more than enough for ZFS's block allocator to do its magic. > -- Med venlig hilsen / Best Regards Henrik Johansen hen...@scannet.dk Tlf. 75 53 35 00 ScanNet Group A/S ScanNet ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users