On Sat, Mar 11, 2006 at 08:25:08PM -0800, Elliott Mitchell wrote: > But those are unlikely to move around, and hence unlikely to change major > number often. Devices that do move around are likely to carry their > journal with them, so having the hint contain only the minor number would > be sufficient. This could even be handled through a hook in hotplug.
Nope. Consider the use case where the data filesystem is using a SCSI or a RAID disk, and the journal filesystem is using a battery-backed up memory disk which looks, feels, and smells like an IDE disk. In that case, the major number of the data partition != to the major number of the journal. So as you can see, just storing the minor number in the hint will not save you. Or consider the case where you are using SCSI id #5 for the data disk, but in order to get the faster performance, you have the external journal on a separate spindle, which is SCSI id #6. Now the system administrator does a clean shutdown of the system, and remove SCSI id #4. *Poof* the SCSI minor device id's get renumbered, so what used to be /dev/sdc1 and /dev/sdd1 now become /dev/sdb1 and /dev/sdc1, and any hints based on major/minor device numbers will be invalidated. If the system uses blkid to do mount-by-label, mount has no problem finding the data disk on /dev/sdb1, but the hint in the external journal is now incorrect. Since the entire system was cleanly shutdown, there is no reason why the system administrator needs to force an fsck just to update the hint; that's just inelegant. The solution is that the mount program needs to be able to use the blkid library to find the new location of the external journal as well. > > Nah, it's too hard, especially when you consider what might happen > > with iSCSI and Fibre Channel. Searching for filesystems by UUID > > really does belong in userspace. But mount does need to know how to > > specify the external journal to the filesystem, just as today it > > passes block device for the filesystem itself to the kernel. > > Strikes me as inelegant to not be able to directly call mount(). :-( You can't directly call mount if you are (a) mounting an NFS partition, without doing a lot of NFS-specific DNS name resolution, etc., or (b) if you are doing any kind of mount-by-label or mount-by-uuid. And this is because putting DNS resolution into the kernel, or doing find-block-device-by-UUID is insane. This is another example of needing to pass *all* of the parameters of the mount command into the kernel, and the location of the external journal is just one of the mount parameters, just as the IP address of the NFS server or where to find the data partition is one of the mount parameters. The hint was a convenience to system administrator for simple cases, but you can make the argument that we should have never implemented the hint, since it left us in a position where we didn't have all of the pieces (i.e., the journal_device mount option should have been implemented a long time ago), and got people lazy and complacent about finishing the userspace support for external journals. - Ted -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]