Re: [CFC/CFT] large changes in the loader(8) code
On Wednesday, June 27, 2012 12:50:20 am Andrey V. Elsukov wrote: > On 26.06.2012 21:37, John Baldwin wrote: > >> 4. The gptboot now searches the backup GPT header in the previous sectors, > >> when it finds the "GEOM::" signature in the last sector. PMBR code also > >> tries to do the same: > >> common/gpt.c > >> i386/pmbr/pmbr.s > > > > GPT really wants the backup header at the last LBA. I know you can set it, > > but I've interpreted that as a way to see if the primary header is correct > > or > > not. It seems to me that GPT tables created in this fashion (inside a GEOM > > provider) will not work properly with partition editors for other OS's. > > I'm > > hesitant to encourage the use of this as I do think putting GPT inside of a > > gmirror violates the GPT spec. > > The standard says: > "The following test must be performed to determine if a GPT is valid: > • Check the Signature > • Check the Header CRC > • Check that the MyLBA entry points to the LBA that contains the GUID > Partition Table > • Check the CRC of the GUID Partition Entry Array > If the GPT is the primary table, stored at LBA 1: > • Check the AlternateLBA to see if it is a valid GPT > If the primary GPT is corrupt, software must check the last LBA of the device > to see if it has a > valid GPT Header and point to a valid GPT Partition Entry Array." Right, we break the last rule. If you want to use a partition editor that doesn't grok gmirror (because you are using another OS's editor), to repair a GPT, it will do the wrong thing. > If a user wants modify GPT in the disk editor from the another OS, > he can do it, and it should work. The result depends only from the partition > editor, > it might overwrite the last sector and might don't. I would not assume it would work at all. If it can't trust the primary GPT, it has to assume the alternate is at the last LBA. > >> 5. Also the pmbr image now contains one fake partition record. > >> When several first sectors are damaged the kernel can't detect GPT > >> (see RECOVERING section in the gpart(8)). We can restore PMBR with dd(1) > >> command, but the old pmbr image has an empty partition table and > >> loader doesn't able to boot from GPT, when there is no partition record > >> in the PMBR. Now it will be able. When pmbr is installed via 'gpart > > bootcode' > >> command, the kernel correctly modifies this partition record. So, this is > > only > >> for the first rescue step. > > > > As I said earlier, I do not think this is appropriate and that instead > > gpart should have an appropriate 'recover' command to install just the pmbr > > on > > a disk and also create a correct entry in the MBR if needed while doing so. > > gpart(8) is only one of several geom(8)' tools to manage objects of a GEOM > class. > It only sends control requests to the kernel. If GPT is not detected, > there is no geom objects to manage. And we can't write bootcode with gpart(8). > I think that adding such functions to the gpart(8) is not good. Maybe, > the boot0cfg is the better tool for that. Also we still haven't any tool to > install zfsboot. We can't write bootcode with gpart? What do you think the 'bootcode' command does? Also, there is no reason we can't have a 'recover' command that attempts to recover a corrupted table including repairing the PMBR. gpart(8) already generates a full PMBR when you use 'gpart create' to create a GPT even though there isn't a GPT object yet. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Tuesday, June 26, 2012 5:23:08 pm Pawel Jakub Dawidek wrote: > On Tue, Jun 26, 2012 at 01:37:11PM -0400, John Baldwin wrote: > > > 4. The gptboot now searches the backup GPT header in the previous sectors, > > > when it finds the "GEOM::" signature in the last sector. PMBR code also > > > tries to do the same: > > > common/gpt.c > > > i386/pmbr/pmbr.s > > > > GPT really wants the backup header at the last LBA. I know you can set it, > > but I've interpreted that as a way to see if the primary header is correct > > or > > not. [...] > > My interpretation is different: The way to verify if the header is valid > is to check its checksum, not to check if the backup header location in > the primary header points at the last LBA. > > Of course if primary header's checksum is incorrect it is hard to trust > that the backup header location is correct. And we need the backup > header when the primary header is invalid... Right, which is why this fails. > > [...] It seems to me that GPT tables created in this fashion (inside a GEOM > > provider) will not work properly with partition editors for other OS's. > > I'm > > hesitant to encourage the use of this as I do think putting GPT inside of a > > gmirror violates the GPT spec. > > I don't think so. Most common case is to configure partitions on top of > a mirror. Mirroring partitions is less common. Mostly because of > hardware RAIDs being popular. You don't expect hardware RAID vendor to > mirror partitions. Partition editors for other OS's won't work, but only > because they don't support gmirror. If they wouldn't recognize and > support some hardware (or pseudo-hardware) RAIDs there will be the same > problem. Hardware RAIDs hide the metadata from the disk that the BIOS (and disk editors) see. Thus, putting a GPT on a hardware RAID volume works fine as the logical volume is always seen by all OS's consistently. The same is even true of the "software" RAID that graid supports since the metadata is defined by the vendor and thus the logical volume is always seen other OS's consistently. My approach has been to only use gmirror with MBR so far, though I realize that doesn't work above 2TB (until recently one had to have a hardware RAID to get above 2TB anyway which made this last a moot point). I won't object to patch our tools to handle this, but I think it is a really bad idea that users will have a hard way to recover from when they are bitten by it. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On 27.06.2012 16:07, John Baldwin wrote: >> • Check the Signature >> • Check the Header CRC >> • Check that the MyLBA entry points to the LBA that contains the GUID >> Partition Table >> • Check the CRC of the GUID Partition Entry Array >> If the GPT is the primary table, stored at LBA 1: >> • Check the AlternateLBA to see if it is a valid GPT >> If the primary GPT is corrupt, software must check the last LBA of the >> device to see if it has a >> valid GPT Header and point to a valid GPT Partition Entry Array." > > Right, we break the last rule. If you want to use a partition editor > that doesn't grok gmirror (because you are using another OS's editor), > to repair a GPT, it will do the wrong thing. When we are in the FreeBSD, our loader can detect that device size is lower than it see and it will work. When primary header is OK, then other OSes should work with this GPT. When it isn't OK, you just can't load other OS :) >>> As I said earlier, I do not think this is appropriate and that instead >>> gpart should have an appropriate 'recover' command to install just the pmbr >>> on >>> a disk and also create a correct entry in the MBR if needed while doing so. >> >> gpart(8) is only one of several geom(8)' tools to manage objects of a GEOM >> class. >> It only sends control requests to the kernel. If GPT is not detected, >> there is no geom objects to manage. And we can't write bootcode with >> gpart(8). >> I think that adding such functions to the gpart(8) is not good. Maybe, >> the boot0cfg is the better tool for that. Also we still haven't any tool to >> install zfsboot. > > We can't write bootcode with gpart? What do you think the 'bootcode' command > does? `gpart bootcode -b` reads file, creates ioctl request and sends this data to the GEOM_PART class. GEOM_PART receives the control request, checks the data and writes it to the provider. `gpart bootcode -p` works like dd(1) and writes bootcode to the given partition. gpart(8) haven't any knowledge about specific partitioning scheme. > Also, there is no reason we can't have a 'recover' command that attempts to > recover a corrupted table including repairing the PMBR. gpart(8) already > generates a full PMBR when you use 'gpart create' to create a GPT even though > there isn't a GPT object yet. `gpart create` creates only ioctl control request to the GEOM_PART class. GEOM_PART class creates new GPT geom object and this objects writes PMBR and its metadata to the provider. -- WBR, Andrey V. Elsukov ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Wed, Jun 27, 2012 at 08:22:25AM -0400, John Baldwin wrote: > > I don't think so. Most common case is to configure partitions on top of > > a mirror. Mirroring partitions is less common. Mostly because of > > hardware RAIDs being popular. You don't expect hardware RAID vendor to > > mirror partitions. Partition editors for other OS's won't work, but only > > because they don't support gmirror. If they wouldn't recognize and > > support some hardware (or pseudo-hardware) RAIDs there will be the same > > problem. > > Hardware RAIDs hide the metadata from the disk that the BIOS (and disk > editors) see. Thus, putting a GPT on a hardware RAID volume works fine > as the logical volume is always seen by all OS's consistently. [...] Only if you won't connect this disk to a different controller. > [...] The same > is even true of the "software" RAID that graid supports since the metadata > is defined by the vendor and thus the logical volume is always seen other > OS's consistently. But is it seen without metadata by the boot loader? What I'm trying to say is that it is fair to expect from the user to not use gmirror-configured disk on different OS. If the user wants to use this disk in different OS then he has to use format that is recognized by both. Because gmirror is supported by FreeBSD we should improve the support by teaching boot loader about it. Pretending gmirror is special and recommending to mirror partitions with it instead of raw disks is not the solution. I really can't see how gmirror is different in this regard from any other software RAID or volume manager. If you try to use disk that contains unrecognized metadata the behaviour is undefined (but hopefully not a panic). -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl pgpuYtYuIiw2R.pgp Description: PGP signature
Re: [CFC/CFT] large changes in the loader(8) code
On Wednesday, June 27, 2012 10:08:17 am Pawel Jakub Dawidek wrote: > On Wed, Jun 27, 2012 at 08:22:25AM -0400, John Baldwin wrote: > > > I don't think so. Most common case is to configure partitions on top of > > > a mirror. Mirroring partitions is less common. Mostly because of > > > hardware RAIDs being popular. You don't expect hardware RAID vendor to > > > mirror partitions. Partition editors for other OS's won't work, but only > > > because they don't support gmirror. If they wouldn't recognize and > > > support some hardware (or pseudo-hardware) RAIDs there will be the same > > > problem. > > > > Hardware RAIDs hide the metadata from the disk that the BIOS (and disk > > editors) see. Thus, putting a GPT on a hardware RAID volume works fine > > as the logical volume is always seen by all OS's consistently. [...] > > Only if you won't connect this disk to a different controller. Yes, but people do not expect to be able to yank a hardware RAID drive out and hook it up to a "raw" disk controller and have it work. > > [...] The same > > is even true of the "software" RAID that graid supports since the metadata > > is defined by the vendor and thus the logical volume is always seen other > > OS's consistently. > > But is it seen without metadata by the boot loader? Yes. The logical volume shows up as a BIOS disk device. > What I'm trying to say is that it is fair to expect from the user to not > use gmirror-configured disk on different OS. If the user wants to use > this disk in different OS then he has to use format that is recognized > by both. > > Because gmirror is supported by FreeBSD we should improve the support by > teaching boot loader about it. Pretending gmirror is special and > recommending to mirror partitions with it instead of raw disks is not > the solution. > > I really can't see how gmirror is different in this regard from any > other software RAID or volume manager. If you try to use disk that > contains unrecognized metadata the behaviour is undefined (but hopefully > not a panic). It is not gmirror I am complaining about, it is the non-standard use of GPT. Note that gmirror + MBR works fine without violating what little standard there is for the MBR. Using a dedicated GPT partition to hold the gmirrror metadata would work with GPT (but be a good bit harder to work with in terms of GEOM I realize). But as I said, I won't object to these patches. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Wednesday, June 27, 2012 8:45:45 am Andrey V. Elsukov wrote: > On 27.06.2012 16:07, John Baldwin wrote: > >> • Check the Signature > >> • Check the Header CRC > >> • Check that the MyLBA entry points to the LBA that contains the GUID > >> Partition Table > >> • Check the CRC of the GUID Partition Entry Array > >> If the GPT is the primary table, stored at LBA 1: > >> • Check the AlternateLBA to see if it is a valid GPT > >> If the primary GPT is corrupt, software must check the last LBA of the > >> device to see if it has a > >> valid GPT Header and point to a valid GPT Partition Entry Array." > > > > Right, we break the last rule. If you want to use a partition editor > > that doesn't grok gmirror (because you are using another OS's editor), > > to repair a GPT, it will do the wrong thing. > > When we are in the FreeBSD, our loader can detect that device size > is lower than it see and it will work. When primary header is OK, then > other OSes should work with this GPT. When it isn't OK, you just can't > load other OS :) Ah, yes. The solution to violating standards is to make sure you never use standards-compliant software. That's a great argument. :) (Although not entirely uncommon. Standards aren't always perfect, but if we had a way to not gratuitously violate them it would be nice to avoid doing so.) > > We can't write bootcode with gpart? What do you think the 'bootcode' > > command > > does? > > `gpart bootcode -b` reads file, creates ioctl request and sends this data to > the GEOM_PART class. GEOM_PART receives the control request, checks the data > and writes it to the provider. > `gpart bootcode -p` works like dd(1) and writes bootcode to the given > partition. > gpart(8) haven't any knowledge about specific partitioning scheme. Correct, but in both cases it writes "bootcode". > > Also, there is no reason we can't have a 'recover' command that attempts to > > recover a corrupted table including repairing the PMBR. gpart(8) already > > generates a full PMBR when you use 'gpart create' to create a GPT even > > though > > there isn't a GPT object yet. > > `gpart create` creates only ioctl control request to the GEOM_PART class. > GEOM_PART class creates new GPT geom object and this objects writes PMBR and > its > metadata to the provider. You can't add a new ioctl? -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 26, 2012, at 10:37 AM, John Baldwin wrote: > > GPT really wants the backup header at the last LBA. I know you can set it, > but I've interpreted that as a way to see if the primary header is correct or > not. It seems to me that GPT tables created in this fashion (inside a GEOM > provider) will not work properly with partition editors for other OS's. I'm > hesitant to encourage the use of this as I do think putting GPT inside of a > gmirror violates the GPT spec. Agreed. While it is a nice trick to use the last sector for meta data, it does create 2 problems. 1 is mentioned above. The second is that when there's different metadata in the first *and* the last sector, you can't decide which is to take precedence without also looking at the other and know how to interpret it. We have not solved this second problem at all. We do get reports about the problems though. At best we're handwaving or kluging. I think it's unwise to depend on FreeBSD-specific extensions or features in industry-standard partitioning schemes and as such make the use of "foreign" tools hard if not impossible. A much more flexible approach is to support out-of-band configuration data. This allows us to mirror GPT disks without having to become non- standard as it removes the need to use the last sector for meta-data. The ability to construct GEOM hierarchies unambiguously is very important and our current approach has proven to not deliver on that. This is actually impacting existing FreeBSD consumers already, like Juniper. So, se should not go deeper into this rabbit hole. We should finally solve this problem for real... -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 26, 2012, at 2:43 PM, Pawel Jakub Dawidek wrote: > > As for sharing disk with other OS. If you share the disk with OS that > doesn't support gmirror, you shouldn't use gmirror in the first place. > You probably want to use only formats that are recognized by all your > OSes. This statement is ridicuous by virtue of not being in touch with reality and by making gmirror useless for such wide range of cases that one can question why we have it at all. Put differently: a mirroring class is a fairly basic and useful thing to have. Limiting it's use is nothing but artificial and follows from having to use the underlying provider to store metadata. This then changes the view of the underlying providing to consumers above gmirror in a way that makes the presence or absence of gmirror visible. Solving the visibility problem makes gmirror useful all the time. I see that as a better way of looking at it than simply blurting out that you shouldn't use gmirror when certain awkward and artifical conditions apply. -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Freeze when running freebsd-update
On Wed, Jun 27, 2012 at 2:33 AM, Dieter BSD wrote: >>> Robert writes: 3) the box is responsive to hitting enter at the console (it produces another login: prompt) >>> >>> Getty is in memory and can run. >>> 5) if I try to login to the console, it lets me enter a username then locks up totally, it does not present me with a password: prompt. >>> >>> Login(1) is not in memory, and the kernel cannot read it from disk >>> for some reason. >>> >>> I can get this symptom by writing a large file to a disk on a >>> controller that FreeBSD doesn't support NCQ on. I assume there >>> is a logjam in the buffer cache. Something trivial like reading >>> login in from disk that would normally happen in well under a >>> second can take many minutes. >>> >>> Perhaps geli is causing a similar logjam? Does it hang forever or >>> is it just obscenely slow? If it truely hangs forever it is >>> probably something else. Is there disk activity after it hangs? >>> Can you try it without geli? systat -vmstat might provide a clue. >> >> Well, it is geli. I'm unable to reproduce the freeze on the same >> exact system with everything else the same except for no geli. I'm >> going to move this thread over to geom, and continue it there. Thanks >> for your help! > > It occurs to me that it will need twice as much memory for disk i/o. > 1 buffer for encrypted and 1 for unencrypted. I know nothing about geli, > so I don't know if it uses the buffer cache for both, or what. > Could it be that the kernel isn't keeping enough memory free and > manages to paint itself into a corner and not have space to store > the unencrypted version of disk reads, and can't page/swap anything > out to make space because it doesn't have space to store the encrypted > version to write? I think that's probably about what is happening. I'm still waiting for an answer on the geom mailing list, but I will do some testing with increasing memory sizes and see where the problem stops occurring. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 26, 2012, at 9:50 PM, Andrey V. Elsukov wrote: > If the primary GPT is corrupt, software must check the last LBA of the device > to see if it has a > valid GPT Header and point to a valid GPT Partition Entry Array." > > For the FreeBSD an each GEOM provider can be treated as disk device. > So, i don't see anything criminal if we will add some quirks in the our loader > for the better supporting of our technologies. You can't just re-interpret standards to match a context you know very well isn't applicable and consequently redefine what the word "device" means. You're on a slippery slope and while you may not see it as a problem, you do make it a problem for FreeBSD users. It's our users we should be keeping in mind when we solve problems. > If a user wants modify GPT in the disk editor from the another OS, > he can do it, and it should work. The result depends only from the partition > editor, > it might overwrite the last sector and might don't. Right. Another happy user that sees his/her FreeBSD installation destroyed or degraded (no mirroring, warning messages about corrupted GPT, etc) for no apparent reason and without any kind of warning that what he/she is doing is potentially harmful... That's the spirit! -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Wed, Jun 27, 2012 at 10:37:11AM -0700, Marcel Moolenaar wrote: > > On Jun 26, 2012, at 10:37 AM, John Baldwin wrote: > > > > GPT really wants the backup header at the last LBA. I know you can set it, > > but I've interpreted that as a way to see if the primary header is correct > > or > > not. It seems to me that GPT tables created in this fashion (inside a GEOM > > provider) will not work properly with partition editors for other OS's. > > I'm > > hesitant to encourage the use of this as I do think putting GPT inside of a > > gmirror violates the GPT spec. > > Agreed. Guys. This doesn't violate the GPT spec in any way. The spec is narrow-minded if it talks only about raw disks, but you should think about gmirror as pseudo-hardware RAID. That's all. If putting GPT on top of RAID array is spec violation, then I guess we just have to live with it. > While it is a nice trick to use the last sector for meta data, it does > create 2 problems. 1 is mentioned above. [...] It doesn't really matter where gmirror puts its metadata. If gmirror would keep its metadata in the first sector, gpart/gpt will find its metadata in the last sector and will complain about missing primary header. > [...] The second is that when there's > different metadata in the first *and* the last sector, you can't decide > which is to take precedence without also looking at the other and know > how to interpret it. We have not solved this second problem at all. We > do get reports about the problems though. At best we're handwaving or > kluging. This is different kind of problem. It took me a while to realize that, but now I know:) The real problem is that not all metadata formats are suitable for autodetection. That's all. The metadata I use in my GEOM classes play nice with autodetection. The solution is very easy - keep size of the disk device within metadata. This allows gmirror to figure out if it is configured on raw disk, last slice or last partition within last slice, etc. If GPT would keep disk size in its metadata the second problem you mentioned would not exist. And to be honest GPT kinda does that by having backup header's LBA stored in the primary header. And this is fine as long the primary header is valid. The same problem is with things like UFS labels. There is no way to properly support them using GEOM autodetection, because there is no provider size in UFS superblock. UFS superblock contains file system size, but it is not the same, as one can create smaller file system than the underlying disk device. > I think it's unwise to depend on FreeBSD-specific extensions or features > in industry-standard partitioning schemes and as such make the use of > "foreign" tools hard if not impossible. If you plan to use the given disk with FreeBSD only, what's the problem? Partitioning is not the end of the world. Even if you use "industry-standard partitioning schemes" what file system are you going to use to actually access your data? FAT? Of course if you do share your disk between various OSes then probably your best bet is to use MBR or GPT on raw disk and FAT file system. But if you use your disk with FreeBSD only, then I see no reason to not to leverage FreeBSD-specific features (be it gmirror, geli or zfs). > A much more flexible approach is to support out-of-band configuration > data. This allows us to mirror GPT disks without having to become non- > standard as it removes the need to use the last sector for meta-data. > The ability to construct GEOM hierarchies unambiguously is very > important and our current approach has proven to not deliver on that. > This is actually impacting existing FreeBSD consumers already, like > Juniper. So, se should not go deeper into this rabbit hole. We should > finally solve this problem for real... Marcel, nothing stops anyone from implementing GEOM mirror class that uses no on-disk metadata. GEOM is not a limiting factor here. GEOM does provide mechanism for autoconfiguration, but it is totally optional and GEOM class might choose not to use it. As an example you can take a look at two other GEOM classes of mine: gconcat(8) and gstripe(8). You can use 'label' subcommand to store metadata on component disks, which will take advantage of GEOM autodetection and autoconfiguration. You can also use 'create' subcommand to create ad hoc provider that stores no metadata and makes use of entire disks, which also means it won't be automatically created on next boot. For Juniper it might be more handy to use out-of-band configuration as you know the hardware you are running on, so you know where the disks are exactly, etc. My company build appliances too, so I have been there. For most of our users automatic configuration is simply better, as they can shuffle disks around and not wonder if the system will boot or not. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes
Re: [CFC/CFT] large changes in the loader(8) code
On Wed, Jun 27, 2012 at 10:45:35AM -0700, Marcel Moolenaar wrote: > > On Jun 26, 2012, at 2:43 PM, Pawel Jakub Dawidek wrote: > > > > As for sharing disk with other OS. If you share the disk with OS that > > doesn't support gmirror, you shouldn't use gmirror in the first place. > > You probably want to use only formats that are recognized by all your > > OSes. > > This statement is ridicuous by virtue of not being in touch with > reality and by making gmirror useless for such wide range of cases > that one can question why we have it at all. > > Put differently: a mirroring class is a fairly basic and useful thing > to have. Limiting it's use is nothing but artificial and follows from > having to use the underlying provider to store metadata. This then > changes the view of the underlying providing to consumers above gmirror > in a way that makes the presence or absence of gmirror visible. > Solving the visibility problem makes gmirror useful all the time. > I see that as a better way of looking at it than simply blurting out > that you shouldn't use gmirror when certain awkward and artifical > conditions apply. I'm sorry, Marcel, but what you describe here has nothing to do with reality. To be able to implement realiable mirroring you have to use on-disk metadata. There is no way around that. You can implement non-redundant GEOM classes without using on-disk metadata, but out-of-band configuration in case of mirroring is simply naive. How do you detect that components are out of sync, for example? And when it comes to visablity. Are you suggesting that gmirror should present entire underlying provider to upper layers? Including its metadata? I hope not, because we went through that hell already (remember skipping first 16 sectors by UFS, as BSDlabel metadata might be there? The same for swap?). I think I did pretty good job by making the metadata as simple as possible - I use exactly one sector at the end of the target device. I'm really having a hard time to think of a simpler format. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl pgpHuBBkXk10K.pgp Description: PGP signature
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 27, 2012, at 11:34 AM, Pawel Jakub Dawidek wrote: > > I'm sorry, Marcel, but what you describe here has nothing to do with > reality. To be able to implement realiable mirroring you have to use > on-disk metadata. There is no way around that. You can implement > non-redundant GEOM classes without using on-disk metadata, but > out-of-band configuration in case of mirroring is simply naive. How do > you detect that components are out of sync, for example? GEOM configuration and per-class runtime state are not to be treated the same. Out-of-band configuration is trivial. Per-class runtime state, like whether elements in a mirrored configuration are in sync or not is more difficult, but does not a priori require on-disk metadata as it's implemented now. You can have the configuration tell the GEOM where that state is being kept, so that you can put it in a partition on the disks involved, or even keep it independent from the disks, which then requires disks to be uniquely identifiable, for sure. But that's what GPT gives you anyway. But even without identification, you can invert the question from "how do I detect that components are out of sync" to "how do I prove they are in fact in sync". That question has a very simple O(n) answer. So, if time isn't a concern or your storage is small, you can always scan all sectors as such prove that the disks are in sync. The point being: the current implementation isn't the only one. Granted, it can easily be the simplest one or even the best one in some cases, but that's besides the point you were making. -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On 06/27/12 16:28, John Baldwin wrote: On Wednesday, June 27, 2012 8:45:45 am Andrey V. Elsukov wrote: When we are in the FreeBSD, our loader can detect that device size is lower than it see and it will work. When primary header is OK, then other OSes should work with this GPT. When it isn't OK, you just can't load other OS :) Ah, yes. The solution to violating standards is to make sure you never use standards-compliant software. That's a great argument. :) (Although not entirely uncommon. Standards aren't always perfect, but if we had a way to not gratuitously violate them it would be nice to avoid doing so.) To be standards compliant and allow whole-disk based mirroring to work at the same time wouldn't nested GPT work like this? Whole disk (start) | GPT header | GPT partition of type freebsd-geom (start) | | gmirror device (start) | | | GPT header | | | | freebsd-boot | | | | freebsd-ufs | | | | freebsd-swap | | | GPT backup header | | gmirror metadata | | gmirror device (end) | GPT partition of type freebsd-geom (end) | GPT backup header Whole disk (end) Nothing but FreeBSD would understand the freebsd-geom partition type, so the inner GPT device should be valid and standards compliant. The boot loader would of course need to understand this setup but that shouldn't be impossible. Just a thought. It might be too complicated compared to the non-standards compliant way it works now which works quite well in practice though. -- Christian Laursen ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On 2012-06-26 14:50, Andrey V. Elsukov wrote: > Some time ago i have started reading the code in the sys/boot. > Especially i'm interested in the partition tables handling. > I found several problems: > 1. There are several copies of the same code in the libi386/biosdisk.c > and common/disk.c, and partially libpc98/biosdisk.c. > 2. ZFS probing is very slow, because the ZFS code doesn't know how many > disks and partitions the system has: > http://www.freebsd.org/cgi/query-pr.cgi?pr=148296 > http://www.freebsd.org/cgi/query-pr.cgi?pr=161897 > 3. The GPT support doesn't check CRC and even doesn't know anything > about the secondary GPT header/table. > > So, i have created the branch and committed the changes: > http://svnweb.freebsd.org/base/user/ae/bootcode/ > The patch is here: > http://people.freebsd.org/~ae/boot.diff FWIW, I verified it compiles OK with clang, and especially boot2's size isn't increased at all. It would be nice if you could check it with clang now and again, before you finally merge this project into head. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 27, 2012, at 11:20 AM, Pawel Jakub Dawidek wrote: > On Wed, Jun 27, 2012 at 10:37:11AM -0700, Marcel Moolenaar wrote: >> >> On Jun 26, 2012, at 10:37 AM, John Baldwin wrote: >>> >>> GPT really wants the backup header at the last LBA. I know you can set it, >>> but I've interpreted that as a way to see if the primary header is correct >>> or >>> not. It seems to me that GPT tables created in this fashion (inside a GEOM >>> provider) will not work properly with partition editors for other OS's. >>> I'm >>> hesitant to encourage the use of this as I do think putting GPT inside of a >>> gmirror violates the GPT spec. >> >> Agreed. > > Guys. This doesn't violate the GPT spec in any way. The spec is > narrow-minded if it talks only about raw disks, but you should think > about gmirror as pseudo-hardware RAID. I'm sorry, but this is a contradiction. If it doesn't violate the spec, then the spec is not narrow-minded on the grounds of what we're discussing. If the spec *is* narrow-minded then obviously it doesn't capture our scenario, which means that we're violating the spec. Clearly we're not discussing anything that falls well within the spec, or is undebatable. This makes the whole topic dangerous anyway. When you're in the grey area (this is only for argument's sake -- we're in violation for sure) you're opening yourself up to compatibility problems. Should we deliberately go there? -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Wednesday, June 27, 2012 1:45:35 pm Marcel Moolenaar wrote: > > On Jun 26, 2012, at 2:43 PM, Pawel Jakub Dawidek wrote: > > > > As for sharing disk with other OS. If you share the disk with OS that > > doesn't support gmirror, you shouldn't use gmirror in the first place. > > You probably want to use only formats that are recognized by all your > > OSes. > > This statement is ridicuous by virtue of not being in touch with > reality and by making gmirror useless for such wide range of cases > that one can question why we have it at all. > > Put differently: a mirroring class is a fairly basic and useful thing > to have. Limiting it's use is nothing but artificial and follows from > having to use the underlying provider to store metadata. This then > changes the view of the underlying providing to consumers above gmirror > in a way that makes the presence or absence of gmirror visible. > Solving the visibility problem makes gmirror useful all the time. > I see that as a better way of looking at it than simply blurting out > that you shouldn't use gmirror when certain awkward and artifical > conditions apply. I'm not sure we can force gmirror to be anything except FreeBSD-specific, but it would be nice to not make non-standard GPT tables while we are at it. The reason the metadata for things like Intel's onboard SATA RAID does work ok is because the metadata format is enforced by the vendor, so it is reasonable to assume that metadata format will work across other OS's. Anyway, I've said my piece and will let the matter drop from my end at this point. -- John Baldwin ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 27, 2012, at 12:08 PM, Christian Laursen wrote: > On 06/27/12 16:28, John Baldwin wrote: >> On Wednesday, June 27, 2012 8:45:45 am Andrey V. Elsukov wrote: >> >>> When we are in the FreeBSD, our loader can detect that device size >>> is lower than it see and it will work. When primary header is OK, then >>> other OSes should work with this GPT. When it isn't OK, you just can't >>> load other OS :) >> >> Ah, yes. The solution to violating standards is to make sure you never >> use standards-compliant software. That's a great argument. :) >> >> (Although not entirely uncommon. Standards aren't always perfect, but if >> we had a way to not gratuitously violate them it would be nice to avoid >> doing so.) > > To be standards compliant and allow whole-disk based mirroring to work at the > same time wouldn't nested GPT work like this? GPTs don't nest. > Nothing but FreeBSD would understand the freebsd-geom partition type, so the > inner GPT device should be valid and standards compliant. If it were standards compliant, it would be discoverable by non-FreeBSD. That clearly isn't the case -- hence it's not standards compliant. What for example if someone wanted to share the swap partition between Linux and FreeBSD? -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On 27.06.2012 21:55, Marcel Moolenaar wrote: > You can't just re-interpret standards to match a context you know very well > isn't applicable and consequently redefine what the word "device" means. > You're on a slippery slope and while you may not see it as a problem, you > do make it a problem for FreeBSD users. It's our users we should be keeping > in mind when we solve problems. > >> If a user wants modify GPT in the disk editor from the another OS, >> he can do it, and it should work. The result depends only from the partition >> editor, >> it might overwrite the last sector and might don't. > > Right. Another happy user that sees his/her FreeBSD installation destroyed > or degraded (no mirroring, warning messages about corrupted GPT, etc) for > no apparent reason and without any kind of warning that what he/she is doing > is potentially harmful... That's the spirit! Ok. Let's return back to my patches. They don't add any new methods to shoot in the foot. We are talking about the *FreeBSD loader*. This is the program that starts FreeBSD kernel. It doesn't start other OS. We already have many users who uses FreeBSD as a single system on the machine. Many of them use GPT inside of some GEOM provider. You can just read the lists, articles about installing FreeBSD, forums, etc. We already have these users and i hope they will use FreeBSD as before. So, why can't add a simple quirk to make theirs system a bit more reliable? As i understand there two parts where we haven't a consensus: 1. You are against from: Our loader detects that primary GPT header is damaged. It tries to read backup GPT header from the last LBA and it detects that there is "GEOM::" signature. It tries to read one previous sector and there is *valid* GPT header. It is valid, because it's CRC is valid, it's self_LBA is valid. For the *FreeBSD* users it is better to don't use this GPT and just complain "i'm sorry, can't boot". The other OSes can't, and we shouldn't. 2. You are against from having one fake PMBR entry by default in the /boot/pmbr image. Ok, I can propose several ways to resolve this: * remove from the loader's GPT probing code restriction to necessarily have PMBR partition record in the MBR; * teach the boot0cfg command properly write the PMBR; * add new condition to mark GPT as corrupt when it has invalid PMBR. Thus, when you write PMBR with empty partition table with dd(1), the kernel will complain and you will be forced to run `gpart recover`. -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature
Re: [CFC/CFT] large changes in the loader(8) code
On Tue, 26 Jun 2012 12:37:11 -0500, John Baldwin wrote: I'm hesitant to encourage the use of this as I do think putting GPT inside of a gmirror violates the GPT spec. I personally think this use case is a bit ... odd, anyway. I have only request to those that manage GPT/GEOM/etc -- as I'm used to doing multiple mdadm RAID components on Linux for maximum flexibility, using gmirror upon multiple GPT partitions upon the same physical device is OK with me. My only complaint is that recovery is very, very stupid. We should by default detect and only rebuild ONE gmirror device at a time on the same physical provider. You get nothing but a smokin' angry head if you allow multiple to rebuild at the same time because it's fighting over sequential writes all the way across the platters. It would also be nice if gmirror rebuild could also be detected by fsck and fsck could either hold off or gmirror could be paused until a consistent filesystem state exists. It's probably best for the background fsck to go first so you can get the system up and running, but then when it's finished gmirror should continue. Otherwise I have no issues with gmirror -- it does exactly the job I need it to. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 27, 2012, at 12:27 PM, Andrey V. Elsukov wrote: > On 27.06.2012 21:55, Marcel Moolenaar wrote: >> You can't just re-interpret standards to match a context you know very well >> isn't applicable and consequently redefine what the word "device" means. >> You're on a slippery slope and while you may not see it as a problem, you >> do make it a problem for FreeBSD users. It's our users we should be keeping >> in mind when we solve problems. >> >>> If a user wants modify GPT in the disk editor from the another OS, >>> he can do it, and it should work. The result depends only from the >>> partition editor, >>> it might overwrite the last sector and might don't. >> >> Right. Another happy user that sees his/her FreeBSD installation destroyed >> or degraded (no mirroring, warning messages about corrupted GPT, etc) for >> no apparent reason and without any kind of warning that what he/she is doing >> is potentially harmful... That's the spirit! > > Ok. Let's return back to my patches. They don't add any new methods to > shoot in the foot. We are talking about the *FreeBSD loader*. > This is the program that starts FreeBSD kernel. It doesn't start other > OS. We already have many users who uses FreeBSD as a single system on > the machine. Many of them use GPT inside of some GEOM provider. Your patches are a continuation on a path that we're discussing isn't necessarily the path we should be on. While you don't make things worse from a compliance perspective, you make it worse by adding the non-compliant behaviour to more components. > As i understand there two parts where we haven't a consensus: > > 1. You are against from: > Our loader detects that primary GPT header is damaged. It tries to read > backup GPT header from the last LBA and it detects that there is > "GEOM::" signature. It tries to read one previous sector and there is > *valid* GPT header. How do you know it's valid? It's in a location that is not valid to begin with. Validity is based on rules and you're violating the the rules without defining exactly what we call valid given the new rules. This may seem nitpicking, but having went through the hassle of dealing with the broken way we created the dangerously dedicated disk, I appreciate the importance of being anal when it comes to something that lives on non-volatile storage and gets to be exposed to a world much larger than FreeBSD. > 2. You are against from having one fake PMBR entry by default in the > /boot/pmbr image. I don't understand what you're saying or what I'm being accused to be against. -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On 28.06.2012 00:14, Marcel Moolenaar wrote: >> Our loader detects that primary GPT header is damaged. It tries to read >> backup GPT header from the last LBA and it detects that there is >> "GEOM::" signature. It tries to read one previous sector and there is >> *valid* GPT header. > > How do you know it's valid? It's in a location that is not valid > to begin with. Validity is based on rules and you're violating the > the rules without defining exactly what we call valid given the > new rules. This may seem nitpicking, but having went through the > hassle of dealing with the broken way we created the dangerously > dedicated disk, I appreciate the importance of being anal when it > comes to something that lives on non-volatile storage and gets to > be exposed to a world much larger than FreeBSD. So why do you not prevent to attach GEOM_PART_GPT to any providers that are not the disk drive? This will be the right solution to all our problems. Just don't create invalid GPT. -- WBR, Andrey V. Elsukov ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: [CFC/CFT] large changes in the loader(8) code
On Jun 27, 2012, at 1:48 PM, Andrey V. Elsukov wrote: > On 28.06.2012 00:14, Marcel Moolenaar wrote: >>> Our loader detects that primary GPT header is damaged. It tries to read >>> backup GPT header from the last LBA and it detects that there is >>> "GEOM::" signature. It tries to read one previous sector and there is >>> *valid* GPT header. >> >> How do you know it's valid? It's in a location that is not valid >> to begin with. Validity is based on rules and you're violating the >> the rules without defining exactly what we call valid given the >> new rules. This may seem nitpicking, but having went through the >> hassle of dealing with the broken way we created the dangerously >> dedicated disk, I appreciate the importance of being anal when it >> comes to something that lives on non-volatile storage and gets to >> be exposed to a world much larger than FreeBSD. > > So why do you not prevent to attach GEOM_PART_GPT to any providers that > are not the disk drive? This will be the right solution to all our > problems. Just don't create invalid GPT. It's not even the right solution, as it prevents legit nesting of gpart GEOMs *and* is fundamentally based on a flawed assumption that any non-disk GEOM underneath gpart yields an invalid GPT. Think gnop. -- Marcel Moolenaar mar...@xcllnt.net ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
"magic" crashes - mostly solved but
the reason was most probably of of date vbox and fuse kernel modules. after making everything in sync system boots successfully with WITNESS, INVARIANT etc. options enabled. STILL - mostly at booting i'm getting few messages. first comes when executing /etc/rc.d/named (at mounting devfs IMHO): Jun 27 18:32:23 foo kernel: lock order reversal: Jun 27 18:32:23 foo kernel: 1st 0xff80f5859800 bufwait (bufwait) @/usr/src/sys/kern/vfs_bio.c:2636 Jun 27 18:32:24 foo kernel: Jun 27 18:32:24 foo kernel: 2nd 0xff0005c82200 dirhash (dirhash) @/usr/src/sys/ufs/ufs/ufs_dirhash.c:285 Jun 27 18:32:24 foo kernel: KDB: stack backtrace: Jun 27 18:32:24 foo kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x27 Jun 27 18:32:24 foo kernel: em0: link state changed to UP Jun 27 18:32:24 foo kernel: kdb_backtrace() at kdb_backtrace+0x3e Jun 27 18:32:24 foo kernel: _witness_debugger() at _witness_debugger+0x24 Jun 27 18:32:24 foo kernel: witness_checkorder() at witness_checkorder+0xae7 Jun 27 18:32:24 foo kernel: _sx_xlock() at _sx_xlock+0xbf Jun 27 18:32:24 foo kernel: ufsdirhash_acquire() at ufsdirhash_acquire+0x4f Jun 27 18:32:24 foo kernel: ufsdirhash_remove() at ufsdirhash_remove+0x1c Jun 27 18:32:24 foo kernel: ufs_dirremove() at ufs_dirremove+0x12c Jun 27 18:32:24 foo kernel: ufs_remove() at ufs_remove+0x8f Jun 27 18:32:24 foo kernel: VOP_REMOVE_APV() at VOP_REMOVE_APV+0xf4 Jun 27 18:32:24 foo kernel: VOP_REMOVE() at VOP_REMOVE+0x45 Jun 27 18:32:24 foo kernel: kern_unlinkat() at kern_unlinkat+0x1ce Jun 27 18:32:24 foo kernel: kern_unlink() at kern_unlink+0x28 Jun 27 18:32:24 foo kernel: unlink() at unlink+0x25 Jun 27 18:32:24 foo kernel: syscallenter() at syscallenter+0x2e3 Jun 27 18:32:24 foo kernel: amd64_syscall() at amd64_syscall+0x58 Jun 27 18:32:24 foo kernel: Jun 27 18:32:24 foo kernel: Xfast_syscall() at Xfast_syscall+0xfc Jun 27 18:32:24 foo kernel: --- syscall (10, FreeBSD ELF64, unlink), rip = 0xeede070c, rsp = 0x7fffdb08, rbp = 0x7fffef58 --- Jun 27 18:32:24 foo kernel: lock order reversal: Jun 27 18:32:24 foo kernel: 1st 0xff00080a8270 ufs (ufs) @/usr/src/sys/kern/vfs_mount.c:1081 Jun 27 18:32:24 foo kernel: 2nd 0xff00085397f8 devfs (devfs) @/ /usr/src/sys/kern/vfs_subr.c:2169 Jun 27 18:32:24 foo kernel: KDB: stack backtrace: Jun 27 18:32:24 foo kernel: db_trace_self_wrapper() atdb_trace_self_wrapper+0x27 Jun 27 18:32:24 foo kernel: kdb_backtrace() at kdb_backtrace+0x3e Jun 27 18:32:24 foo kernel: _witness_debugger() at _witness_debugger+0x24 Jun 27 18:32:24 foo kernel: witness_checkorder() atwitness_checkorder+0xae7 Jun 27 18:32:24 foo kernel: __lockmgr_args() at __lockmgr_args+0x68d Jun 27 18:32:24 foo kernel: _lockmgr_args() at _lockmgr_args+0x6f Jun 27 18:32:24 foo kernel: vop_stdlock() at vop_stdlock+0x67 Jun 27 18:32:24 foo kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfd Jun 27 18:32:24 foo kernel: VOP_LOCK1() at VOP_LOCK1+0x4b Jun 27 18:32:24 foo kernel: _vn_lock() at _vn_lock+0x64 Jun 27 18:32:24 foo kernel: vget() at vget+0xe9 Jun 27 18:32:24 foo kernel: devfs_allocv() at devfs_allocv+0x125 Jun 27 18:32:24 foo kernel: devfs_root() at devfs_root+0x5a Jun 27 18:32:24 foo kernel: vfs_domount() at vfs_domount+0xcdb Jun 27 18:32:24 foo kernel: vfs_donmount() at vfs_donmount+0x78e Jun 27 18:32:24 foo kernel: nmount() at nmount+0x7e Jun 27 18:32:24 foo kernel: syscallenter() at syscallenter+0x2e3 Jun 27 18:32:24 foo kernel: amd64_syscall() at amd64_syscall+0x58 Jun 27 18:32:24 foo kernel: Xfast_syscall() at Xfast_syscall+0xfc Jun 27 18:32:24 foo kernel: --- syscall (378, FreeBSD ELF64, nmount), rip= 0xeee6535c, rsp = 0x7fffdd18, rbp = 0xef206048 --- Jun 27 18:32:24 foo named[1071]: starting BIND 9.6.-ESV-R7-P1 -t/var/named -u bind Jun 27 18:32:24 foo kernel: Starting named. few more when mounting or unmounting (i'm not sure) pendrive. Jun 27 18:57:09 foo kernel: lock order reversal: Jun 27 18:57:09 foo kernel: 1st 0xff011ec78098 ufs (ufs) @ /usr/src/sys/kern/vfs_lookup.c:504 Jun 27 18:57:09 foo kernel: 2nd 0xff80f5e1bb80 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_softdep.c:6193 Jun 27 18:57:09 foo kernel: 3rd 0xff011ead3d80 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2169 Jun 27 18:57:09 foo kernel: KDB: stack backtrace: Jun 27 18:57:09 foo kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x27 Jun 27 18:57:09 foo kernel: kdb_backtrace() at kdb_backtrace+0x3e Jun 27 18:57:09 foo kernel: _witness_debugger() at _witness_debugger+0x24 Jun 27 18:57:09 foo kernel: witness_checkorder() at witness_checkorder+0xae7 Jun 27 18:57:09 foo kernel: __lockmgr_args() at __lockmgr_args+0x68d Jun 27 18:57:09 foo kernel: _lockmgr_args() at _lockmgr_args+0x6f Jun 27 18:57:09 foo kernel: ffs_lock() at ffs_lock+0xaa Jun 27 18:57:09 foo kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0xfd Jun 27 18:57:09 foo kernel: VOP_LOCK1() at VOP_LOCK1+0x4b Jun 27 18:57:09 foo kernel: _vn_lock() at _vn_lock+0x64 Jun 27 18:57:09 foo kernel: vget() at vget+0xe9 Jun 27 18:57:
Re: [CFC/CFT] large changes in the loader(8) code
I would like to point out that all other operating system which has had this precise problem, have solved it by adding a bootfs partition to hold the kernel+modules required to truly understand the disk-layout ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 p...@freebsd.org | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: /etc/resolv.conf getting over written with dhcp
On Wed, 2012-06-20 at 13:39 +0530, Varuna wrote: > Ian Lepore wrote: > > > > Using the 'prepend' or 'supercede' keywords in /etc/dhclient.conf is > > pretty much the standard way of handling a mix of static and dhcp > > interfaces where the static config needs to take precedence. I'm not > > sure why you dismiss it as essentially good, but somehow not good > > enough. It's been working for me for years. > > > > -- Ian > > > The issue that I had indicated that the issue with the /etc/resolv.conf is > being > caused by an error in /sbin/dhclient-script; hence, I am definitely not > looking > at solving the issue either with /etc/dhclient.conf or > /etc/dhclient-exit-hooks > configuration file. > > BTW, resolver(5) / resolv.conf(5) does not mention the usage of > /etc/dhclient-exit-hooks file to protect the earlier contents of > /etc/resolv.conf file. Will put this issue in the freebsd-doc mailing list. > > With regards, > Varuna > Eudaemonic Systems > Simple, Specific & Insightful I have re-read your original message and I think the confusion is here: > 2***# When resolv.conf is not changed actually, we don't > # need to update it. > # If /usr is not mounted yet, we cannot use cmp, then > # the following test fails. In such case, we simply > # ignore an error and do update resolv.conf. > 3***if cmp -s $tmpres /etc/resolv.conf; then > rm -f $tmpres > return 0 > fi 2>/dev/null > [...] > I guess, the 1***, 3*** and 4*** is causing the recreation of > /etc/resolv.conf. > Is this correct? I did a small modification to 3*** which is: > if !(cmp -s $tmpres /etc/resolv.conf); then > rm -f $tmpres > return 0 > fi 2>/dev/null > This seems to have solved the issue of /etc/resolv.conf getting overwritten > with > just: nameserver 192.168.98.4. This ensures that: If there is a difference > between $tmpres and /etc/resolv.conf, then it exits post removal of $tmpres. > If > the execution of 3*** returns a 0, a new file gets created. I guess the > modification get the intent of 3*** working. > > Have I barked up the wrong tree? I think yes, you have barked up the wrong tree. The intent of the code at 3*** is not to exit if there is a difference, it is to exit if there is NO difference. In other words, if the old and new files are identical then there is no need to re-write the file, just cleanup and exit. If the files are different then replace the existing file with the new one. This is just the (sometimes annoying) way dhcp works. If the dhcp server provides new resolver info it completely replaces any existing resolver info unless you've configured your dhclient.conf to prevent it. It only does so if the interface being configured is the current default-route interface, or there is no current default-route interface. -- Ian ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: Freeze when running freebsd-update
Robert writes: > 3) the box is responsive to hitting enter at the console (it produces > another login: prompt) Getty is in memory and can run. > 5) if I try to login to the console, it lets me enter a username then > locks up totally, it does not present me with a password: prompt. Login(1) is not in memory, and the kernel cannot read it from disk for some reason. I can get this symptom by writing a large file to a disk on a controller that FreeBSD doesn't support NCQ on. I assume there is a logjam in the buffer cache. Something trivial like reading login in from disk that would normally happen in well under a second can take many minutes. Perhaps geli is causing a similar logjam? Does it hang forever or is it just obscenely slow? If it truely hangs forever it is probably something else. Is there disk activity after it hangs? Can you try it without geli? systat -vmstat might provide a clue. >>> >>> Well, it is geli. I'm unable to reproduce the freeze on the same >>> exact system with everything else the same except for no geli. I'm >>> going to move this thread over to geom, and continue it there. Thanks >>> for your help! >> >> It occurs to me that it will need twice as much memory for disk i/o. >> 1 buffer for encrypted and 1 for unencrypted. I know nothing about geli, >> so I don't know if it uses the buffer cache for both, or what. >> Could it be that the kernel isn't keeping enough memory free and >> manages to paint itself into a corner and not have space to store >> the unencrypted version of disk reads, and can't page/swap anything >> out to make space because it doesn't have space to store the encrypted >> version to write? > > I think that's probably about what is happening. I'm still waiting > for an answer on the geom mailing list, but I will do some testing > with increasing memory sizes and see where the problem stops > occurring. Some of the vfs.*buf sysctls might be useful? ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
Re: "magic" crashes - mostly solved but
On Wed, 27 Jun 2012, Wojciech Puchar wrote: the reason was most probably of of date vbox and fuse kernel modules. after making everything in sync system boots successfully with WITNESS, INVARIANT etc. options enabled. STILL - mostly at booting i'm getting few messages. first comes when executing /etc/rc.d/named (at mounting devfs IMHO): Jun 27 18:32:23 foo kernel: lock order reversal: Jun 27 18:32:23 foo kernel: 1st 0xff80f5859800 bufwait (bufwait) @/usr/src/sys/kern/vfs_bio.c:2636 Jun 27 18:32:24 foo kernel: Jun 27 18:32:24 foo kernel: 2nd 0xff0005c82200 dirhash (dirhash) @/usr/src/sys/ufs/ufs/ufs_dirhash.c:285 http://ipv4.sources.zabbadoz.net/freebsd/lor/261.html few more when mounting or unmounting (i'm not sure) pendrive. Jun 27 18:57:09 foo kernel: lock order reversal: Jun 27 18:57:09 foo kernel: 1st 0xff011ec78098 ufs (ufs) @ /usr/src/sys/kern/vfs_lookup.c:504 Jun 27 18:57:09 foo kernel: 2nd 0xff80f5e1bb80 bufwait (bufwait) @ /usr/src/sys/ufs/ffs/ffs_softdep.c:6193 Jun 27 18:57:09 foo kernel: 3rd 0xff011ead3d80 ufs (ufs) @ http://ipv4.sources.zabbadoz.net/freebsd/lor/285.html ___ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"