wd2a: id not found writing fsbn 488397104 (wd2 bn 8796581419375;
cn  547561868 tn 158 sn 1), retrying

It may be trivial but I wonder where in the line you highlighted is the
clue that gave you the answer.

During the 4.1->4.2 development cycle, the disklabel layout has been
modified to allow sector numbers of up to 48 bits, instead of 32 bits.
This has been done by ``packing'' existing fields of the structure,
to get room for the extended values.

Disklabel handling is done by both the kernel (which needs to read
the label for its own needs, and also provides ioctls for userland
tools to be able to read and write labels), and the userland system
administration tools such as disklabel(8).

Of course, for this to work correctly, all the involved components
need to agree on the disklabel layout.

In the OP's problem, the wd2a error message reports an unreachable
block number (bn value) with is a huge number, which fits in 48 bits,
but won't in 32. This is a sure sign that disklabel(8) wrote an
old-style label on the disk.

What exactly happened was:
- there was no label on the disk.
- disklabel -E starts by reading the label. Since there wasn't any,
  the kernel returns an empty label, flagged as being a ``new-style''
  48 bit layout.
- disklabel itself (because it is still the 4.1 binary) does not
  know about the new style format, and happily constructs a 32-bit
  style label. Unfortunately, this does not overwrite the ``new-style''
  flag.
- when disklabel asks the kernel to write the new label, the kernel
  does so and handles the label disklabel gave it as a ``new-style''
  format, not knowing that it comes from the old disklabel binary.

Editing the label with the old binary causes extra high-order bits
to appear where the new layout stores the higher part of a 48 bit
value on little-endian platforms; this causes the partition, which
disklabel wants to start at sector #0x0000.00xx (a 32 bit value), to
be handled by the kernel as #0x0800.0000.00xx (a 48 bit value). And
of course, this sector number does not exist on the device, hence the
I/O errors.

Miod

Reply via email to