On Fri, Jan 05, 2018 at 01:31:08PM +0100, Jan Stary wrote: > On Dec 25 22:07:12, o...@drijf.net wrote: > > On Mon, Dec 25, 2017 at 11:48:22AM +0100, Jan Stary wrote: > > > > > This is current/amd64. The nightly dump (full mail and daily.local > > > and df -hi at the bottom) reports a lot of errors such as > > > > > > read error from /dev/rsd2a: Invalid argument: [block -25232932862]: > > > count=10240 > > > > > > Obviously, there is no block -25232932862, but dump(8) must mean > > > some certain block by that, if it knows the count. > > > Is this simply an overflow in the (traverse.c) > > > > > > msg("read error from %s: %s: [block %lld]: count=%d\n", > > > disk, strerror(errno), (long long)blkno, size); > > > > > > or the previous pread()? The disk seems to be working just fine, > > > and a complete dd read of sd2c reports no errors: > > > > > > sd2: > > > 3815602+1 records in > > > 3815602+1 records out > > > 250059350016 bytes transferred in 46028.063 secs (5432758 bytes/sec) > > > > I would start with unmounting the filesystem and doing a (forced) fsck > > to see if your filesystem is corrupted. > > It's clean and fsck did not report anything. > > I should have mentioned that the failure happened during heavy activity > on the disk (bitcoin full node synchronizing itself). > > > On Jan 04 16:47:41, o...@drijf.net wrote: > > On Thu, Jan 04, 2018 at 02:53:31PM +0100, Otto Moerbeek wrote: > > > > > On Thu, Jan 04, 2018 at 09:11:04AM +0100, Otto Moerbeek wrote: > > > > > > > On Wed, Jan 03, 2018 at 09:44:55PM -0600, Colton Lewis wrote: > > > > > > > > > When I try to run fsck on partition m of this disk: > > > > > > > > > > # /dev/rsd1c: > > > > > type: SCSI > > > > > disk: SCSI disk > > > > > label: TOSHIBA MD04ACA4 > > > > > duid: 8ad0895bc1395d21 > > > > > flags: > > > > > bytes/sector: 512 > > > > > sectors/track: 63 > > > > > tracks/cylinder: 255 > > > > > sectors/cylinder: 16065 > > > > > cylinders: 486401 > > > > > total sectors: 7814037168 > > > > > boundstart: 262208 > > > > > boundend: 7814037168 > > > > > drivedata: 0 > > > > > > > > > > 16 partitions: > > > > > # size offset fstype [fsize bsize cpg] > > > > > a: 1136000 262208 4.2BSD 2048 16384 8875 > > > > > b: 1821490 1398208 swap > > > > > c: 7814037168 0 unused > > > > > d: 1571840 3219712 4.2BSD 2048 16384 12280 > > > > > e: 2318784 4791552 4.2BSD 2048 16384 12958 > > > > > f: 2672000 7110336 4.2BSD 2048 16384 12958 > > > > > g: 1545856 9782336 4.2BSD 2048 16384 12077 > > > > > h: 4944064 11328192 4.2BSD 2048 16384 12958 > > > > > i: 262144 64 MSDOS > > > > > j: 2428672 16272256 4.2BSD 2048 16384 12958 > > > > > k: 6954496 18700928 4.2BSD 2048 16384 12958 > > > > > l: 7898912 25655424 4.2BSD 2048 16384 12958 > > > > > m: 7780482560 33554560 4.2BSD 8192 65536 1 > > > > > > > > > > fsck reports that it cannot read negative block numbers: > > > > > > > > > > ** /dev/rsd1m > > > > > BAD SUPER BLOCK: MAGIC NUMBER WRONG > > > > > > > > > > LOOK FOR ALTERNATE SUPERBLOCKS? yes > > > > > > > > > > > > > > > CANNOT READ: BLK 749213312 > > > > > CONTINUE? yes > > > > > > > > > > THE FOLLOWING DISK SECTORS COULD NOT BE READ: 749213312, 749213313, > > > > > 749213314, 749213315, 749213316, 749213317, 749213318, 749213319, > > > > > > > > > > CANNOT READ: BLK -2147483648 > > > > > CONTINUE? yes > > > > > > > > > > THE FOLLOWING DISK SECTORS COULD NOT BE READ: -2147483648, > > > > > -2147483647, -2147483646, -2147483645, -2147483644, -2147483643, > > > > > -2147483642, -2147483641, -2147483640, -2147483639, -2147483638, > > > > > -2147483637, -2147483636, -2147483635, -2147483634, -2147483633, > > > > > > > > > > ...<repeat for the rest of the disk> > > > > > > > > > > How can I make sure fsck can handle a partition this size? There is > > > > > nothing important on there at the moment. > > > > > > > > > > -- > > > > > Sincerely, > > > > > Colton Lewis > > > > > > > > Did you actually newfs that partition? It looks like not since no > > > > superblock or alternative is found. > > > > > > > > That said, it looks like there's an overflow somehere. I do not have > > > > the hardware to investigate this though. > > > > > > > > On a side note: a partition that large will cause problem in other > > > > areas. Even if it would work, the memory needed to do an fsck will be > > > > huge. > > > > > > > > Also: provide dmeg! The platform involved can play a role in this. > > > > > > > > -Otto > > > > > > I tried to reproduce your problem using a vnd image using a sparse > > > file. > > > > > > If I do not newfs the device, I get results very similar to what you > > > are seeing. > > > > > > If I newfs the partition first, an fsck -f works as expected. So without > > > further information, I assume you did not run newfs. > > > > > > I'll invetstigate the negative block numbers. > > > > > > -Otto > > > > THis diff should fixes the negative blocknumbers here, > > Given that my dump(8) problem above also reported negative block > numbers, is there a similar glitch in dump? At some places, blkno > is cast to different int_ types (but he disk code is way over my head).
Yest, that is very well possible. -Otto > > Jan > > > Index: fsck.h > > =================================================================== > > RCS file: /cvs/src/sbin/fsck_ffs/fsck.h,v > > retrieving revision 1.31 > > diff -u -p -r1.31 fsck.h > > --- fsck.h 19 Jan 2015 18:20:47 -0000 1.31 > > +++ fsck.h 4 Jan 2018 15:46:37 -0000 > > @@ -229,7 +229,7 @@ extern long numdirs, listmax, inplast; > > long secsize; /* actual disk sector size */ > > char nflag; /* assume a no response */ > > char yflag; /* assume a yes response */ > > -int bflag; /* location of alternate super block */ > > +daddr_t bflag; /* location of alternate super block */ > > int debug; /* output debugging info */ > > int cvtlevel; /* convert to newer file system format > > */ > > char usedsoftdep; /* just fix soft dependency > > inconsistencies */ > > Index: main.c > > =================================================================== > > RCS file: /cvs/src/sbin/fsck_ffs/main.c,v > > retrieving revision 1.50 > > diff -u -p -r1.50 main.c > > --- main.c 9 Sep 2016 15:37:15 -0000 1.50 > > +++ main.c 4 Jan 2018 15:46:37 -0000 > > @@ -48,7 +48,7 @@ > > > > volatile sig_atomic_t returntosingle; > > > > -int argtoi(int, char *, char *, int); > > +long long argtoi(int, char *, char *, int); > > int checkfilesys(char *, char *, long, int); > > int main(int, char *[]); > > > > @@ -78,7 +78,8 @@ main(int argc, char *argv[]) > > case 'b': > > skipclean = 0; > > bflag = argtoi('b', "number", optarg, 10); > > - printf("Alternate super block location: %d\n", bflag); > > + printf("Alternate super block location: %lld\n", > > + (long long)bflag); > > break; > > > > case 'c': > > @@ -140,13 +141,13 @@ main(int argc, char *argv[]) > > exit(ret); > > } > > > > -int > > +long long > > argtoi(int flag, char *req, char *str, int base) > > { > > char *cp; > > - int ret; > > + long long ret; > > > > - ret = (int)strtol(str, &cp, base); > > + ret = strtoll(str, &cp, base); > > if (cp == str || *cp) > > errexit("-%c flag requires a %s\n", flag, req); > > return (ret); > > Index: setup.c > > =================================================================== > > RCS file: /cvs/src/sbin/fsck_ffs/setup.c,v > > retrieving revision 1.63 > > diff -u -p -r1.63 setup.c > > --- setup.c 9 Sep 2016 15:37:15 -0000 1.63 > > +++ setup.c 4 Jan 2018 15:46:37 -0000 > > @@ -202,7 +202,7 @@ setup(char *dev, int isfsdb) > > } > > found: > > doskipclean = 0; > > - pwarn("USING ALTERNATE SUPERBLOCK AT %d\n", bflag); > > + pwarn("USING ALTERNATE SUPERBLOCK AT %lld\n", (long long)bflag); > > } > > if (debug) > > printf("clean = %d\n", sblock.fs_clean); > >