I'm trying to get my Linux-based NTFS backup drive to pass a CHKDSK and came upon this curious situation where CHKDSK finds errors.
It seems to be some issue with how ntfs-3g modifies a directory index when renaming many files. The CHKDSK error always seems to be of the form: Stage 2: Examining file name linkage ... The first free byte, 0xc0, and bytes available, 0x150, for root index $I30 in file 0x40 are not equal. I've attached a python script (mkbaddir.py) that creates two (apparently) identical directories, one of which reliably causes this CHKDSK error; the other doesn't. How to demonstrate: - Format an NTFS partition or thumbdrive using Windows or mkfs.ntfs. - Mount the partition on a Linux system. I used Mint 20 with ntfs-3g 2017.3.23AR.3 integrated FUSE 28 and python 3.8.2. - Chdir to the new NTFS partition and run the script: /tmp/mkbaddir.py # creates 'baddir' in current dir. /tmp/mkbaddir.py -G # creates 'gooddir' in current dir. diff -r baddir gooddir # no difference du -sB1 baddir gooddir # same size (128K) - Boot into Windows (10 v1903) and run (from a terminal) chkdsk X: (where X: is the NTFS drive). - This will say: "Errors found. CHKDSK cannot continue in read-only mode." - Delete baddir (I used cygwin's rm -rf), and run chkdsk X: again. - This will now have no errors. My guess at what's happening: The script creates a directory of 410 empty files and then renames them with slightly larger names, which as I understand leaves a bunch of unused nodes in the b-tree. The -G option just renames the 410 known files; without the -G option, it uses os.walk() to traverse the directory which I'm guessing leaves the b-tree in a slightly different state with even more unused nodes. The 410 was chosen by trial-and-error so that some internal threshhold is just exceeded by the baddir but not by the gooddir. With more than 410 (using the -c option; say -c 500), both baddir and gooddir will cause CHKDSK errors. If I run the script on Windows/cygwin (Python 3.6.9) to create the folders, it does not give any CHKDSK errors even with many more files. So there seems to be some issue with how ntfs-3g modifies the b-tree when renaming many files that is causing CHKDSK to complain. I encountered this issue when trying to get my Linux-based NTFS backup drive to consistently pass a CHKDSK. I use a script to first rename POSIX names to valid windows names, replacing '?' with '@@3F', etc so I can reverse the renaming afterwards. I have some website mirror folders with many files of the form: details.asp?id=xxxxx&key=val which gave rise to this issue. (In the mkbaddir script I use only alphanumeric names to be clear this is not an illegal char issue). --------------------- mkbaddir.py ------------------------------------------------ #!/usr/bin/python3 import os, re, argparse def mkname(i): return "detaildetaildetail-N-%04d" % i parser = argparse.ArgumentParser() parser.add_argument('-G', '--good', action='store_true') parser.add_argument( '-c', '--count', action='store', default=410) opts = parser.parse_args() count = int(opts.count) if opts.good: dirname = 'gooddir' else: dirname = 'baddir' # Create a dir of files os.mkdir(dirname) for i in range(count): f = dirname + '/' + mkname(i) open(f,'a').close() # touch # rename them if opts.good: for i in range(count): f = mkname(i) nf = re.sub('N', "KK3F", f) os.rename(dirname+'/'+f, dirname+'/'+nf) else: for dirpath, dirs, files in os.walk(dirname, topdown=False): for f in files: nf = re.sub('N', "KK3F", f) os.rename(dirname+'/'+f, dirname+'/'+nf)
#!/usr/bin/python3 import os, re, argparse def mkname(i): return "detaildetaildetail-N-%04d" % i parser = argparse.ArgumentParser() parser.add_argument('-G', '--good', action='store_true') parser.add_argument( '-c', '--count', action='store', default=410) opts = parser.parse_args() count = int(opts.count) if opts.good: dirname = 'gooddir' else: dirname = 'baddir' # Create a dir of files os.mkdir(dirname) for i in range(count): f = dirname + '/' + mkname(i) open(f,'a').close() # touch # rename them if opts.good: for i in range(count): f = mkname(i) nf = re.sub('N', "KK3F", f) os.rename(dirname+'/'+f, dirname+'/'+nf) else: for dirpath, dirs, files in os.walk(dirname, topdown=False): for f in files: nf = re.sub('N', "KK3F", f) os.rename(dirname+'/'+f, dirname+'/'+nf)
_______________________________________________ ntfs-3g-devel mailing list ntfs-3g-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ntfs-3g-devel