For several years I have been making daily backups of my four Debian computers using Rsync and a small script of my own devising. The data has been accumulating on an external USB drive in a partition with the label, gfx5. Some time ago I decided to a make a copy of these data, so I would have more than one copy. I had to use Rsync to do this because it I were to use cp the copies of files labeled by different dates and hard-link together on gfx5 would exceed the capacity on the target disk (which was/is labeled gfx2). This is a simple one line command to Rsync. When I tried, the job would always crash well before completion. Sometimes, a simple repeat invocation would make further progress, sometimes not. I became curious. As I tried different variations of how to observe the progress of transfer as it happened, I acquired copies of failed transfers, and then discovered that I could not reliably delete a failed copy by using the obvious, 'rm -rfv ... ' I discovered that the command 'find -depth -print -delete' sometimes worked when 'rm -rfv ...' did not. But in both cases the deletion failed because 'gfx2' has been remounted read-only, which makes it impossible to update the target directory tree.
I have not tried it, but from my investigation I'm sure that a massive delete of some obsolete file structure from the HD that was /dev/sda1 during Debian install would trigger a remount-ro, which surely would lead to a system crash in short order. I investigated further. These investigations were done on a computer which I call 'gq'. I set up experiments on 'gq' by using ssh to issue commands in 'gq' from my main desktop computer, 'big'. I set up several ssh windows into 'gq'. My first discovery was that after a crash while attempting to delete with 'find -depth -print -delete ', there was a long delay in remounting 'gfx2' while the mount command emptied the journal (ext4) on 'gfx2'. Next I tried 'find -depth -print -delete ', with some extra windows into 'gq' in which I issued the command 'sync'. The return from 'sync' was delayed, sometimes as much as a minute, and if I didn't issue 'sync' commands frequently enough, there was never a return from 'sync', just the crash of the 'find' command. So frequent sync commands delayed the crash. I found two other ways to delay the crash: 1) using nice as in: ' nice -n 19 find -depth -print -delete' (this, I think, slows down the main running job in relation to the running of the kernel.) 2) using cntrl-Z to pause the 'find' job for a while (which I think also allows the kernel to catch up with the journal) I could also monitor the progress of the journal run, by issuing a sync command in a separate ssh window. I'm worried about what I found. I want to interest someone who has far more knowledge about how the kernel actually works internally to look into this. I done other experiments more complicated to report, I can't find anything comforting about this situation. If you think it's OK, you probably don't understand, IMHO. Kind regards, -- Paul E Condon pecon...@mesanetworks.net -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20150402232106.gb3...@big.lan.gnu