> On Jun 28, 2015, at 18:35, Philip Guenther <guent...@gmail.com> wrote: > > On Sun, Jun 28, 2015 at 4:27 PM, Chris Bennett > <chrisbenn...@bennettconstruction.us> wrote: >> On Mon, Jun 29, 2015 at 12:56:04AM +0200, nerv wrote: >>> On Sun, 28 Jun 2015 17:39:18 -0500 >>> Chris Bennett <chrisbenn...@bennettconstruction.us> wrote: >>> >>>> I had 4 different hardrives that were failing. >>>> I bought a 2TB usb drive to back up all the home folders. >>>> >>>> I now would like to cp all of the folders and files to another empty >>>> partition. >>>> >>>> But I don't want to overwrite any files with same name but different >>>> content. >>>> >>>> For example: >>>> >>>> /homeX/index.html to /homePerfect >>>> /homeY/index.html to /homePerfect >>>> >>>> both have same name but different contents. >>>> >>>> I googled but couldn't find any solutions. >>>> Ideally I would like a list of failed file copies. >>>> >>>> Any ideas or scripts or ports? >>>> Browsing through 4 home folders is a nightmare. >>>> >>>> Chris Bennett >>>> >>> >>> If you can't find a switch for cp you may have an easier time using >>> rsync, but I'm not too familiar with it so I couldn't tell you >>> what switches to use (It may be able to natively do what you're asking >>> however). >>> Writing a script for it using cp should be quite easy, for each of the >>> partition have the script recursively go into all folders and copy the >>> files after verifying if the name already exists in the target >>> partition. If it does, compare checksums, >>> same checksum : do nothing and go to the next file, >>> different checksum : copy and append a number to its name (or append to >>> it a name for the source partition). >> >> I looked at rsync and cp and gnu cp. >> noclobber just won't do what I want. >> >> Using checksums seems like a good part of the answer, but name changing >> would be very complicated. I have everything read-only except for >> regular /home, /var, / and /tmp. I do some of my programming in /home >> folder and I also have many html files. I already wrote software to >> change file contents to new values, but that adds even more >> complications for both of those areas. >> >> And I want to do this for 4 home folders!!??? > > IMO, you're over thinking it. > > Step 1) GET THE DATA OFF THE FAILING DRIVES. Doing *anything* before > that's done means you *want* to lose data. > > Step 2) okay, *now* that the data is safe, compare files between trees > and delete duplicates > > Note that trying to dedup as it's copied will probably *increase* the > number of times the data has to be read and thus increase the chance > of lost data. > > > Philip Guenther >
Agreed. Save your data first then merge. rsync (pkgs) will help you with both steps: For initial save: # -a preserves dates, time and permissions # -H preserves hard links <- can be memory intensive # -v if you want to see each file by name # --progress to see name + ETA rsync -aH /mnt/failing_w/homes/ /mnt/2tb/w/ rsync -aH /mnt/failing_x/homes/ /mnt/2tb/x/ rsync -aH /mnt/failing_y/homes/ /mnt/2tb/y/ rsync -aH /mnt/failing_z/homes/ /mnt/2tb/z/ Merging: rsync-ing with the --backup --backup-suffix options will backup existing files into the same directory before copying changed. Following is an example. I recommend reading the rsync man page to understand the options first. # disk w archive rsync -aH /mnt/2tb/w/ /mnt/2tb/merged/ # disk x archive # -b == --backup # # -c == --checksum # # set a backup suffix that means something to you and change # it for each drive rsync -aHcb /mnt/2tb/w/ /mnt/2tb/merged/ --backup-suffix=_x_sync.bak Repeat changing disk and backup-suffix. Another option is to just use --dry-run to see the differences. rsync -aH /mnt/2tb/w/ /mnt/2tb/merged/ rsync -aHcv /mnt/2tb/x/ /mnt/2tb/merged/ --dry-run Using --dry-run alone shows what has changed or been added. Add --delete to see what doesn't exist as well. rsync -aHcv /mnt/2tb/x/ /mnt/2tb/merged/ --dry-run --delete --Aaron