Re: diff or deduplicate two volumes with different folder structures

2016-09-22 Thread Chris Murphy
On Thu, Sep 22, 2016 at 12:56 PM, Matthew Miller wrote: > On Thu, Sep 22, 2016 at 07:57:48PM +0200, Roberto Ragusa wrote: >> > Don't use MD5. You will get unintentional file collisions. (SHA-256 is >> > good. It depends on just how much you are comparing.) >> MD5 unintentional collisions? >> It is

Re: diff or deduplicate two volumes with different folder structures

2016-09-22 Thread Matthew Miller
On Thu, Sep 22, 2016 at 07:57:48PM +0200, Roberto Ragusa wrote: > > Don't use MD5. You will get unintentional file collisions. (SHA-256 is > > good. It depends on just how much you are comparing.) > MD5 unintentional collisions? > It is 128 bit, so you will have a collision after about 2^64 files,

Re: diff or deduplicate two volumes with different folder structures

2016-09-22 Thread Roberto Ragusa
On 09/21/2016 01:01 AM, a...@clueserver.org wrote: > Don't use MD5. You will get unintentional file collisions. (SHA-256 is > good. It depends on just how much you are comparing.) MD5 unintentional collisions? It is 128 bit, so you will have a collision after about 2^64 files, according to the bi

Re: diff or deduplicate two volumes with different folder structures

2016-09-21 Thread Chris Murphy
What I ended up doing: $ find /brickA -type f -exec md5sum "{}" + > brickA.txt $ find /brickB -type f -exec md5sum "{}" + > brickB.txt $ cut -c 1-32 brickA.txt > brickA_md5.txt $ grep -v -F -f brickA_md5.txt brickB.txt > onbrickB_notonbrickA.txt Thanks for the help everyone. Chris Murphy ___

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread alan
> On Tue, Sep 20, 2016 at 10:52:10PM +0200, Ahmad Samir wrote: >> One last try (sometimes an issue nags): >> $ find A -exec md5sum '{}' + > a-md5 >> $ find B -exec md5sum '{}' + > b-md5 >> $ cat a-md5 b-md5 > All >> $ sort -u -k 1,1 All > dupes >> >> Now, (I hopefully got my head around it this tim

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Jon LaBadie
On Tue, Sep 20, 2016 at 10:52:10PM +0200, Ahmad Samir wrote: > One last try (sometimes an issue nags): > $ find A -exec md5sum '{}' + > a-md5 > $ find B -exec md5sum '{}' + > b-md5 > $ cat a-md5 b-md5 > All > $ sort -u -k 1,1 All > dupes > > Now, (I hopefully got my head around it this time...), t

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Ahmad Samir
One last try (sometimes an issue nags): $ find A -exec md5sum '{}' + > a-md5 $ find B -exec md5sum '{}' + > b-md5 $ cat a-md5 b-md5 > All $ sort -u -k 1,1 All > dupes Now, (I hopefully got my head around it this time...), the dupes file should contain a list of files that exist in _both_ A and B;

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread stan
On Mon, 19 Sep 2016 17:23:39 -0600 Chris Murphy wrote: > Drives A and B have many overlapping files but I want to find out what > files don't exist on each. Thwarting this is directory structure > differs between the two drives, and I'm fairly certain some of the > file names differ on the two dr

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Chris Murphy
On Tue, Sep 20, 2016 at 11:55 AM, Ahmad Samir wrote: > On 20 September 2016 at 13:00, Ahmad Samir wrote: >> On 20 September 2016 at 12:34, Ahmad Samir wrote: >>> On 20 September 2016 at 10:33, Ahmad Samir wrote: Here's a crude way: $ find /brickA -type f -exec md5sum "{}" + | sor

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Ahmad Samir
On 20 September 2016 at 13:00, Ahmad Samir wrote: > On 20 September 2016 at 12:34, Ahmad Samir wrote: >> On 20 September 2016 at 10:33, Ahmad Samir wrote: >>> >>> Here's a crude way: >>> $ find /brickA -type f -exec md5sum "{}" + | sort > brickA.txt >>> $ find /brickB -type f -exec md5sum "{}" +

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Ahmad Samir
On 20 September 2016 at 12:34, Ahmad Samir wrote: > On 20 September 2016 at 10:33, Ahmad Samir wrote: >> >> Here's a crude way: >> $ find /brickA -type f -exec md5sum "{}" + | sort > brickA.txt >> $ find /brickB -type f -exec md5sum "{}" + | sort > brickB.txt >> $ diff -U 0 brickA.txt brickB.txt

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Ahmad Samir
On 20 September 2016 at 10:33, Ahmad Samir wrote: > On 20 September 2016 at 01:23, Chris Murphy wrote: >> Drives A and B have many overlapping files but I want to find out what >> files don't exist on each. Thwarting this is directory structure >> differs between the two drives, and I'm fairly ce

Re: diff or deduplicate two volumes with different folder structures

2016-09-20 Thread Ahmad Samir
On 20 September 2016 at 01:23, Chris Murphy wrote: > Drives A and B have many overlapping files but I want to find out what > files don't exist on each. Thwarting this is directory structure > differs between the two drives, and I'm fairly certain some of the > file names differ on the two drives

Re: diff or deduplicate two volumes with different folder structures

2016-09-19 Thread geo.inbox.ignored
On 09/19/2016 06:23 PM, Chris Murphy wrote: > Drives A and B have many overlapping files but I want to find out what > files don't exist on each. you might consider; rsync -avh /brickA/ /brickB/ then rsync -avh /brickB/ /brickA/ to dupe files on both drives. read 'man rsync' for argument