On Sun, May 20, 2012 at 4:44 AM, Girish Venkatachalam <
[email protected]> wrote:

> On Sun, May 20, 2012 at 2:43 AM, Parikshith Mechineni
> <[email protected]> wrote:
> > I have two folders with jpeg files,
> > I am trying to figure out how to delete files that are in folder one that
> > also exist in folder two.
> > The file names of the two identical files are not the same, but the
> Hashes
> > are the same ( tried md5 and SHA-1)
> > Does any one have any idea how to do it?
> > Thanks in advance
> >
>
> Very simple query.
>
> for f in $oldfolder
> do
> sha=`sha1sum $f`
>    for f2 in $newfolder
>     do
>      sha2=`sha1sum $f2`
>      if [ $sha1 = $sha2 ]; then
>             rm $f2
>      fi
>      done
> done
>
> --
> Gayatri Hitech
> http://gayatri-hitech.com
> _______________________________________________
> ILUGC Mailing List:
> http://www.ae.iitm.ac.in/mailman/listinfo/ilugc
>


Store the sha sum of each file in first folder in a hash table (like C++
HashMap). Then start taking sha of each file in the second folder. If the
hash is already present just delete the file (duplicate file), if the hash
is not present keep the file in second folder or copy the file to the first
folder (or take necessary action as you like). This will just loop through
the contents of each folder only once. I guess this method is O(n).

Hope this helps,
Prasanna Kumar T S M
_______________________________________________
ILUGC Mailing List:
http://www.ae.iitm.ac.in/mailman/listinfo/ilugc

Reply via email to