Re: [go-nuts] Re: Duplicate File Checker Performance

2016-10-21 Thread Sri G
entire file must be hashed, so sadly I cant use > these optimizations. > > > On Sunday, October 16, 2016 at 1:26:24 PM UTC-4, Michael Jones wrote: > > Sri G, > > > > How does this time compare to my “Dup” program? I can’t test for you…since > it is your filesystem…but

Re: [go-nuts] Re: Duplicate File Checker Performance

2016-10-16 Thread Michael Jones
Oh, I see. Well if you must read and hash every byte of every file then you really are mostly measuring device speed. From: on behalf of Sri G Date: Sunday, October 16, 2016 at 12:17 PM To: golang-nuts Cc: Subject: Re: [go-nuts] Re: Duplicate File Checker Performance This isn&#

Re: [go-nuts] Re: Duplicate File Checker Performance

2016-10-16 Thread Sri G
ystem…but I thought I had it going about as fast as > possible a few years ago when I wrote that one. > > > > https://github.com/MichaelTJones/dup > > > > Michael > > > > *From: *> on behalf of Sri G < > sriakhil...@gmail.com > > *Date: *Satu

Re: [go-nuts] Re: Duplicate File Checker Performance

2016-10-16 Thread Michael Jones
: Saturday, October 15, 2016 at 6:46 PM To: golang-nuts Subject: [go-nuts] Re: Duplicate File Checker Performance Thanks. Made the go code similar to python using CopyBuffer with a block size of 65536. buf := make([]byte, 65536) if _, err := io.CopyBuffer(hash, file, buf); err

[go-nuts] Re: Duplicate File Checker Performance

2016-10-15 Thread Sri G
Thanks. Made the go code similar to python using CopyBuffer with a block size of 65536. buf := make([]byte, 65536) if _, err := io.CopyBuffer(hash, file, buf); err != nil { fmt.Println(err) } Didn't make too much of a difference, was slightly faster. What got it to the

[go-nuts] Re: Duplicate File Checker Performance

2016-10-15 Thread Sri G
Too diagnose this issue, I tried some benchmarks with time tested tools: On the same directory: find DIR -type f -exec md5 {} \; *5.36s user 2.93s system 50% cpu 16.552 total* Adding a hashmap on top of that wouldn't significantly increase the time. Making this multi-processed (32 processes):