Yes, we can say it is calculating hashes in some manner. However, all the test content so far are pure ascii, which would not change regardless how you are looking at it (unlike unicode), and the hashes is done on only words, i.e., spaces and line endings will not affect the hashing.
Thanks a lot for your help though, Jakob. On Mon, Sep 4, 2017 at 2:17 AM, Jakob Borg <ja...@kastelo.net> wrote: > Hi, > > It's not especially clear from your mail what your tool does, exactly. But > assuming that it calculates hashes of content in some manner, my first > guess would be that your test data character set and/or line endings get > changed by the git checkin/checkout procedure. > > //jb > > > On 4 Sep 2017, at 05:34, Tong Sun <suntong...@gmail.com> wrote: > > > > Hi, > > > > I've bumped into another "same code different result" problem -- my `go > test` runs fine locally but on Travis, > > https://travis-ci.org/go-dedup/fsimilar/builds/271540570 > > it is broken. > > > > I've verified at least four or five times that all my local code have > been pushed to github. Now I've run out of ideas why the same source will > have different behavior after compiling into executables on different > machines. > > > > Mine is go 1.9 under Ubuntu 17.04. > > > > Somebody help please. > > > > FYI, the tool I'm building would spot similar files within the file > system very quickly. > > > > $ fsimilar > > find/file similar > > Version 0.1.0 built on 2017-09-03 > > > > Find similar files > > > > Options: > > > > -h, --help display help information > > -S, --size-given size of the files in input as first field > > -Q, --query-size query the file sizes from os > > -i, --input *input from stdin or the given file (mandatory) > > -p, --phonetic use phonetic as words for further error tolerant > > -F, --final produce final output, the recommendations > > -c, --cp[=$FSIM_CP] config path, path that hold all template files > > -v, --verbose verbose mode (multiple -v increase the verbosity) > > > > Commands: > > > > sim Filter the input using simhash similarity check > > vec Use Vector Space for similarity check > > > > $ cat test/sim.lstA > > test/sim/Audio Book - The Grey Coloured Bunnie.mp3 > > test/sim/GNU - Python Standard Library (2001).rar > > test/sim/PopupTest.java > > test/sim/(eBook) GNU - Python Standard Library 2001.pdf > > test/sim/Python Standard Library.zip > > test/sim/GNU - 2001 - Python Standard Library.pdf > > test/sim/LayoutTest.java > > test/sim/ColoredGrayBunny.ogg > > > > $ fsimilar sim > > Filter the input using simhash similarity check > > > > Usage: > > mlocate -i soccer | fsimilar sim -i > > > > Options: > > > > -h, --help display help information > > -S, --size-given size of the files in input as first field > > -Q, --query-size query the file sizes from os > > -i, --input *input from stdin or the given file (mandatory) > > -p, --phonetic use phonetic as words for further error tolerant > > -F, --final produce final output, the recommendations > > -c, --cp[=$FSIM_CP] config path, path that hold all template files > > -v, --verbose verbose mode (multiple -v increase the verbosity) > > -d, --dist[=3] the hamming distance of hashes within which to > deem similar > > > > $ fsimilar sim -i test/sim.lstA -d 12 > > 1 test/sim/(eBook) GNU - Python Standard Library 2001.pdf > > 1 test/sim/GNU - Python Standard Library (2001).rar > > > > 1 test/sim/GNU - 2001 - Python Standard Library.pdf > > 1 test/sim/Python Standard Library.zip > > > > $ fsimilar vec > > Use Vector Space for similarity check > > > > Usage: > > { mlocate -i soccer; mlocate -i football; } | fsimilar sim -i | > fsimilar vec -i -S -Q -F > > > > Options: > > > > -h, --help display help information > > -S, --size-given size of the files in input as first field > > -Q, --query-size query the file sizes from os > > -i, --input *input from stdin or the given file (mandatory) > > -p, --phonetic use phonetic as words for further error tolerant > > -F, --final produce final output, the recommendations > > -c, --cp[=$FSIM_CP] config path, path that hold all template files > > -v, --verbose verbose mode (multiple -v increase the verbosity) > > -t, --thr[=0.86] the threshold above which to deem similar (0.8 = > 80%) > > > > $ fsimilar vec -i test/sim.lstA -t 0.7 > > 1 test/sim/GNU - Python Standard Library (2001).rar > > 1 test/sim/(eBook) GNU - Python Standard Library 2001.pdf > > 1 test/sim/Python Standard Library.zip > > 1 test/sim/GNU - 2001 - Python Standard Library.pdf > > > > $ fsimilar vec -i test/sim.lstA -t 0.7 -p > > 1 test/sim/Audio Book - The Grey Coloured Bunnie.mp3 > > 1 test/sim/ColoredGrayBunny.ogg > > > > 1 test/sim/GNU - Python Standard Library (2001).rar > > 1 test/sim/(eBook) GNU - Python Standard Library 2001.pdf > > 1 test/sim/Python Standard Library.zip > > 1 test/sim/GNU - 2001 - Python Standard Library.pdf > > > > I meant, I hope you can try pulling off from remote yourself and try > testing it with your local machine, as it would be a useful tool eventually. > > > > Thanks for helping! > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "golang-nuts" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to golang-nuts+unsubscr...@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.