...oh, and in case you wonder about count and size variation, home directory trees are noisy...
celeste:since mtj$ since -v -d 1m ~ /Users/mtj/Library/Application Support/Google/Chrome /Users/mtj/Library/Application Support/Google/Chrome/Default/Application Cache/Cache/data_1 /Users/mtj/Library/Application Support/Google/Chrome/Default/Cookies /Users/mtj/Library/Application Support/Google/Chrome/Default/Cookies-journal /Users/mtj/Library/Application Support/Google/Chrome/Default/Extension State/000016.log /Users/mtj/Library/Application Support/Google/Chrome/Default/File System/023/p/.usage /Users/mtj/Library/Application Support/Google/Chrome/Default/GPUCache/data_1 /Users/mtj/Library/Application Support/Google/Chrome/Default/IndexedDB/https_docs.google.com_0.indexeddb.leveldb /Users/mtj/Library/Application Support/Google/Chrome/Default/IndexedDB/https_docs.google.com_0.indexeddb.leveldb/LOG /Users/mtj/Library/Application Support/Google/Chrome/Default/Local Extension Settings/ghbmnnjooekpmoecnnnilnnbdlolhkhi/000003.log /Users/mtj/Library/Application Support/Google/Chrome/Default/Local Storage/leveldb/059599.log /Users/mtj/Library/Application Support/Google/Chrome/Default/QuotaManager /Users/mtj/Library/Application Support/Google/Chrome/Default/QuotaManager-journal /Users/mtj/Library/Application Support/Google/Chrome/Default/Session Storage/008658.log /Users/mtj/Library/Application Support/Google/Chrome/Local State /Users/mtj/Library/Caches/Google/Chrome/Default/Cache /Users/mtj/Library/Caches/Google/Chrome/Default/Cache/data_0 /Users/mtj/Library/Caches/Google/Chrome/Default/Cache/data_1 /Users/mtj/Library/Caches/Google/Chrome/Default/Cache/data_3 /Users/mtj/Library/Logs/CreativeCloud/CoreSync/CoreSync-2018-10-07.log /Users/mtj/Library/Saved Application State/com.apple.Terminal.savedState/data.data /Users/mtj/Library/Saved Application State/com.apple.Terminal.savedState/window_2.data 2018/10/07 09:17:11 total: 293233 files ( 100.00%), 512173553646 bytes ( 100.00%) 2018/10/07 09:17:11 recent: 22 files ( 0.01%), 144624988 bytes ( 0.03%) in 1.1883 seconds celeste:since mtj$ ...changes in my last minute, each invisible to me. On Sun, Oct 7, 2018 at 9:10 AM Michael Jones <michael.jo...@gmail.com> wrote: > impressively patient response! > > choosing a serial vs concurrent approach matters too in terms of > performance. > > celeste:tour4 mtj$ tour4 ~ > Go walker [/Users/mtj] > walked 293426 files containing 512174988291 bytes in 4.895 seconds > walked 293426 files containing 512174988537 bytes in 4.918 seconds > walked 293425 files containing 512174988493 bytes in 6.096 seconds > > fs walker [/Users/mtj] > walked 293425 files containing 512174988493 bytes in 4.181 seconds > walked 293425 files containing 512174988493 bytes in 3.858 seconds > walked 293425 files containing 512174035591 bytes in 4.328 seconds > > am walker [/Users/mtj] > walked 293425 files containing 512174035591 bytes in 3.177 seconds > walked 293425 files containing 512174035591 bytes in 3.878 seconds > walked 293425 files containing 512174035591 bytes in 3.453 seconds > > mtj walker [/Users/mtj] > walked 293425 files containing 512174035837 bytes in 1.199 seconds > walked 293425 files containing 512174035837 bytes in 1.074 seconds > walked 293425 files containing 512174035837 bytes in 1.371 seconds > celeste:tour4 mtj$ > > look at walk, since, and dup here for examples: > https://github.com/MichaelTJones > > > On Sun, Oct 7, 2018 at 8:40 AM Marvin Renich <m...@renich.org> wrote: > >> [I've set reply-to to include you (per your reply-to) but to exclude me; I >> prefer to read my list mail on the list rather than in my personal inbox.] >> >> * rob solomon <drrob...@verizon.net> [181006 15:17]: >> > I've been trying to do something simple like this, but I'm not >> interested in >> > following symlinks. Here I just am interested in summing all >> subdirectories >> > from my start directory. But I'm not geting consistent sums, >> especially if >> > I start from my home directory. >> > >> > I guess I'm not handling errors, but I don't know how to handle them in >> a >> > way that allows continuing w/ all directories until all >> non-error-producing >> > directories are walked and summed. >> > >> > I don't want to follow symlinks. >> > >> > I use this on Ubuntu amd64 16.04, 18.04 and Win10. >> > >> > Compiled w/ 1.11.1 (and earlier, but that doesn't matter now). >> > >> > I don't know how to post code without bombing this list. >> >> This list likes to use The Go Playground at https://play.golang.org to >> share code that is not exceedingly large. I have taken your program, >> repaired line breaks added by the email handling programs, fixed a typo, >> run gofmt, and pasted it into play.golang.org. Clicking on the "Share" >> button gives the following link: >> <https://play.golang.org/p/XnmzUnbnhQQ>. >> >> Many simple programs run on the Playground, but when I attempted to run >> yours, I discovered that the playground passes an empty first command >> line argument (os.Args[1] == ""), so your program just gives the usage >> message. I'm not sure whether or not this Playground behavior is >> intentional. >> >> Note that programs that are already running and which modify the >> directory being scanned (e.g. Firefox may be frequently updating a >> cache subdirectory of your home directory) may cause the program to give >> different results for each run. However, I think your problems lie >> elsewhere. >> >> One problem is that you use fi.Name() in DirAlreadyWalked, but fi.Name() >> is only the file name without the directory (e.g. filepath.Base(fpath)). >> You want to use fpath. >> >> The filepath.Walk function does not follow symlinks, and a normal file >> system will not have any cycles, so you do not need any of the logic >> associated with DirAlreadyWalked. This would remove your problem with >> fi.Name as well. >> >> The documentation for WalkFunc is not clear on what errors might be >> passed in as the err argument, but I suspect things like errors from the >> underlying syscalls for stat or lstat. However, it is clear that if err >> is not nil on entry, the Walk function will already skip that directory >> without you needing to return SkipDir. You should return nil in this >> case unless you want to abort the walk completely. >> >> Also in your WalkFunc, you return SkipDir for non-regular files that are >> not directories (e.g. device or pipe). You probably want to return nil >> in this case as well. >> >> When I first started writing this, I took "not getting consistent sums" >> to mean that you were getting different results from successive runs. >> Now I realize you may mean results that are not close to the output of >> du. Being more specific about what your program produced and what you >> expected it to produce would help here. >> >> The *nix program du specifically gives you space taken on disk, unless >> you pass an appropriate option to return the sum of apparent file sizes. >> Your program sums file sizes, not disk space used. It also ignores >> sizes of directories (which can be large for directories with many files >> and subdirectories). >> >> When you start producing output, you create an output file on disk, and >> then write to Stdout if dirList is small, leaving the empty disk file, >> or write to the disk file otherwise. It would be better to do something >> like this: >> >> var isFileOutput = len(dirList) >= 30 >> var w io.Writer >> if !isFileOutput { >> w = os.Stdout >> } else { >> var outfile, err = os.Create(outfilename) >> if err != nil { >> // print a message to os.Stderr and exit >> ... >> } >> defer outfile.Close() >> var bufoutfile = bufio.NewWriter(outfile) >> defer bufoutfile.Flush() >> w = bufoutfile >> } >> >> I would put this code after filepath.Walk, but before any output. You >> can then use isFileOutput to ensure that the summary info is written to >> both Stdout and the output file, but only have to generate it once. >> >> var b0 = []byte(fmt.Sprintf("start dir is %s, found %d files in this >> tree. GrandTotal is %s, or %s, and number of directories is %d\n", >> startDirectory, TotalOfFiles, GrandTotalString, s2, len(DirMap))) >> if isFileOutput { >> // Display summary info to Stdout as well. >> os.Stdout.Write(b0) >> } >> _, err = w.Write(b0) >> if err != nil { >> // print a message to os.Stderr and exit >> ... >> } >> >> And then your main output loop would look like this: >> >> for _, d := range dirList { >> var str = strconv.FormatInt(d.subtotal, 10) >> str = AddCommas(str) >> var _, err = fmt.Fprintf(w, "%s size is %s", d.name, str) >> if err != nil { >> // print a message to os.Stderr and exit >> ... >> } >> } >> >> I hope these comments help. >> >> ...Marvin >> >> -- >> You received this message because you are subscribed to the Google Groups >> "golang-nuts" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to golang-nuts+unsubscr...@googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > > -- > > *Michael T. jonesmichael.jo...@gmail.com <michael.jo...@gmail.com>* > -- *Michael T. jonesmichael.jo...@gmail.com <michael.jo...@gmail.com>* -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.