Hi, Matthew -- Regarding your message of 05-Jul-2005 concerning rsync memory usage (sorry that I am not directly replying to it; I am not as yet subscribed to the list and my mailer doesn't allow me to hard-code an In-Reply-To or References header):
While I applaud anyone who wants to encourage open-source development, it seems to me that, if in fact your problem is that you are running out of memory due to transferring too many files at a time, there are much cheaper solutions for your company than paying for the change to rsync described in the FAQ. 1) Free: break your rsync's into several executions rather than one huge one. Do several sub-directory trees, each separately. If your data files are not organized in such a way that they can easily be divided into a reasonable number of sub-directory trees, consider re-organizing them so that they can be: it will pay off in many other sys-admin benefits as well. 2) Cheap: buy more swap space. These days random-access magnetic storage is running close to 0.50 USD per gig (e.g. here: http://www.buy.com/retail/product.asp?sku=10360313 is 200GB for $105 in the US, including shipping). At the stated rate of 100 bytes per file, this is enough storage to add 2 billion files to each rsync that you run, for a price that is less than many programmers want for a week of coding. If you have much more than 2 billion files in each sub- directory tree, you are probably doing something very wrong. :-) 3) Free: If your problem is not that you are running *out* of memory but rather that rsync is (temporarily) 'stealing' the core (solid-state) memory from the other 'more important' (i.e. requiring quicker response time) processes (causing their data to get swapped out, which might reduce response-time when that data later needs to get swapped back in), you might also consider using the operating system to either lock-down the memory used by your important server programs so that it cannot be swapped out, or give them higher priority (memory-priority, not CPU- scheduling priority, though that might be a good idea also) in such a way that rsync gets swapped out before they do, and it will then maintain a small footprint in physical memory (I am not sure if this is possible or how to do it under Linux, but would be interested to know -- a sort of variant of the nice command but for core usage, or a per- process maximum-in-core parameter). I would however use some caution when doing either of these since the general-purpose VM swap-out algorithms used by most modern operating systems usually do a pretty good job of getting everything serviced in a reasonable response time: forcing rsync to thrash the swap-cache as it might do if the lists are traversed as often as the FAQ implies will not necessarily increase overall performance of the system. Solution (1) above will also greatly improve this situation. Otherwise, the final suggestion: 4) Expensive: buy more solid-state memory. Possibly still cheaper than paying for coding, but at any rate, in my experience, more core is rarely the best solution for lack-of-core problems. Another thought to consider: the method for the proposed "week of coding" solution isn't specified, but it may well involve spooling the lists to temporary files, in which case you'll probably need to buy the storage from solution number (2) above anyhow, in addition to paying for the coding, and get what amounts to nearly the same solution as (2) and/or (3) anyhow. That said, I am always in favor of frugal use of core -- it all depends on what the proposed solution is; if it involves substituting user-space 'swapping' to disk rather than kernel-space swapping, it's likely not worth (apparently large) effort, which could be better directed at other improvements, especially considering the likely decrease in the cost of solid-state memory in the future. Finally, trying to first experiment with solutions (1) and (2) above may help you to determine if indeed the problem is what you think it is, before you shell out for a software solution. Also keep in mind that in my experience, when most programmers estimate "1 week of coding" it often ends up taking 2 or 3, or sometimes 8. Just my (rambling) thoughts as a fellow programmer and system administrator. Anyhow, I really admire someone who is willing to shell out for improvements to open-source code! -- David Favro Senior Partner Meta-Dynamic Solutions meta-dynamic.com -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html