On Thu, Feb 13, 2014 at 12:49 PM, <vita...@yourcmc.ru> wrote: > Hi! > > I want to report two bugs in Subversion 1.8.5 libsvn_subr > subversion/libsvn_subr/cache-membuffer.c, a more important one and a less > important one. I've fixed them, and patches are attached - please apply > them. >
Hi Vitali, Thank you very much for taking the time to debug the hangs and to come up with a patch. It also tells me that the caching code can still be understood. > First bug (important): > > There is an incorrect integer addition to cache->data_used which makes > 32-bit threaded svnserve sometimes hang and eat 100% CPU. When it hangs, > one thread is doing an infinite loop in ensure_data_insertable() with write > lock acquired, and others just wait for it to release the lock. So it looks > like 100% usage of a single CPU core and non-working svn:// access. > > The problem is in line 1377: "cache->data_used += size - entry->size;" is > incorrect because cache->data_used is always apr_uint64_t, and size and > entry->size are size_t which is unsigned 32-bit integer type on 32-bit > platforms. And size is always [almost always] less than entry->size at that > line, so cache->data_used becomes > 0x100000000 at once which makes > ensure_data_insertable() to loop infinitely because all entries in cache > "become" less than 1/8 * average entry size. > > The patch is in the attachment 'fix-membuffer-u32-overflow' > Fixed in r1567985 using Peter's suggestion. > Second bug (less important): > > There is also another bug in libsvn_subr membuffer cache implementation: > cache->hit_count and entry->hit_count are incremented without proper mutual > exclusion (only with read lock acquired), so some increments get lost in > multithreaded svnserve; as it's unlikely two threads access the same entry > at once, usually cache->hit_count (not entry->hit_count) increments are > lost, which leads to cache->hit_count being less than sum(entry->hit_count). > > So cache->hit_count may overflow during subtraction of entry->hit_count > from it and lead to removing of extra values from the cache in > ensure_data_insertable(), because the retain threshold may be very big; > also it may overflow again, become less than cache->entries_used and also > lead to removing of extra items. > > Instead of killing the performance by mutual exclusion I think it's > possible to just check for integer overflow. > This one is more subtle. My patch for it is in r1567996 with a follow-up in 1568009. It took me a while to see that your proposal is actually a stable solution, i.e. not likely to make the global counter value drift over time. I also fixed a related issue with 32 bit hit counter overflows (rare but may happen after some time for large directories or items close to the root of the access path). > > This doesn't affect general svnserve usability, so severity is minor. > These problems may affect svnserve in threaded more more than httpd but the latter should still exhibit the infinite loop behaviour. > The patch is in the attachment 'fix-membuffer-hit_count-overflow' > > Hope you'll apply these patches soon, or fix the bugs some another way :) > Well, the good news is that 1.7 is not affected (all access is fully synchronized there) nor 1.9 / trunk. The latter uses a more sophisticated scheme and will exit the critical loop after a few iterations without affecting overall performance. Thanks again for the report! -- Stefan^2.