Sean Hamilton proposes:
Wouldn't it seem logical to have [randomized disk cache expiration] in place at all times?
Terry Lambert responds:
:I really dislike the idea of random expiration; I don't understand :the point, unless you are trying to get better numbers on some
>>:benchmark. Matt Dillon concedes:
... it's only useful when you are cycling through a [large] data set ...
Cycling through large data sets is not really that uncommon. I do something like the following pretty regularly: find /usr/src -type f | xargs grep function_name Even scanning through a large dataset once can really hurt competing applications on the same machine by flushing their data from the cache for no gain. I think this is where randomized expiration might really win, by reducing the penalty for disk-cache-friendly applications who are competing with disk-cache-unfriendly applications. There's an extensive literature on randomized algorithms. Although I'm certainly no expert, I understand that such algorithms work very well in exactly this sort of application, since they "usually" avoid worst-case behavior under a broad variety of inputs. The current cache is, in essence, tuned specifically to work badly on a system where applications are scanning through large amounts of data. No matter what deterministic caching algorithm you use, you're choosing to behave badly under some situation. Personally, I think there's a lot of merit to _trying_ randomized disk cache expiry and seeing how it works in practice. (I would also observe here that 5.0 now has a fast, high-quality source of randomness that seems ideal for exactly such applications.) I don't believe that it would _prevent_ applications from using optimizations such as those that Terry suggests, while possibly providing reasonable performance under a broader range of scenarios than are currently supported. Sounds like a good idea to me. Tim Kientzle To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message