On Friday 31 January 2003 03:47, David Schultz wrote:
> You have found an optimal replacement algorithm for the case of
> repeated sequential reads. In fact, if you know in advance what
> the access pattern is going to be, it is *always* possible to find
> an optimal replacement algorithm. Spec
For those thinking of playing with predictive caching
(likely an area of considerable student endeveour/interest
these days at both filesystem and "web" level):
---
Matthew Dillon:
> So there is no 'perfect' caching algorithm. There
> are simply too many variables even in a well defined
> env
> You have found an optimal replacement algorithm for the case of
> repeated sequential reads. In fact, if you know in advance what
> the access pattern is going to be, it is *always* possible to find
> an optimal replacement algorithm. Specifically, you always
> replace the block in the cache th
Thus spake Fergal Daly <[EMAIL PROTECTED]>:
> [EMAIL PROTECTED] (Tim Kientzle) wrote in message
> news:...
> > Personally, I think there's a lot of merit to _trying_
>
> There's even more merit in only pretending to try...
Welcome to my quotes file.
> As you can see, the locking cache is alway
:I think you missed Matt's point, which is well-taken:
:
:Even if everybody accesses it sequentially, if you have 100 processes
:accessing it sequentially at the *same* time, then it would be to your
:benefit to leave the "old" pages around because even though *this*
:process won't access it ag
[EMAIL PROTECTED] (Tim Kientzle) wrote in message
news:...
> Personally, I think there's a lot of merit to _trying_
There's even more merit in only pretending to try...
Here's the results of a quick simulation of a cache using random replacement.
I've also included a scheme for a caching algor
Hi, re:
> If a file's access history were stored as a "hint"
> associated with the file, then it would
> be possible to make better up-front decisions about
> how to allocate cache space.
I believe at one time this was a hot area, and now
maybe it is back. I vaguely recall a recent PhD in
On Thursday 30 January 2003 07:06 pm, Tim Kientzle wrote:
| Matthew Dillon wrote:
| > Your idea of 'sequential' access cache restriction only
| >
| > works if there is just one process doing the accessing.
|
| Not necessarily. I suspect that there is
| a strong tendency to access particular fil
:Not necessarily. I suspect that there is
:a strong tendency to access particular files
:in particular ways. E.g., in your example of
:a download server, those files are always
:read sequentially. You can make similar assertions
:about a lot of files: manpages, gzip files,
:C source code files,
Matthew Dillon wrote:
Your idea of 'sequential' access cache restriction only
works if there is just one process doing the accessing.
Not necessarily. I suspect that there is
a strong tendency to access particular files
in particular ways. E.g., in your example of
a download server, those
On Thursday 30 January 2003 05:22 pm, Matthew Dillon wrote:
| Well, here's a counterpoint. Lets say you have an FTP
| server with 1G of ram full of, say, pirated CDs at 600MB a
| pop.
|
| Now lets say someone puts up a new madonna CD and suddenly
| you have thousands of peop
Well, here's a counterpoint. Lets say you have an FTP
server with 1G of ram full of, say, pirated CDs at 600MB a
pop.
Now lets say someone puts up a new madonna CD and suddenly
you have thousands of people from all over the world trying
to download a single 600MB file.
The suggestion here basically boils down to this: if the system could
act on hints that somebody will be doing sequential access, then it
should be more timid about caching for that file access. That is to
say, it should allow that file to "use up" a smaller number of blocks
from the cache (y
M. Basically what it comes down to is that without foreknowledge
of the data locations being accessed, it is not possible for any
cache algorithm to adapt to all the myrid ways data might be accessed.
If you focus the cache on one methodology it will probably perform
terri
Thus spake Tim Kientzle <[EMAIL PROTECTED]>:
> Sean Hamilton proposes:
>
> >Wouldn't it seem logical to have [randomized disk cache expiration] in
> >place at all times?
>
> Terry Lambert responds:
>
> >>:I really dislike the idea of random expiration; I don't understand
> >>:the point, unless y
"Brian T. Schellenberger" wrote:
> 2. For sequential access, you should stop caching before you throw away
> your own blocks. If it's sequential it is, it seems to me, always a
> lose to throw away your *own* processes older bllocks on thee same
> file.
You can not have a block in a vm object whi
Sean Hamilton wrote:
> In my case I have a webserver serving up a few dozen files of about 10 MB
> each. While yes it is true that I could purchase more memory, and I could
> purchase more drives and stripe them, I am more interested in the fact that
> this server is constantly grinding away becaus
Tim Kientzle wrote:
> Cycling through large data sets is not really that uncommon.
> I do something like the following pretty regularly:
> find /usr/src -type f | xargs grep function_name
>
> Even scanning through a large dataset once can really hurt
> competing applications on the same machin
Brian T. Schellenberger wrote:
This to me is imminently sensible.
In fact there seem like two rules that have come up in this discussion:
1. For sequential access, you should be very hesitant to throw away
*another* processes blocks, at least once you have used more than, say,
25% of the cache
On Sunday 26 January 2003 11:55 pm, Sean Hamilton wrote:
| - Original Message -
| From: "Tim Kientzle" <[EMAIL PROTECTED]>
|
| | Cycling through large data sets is not really that uncommon.
| | I do something like the following pretty regularly:
| | find /usr/src -type f | xargs grep
On Sun, 26 Jan 2003, Sean Hamilton wrote:
>
> In my case I have a webserver serving up a few dozen files of about 10 MB
> each. While yes it is true that I could purchase more memory, and I could
> purchase more drives and stripe them, I am more interested in the fact that
> this server is constan
- Original Message -
From: "Tim Kientzle" <[EMAIL PROTECTED]>
| Cycling through large data sets is not really that uncommon.
| I do something like the following pretty regularly:
| find /usr/src -type f | xargs grep function_name
|
| Even scanning through a large dataset once can really
Sean Hamilton proposes:
Wouldn't it seem logical to have [randomized disk cache expiration] in
place at all times?
Terry Lambert responds:
:I really dislike the idea of random expiration; I don't understand
:the point, unless you are trying to get better numbers on some
>>:benchmark.
Matt
Matthew Dillon wrote:
> :I really dislike the idea of random expiration; I don't understand
> :the point, unless you are trying to get better numbers on some
> :..
>
>Well, the basic scenario is something like this: Lets say you have
>512MB of ram and you are reading a 1GB file sequential
:I really dislike the idea of random expiration; I don't understand
:the point, unless you are trying to get better numbers on some
:..
Well, the basic scenario is something like this: Lets say you have
512MB of ram and you are reading a 1GB file sequentially, over and over
again. The
Matthew Dillon wrote:
> Hi Sean. I've wanted to have a random-disk-cache-expiration feature
> for a long time. We do not have one now. We do have mechanisms in
> place to reduce the impact of sequential cycling a large dataset so
> it does not totally destroy unrelated cached dat
:Greetings,
:
:I have a situation where I am reading large quantities of data from disk
:sequentially. The problem is that as the data is read, the oldest cached
:blocks are thrown away in favor of new ones. When I start re-reading data
:from the beginning, it has to read the entire file from disk
Greetings,
I have a situation where I am reading large quantities of data from disk
sequentially. The problem is that as the data is read, the oldest cached
blocks are thrown away in favor of new ones. When I start re-reading data
from the beginning, it has to read the entire file from disk again.
28 matches
Mail list logo