> 4. Even if we could accurately estimate the percentage of the table > that is cached, what then? For example, suppose that a user issues a > query which retrieves 1% of a table, and we know that 1% of that table > is cached. How much of the data that the user asked for is cache? > Hard to say, right? It could be none of it or all of it. The second > scenario is easy to imagine - just suppose the query's been executed > twice. The first scenario isn't hard to imagine either. > > I have a set of slow disks which can impact performance nearly as much as in cached in memory versus the fast disks.
How practical would it be for analyze to keep a record of response times for given sections of a table as it randomly accesses them and generate some kind of a map for expected response times for the pieces of data it is analysing? It may well discover, on it's own, that recent data (1 month old or less) has a random read response time of N, older data (1 year old) in a different section of the relation tends to have a response time of 1000N, and really old data (5 year old) tends to have a response time of 3000N.