On Fri Dec 5 14:32:56 EST 2008, [EMAIL PROTECTED] wrote: > But random access patterns suck at being speculatively cached. > Linear access patterns still require reasonably careful work for the > caching to do the right thing. > Expecting your entire frame buffer to be cached in L2 isn't particularly > reasonable. > > Paul
i'm just not convinced that nvidia's poor performance has anything to do with pcie latency or processor stalls. a 500x500 window takes ~1sec to uncover. that's like 2 billion instructions. since a cacheline is ~128 bytes (close enough) that's ~8000 stall opertunities. if it takes all of them, that's only 8 million instructions. on the order of 1/1000th of the actual delay. if WC were the issue, i should see 100x improvement in reading from the card. - erik