On Sat, 20 Jan 2024 at 16:35, vignesh C <vignes...@gmail.com> wrote: > I'm seeing that there has been no activity in this thread for more > than 6 months, I'm planning to close this in the current commitfest > unless someone is planning to take it forward.
Thanks for the reminder about this. Since the heapgettup/heapgettup_pagemode refactor I was unable to see the same performance gains as I was before. Also, since reading "The Art of Writing Efficient Programs" I'm led to believe that modern processor hardware prefetchers can detect and prefetch on both forward and backward access patterns. I also saw some discussion on twitter about this [1]. I'm not sure yet how this translates to non-uniform access patterns, e.g. tuples are varying cachelines apart and we do something like only deform attributes in the first cacheline. Will the prefetcher still see the pattern in this case? If it's non-uniform, then how does it know which cacheline to fetch? If the tuple spans multiple cacheline and we deform the whole tuple, will accessing the next cacheline in a forward direction make the hardware prefetcher forget about the more general backward access that's going on? These are questions I'll need to learn the answers to before I can understand what's the best thing to do in this area. The only way to tell is to design a benchmark and see how far we can go before the hardware prefetcher no longer detects the pattern. I've withdrawn the patch. I can resubmit once I've done some more experimentation if that experimentation yields positive results. David [1] https://twitter.com/ID_AA_Carmack/status/1470832912149135360