On Wed, Mar 27, 2019 at 7:08 PM Gary Guo <g...@garyguo.net> wrote: > > On 27/03/2019 11:42, Anup Patel wrote: > > On Wed, Mar 27, 2019 at 4:57 PM Gary Guo <g...@garyguo.net> wrote: > >> > >> Hi Anup, > >> > >> This won't work in an actual hardware with ASID support. There're more > > > > Can you elaborate why > > > > > This implementation is based on Linux ARM64 ASID allocator which is > > tested for large number of CPUs on real HW. > > > >> interactions with TLB flushes that need to be considered. You won't see > > > > Yap, already considered. Please point me to unhandled case. > > > When memory mapping is changed, you need to flush it from all cores that > previously have that process executed, etc. Our patches both take > inspiration from ARM's code, but the major difference between my code is > handling of cache invalidations, see my code's cache_mask, etc. This is > actually the most error-prone part, and I spent more time trying to find > an optimal solution for this than porting the ASID allocator. The major > difference is that ARM has a much more expressive sets of TLB flush > instructions compared to RISC-V.
We should not require explicit cache maintenance anywhere in RISC-V because caches are transparent to SW in RISC-V. The HW can use "sfence.vma" hints for cache maintenance. Further, (just like ARM world) the page table walks are cache coherent in RISC-V so we should not require any cache flushes along with TLB flushes. Now if you seeing inconsistent cache contents then it might due to some bug in your HW. I am just guessing here. Regarding changing memory mapping, I am sure generic Linux kernel will issue appropriate flush_tlb_xyz() calls. > > > >> this on both QEMU and SiFive board, as QEMU does not have ASID, so it > >> pretends that ASID is supported by just flushing its TLB everytime you > > > > Nope, it does not. It detects whether ASID is supported or not. If supported > > it will also figure-out number of ASID bits supported by HW. > > > Except that you can detect that QEMU supports ASID, but actually it does > not. However QEMU is still correct because it always flush TLB when you > set SATP/SPTBR. You won't be able to find out bugs in your code by just > testing on QEMU. I am not advocating that testing on QEMU is sufficient. Its just functionally correct and works on HW without ASID support. > > > SiFive board does not have ASID bits so this implementation successfully > > detects that number of ASID bits are 0 and fallbacks to original way of > > local TLB flushes. > > >> change sptbr. I suspect the performance gain you see is just due to > >> saved TLB flush as TLB flush is super expensive in QEMU (all translation > >> block jumps need to be cleared). > > > > Yes, performance gain is due to saved TLB flushes. > > > > On HW which supports ASID bits, we will see more performance > > improvements. > > A hardware TLB flush is cheaper than QEMU' TLB flush. As no hardware > supports ASID at the moment the performance gain is minimal. > >> > >> I have my version here https://github.com/nbdd0121/linux/tree/asid. I > >> haven't done code cleanups yet, but this version has correctness of its > >> ASID code tested on our TLB simulator tool (which unfortunately I can't > >> share right now as it involves with unpublished works). > > > > Except few minor differences. You version of ASID allocator is same as > > mine. In fact there are lot of similar code framgements in your version > > compared to Linux ARM64 as well. I am sure this patch will work for you. > > >> > >> In fact my submit my previous patch series exactly as the basis of this > >> patch. > > > > This patch is based your patch series so I suggest you take this patch > > and try it on your simulator. > > > I've tested, and it does not boot. Thanks for the info. Now help me make this patch better. Regards, Anup