Hi Sahil, On Tue, Mar 18, 2025 at 12:06:30AM +0530, Sahil Siddiq wrote: > Hello Stafford and Geert, > > Thank you for the reviews. > > On 3/17/25 1:55 PM, Geert Uytterhoeven wrote: > > On Sun, 16 Mar 2025 at 07:59, Stafford Horne <sho...@gmail.com> wrote: > > > [...] > > > > +struct cache_desc { > > > > + u32 size; > > > > + u32 sets; > > > > + u32 block_size; > > > > + u32 ways; > > > > > > Considering the changes below to add cache available checks, maybe we > > > want to add a field here, such as `bool present`. Or a flags field like > > > is used in loongarch? > > > > I assume cache_desc.size is zero when the cache is not present? > > Yes, cache_desc.size will be zero when there's no cache component since > cpuinfo_or1k[NR_CPUS] is declared as a global variable in setup.c. The > cache attributes are stored in this struct. > > On 3/17/25 5:30 PM, Stafford Horne wrote: > > [...] > > Yes, good point, would be clean too work too. I was not too happy with > > using > > cache_desc.ways as is done below. Also there ended up bieng 2 different > > ways > > that were used. > > > > I am happy to use size too, but I think checking the SPR would be faster or > > just > > as fast as using the struct. I am not too fussed either way. > > There are a few places (e.g. cache_loop() in cache.c) where > cpuinfo_or1k[smp_processor_id()] is not being referenced. This will have to be > referenced to access cache_desc. In these cases, I think it would be better to > read from the SPR instead. For consistency, the SPR can also be read where > cpuinfo_or1k[smp_processor_id()] is already being referenced. > > On 3/16/25 12:28 PM, Stafford Horne wrote: > > On Sun, Mar 16, 2025 at 02:09:37AM +0530, Sahil Siddiq wrote: > > > Add cacheinfo support for OpenRISC. > > > [...] > > > diff --git a/arch/openrisc/kernel/cacheinfo.c > > > b/arch/openrisc/kernel/cacheinfo.c > > > new file mode 100644 > > > index 000000000000..6bb81e246f7e > > > --- /dev/null > > > +++ b/arch/openrisc/kernel/cacheinfo.c > > > @@ -0,0 +1,106 @@ > > > +// SPDX-License-Identifier: GPL-2.0-or-later > > > +/* > > > + * OpenRISC cacheinfo support > > > + * > > > + * Based on work done for MIPS and LoongArch. All original copyrights > > > + * apply as per the original source declaration. > > > + * > > > + * OpenRISC implementation: > > > + * Copyright (C) 2025 Sahil Siddiq <sahil...@proton.me> > > > + */ > > > + > > > +#include <linux/cacheinfo.h> > > > +#include <asm/cpuinfo.h> > > > +#include <asm/spr.h> > > > +#include <asm/spr_defs.h> > > > + > > > +static inline void ci_leaf_init(struct cacheinfo *this_leaf, enum > > > cache_type type, > > > + unsigned int level, struct cache_desc *cache, > > > int cpu) > > > +{ > > > + this_leaf->type = type; > > > + this_leaf->level = level; > > > + this_leaf->coherency_line_size = cache->block_size; > > > + this_leaf->number_of_sets = cache->sets; > > > + this_leaf->ways_of_associativity = cache->ways; > > > + this_leaf->size = cache->size; > > > + cpumask_set_cpu(cpu, &this_leaf->shared_cpu_map); > > > +} > > > + > > > +int init_cache_level(unsigned int cpu) > > > +{ > > > + struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()]; > > > + struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); > > > + int leaves = 0, levels = 0; > > > + unsigned long upr = mfspr(SPR_UPR); > > > + unsigned long iccfgr, dccfgr; > > > + > > > + if (!(upr & SPR_UPR_UP)) { > > > + printk(KERN_INFO > > > + "-- no UPR register... unable to detect > > > configuration\n"); > > > + return -ENOENT; > > > + } > > > + > > > + if (upr & SPR_UPR_DCP) { > > > + dccfgr = mfspr(SPR_DCCFGR); > > > + cpuinfo->dcache.ways = 1 << (dccfgr & SPR_DCCFGR_NCW); > > > + cpuinfo->dcache.sets = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3); > > > + cpuinfo->dcache.block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) > > > >> 7); > > > + cpuinfo->dcache.size = > > > + cpuinfo->dcache.sets * cpuinfo->dcache.ways * > > > cpuinfo->dcache.block_size; > > > + leaves += 1; > > > + printk(KERN_INFO > > > + "-- dcache: %d bytes total, %d bytes/line, %d set(s), %d > > > way(s)\n", > > > + cpuinfo->dcache.size, cpuinfo->dcache.block_size, > > > + cpuinfo->dcache.sets, > > > + cpuinfo->dcache.ways); > > > > The indentation of sets looks a bit off here. > > > > > + } else > > > + printk(KERN_INFO "-- dcache disabled\n"); > > > + > > > + if (upr & SPR_UPR_ICP) { > > > + iccfgr = mfspr(SPR_ICCFGR); > > > + cpuinfo->icache.ways = 1 << (iccfgr & SPR_ICCFGR_NCW); > > > + cpuinfo->icache.sets = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3); > > > + cpuinfo->icache.block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) > > > >> 7); > > > + cpuinfo->icache.size = > > > + cpuinfo->icache.sets * cpuinfo->icache.ways * > > > cpuinfo->icache.block_size; > > > + leaves += 1; > > > + printk(KERN_INFO > > > + "-- icache: %d bytes total, %d bytes/line, %d set(s), %d > > > way(s)\n", > > > + cpuinfo->icache.size, cpuinfo->icache.block_size, > > > + cpuinfo->icache.sets, > > > + cpuinfo->icache.ways); > > > > The indentation of sets looks a bit off here. Maybe its the others that are > > out > > of line, but can you fix? Also I think sets and ways could be on the same > > line. > > Sorry, I must have missed this. I'll fix it. > > > > [...] > > > diff --git a/arch/openrisc/kernel/dma.c b/arch/openrisc/kernel/dma.c > > > index b3edbb33b621..ffb161e41e9d 100644 > > > --- a/arch/openrisc/kernel/dma.c > > > +++ b/arch/openrisc/kernel/dma.c > > > @@ -36,8 +36,10 @@ page_set_nocache(pte_t *pte, unsigned long addr, > > > flush_tlb_kernel_range(addr, addr + PAGE_SIZE); > > > /* Flush page out of dcache */ > > > - for (cl = __pa(addr); cl < __pa(next); cl += cpuinfo->dcache_block_size) > > > - mtspr(SPR_DCBFR, cl); > > > + if (cpuinfo->dcache.ways) { > > > + for (cl = __pa(addr); cl < __pa(next); cl += > > > cpuinfo->dcache.block_size) > > > + mtspr(SPR_DCBFR, cl); > > > + } > > > > I think it would be better to move this to cacheflush.h as a function like > > flush_dcache_range() or local_dcache_range_flush(). > > > > > return 0; > > > } > > > @@ -104,15 +106,19 @@ void arch_sync_dma_for_device(phys_addr_t addr, > > > size_t size, > > > switch (dir) { > > > case DMA_TO_DEVICE: > > > /* Flush the dcache for the requested range */ > > > - for (cl = addr; cl < addr + size; > > > - cl += cpuinfo->dcache_block_size) > > > - mtspr(SPR_DCBFR, cl); > > > + if (cpuinfo->dcache.ways) { > > > + for (cl = addr; cl < addr + size; > > > + cl += cpuinfo->dcache.block_size) > > > + mtspr(SPR_DCBFR, cl); > > > + } > > > > Also here,I think it would be better to move this to cacheflush.h as a > > function like > > flush_dcache_range(). > > > > Or, local_dcache_range_flush(), which seems to be the convention we use in > > cacheflush.h/cache.c. > > > > > > > break; > > > case DMA_FROM_DEVICE: > > > /* Invalidate the dcache for the requested range */ > > > - for (cl = addr; cl < addr + size; > > > - cl += cpuinfo->dcache_block_size) > > > - mtspr(SPR_DCBIR, cl); > > > + if (cpuinfo->dcache.ways) { > > > + for (cl = addr; cl < addr + size; > > > + cl += cpuinfo->dcache.block_size) > > > + mtspr(SPR_DCBIR, cl); > > > + } > > > > This one could be invalidate_dcache_range(). Note, this will also be > > useful > > for the kexec patches that I am working on. > > > > Or, local_dcache_range_inv(), which seems to be the convention we use in > > cacheflush.h/cache.c. > > Understood. > > > > [...] > > > @@ -29,13 +34,13 @@ static __always_inline void cache_loop(struct page > > > *page, const unsigned int reg > > > void local_dcache_page_flush(struct page *page) > > > { > > > - cache_loop(page, SPR_DCBFR); > > > + cache_loop(page, SPR_DCBFR, SPR_UPR_DCP); > > > } > > > EXPORT_SYMBOL(local_dcache_page_flush); > > > void local_icache_page_inv(struct page *page) > > > { > > > - cache_loop(page, SPR_ICBIR); > > > + cache_loop(page, SPR_ICBIR, SPR_UPR_ICP); > > > } > > > EXPORT_SYMBOL(local_icache_page_inv); > > > > OK, if we move the range flush and invalidate here we will need to add to > > this > > cache_loop a bit more. > > Right. I'll see what can be done to keep it concise. > > > > diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c > > > index d0cb1a0126f9..bbe16546c5b9 100644 > > > --- a/arch/openrisc/mm/init.c > > > +++ b/arch/openrisc/mm/init.c > > > @@ -124,6 +124,7 @@ static void __init map_ram(void) > > > void __init paging_init(void) > > > { > > > int i; > > > + unsigned long upr; > > > printk(KERN_INFO "Setting up paging and PTEs.\n"); > > > @@ -176,8 +177,11 @@ void __init paging_init(void) > > > barrier(); > > > /* Invalidate instruction caches after code modification */ > > > - mtspr(SPR_ICBIR, 0x900); > > > - mtspr(SPR_ICBIR, 0xa00); > > > + upr = mfspr(SPR_UPR); > > > + if (upr & SPR_UPR_UP & SPR_UPR_ICP) { > > > + mtspr(SPR_ICBIR, 0x900); > > > + mtspr(SPR_ICBIR, 0xa00); > > > + } > > > > Here we could use new utilities such as local_icache_range_inv(0x900, > > L1_CACHE_BYTES); > > > > Or something like local_icache_block_inv(0x900). This only needs to flush a > > single block as the code it is invalidating is just 2 instructions 8 bytes: > > > > .org 0x900 > > l.j boot_dtlb_miss_handler > > l.nop > > > > .org 0xa00 > > l.j boot_itlb_miss_handler > > l.nop > > Given that there'll be generic local_(i|d)cache_range_inv(start, stop) utility > functions, would it make sense to simply have a macro defined as: > > #define local_icache_block_inv(addr) local_icache_range_inv(start, > L1_CACHE_BYTES) > > instead of having a separate function for invalidating a single cache line? > This would > still use cache_loop() under the hood. The alternative would be to use > local_icache_range_inv(start, L1_CACHE_BYTES) directly but using the macro > might be > more readable.
Yes, I think a macro would be fine. Should we use cache_desc.block_size or L1_CACHE_BYTES? It doesn't make much difference as L1_CACHE_BYTES is defined as 16 bytes which is the minimum block size and using that will always invalidate a whole block. It would be good to have a comment explaining why using L1_CACHE_BYTES is enough. -Stafford