On Sat, Mar 15, 2025 at 01:34:03AM +0530, Sahil Siddiq wrote: > Hi, > > On 3/14/25 2:54 AM, Stafford Horne wrote: > > On Tue, Mar 11, 2025 at 12:43:57AM +0530, Sahil Siddiq wrote: > > > Add cacheinfo support for OpenRISC. > > > > > > [...] > > > None of the functions in drivers/base/cacheinfo.c that are capable of > > > pulling these details (e.g.: cache_size) have been used. This is because > > > they pull these details by reading properties present in the device tree > > > file. In setup.c, for example, the value of "clock-frequency" is pulled > > > from the device tree file. > > > > > > Cache related properties are currently not present in OpenRISC's device > > > tree files. > > > > If we want to add L2 caches and define them in the device tree would > > it "just work" or is more work needed? > > > > A little more work will have to be done. The implementation of > "init_cache_level" > and "populate_cache_leaves" will have to be extended. To pull L2 cache > attributes, > they'll need to make calls to the "of_get_property" family of functions > similar to > what's being done for RISC-V and PowerPC. > > Shall I resubmit this patch with those extensions? I think I'll be able to > test > those changes with a modified device tree file that has an L2 cache component.
Since we don't have any such hardware now I don't think its needed. > > > Regarding the "shared_cpu_map" cache attribute, I wasn't able to find > > > anything in the OpenRISC architecture manual to indicate that processors > > > in a multi-processor system may share the same cache component. MIPS uses > > > "globalnumber" to detect siblings. LoongArch uses a "CACHE_PRIVATE" flag > > > to detect siblings sharing the same cache. > > > > In SMP environment the L1 caches are not shared they are specific to each > > CPU. > > > > Also, we do not have hyperthreading in OpenRISC so shared_cpu_map should be > > a > > 1-to-1 mapping with the cpu. Do you need to do extra work to setup that > > mapping? > > > > No extra work has to be done to set up the 1-to-1 mapping. This is already > being > done in "ci_leaf_init()". OK. > > > I am running with the assumption that every OpenRISC core has its own > > > icache and dcache. Given that OpenRISC does not support a multi-level > > > cache architecture and that icache and dcache are like L1 caches, I > > > think this assumption is reasonable. What are your thoughts on this? > > > > Currently this is the case, but it could be possible to create an SoC with > > L2 > > caches. I could imagine these would be outside of the CPU and we could > > define > > them with the device tree. > > In this case, some extra work will have to be done to set the "shared_cpu_map" > appropriately. But I think the modifications will be quite small. If the L2 > cache > is external to all CPUs, then all online CPUs will have their corresponding > bit > set in the "shared_cpu_map". Yes, it could be so. For now, let's not do this as no such hardware exists. > > > Another issue I noticed is that the unit used in ...cache/indexN/size > > > is KB. The actual value of the size is right-shifted by 10 before being > > > reported. When testing these changes using QEMU (and without making any > > > modifications to the values stored in DCCFGR and ICCFGR), the cache size > > > is far smaller than 1KB. Consequently, this is reported as 0K. For cache > > > sizes smaller than 1KB, should something be done to report it in another > > > unit? Reporting 0K seems a little misleading. > > > > I think this is fine, as long as we pass in the correct size in bytes. > > Understood. OK. > > > [...] > > > + > > > +int init_cache_level(unsigned int cpu) > > > +{ > > > + struct cpuinfo_or1k *cpuinfo = &cpuinfo_or1k[smp_processor_id()]; > > > + struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); > > > + int leaves = 0, levels = 0; > > > + unsigned long upr = mfspr(SPR_UPR); > > > + unsigned long iccfgr, dccfgr; > > > + > > > + if (!(upr & SPR_UPR_UP)) { > > > + printk(KERN_INFO > > > + "-- no UPR register... unable to detect > > > configuration\n"); > > > + return -ENOENT; > > > + } > > > + > > > + if (upr & SPR_UPR_DCP) { > > > + dccfgr = mfspr(SPR_DCCFGR); > > > + cpuinfo->dcache.ways = 1 << (dccfgr & SPR_DCCFGR_NCW); > > > + cpuinfo->dcache.sets = 1 << ((dccfgr & SPR_DCCFGR_NCS) >> 3); > > > + cpuinfo->dcache.block_size = 16 << ((dccfgr & SPR_DCCFGR_CBS) > > > >> 7); > > > + cpuinfo->dcache.size = > > > + cpuinfo->dcache.sets * cpuinfo->dcache.ways * > > > cpuinfo->dcache.block_size; > > > + leaves += 1; > > > + printk(KERN_INFO > > > + "-- dcache: %4d bytes total, %2d bytes/line, %d > > > way(s)\n", > > > + cpuinfo->dcache.size, cpuinfo->dcache.block_size, > > > + cpuinfo->dcache.ways); > > > > Can we print the number of sets here too? Also is there a reason to pad > > these > > int's with 4 and 2 spaces? I am not sure the padding is needed. > > > > > + } else > > > + printk(KERN_INFO "-- dcache disabled\n"); > > > + > > > + if (upr & SPR_UPR_ICP) { > > > + iccfgr = mfspr(SPR_ICCFGR); > > > + cpuinfo->icache.ways = 1 << (iccfgr & SPR_ICCFGR_NCW); > > > + cpuinfo->icache.sets = 1 << ((iccfgr & SPR_ICCFGR_NCS) >> 3); > > > + cpuinfo->icache.block_size = 16 << ((iccfgr & SPR_ICCFGR_CBS) > > > >> 7); > > > + cpuinfo->icache.size = > > > + cpuinfo->icache.sets * cpuinfo->icache.ways * > > > cpuinfo->icache.block_size; > > > + leaves += 1; > > > + printk(KERN_INFO > > > + "-- icache: %4d bytes total, %2d bytes/line, %d > > > way(s)\n", > > > + cpuinfo->icache.size, cpuinfo->icache.block_size, > > > + cpuinfo->icache.ways); > > > > Same here. > > > Sure, I'll print the number of sets as well. > > I don't think there's any reason for the padding. It was part of the original > implementation in setup.c. There shouldn't be any issues in removing them. Right, it would be good to fix. > > > [...] > > > seq_printf(m, "frequency\t\t: %ld\n", loops_per_jiffy * HZ); > > > - seq_printf(m, "dcache size\t\t: %d bytes\n", cpuinfo->dcache_size); > > > - seq_printf(m, "dcache block size\t: %d bytes\n", > > > - cpuinfo->dcache_block_size); > > > - seq_printf(m, "dcache ways\t\t: %d\n", cpuinfo->dcache_ways); > > > - seq_printf(m, "icache size\t\t: %d bytes\n", cpuinfo->icache_size); > > > - seq_printf(m, "icache block size\t: %d bytes\n", > > > - cpuinfo->icache_block_size); > > > - seq_printf(m, "icache ways\t\t: %d\n", cpuinfo->icache_ways); > > > seq_printf(m, "immu\t\t\t: %d entries, %lu ways\n", > > > 1 << ((mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTS) >> 2), > > > 1 + (mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTW)); > > > -- > > > 2.48.1 > > > > > > > This pretty much looks ok to me. > > > > Thank you for the review. Thank you. -Stafford