On 4/29/20 3:41 PM, Marek Vasut wrote: > On 4/29/20 11:25 PM, Stephen Warren wrote: >> On 4/29/20 1:56 PM, Marek Vasut wrote: >>> On 4/29/20 9:51 PM, Stephen Warren wrote: >>>> On 4/29/20 1:14 PM, Marek Vasut wrote: >>>>> The DWMAC4 IP has the possibility to skip up to 7 64bit words after >>>>> the descriptor, use this to pad the descriptors to cacheline size. >>>> >>>>> diff --git a/drivers/net/dwc_eth_qos.c b/drivers/net/dwc_eth_qos.c >>>> >>>>> /* Descriptors */ >>>>> +/* >>>> >>>> Nit: Blank line between those two comments to match the existing style. >>>> >>>>> + * #if EQOS_DESCRIPTOR_SIZE < ARCH_DMA_MINALIGN fails if we use sizeof() >>>>> + * or offsetof() to calculate descriptor size, since neither is allowed >>>>> + * in C preprocessor macros, so just hard-code this to 16=4*sizeof(u32). >>>>> + */ >>>>> +#define EQOS_DESCRIPTOR_SIZE 16 >>>>> +#define EQOS_DESCRIPTOR_PAD ((ARCH_DMA_MINALIGN - >>>>> EQOS_DESCRIPTOR_SIZE) / 4) >>>>> +struct eqos_desc { >>>>> + u32 des0; >>>>> + u32 des1; >>>>> + u32 des2; >>>>> + u32 des3; >>>>> + u32 __pad[EQOS_DESCRIPTOR_PAD]; >>>>> +}; >>>> >>>> I think the padding needs to be optional based on the same condition in >>>> the following existing warning logic, and the warning logic may need to >>>> be removed/updated to account for the fact that padding is now being >>>> used to avoid the issue: >>>> >>>> (quoting this from existing code) >>>>> /* >>>>> * Warn if the cache-line size is larger than the descriptor size. In such >>>>> * cases the driver will likely fail because the CPU needs to flush the >>>>> cache >>>>> * when requeuing RX buffers, therefore descriptors written by the >>>>> hardware >>>>> * may be discarded. Architectures with full IO coherence, such as x86, >>>>> do not >>>>> * experience this issue, and hence are excluded from this condition. >>>>> * >>>>> * This can be fixed by defining CONFIG_SYS_NONCACHED_MEMORY which will >>>>> cause >>>>> * the driver to allocate descriptors from a pool of non-cached memory. >>>>> */ >>>>> #if EQOS_DESCRIPTOR_SIZE < ARCH_DMA_MINALIGN >>>>> #if !defined(CONFIG_SYS_NONCACHED_MEMORY) && \ >>>>> !CONFIG_IS_ENABLED(SYS_DCACHE_OFF) && !defined(CONFIG_X86) >>>>> #warning Cache line size is larger than descriptor size >>>>> #endif >>>>> #endif >>>> >>>> (back to quoting from the patch) >>>>> -#define EQOS_DESCRIPTOR_WORDS 4 >>>>> -#define EQOS_DESCRIPTOR_SIZE (EQOS_DESCRIPTOR_WORDS * 4) >>>> ... >>>>> -struct eqos_desc { >>>>> - u32 des0; >>>>> - u32 des1; >>>>> - u32 des2; >>>>> - u32 des3; >>>>> -}; >>>> >>>> This patch would be a lot smaller if those definitions were kept in >>>> their existing location instead of moving them... >>> >>> Could you at least test it on the tegra ? >> >> This patch (applied on top of current u-boot/master) does cause network >> failures on Jetson TX2 (Tegra186), which obviously uses this driver. > > That's what I was afraid of. > >> The docs for our version of the IP block do indicate that the DSL shift >> is supported, so I think the patch should work. I wonder if it's because >> of the bus width of our EQoS IP block? > > The STM32MP1 docs say that the shift is in 64bit words, maybe that > actually translate into bus words rather than 8 bytes indeed ?
I assume the STM32 doc writer only wrote the specific case that applies to their HW, which presumably has a 64-bit bus. Our docs are IIRC unmodified from the generic IP docs that Synopsis supplied, so mention all the cases and leave the reader to figure out which option applies! > btw what is your ARCH_DMA_MINALIGN on t186, 64 ? It's 64 I believe, since ARM64 selects SYS_CACHE_SHIFT_6, which then triggers the logic below, and I don't think Tegra overrides any of this: config SYS_CACHELINE_SIZE int default 64 if SYS_CACHE_SHIFT_6 > I'm worried that if something has it set to 128 or more, then this DSL > padding won't be able to cover it, since it can only skip up to 7 64bit > words. Oh yes, I meant to mention that in the patch; it would be good to validate the calculated DSL value doesn't overflow its field width. >> Our docs indicate that DSL is a >> multiple of the bus width, so describes a skip of word/dword/lword count >> based on 32/64/128 bus width. Perhaps the bus width needs to be >> parametrized when calculating the value? Our docs don't seem to indicate >> which bus width our HW uses:-( Aha, but changing the "/2" to "/4" in the >> code in this patch which calculates DSL value does yield a working >> system, so I guess we have a 128-bit bus width. And indeed this matches >> a comment I wrote in the driver: >> >> /* >> * Burst length must be < 1/2 FIFO size. >> * FIFO size in tqs is encoded as (n / 256) - 1. >> * Each burst is n * 8 (PBLX8) * 16 (AXI width) == 128 bytes. >> >> 16 byte AXI width == 128 bit width. > > So how do we parametrize that, based on compatible string I guess ? > We already have params for the tegra186 and stm32mp1 variants of the IP, > so adding one more should be OK. Yes, adding another field to struct eqos_config would make sense. > Also, thanks for looking into it.