> On 03/02/2012 04:39 PM, Marek Vasut wrote: > >> ensure that transmit and receive buffers are cache-line aligned > >> > >> invalidate cache after each packet received > >> flush cache before transmitting > >> > >> Signed-off-by: Eric Nelson<eric.nel...@boundarydevices.com> > >> > >> --- > >> > >> drivers/net/fec_mxc.c | 248 > >> > >> ++++++++++++++++++++++++++++++++++++------------- drivers/net/fec_mxc.h > >> | > >> > >> 19 +---- > >> 2 files changed, 184 insertions(+), 83 deletions(-) > >> > >> diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c > >> index 1fdd071..f72304b 100644 > >> --- a/drivers/net/fec_mxc.c > >> +++ b/drivers/net/fec_mxc.c > >> > >> <snip> > >> > >> /* > >> > >> * This routine transmits one frame. This routine only accepts > >> > >> @@ -650,15 +708,21 @@ static int fec_send(struct eth_device *dev, > >> volatile void* packet, int length) } > >> > >> /* > >> > >> - * Setup the transmit buffer > >> - * Note: We are always using the first buffer for transmission, > >> - * the second will be empty and only used to stop the DMA engine > >> + * Setup the transmit buffer. We are always using the first buffer for > >> + * transmission, the second will be empty and only used to stop the > >> DMA + * engine. We also flush the packet to RAM here to avoid cache > >> trouble. > >> > >> */ > >> > >> #ifdef CONFIG_FEC_MXC_SWAP_PACKET > >> > >> swap_packet((uint32_t *)packet, length); > >> > >> #endif > >> > >> - writew(length,&fec->tbd_base[fec->tbd_index].data_length); > >> - writel((uint32_t)packet,&fec->tbd_base[fec->tbd_index].data_pointer); > >> + > >> + addr = (uint32_t)packet; > >> + size = roundup(length, CONFIG_FEC_DATA_ALIGNMENT); > >> + flush_dcache_range(addr, addr + size); > > > > What if size of the packet isn't aligned on cacheline boundary? Won't it > > choke here? > > Here's a quandary... > > flush_dcache_range() can potentially force writes into unintended areas, > which could conceivably include areas owned by the ethernet receiver if > the source object weren't aligned to a cache-line boundary and size. > > In practice, it appears that net/net.c only transmits from one of two > places:
You can also invalidate memory and loose some information. Though I'm not quite sure this driver can be the case. > - a dedicated transmit buffer (NetTxPacket), which is guaranteed > to be aligned to PKTALIGN (32). > - a receive buffer (ICMP and ARP replies) > > The networking API certainly allows for transmits from arbitrary > areas in memory, but I'm not seeing where or how this can be invoked. OK > > Because there's no way to query how the memory for a packet is > allocated, the only way around this I can see is to memcpy to > a dedicated transmit buffer within fec_mxc.c, which would defeat > any benefit of cache. Indeed ... a bounce buffer. But such a bounce buffer should be implemented close to the place where the misalignment originates. If the net core can align packets everywhere well, we're safe and happy. > > AFAICT, all of the other calls to 'flush_dcache_range()' are okay because > they're dealing with buffers allocated by the driver. > I'd love to see someone else review this. This is very important to be made right. > >> + > >> + fec->tbd_base[fec->tbd_index].data_length = length; > >> + fec->tbd_base[fec->tbd_index].data_pointer = addr; > >> + > >> > >> /* > >> > >> * update BD's status now > >> > >> * This block: > >> @@ -667,9 +731,18 @@ static int fec_send(struct eth_device *dev, > >> volatile void* packet, int length) * - might be the last BD in the > >> list, so the address counter should * wrap (-> keep the WRAP flag) > >> > >> */ > >> > >> - status = readw(&fec->tbd_base[fec->tbd_index].status)& FEC_TBD_WRAP; > >> + status = fec->tbd_base[fec->tbd_index].status& FEC_TBD_WRAP; > >> > >> status |= FEC_TBD_LAST | FEC_TBD_TC | FEC_TBD_READY; > >> > >> - writew(status,&fec->tbd_base[fec->tbd_index].status); > >> + fec->tbd_base[fec->tbd_index].status = status; > >> + > >> + /* > >> + * Flush data cache. This code flushes both TX descriptors to RAM. > >> + * After this code this code, the descritors will be safely in RAM > >> + * and we can start DMA. > >> + */ > >> + size = roundup(2 * sizeof(struct fec_bd), CONFIG_FEC_DATA_ALIGNMENT); > >> + addr = (uint32_t)fec->tbd_base; > >> + flush_dcache_range(addr, addr + size); > > > > Same concern here and everywhere else ... I believe aligning it like this > > is quite unsafe :-( > > This one looks safe because you've controlled the allocation of tbd_base. > > >> <snip> > >> > >> @@ -751,6 +846,13 @@ static int fec_recv(struct eth_device *dev) > >> > >> frame = (struct nbuf *)readl(&rbd->data_pointer); > >> frame_length = readw(&rbd->data_length) - 4; > >> /* > >> > >> + * Invalidate data cache over the buffer > >> + */ > >> + addr = (uint32_t)frame; > >> + size = roundup(frame_length, CONFIG_FEC_DATA_ALIGNMENT); > >> + invalidate_dcache_range(addr, addr + size); > > > > DTTO here, frame length might not be aligned properly, or will it be? > > Network stack must be properly analyzed here. > > The hardware won't return an unaligned value here, so this should be good. Are you sure? You can't receive frame aligned to 8 bytes boundary? > > >> <snip> > >> > >> @@ -765,11 +867,27 @@ static int fec_recv(struct eth_device *dev) > >> > >> (ulong)rbd->data_pointer, > >> bd_status); > >> > >> } > >> > >> + > >> > >> /* > >> > >> - * free the current buffer, restart the engine > >> - * and move forward to the next buffer > >> + * Free the current buffer, restart the engine and move forward > >> + * to the next buffer. Here we check if the whole cacheline of > >> + * descriptors was already processed and if so, we mark it free > >> + * as whole. > >> > >> */ > >> > >> - fec_rbd_clean(fec->rbd_index == (FEC_RBD_NUM - 1) ? 1 : 0, rbd); > >> + size = (CONFIG_FEC_DATA_ALIGNMENT / sizeof(struct fec_bd)) - 1; > >> + if ((fec->rbd_index& size) == size) { > >> + i = fec->rbd_index - size; > >> + addr = (uint32_t)&fec->rbd_base[i]; > >> + for (; i<= fec->rbd_index + size; i++) { > >> + if (i == (FEC_RBD_NUM - 1)) > >> + fec_rbd_clean(1,&fec->rbd_base[i]); > >> + else > >> + fec_rbd_clean(0,&fec->rbd_base[i]); > >> + } > >> + flush_dcache_range(addr, > >> + addr + CONFIG_FEC_DATA_ALIGNMENT); > >> + } > >> + > > > > This was the worst part. I don't quite remember how and why I did those > > alignment decisions (well it's obvious here, you flush rxdesc once whole > > cacheline is done), but some of the pieces are far less obvious now that > > I read the code. > > I'm not grokking this one either. Definitely needs fresher eyes than I have > at the moment. Indeed, I'm thinking the same thing. Let's sleep on this! Thanks for your good work on putting this into shape! M _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot