On 03/02/2012 04:39 PM, Marek Vasut wrote:
        ensure that transmit and receive buffers are cache-line aligned
         invalidate cache after each packet received
         flush cache before transmitting

Signed-off-by: Eric Nelson<eric.nel...@boundarydevices.com>
>> ---
>>   drivers/net/fec_mxc.c |  248
>> ++++++++++++++++++++++++++++++++++++------------- drivers/net/fec_mxc.h |
>>   19 +----
>>   2 files changed, 184 insertions(+), 83 deletions(-)
>>
>> diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
>> index 1fdd071..f72304b 100644
>> --- a/drivers/net/fec_mxc.c
>> +++ b/drivers/net/fec_mxc.c
>>
>> <snip>
>>
        /*
         * This routine transmits one frame.  This routine only accepts
@@ -650,15 +708,21 @@ static int fec_send(struct eth_device *dev, volatile
void* packet, int length) }

        /*
-        * Setup the transmit buffer
-        * Note: We are always using the first buffer for transmission,
-        * the second will be empty and only used to stop the DMA engine
+        * Setup the transmit buffer. We are always using the first buffer for
+        * transmission, the second will be empty and only used to stop the DMA
+        * engine. We also flush the packet to RAM here to avoid cache trouble.
         */
  #ifdef        CONFIG_FEC_MXC_SWAP_PACKET
        swap_packet((uint32_t *)packet, length);
  #endif
-       writew(length,&fec->tbd_base[fec->tbd_index].data_length);
-       writel((uint32_t)packet,&fec->tbd_base[fec->tbd_index].data_pointer);
+
+       addr = (uint32_t)packet;
+       size = roundup(length, CONFIG_FEC_DATA_ALIGNMENT);
+       flush_dcache_range(addr, addr + size);

What if size of the packet isn't aligned on cacheline boundary? Won't it choke
here?


Here's a quandary...

flush_dcache_range() can potentially force writes into unintended areas,
which could conceivably include areas owned by the ethernet receiver if
the source object weren't aligned to a cache-line boundary and size.

In practice, it appears that net/net.c only transmits from one of two
places:
        - a dedicated transmit buffer (NetTxPacket), which is guaranteed
        to be aligned to PKTALIGN (32).
        - a receive buffer (ICMP and ARP replies)

The networking API certainly allows for transmits from arbitrary
areas in memory, but I'm not seeing where or how this can be invoked.

Because there's no way to query how the memory for a packet is
allocated, the only way around this I can see is to memcpy to
a dedicated transmit buffer within fec_mxc.c, which would defeat
any benefit of cache.

AFAICT, all of the other calls to 'flush_dcache_range()' are okay because
they're dealing with buffers allocated by the driver.

+
+       fec->tbd_base[fec->tbd_index].data_length = length;
+       fec->tbd_base[fec->tbd_index].data_pointer = addr;
+
        /*
         * update BD's status now
         * This block:
@@ -667,9 +731,18 @@ static int fec_send(struct eth_device *dev, volatile
void* packet, int length) * - might be the last BD in the list, so the
address counter should *   wrap (->  keep the WRAP flag)
         */
-       status = readw(&fec->tbd_base[fec->tbd_index].status)&  FEC_TBD_WRAP;
+       status = fec->tbd_base[fec->tbd_index].status&  FEC_TBD_WRAP;
        status |= FEC_TBD_LAST | FEC_TBD_TC | FEC_TBD_READY;
-       writew(status,&fec->tbd_base[fec->tbd_index].status);
+       fec->tbd_base[fec->tbd_index].status = status;
+
+       /*
+        * Flush data cache. This code flushes both TX descriptors to RAM.
+        * After this code this code, the descritors will be safely in RAM
+        * and we can start DMA.
+        */
+       size = roundup(2 * sizeof(struct fec_bd), CONFIG_FEC_DATA_ALIGNMENT);
+       addr = (uint32_t)fec->tbd_base;
+       flush_dcache_range(addr, addr + size);

Same concern here and everywhere else ... I believe aligning it like this is
quite unsafe :-(

This one looks safe because you've controlled the allocation of tbd_base.


<snip>

@@ -751,6 +846,13 @@ static int fec_recv(struct eth_device *dev)
                        frame = (struct nbuf *)readl(&rbd->data_pointer);
                        frame_length = readw(&rbd->data_length) - 4;
                        /*
+                        * Invalidate data cache over the buffer
+                        */
+                       addr = (uint32_t)frame;
+                       size = roundup(frame_length, CONFIG_FEC_DATA_ALIGNMENT);
+                       invalidate_dcache_range(addr, addr + size);

DTTO here, frame length might not be aligned properly, or will it be? Network
stack must be properly analyzed here.


The hardware won't return an unaligned value here, so this should be good.


>> <snip>
>>
@@ -765,11 +867,27 @@ static int fec_recv(struct eth_device *dev)
                                                (ulong)rbd->data_pointer,
                                                bd_status);
                }
+
                /*
-                * free the current buffer, restart the engine
-                * and move forward to the next buffer
+                * Free the current buffer, restart the engine and move forward
+                * to the next buffer. Here we check if the whole cacheline of
+                * descriptors was already processed and if so, we mark it free
+                * as whole.
                 */
-               fec_rbd_clean(fec->rbd_index == (FEC_RBD_NUM - 1) ? 1 : 0, rbd);
+               size = (CONFIG_FEC_DATA_ALIGNMENT / sizeof(struct fec_bd)) - 1;
+               if ((fec->rbd_index&  size) == size) {
+                       i = fec->rbd_index - size;
+                       addr = (uint32_t)&fec->rbd_base[i];
+                       for (; i<= fec->rbd_index + size; i++) {
+                               if (i == (FEC_RBD_NUM - 1))
+                                       fec_rbd_clean(1,&fec->rbd_base[i]);
+                               else
+                                       fec_rbd_clean(0,&fec->rbd_base[i]);
+                       }
+                       flush_dcache_range(addr,
+                               addr + CONFIG_FEC_DATA_ALIGNMENT);
+               }
+

This was the worst part. I don't quite remember how and why I did those
alignment decisions (well it's obvious here, you flush rxdesc once whole
cacheline is done), but some of the pieces are far less obvious now that I read
the code.


I'm not grokking this one either. Definitely needs fresher eyes than I have at
the moment.
_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot

Reply via email to