On Wed, 20 Jul 2011 08:36:12 -0700 "J. William Campbell" <jwilliamcampb...@comcast.net> wrote:
> On 7/20/2011 7:35 AM, Albert ARIBAUD wrote: > > Le 20/07/2011 16:01, J. William Campbell a écrit : > >> On 7/20/2011 6:02 AM, Albert ARIBAUD wrote: > >>> Le 19/07/2011 22:11, J. William Campbell a écrit : > >>> > >>>> If this is true, then it means that the cache is of type write-back > >>>> (as > >>>> opposed to write-thru). From a (very brief) look at the arm7 > >>>> manuals, it > >>>> appears that both types of cache may be present in the cpu. Do you > >>>> know > >>>> how this operates? > >>> Usually, copyback (rather than writeback) and writethough are modes of > >>> operation, not cache types. > >> Hi Albert, > >> One some CPUs both cache modes are available. On many other CPUs (I > >> would guess most), you have one fixed mode available, but not both. I > >> have always seen the two modes described as write-back and > >> write-through, but I am sure we are talking about the same things. > > > > We are. Copy-back is another name for write-back, not used by ARM but > > by some others. > > > >> The > >> examples that have both modes that I am familiar with have the mode as a > >> "global" setting. It is not controlled by bits in the TLB or anything > >> like that. How does it work on ARM? Is it fixed, globally, globally > >> controlled, or controlled by memory management? > > > > Well, it's a bit complicated, because it depends on the architecture > > version *and* implementation -- ARM themselves do not mandate things, > > and it is up to the SoC designer to specify what cache they want and > > what mode it supports, both at L1 and L2, in their specific instance > > of ARM cores. And yes, you can have memory areas that are write-back > > and others that are write-through in the same system. > > > >> If it is controlled by memory management, it looks to me like lots of > >> problems could be avoided by operating with input type buffers set as > >> write-through. One probably isn't going to be writing to input buffers > >> much under program control anyway, so the performance loss should be > >> minimal. This gets rid of the alignment restrictions on these buffers > >> but not the invalidate/flush requirements. > > > > There's not much you can do about alignment issues except align to > > cache line boundaries. > > > >> However, if memory management > >> is required to set the cache mode, it might be best to operate with the > >> buffers and descriptors un-cached. That gets rid of the flush/invalidate > >> requirement at the expense of slowing down copying from read buffers. > > > > That makes 'best' a subjective choice, doesn't it? :) > Hi All, > Yes,it probably depends on the usage. > > > >> Probably a reasonable price to pay for the associated simplicity. > > > > Others would say that spending some time setting up alignments and > > flushes and invalidates is a reasonable price to pay for increased > > performance... That's an open debate where no solution is The Right > > One(tm). > > > > For instance, consider the TFTP image reading. People would like the > > image to end up in cached memory because we'll do some checksumming on > > it before we give it control, and having it cached makes this step > > quite faster; but we'll lose that if we put it in non-cached memory > > because it comes through the Ethernet controller's DMA; and it would > > be worse to receive packets in non-cached memory only to move their > > contents into cached memory later on. > > > > I think properly aligning descriptors and buffers is enough to avoid > > the mixed flush/invalidate line issue, and wisely putting instruction > > barriers should be enough to get the added performance of cache > > without too much of the hassle of memory management. > I am pretty sure that all the drivers read the input data into > intermediate buffers in all cases. There is no practical way to be sure > the next packet received is the "right one" for the tftp. Plus there are > headers involved, and furthermore there is no way to ensure that a tftp > destination is located on a sector boundary. In short, you are going to > copy from an input buffer to a destination. > However, it is still correct that copying from an non-cached area is > slower than from cached areas, because of burst reads vs. individual > reads. However, I doubt that the u-boot user can tell the difference, as > the network latency will far exceed the difference in copy time. The > question is, which is easier to do, and that is probably a matter of > opinion. However, it is safe to say that so far a cached solution has > eluded us. That may be changing, but it would still be nice to know how > to allocate a section of un-cached RAM in the ARM processor, in so far > as the question has a single answer! That would allow easy portability > of drivers that do not know about caches, of which there seems to be many. I agree. Unfortunately, my time is up for now, and I can't go on with trying to fix this driver. Maybe I'll pick up after my vacation. As for now I settled for the ugly solution of keeping dcache disabled while ethernet is being used :-( IMHO, doing cache maintenance all over the driver is not an easy or nice solution. Implementing a non-cached memory pool in the MMU and a corresponding dma_malloc() sounds like much more universally applicable to any driver. Best regards, -- David Jander Protonic Holland. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot