On Sun, 24 Jun 2007, Russell King wrote: > On Sun, Jun 24, 2007 at 11:24:16AM +0100, Hugh Dickins wrote: > > On Sun, 24 Jun 2007, Russell King wrote: > > > On Fri, Jun 22, 2007 at 07:39:33PM +0100, Hugh Dickins wrote: > > > > > > Please forward the original problem report. > > > > Done. > > Okay, that seems to back up my suspicions - it's definitely AT91-based. > Since AT91-based machines do not have a DMA coherent cache, > arch_is_coherent() must be defined to '0'. The only way that kmalloc > could be reached is if that were defined to something other than '0', > and if that's done on a machine with DMA incoherent caches, that will > lead to data corruption.
Yes, having looked through that now, I agree with you 100%. > > I think we need to wait for Nicolas to respond on this issue before > running headlong into applying a sticky plaster for something which is > actually a deeper issue. No need for Nicolas to respond, I think I've found what's "wrong": see below. > > However, the arch_is_coherent() path _is_ buggy as it stands, but in > more than the way identified thus far. Eg, it doesn't set __GFP_DMA > appropriately for various DMA masks, so it might return DMA inaccessible > memory. I expect you're right, but that's a separate issue. I had thought you were approving Christoph's ARM patch because both you and he seemed to agree that kmalloc was inappropriate for use in dma_alloc_coherent, whatever additional issues you saw with it. I still don't see why kmalloc is wrong there myself: for a while I bought Christoph's alignment argument, but now I don't see why (more than long) alignment is important to it. But I'm easily wrong when it comes to DMA mapping issues. > > If we're after a simple fix for 2.6.22, the _easiest_ solution would be > to delete the entire arch_is_coherent() branches in arch/arm/mm/consistent.c; > that will result in a working solution for everyone, albiet at a slightly > lower performance for the DMA-coherent CPUs. The fix for 2.6.22 is my PageSlab test in page_mapping which Linus already put into -git. And I now rather think that needs to stay, not be replaced by the VM_BUG_ON Christoph was proposing for 2.6.23 (which earlier I acked). Christoph responded to my page_mapping patch by looking at arch/arm, and there finding a kmalloc in dma_alloc_coherent which he didn't like; but you're right, it's entirely irrelevant to Nicolas' oops. The slub allocation which gives rise to Nicolas' oops in not in ARM, but (I'm guessing) in drivers/mmc/core/sd.c: one of those status = kmalloc(64, GFP_KERNEL); where status is passed down for the response from mmc_sd_switch. And what is wrong with using kmalloc there? Why should that be changed to allocate a whole page? How many other such cases might there be? And the flush_dcache_page in at91mci_post_dma_read looks correct to me too: it has just filled and perhaps also swabbed a buffer, that buffer might in some cases be mapped into userspace, so it calls flush_dcache_page. In the kmalloc case it's not mapped into userspace: flush_dcache_page should detect that and do nothing, as it does with slab; but slub was reusing page->mapping for something else, so we oopsed. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/