On 11/10/2011 07:05 PM, Jerome Glisse wrote: > On Thu, Nov 10, 2011 at 11:27:33AM +0100, Thomas Hellstrom wrote: > >> On 11/09/2011 09:22 PM, j.glisse at gmail.com wrote: >> >>> From: Jerome Glisse<jglisse at redhat.com> >>> >>> This is an overhaul of the ttm memory accounting. This tries to keep >>> the same global behavior while removing the whole zone concept. It >>> keeps a distrinction for dma32 so that we make sure that ttm don't >>> starve the dma32 zone. >>> >>> There is 3 threshold for memory allocation : >>> - max_mem is the maximum memory the whole ttm infrastructure is >>> going to allow allocation for (exception of system process see >>> below) >>> - emer_mem is the maximum memory allowed for system process, this >>> limit is> to max_mem >>> - swap_limit is the threshold at which point ttm will start to >>> try to swap object because ttm is getting close the max_mem >>> limit >>> - swap_dma32_limit is the threshold at which point ttm will start >>> swap object to try to reduce the pressure on the dma32 zone. Note >>> that we don't specificly target object to swap to it might very >>> well free more memory from highmem rather than from dma32 >>> >>> Accounting is done through used_mem& used_dma32_mem, which sum give >>> the total amount of memory actually accounted by ttm. >>> >>> Idea is that allocation will fail if (used_mem + used_dma32_mem)> >>> max_mem and if swapping fail to make enough room. >>> >>> The used_dma32_mem can be updated as a later stage, allowing to >>> perform accounting test before allocating a whole batch of pages. >>> >>> >> Jerome, you're removing a fair amount of functionality here, without >> justifying >> why it could be removed. >> > All this code was overkill. >
[1] I don't agree, and since it's well tested, thought throught and working, I see no obvious reason to alter it, within the context of this patch series unless it's absolutely required for the functionality. > > >> Consider a low-end system with 1G of kernel memory and 10G of >> highmem. How do we avoid putting stress on the kernel memory? I also >> wouldn't be too surprised if DMA32 zones appear in HIGHMEM systems >> in the future making the current zone concept good to keep. >> > Right now kernel memory is accounted against all zone so it decrease > not only the kernel zone but also the dma32& highmem if present. > Do you mean that the code is incorrect? In that case, did you consider the fact that zones may overlap? (Although I admit the name "highmem" might be misleading. Should be "total"). > Note also that kernel zone in current code == dma32 zone. > Last time I looked, the highmem split was typically at slightly less than 1GB, depending on the hardware and desired setup. I admit that was some time ago, but has that really changed? On all archs? Furthermore, on !Highmem systems, All pages are in the kernel zone. > When it comes to future it looks a lot simpler, it seems everyone > is moving toward more capable and more advanced iommu that can remove > all the restriction on memory from the device pov. I mean even arm > are getting more and more advance iommu. I don't see any architecture > worse supporting not going down that road. > While the proposed change is probably possible, with different low <-> high splits depending on whether HIGHMEM is defined or not, I think [1] is a good reason for not pushing it through. > >> Also, in effect you move the DOS from *all* zones into the DMA32 >> zone and create a race in that multiple simultaneous allocators can >> first pre-allocate out of the global zone, and then update the DMA32 >> zone without synchronization. In this way you might theoretically >> end up with more DMA32 pages allocated than present in the zone. >> > Ok a respin attached with a simple change, things will be > accounted against dma32 zone and only when we get page we will > decrease the dma32 zone usage that way no DOS on dma32. > > It also deals with the case where there is still lot of highmem > but no more dma32. > So why not just do a ttm_mem_global_alloc() for the pages you want to allocate, and add a proper adjustment function if memory turns out to be either HIGHMEM or !DMA32 > >> With the proposed code there's also a theoretical problem in that a >> potentially huge number of pages are unaccounted before they are >> actually freed. >> > What you mean unaccounted ? The way it works is : > r = global_memory_alloc(size) > if (r) fail > alloc pages > update memory accounting according to what page where allocated > > So memory is always accounted before even being allocated (exception > are the kernel object for vmwgfx& ttm_bo but we can move accounting > there too if you want, those are small allocation and i didn't think > it was worse changing that). > No, I mean the sequence unaccount_page_array() ---Race-- free_page_array() /Thomas > Cheers, > Jerome >