UUID support
first of all, I'm not subscribed to the list, so please CC me the answers. I'm being forced by my distro to use UUDIs to specify the boot device by UUID. the problem is I don't know how to add UUID support to the kernl, that is, I don't know which option I should enable. some might say «go fix your distro», some might say «change your distro». yes, those are options, but for me is more interesting to know what part in the kernel is responsible for this. so, thanks in advance for any answer. -- (Not so) Random fortune: They were tecnicians, mechanics--and never thought of it in that manner [that they were taking all the power of desicion]. They just wanted to right an obvious wrong. -- Harry Harrison, "To the stars". - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: committed memory, mmaps and shms
On Fri, Mar 13, 2015 at 11:58:51AM -0300, Marcos Dione wrote: > On Thu, Mar 12, 2015 at 01:56:00PM -0300, Marcos Dione wrote: > > On Thu, Mar 12, 2015 at 11:35:13AM -0400, Michal Hocko wrote: > > > On Wed 11-03-15 19:10:44, Marcos Dione wrote: > > > I also read Documentation/vm/overcommit-accounting > > > > What would help you to understand it better? > > I think it's mostly a language barrier. The doc talks about of how > the kernel handles the memory, but leaves userland people 'watching from > outside the fence'. From the sysadmin and non-kernel developer (that not > necesarily knows all the kinds of things that can be done with > malloc/mmap/shem/&c) point of view, this is what I think the doc refers > to: > > > How It Works > > > > > > The overcommit is based on the following rules > > > > For a file backed map > > mmaps. are there more? answering myself: yes, code maps behave like this. > > SHARED or READ-only - 0 cost (the file is the map not swap) > > PRIVATE WRITABLE- size of mapping per instance code is not writable, so only private writable mmaps are left. I wonder why shared writable are accounted. > > For an anonymous > > malloc'ed memory > > > or /dev/zero map > > hmmm, (read only?) mmap'ing on top of /dev/zero? > > > SHARED - size of mapping > > a shared anonymous memory is a shm? > > > PRIVATE READ-only - 0 cost (but of little use) > > PRIVATE WRITABLE- size of mapping per instance > > I can't translate these two terms, unless the latter is the one > refering specifically to mmalloc's. I wonder how could create several > intances of the 'same' mapping in that case. forks? > > > Additional accounting > > Pages made writable copies by mmap > > Hmmm, copy-on-write pages for when you write in a shared mmap? I'm > wild guessing here, even when what I say doesn't make any sense. > > > shmfs memory drawn from the same pool > > Beats me. [...] > Now it seems too simple! What I'm missing? :) Cheers, untrue, I'm still in the dark on what those mean. maybe someone can translate those terms to userland terms? malloc, shm, mmap, code maps? probably I'm missing some. cheers, -- Marcos. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
committed memory, mmaps and shms
Hi everybody. First, I hope this is the right list for such questions; I searched in the list of lists[1] for a MM specific one, but didn't find any. Second, I'm not subscribed, so please CC me and my other address when answering. I'm trying to figure out how Linux really accounts for memory, both globally and for each individual process. Most user's first approach to memory monitoring is running free (no pun intended): $ free total used free sharedbuffers cached Mem: 396895176 395956332 938844 0 8972 356409952 -/+ buffers/cache: 39537408 357357768 Swap: 83857888385788 0 This reports 378GiB of RAM, 377 used; of those 8MiB in buffers, 339GiB in cache, leaving only 38Gib for processes (for some reason this value is not displayed, which should probably be a warning to what is to come); and 1GiB free. So far all seems good. Now, this machine has (at least) a 108 GiB shm. All this memory is clearly counted as cache. This is my first surprise. shms are not cache of anything on disk, but spaces of shared memory (duh); at most, their pages can end up in swap, but not in a file somewhere. Maybe I'm not correctly interpreting the meaning of (what is accounted as) cache. The next tool in the toolbox is ps: $ ps ux | grep 27595 USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND osysops 27595 49.5 12.7 5506723020 50525312 ? Sl 05:20 318:02 otf_be v2.9.0.13 : FQ_E08AS FQ_E08-FQDSIALT #1 [processing daemon lib, msg type: undefined] This process is not only attached to that shm, it's also attached to 5TiB of mmap'ed files (128 LMDB databases), for a total of 5251GiB. For context, know that another 9 processes do the same. This tells me that shms and mmaps are counted as part of their virtual size, which makes sense. Of those, only 48GiB are resident... but a couple of paragraphs before I said that there were only 38GiB used by processes. Clearly some part of each individual process' RSS also counts at least some part of the mmaps. /proc/27595/smaps has more info: $ cat /proc/27595/smaps | awk 'BEGIN { count= 0; } /Rss/ { count = count + $2; print } /Pss/ { print } /Swap/ { print } /^Size/ { print } /-/ { print } END { print count }' [...] 7f2987e92000-7f3387e92000 rw-s fc:11 3225448420 /instant/LMDBMedium_00/data.mdb Size: 41943040 kB Rss: 353164 kB Pss: 166169 kB Swap: 0 kB [...] 7f33df965000-7f4f1cdcc000 rw-s 00:04 454722576 /SYSV (deleted) Size: 114250140 kB Rss: 5587224 kB Pss: 3856206 kB Swap: 0 kB [...] 51652180 Notice that the sum is not the same as the one reported before; maybe because I took them in different points of time while redacting this mail. So this confirms that a process' RSS value includes shms and mmaps, at least the resident part. In the case of the mmaps, the resident part must be the part that currently sits on the cache; in the case of the shms, I suppose it's the part that has ever been used. An internal tool tels me that currently 24GiB of that shm is in use, but only 5 are reported as part of that process' RSS. Maybe is that process' used part? And now I reach to what I find more confusing (uninteresting values removed): $ cat /proc/meminfo MemTotal: 396895176 kB MemFree: 989392 kB Buffers :8448 kB Cached: 344059556 kB SwapTotal:8385788 kB SwapFree: 0 kB Mapped: 147188944 kB Shmem: 109114792 kB CommitLimit:206833376 kB Committed_AS: 349194180 kB VmallocTotal: 34359738367 kB VmallocUsed: 1222960 kB VmallocChunk: 34157188704 kB Again, values might vary due to timing. Mapped clearly includes Shmem but not mmaps; in theory 36GiB are 'pure' (not shm'ed, not mmap'ed) process memory, close to what I calculated before. Again, this is not segregated, which again makes us wonder why. Probably it's more like "It doesn't make sense to do it". Last but definitely not least, Committed_AS is 333GiB, close to the total mem. man proc says it's «The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which has been allocated by processes, even if it has not been "used" by them as of yet». What is not clear is if this counts or not mmaps (I think it doesn't, or it would be either 5TiB or 50TiB, depending on whether you count each attachment to each shm) and/or/neither shms (once, multiple times?). In a rough calculation, the 83 procs using the same 108GiB shm account for 9TiB, so at least it's not counting it multiple times. While we're at it, I would like to know what VmallocTotal (32TiB) is accounting. The explanation in man proc («Total size of vmalloc memory area.», where vmalloc seems to be a kernel internal function to «alloca
Re: committed memory, mmaps and shms
On Thu, Mar 12, 2015 at 08:40:53AM -0400, Michal Hocko wrote: > [CCing MM maling list] Shall we completely migrate the rest of the conversation there? > On Wed 11-03-15 19:10:44, Marcos Dione wrote: > > > > Hi everybody. First, I hope this is the right list for such > > questions; I searched in the list of lists[1] for a MM specific one, but > > didn't find any. Second, I'm not subscribed, so please CC me and my other > > address when answering. > > > > I'm trying to figure out how Linux really accounts for memory, both > > globally and for each individual process. Most user's first approach to > > memory monitoring is running free (no pun intended): > > > > $ free > > total used free sharedbuffers cached > > Mem: 396895176 395956332 938844 0 8972 356409952 > > -/+ buffers/cache: 39537408 357357768 > > Swap: 83857888385788 0 > > > > This reports 378GiB of RAM, 377 used; of those 8MiB in buffers, > > 339GiB in cache, leaving only 38Gib for processes (for some reason this > > I am not sure I understand your math here. 339G in the cache should be > reclaimable (be careful about the shmem though). It is the rest which > might be harder to reclaim. These 38GiB I mention is the rest of 378 available minus 339 in cache. To me this difference represents the sum of the resident anonymous memory malloc'ed by all processes. Unless there's some othr kind of pages accounted in 'Used'. > shmem (tmpfs) is a in memory filesystem. Pages backing shmem mappings > are maintained in the page cache. Their backing storage is swap as you > said. So from a conceptual point of vew this makes a lot of sense. Now it's completely clear, thanks. > > * Why 'pure' mmalloc'ed memory is ever reported? Does it make sense to > > talk about it? > > This is simply private anonymous memory. And you can see it as such in > /proc//[s]maps Yes, but my question was more on the lines of 'why free or /proc/meminfo do not show it'. Maybe it's just that it's difficult to define (like I said, "sum of resident anonymous..." &c) or nobody really cares about this. Maybe I shouldn't either. > > * What does the RSS value means for the shms in each proc's smaps file? > > And for mmaps? > > The amount of shmem backed pages mapped in to the user address space. Perfect. > > * Is my conclusion about Shmem being counted into Mapped correct? > > Mapped will tell you how much page cache is mapped via pagetable to a > process. So it is a subset of pagecache. same as Shmem is a subset. Note > that shmem doesn't have to be mapped anywhere (e.g. simply read a file > on tmpfs filesystem - it will be in the pagecache but not mapped). > > > * What is actually counted in Committed_AS? Does it count shms or mmaps? > > How? > > This depends on the overcommit configuration. See > Documentation/sysctl/vm.txt for more information. I understand what /proc/sys/vm/overcommit_memory is for; what I don't understand is what exactly counted in the Committed_AS line in /proc/meminfo. I also read Documentation/vm/overcommit-accounting and even mm/mmap.c, but I'm still in the dark here. > > * What is VmallocTotal? > > Vmalloc areas are used by _kernel_ to map larger physically > non-contiguous memory areas. More on that e.g. here > http://www.makelinux.net/books/lkd2/ch11lev1sec5. You can safely ignore > it. It's already forgotten, thanks :) Cheers, -- Marcos. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: committed memory, mmaps and shms
On Thu, Mar 12, 2015 at 11:35:13AM -0400, Michal Hocko wrote: > > > On Wed 11-03-15 19:10:44, Marcos Dione wrote: > [...] > > > > $ free > > > > total used free sharedbuffers > > > > cached > > > > Mem: 396895176 395956332 938844 0 8972 > > > > 356409952 > > > > -/+ buffers/cache: 39537408 357357768 > > > > Swap: 83857888385788 0 > > > > > > > > This reports 378GiB of RAM, 377 used; of those 8MiB in buffers, > > > > 339GiB in cache, leaving only 38Gib for processes (for some reason this > > > > > > I am not sure I understand your math here. 339G in the cache should be > > > reclaimable (be careful about the shmem though). It is the rest which > > > might be harder to reclaim. > > > > These 38GiB I mention is the rest of 378 available minus 339 in > > cache. To me this difference represents the sum of the resident > > anonymous memory malloc'ed by all processes. Unless there's some othr > > kind of pages accounted in 'Used'. > > The kernel needs memory as well for its internal data structures > (stacks, page tables, slab objects, memory used by drivers and what not). Are those in or out of the total memory reported by free? I had the impression the were out. 396895176 accounts only for 378.5GiB of the 384 available in the machine; I assumed the missing 5.5 was kernel memory. > > Yes, but my question was more on the lines of 'why free or > > /proc/meminfo do not show it'. Maybe it's just that it's difficult to > > define (like I said, "sum of resident anonymous..." &c) or nobody really > > cares about this. Maybe I shouldn't either. > > meminfo is exporting this information as AnonPages. I think that what I'm trying to do is figure out what each value represents and where it's incuded, as if to make a graph like this (fields in /proc/meminfo between []'s; dots are inactive, plus signs active): RAMswap other (mmaps) |--|-|-... |.| kernel [Slab+KernelStack+PageTables+?] |.| buffers [Buffers] | . . . . .. .| swap cached (not necesarily like this, but you get the idea) (I'm assuming that it only includes anon pages, shms and private mmaps) [SwapCached] |++..| resident annon (malloc'ed) [AnonPages/Active/Inactive(anon)] |++++++| cache [Cached/Active/Inactive(file)] |+++...| (resident?) shms [Shmem] |+++..| resident mmaps |.| other fs cache |..| free [MemFree] |.| used swap [SwapTotal-SwapFree] |...| swap free [SwapFree] Note that there are no details on how the swap is used between anon pages, shm and others; neither about mmaps; except in /proc//smaps. If someone is really interested in that, it would have to poll an interesting amount of files, but definitely doable. Just cat'ing one of these files for a process with 128 mmaps and 1 shm as before gave these times: real0m0.802s user0m0.004s sys 0m0.244s > > I understand what /proc/sys/vm/overcommit_memory is for; what I > > don't understand is what exactly counted in the Committed_AS line in > > /proc/meminfo. > > It accounts all the address space reservations - e.g. mmap(len), len > will get added. The things are slightly more complicated but start > looking at callers of security_vm_enough_memory_mm should give you an > idea what everything is included. > How is this number used depends on the overcommit mode. > __vm_enough_memory would give you a better picture. > > > I also read Documentation/vm/overcommit-accounting > > What would help you to understand it better? I think that after this dip in terminology I should go back to it and try again to figure it out myself :) Of course findings will be posted. Cheers, -- Marcos. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: committed memory, mmaps and shms
On Thu, Mar 12, 2015 at 01:56:00PM -0300, Marcos Dione wrote: > On Thu, Mar 12, 2015 at 11:35:13AM -0400, Michal Hocko wrote: > > > > On Wed 11-03-15 19:10:44, Marcos Dione wrote: > I think that what I'm trying to do is figure out what each value > represents and where it's incuded, as if to make a graph like this > (fields in /proc/meminfo between []'s; dots are inactive, plus signs > active): > > RAMswap other (mmaps) > |--|-|-... > |.| kernel [Slab+KernelStack+PageTables+?] > |.| buffers [Buffers] > | . . . . .. .| swap cached (not necesarily like this, but you get > the idea) (I'm assuming that it only includes anon pages, shms and private > mmaps) [SwapCached] > |++..| resident annon (malloc'ed) [AnonPages/Active/Inactive(anon)] > |++++++| cache [Cached/Active/Inactive(file)] > |+++...| (resident?) shms [Shmem] > |+++..| resident mmaps > |.| other fs cache > |..| free [MemFree] >|.| used swap [SwapTotal-SwapFree] > |...| swap free > [SwapFree] Did I get this right so far? > > > I understand what /proc/sys/vm/overcommit_memory is for; what I > > > don't understand is what exactly counted in the Committed_AS line in > > > /proc/meminfo. > > > > It accounts all the address space reservations - e.g. mmap(len), len > > will get added. The things are slightly more complicated but start > > looking at callers of security_vm_enough_memory_mm should give you an > > idea what everything is included. > > How is this number used depends on the overcommit mode. > > __vm_enough_memory would give you a better picture. > > > > > I also read Documentation/vm/overcommit-accounting > > > > What would help you to understand it better? I think it's mostly a language barrier. The doc talks about of how the kernel handles the memory, but leaves userland people 'watching from outside the fence'. From the sysadmin and non-kernel developer (that not necesarily knows all the kinds of things that can be done with malloc/mmap/shem/&c) point of view, this is what I think the doc refers to: > How It Works > > > The overcommit is based on the following rules > > For a file backed map mmaps. are there more? > SHARED or READ-only - 0 cost (the file is the map not swap) > PRIVATE WRITABLE - size of mapping per instance > > For an anonymous malloc'ed memory > or /dev/zero map hmmm, (read only?) mmap'ing on top of /dev/zero? > SHARED- size of mapping a shared anonymous memory is a shm? > PRIVATE READ-only - 0 cost (but of little use) > PRIVATE WRITABLE - size of mapping per instance I can't translate these two terms, unless the latter is the one refering specifically to mmalloc's. I wonder how could create several intances of the 'same' mapping in that case. forks? > Additional accounting > Pages made writable copies by mmap Hmmm, copy-on-write pages for when you write in a shared mmap? I'm wild guessing here, even when what I say doesn't make any sense. > shmfs memory drawn from the same pool Beats me. > Status > -- This section goes back mostly to userland terminology. > o We account mmap memory mappings > o We account mprotect changes in commit > o We account mremap changes in size > o We account brk This I know is part of the implementation of malloc. > o We account munmap > o We report the commit status in /proc > o Account and check on fork > o Review stack handling/building on exec > o SHMfs accounting > o Implement actual limit enforcement > > To Do > - > o Account ptrace pages (this is hard) I know ptrace, and this seems to hint that ptrace also uses a good amount of pages, but in normal operation I can ignore this. In summary, so far: * only private writable mmaps are counted 'once per instance', which I assume it means that if the same process uses the 'same' mmap twice (two instances), then in gets counted twice, beacuase each instance is separated from the other. * malloc'ed and shared memory, again once per instance. * those two things I couldn't figure out. Now it seems too simple! What I'm missing? :) Cheers, -- Marcos. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: committed memory, mmaps and shms
On Fri, Mar 13, 2015 at 03:09:58PM +0100, Michal Hocko wrote: > Well, the memory management subsystem is rather complex and it is not > really trivial to match all the possible combinations into simple > counters. Yes, I imagine. > I would be interested in the particular usecase where you want the > specific information and it is important outside of debugging purposes. Well, now it's more sheer curiosity than anything else, except for the Commited_AS, which is directly related to work. I personalyy prefer to a) have a full picture in my head and b) have it documented somwhere, if at least in this thread. -- Marcos. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/