On 19.07.18 12:30, Timothy Arceri wrote:
On 19.07.18 11:47, Timothy Arceri wrote:
On 19/07/18 08:31, Eric Anholt wrote:
Danylo Piliaiev <danylo.pilia...@gmail.com> writes:

After optimization passes and many trasfromations most of memory

"transformations"

NIR holds is a garbage which was being freed only after shader deletion.

"is garbage"

Freeing it at the end of linking will save memory which would be useful
in case there are a lot of complex shaders being compiled.
The common case for this issue is 32bit game running under Wine.

The cost of the optimization is around ~3-5% of compilation speed
with complex shaders.

Signed-off-by: Danylo Piliaiev <danylo.pilia...@globallogic.com>

This seems good, and I'm running it through the CTS now.


The problem is this does the sweep too early. We still do lots of work on NIR after this, I've thought about this a few times and it really seems we should be able to call a driver specific function from the st and pass it the IR so it can do what ever it wants (lowering opts etc) and spits it back out. Once this is done we should then call sweep and cache the IR.

At the very least we should call sweep before we cache NIR rather than where this patch places it.
I debugged mesa once more and cannot see where else memory is allocated for NIR after this sweep. Could you point it to me? After this sweep the memory NIR holds never grows. Later this NIR is cloned from and these cloned NIRs are being sweeped in other place.

Ah yes you are right. We do clone it when creating variants I'd forgotten about that. In that case please ignore my comment.

Good, I thought I really missed something...

Also during my investigation of Mesa memory usage I wrote gdb pretty printer which shows how much memory variable holds in its ralloc context (with all its children), it was crudely written and at this moment has sever limitation: x64 only, depends on internal malloc implementation and other hardcoded things, also I wasn't able to nicely display children of a variable. The reason that pretty printer was done this way is that calling c function (e.g. malloc_usable_size) corrupts backtrace somehow.

Example usage:

(gdb) source ralloc_info_pretty_printer.py
(gdb) backtrace
#0  brw_link_shader (ctx=0x5555558ca0a0 <size: 96432>, shProg=0x555555d850c0 <size: 528>) at brw_link.cpp:320 #1  0x00007ffff2732b6f in _mesa_glsl_link_shader (ctx=0x5555558ca0a0 <size: 96432>, prog=0x555555d850c0 <size: 528>) at program/ir_to_mesa.cpp:3174 #2  0x00007ffff25a1862 in link_program (no_error=false, shProg=0x555555d850c0 <size: 528>, ctx=0x5555558ca0a0 <size: 96432>) at main/shaderapi.c:1206 #3  link_program_error (ctx=0x5555558ca0a0 <size: 96432>, shProg=0x555555d850c0 <size: 528>) at main/shaderapi.c:1286 #4  0x00007ffff25a2f00 in _mesa_LinkProgram (programObj=3) at main/shaderapi.c:1778 #5  0x0000555555556de1 in main () at /home/danylo/Projects/shader_compilation_memory_test/test.cpp:421
(gdb) p prog->nir
$1 = 0x555555dc2b20 <size: 241296>

If there is any interest in having it in Mesa I can clean it up. You can find its code in attachment.

import gdb
import gdb.types
import gdb.printing

have_python_2 = (sys.version_info[0] == 2)
have_python_3 = (sys.version_info[0] == 3)

if have_python_3:
    intptr = int
elif have_python_2:
    intptr = long

def get_ralloc_header(val):
    try:
        ralloc_header_type = gdb.lookup_type("ralloc_header")
        ralloc_header_type_pointer = ralloc_header_type.pointer()

        val_as_header_ptr = val.cast(ralloc_header_type_pointer)

        if not val_as_header_ptr:
            return None

        header_ptr = val_as_header_ptr - 1

        if not header_ptr:
            return None

        canary = header_ptr["canary"]

        if canary and int(canary) == 0x5A1106:
            return header_ptr
        else:
            return None
    except:
        return None


class RAllocPrinter:
    "Pretty Printer for ralloc"
    printer_name = 'ralloc'

    def __init__(self, val, header):
        self.val = val
        self.header = header;
        self.has_child = True

    def calc_alloc_size(self, header_ptr):
        headers = [header_ptr]
        size = 0

        current_inferior = gdb.selected_inferior()

        size_bits_mask = ~(0x1 | 0x2 | 0x4)

        while headers:
            current_header_ptr = headers.pop()

            child_ptr = current_header_ptr["child"]
            while intptr(child_ptr) != 0:
                canary = child_ptr["canary"]
                if not canary or int(canary) != 0x5A1106:
                    return -1

                headers.append(child_ptr)
                child_ptr = child_ptr["next"]

            #struct malloc_chunk {
                #INTERNAL_SIZE_T prev_size;  /* Size of previous chunk (if free).  */
                #INTERNAL_SIZE_T size;       /* Size in bytes, including overhead. */   <----------
                # ------USABLE MEMORY--------

            mem = current_inferior.read_memory(int(current_header_ptr) - 8, 8)

            size += mem.cast('l')[0] & size_bits_mask

        return size

    def to_string(self):
        val_size = self.calc_alloc_size(self.header)
        return hex(intptr(self.val)) + " <size: " + str(int(val_size)) + ">"

    # def children(self):
    #     try:
    #         for field in self.val.dereference().type.fields():
    #             yield field.name, self.val[field.name]
    #     except:
    #         raise StopIteration


def lookup_type(val):
    if val.type.strip_typedefs().code != gdb.TYPE_CODE_PTR or intptr(val) == 0:
        return None

    header_ptr = get_ralloc_header(val)

    if header_ptr:
        return RAllocPrinter(val, header_ptr)
    else:
        return None


gdb.pretty_printers.append(lookup_type)
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to