Well the best placement is guaranteed as long as the application doesn't do any nonsense (e.g. trying to allocate a buffer larger than available VRAM).

The VM_ALWAYS_VALID flag doesn't affect any of that handling.

Regards,
Christian.

Am 13.05.22 um 00:17 schrieb Marek Olšák:
Would it be better to set the VM_ALWAYS_VALID flag to have a greater guarantee that the best placement will be chosen?

See, the main feature is getting the best placement, not being discardable. The best placement is a hw design requirement due to using memory for uses that are expected to have performance similar to onchip SRAMs. We need to make sure the best placement is guaranteed if it's VRAM.

Marek

On Thu., May 12, 2022, 03:26 Christian König, <ckoenig.leichtzumer...@gmail.com> wrote:

    Am 12.05.22 um 00:06 schrieb Marek Olšák:
    3rd question: Is it worth using this on APUs?

    It makes memory management somewhat easier when we are really OOM.

    E.g. it should also work for GTT allocations and when the core
    kernel says "Hey please free something up or I will start the
    OOM-killer" it's something we can easily throw away.

    Not sure how many of those buffers we have, but marking everything
    which is temporary with that flag is probably a good idea.


    Thanks,
    Marek

    On Wed, May 11, 2022 at 5:58 PM Marek Olšák <mar...@gmail.com> wrote:

        Will the kernel keep all discardable buffers in VRAM if VRAM
        is not overcommitted by discardable buffers, or will other
        buffers also affect the placement of discardable buffers?


    Regarding the eviction pressure the buffers will be handled like
    any other buffer, but instead of preserving the content it is just
    discarded on eviction.


        Do evictions deallocate the buffer, or do they keep an
        allocation in GTT and only the copy is skipped?


    It really deallocates the backing store of the buffer, just keeps
    a dummy page array around where all entries are NULL.

    There is a patch set on the mailing list to make this a little bit
    more efficient, but even using the dummy page array should only
    have a few bytes overhead.

    Regards,
    Christian.


        Thanks,
        Marek

        On Wed, May 11, 2022 at 3:08 AM Marek Olšák
        <mar...@gmail.com> wrote:

            OK that sounds good.

            Marek

            On Wed, May 11, 2022 at 2:04 AM Christian König
            <ckoenig.leichtzumer...@gmail.com> wrote:

                Hi Marek,

                Am 10.05.22 um 22:43 schrieb Marek Olšák:
                A better flag name would be:
                AMDGPU_GEM_CREATE_BEST_PLACEMENT_OR_DISCARD

                A bit long for my taste and I think the best
                placement is just a side effect.


                Marek

                On Tue, May 10, 2022 at 4:13 PM Marek Olšák
                <mar...@gmail.com> wrote:

                    Does this really guarantee VRAM placement? The
                    code doesn't say anything about that.


                Yes, see the code here:


                        diff --git
                        a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
                        b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
                        index 8b7ee1142d9a..1944ef37a61e 100644
                        --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
                        +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
                        @@ -567,6 +567,7 @@ int
                        amdgpu_bo_create(struct amdgpu_device *adev,
                                        bp->domain;
                                bo->allowed_domains =
                        bo->preferred_domains;
                                if (bp->type != ttm_bo_type_kernel &&
                        +           !(bp->flags &
                        AMDGPU_GEM_CREATE_DISCARDABLE) &&
                                    bo->allowed_domains ==
                        AMDGPU_GEM_DOMAIN_VRAM)
                        bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;


                The only case where this could be circumvented is
                when you try to allocate more than physically
                available on an APU.

                E.g. you only have something like 32 MiB VRAM and
                request 64 MiB, then the GEM code will catch the
                error and fallback to GTT (IIRC).

                Regards,
                Christian.


Reply via email to