On Mon, Mar 4, 2019 at 4:43 PM Alyssa Rosenzweig <aly...@rosenzweig.io> wrote: > > Wooo!!!!!!! > > Seeing this patch in my inbox has been some of the best news all day! > > Without further ado, my (solicited?) comments: > > > +/* Copyright 2018, Linaro, Ltd., Rob Herring <r...@kernel.org> */ > > Please triple check upstream that this license is okay; the other files > in include/drm-uapi are BSDish. > > > +/* timeouts are specified in clock-monotonic absolute times (to simplify > > + * restarting interrupted ioctls). The following struct is logically the > > + * same as 'struct timespec' but 32/64b ABI safe. > > + */ > > +struct drm_panfrost_timespec { > > + __s64 tv_sec; /* seconds */ > > + __s64 tv_nsec; /* nanoseconds */ > > +}; > > What's this for -- is there not a shared struct for this if it's > necessary? (I'm assuming this was copied from etna..?) > > > + __u32 flags; /* in, mask of ETNA_BO_x */ > > As Rob said, s/ETNA/PAN/g > > > + struct drm_panfrost_timespec timeout; /* in */ > > (Presumably we just want a timeout in one of nanoseconds / milliseconds > / second, not the full timespec... 64-bit time gives a pretty wide range > even for high-precision timeouts, which I don't know that we need. Also, > it's signed for some reason...?) > > > + struct drm_panfrost_gem_submit_atom_dep deps[2]; > > I'm concerned about hardcoding deps to 2 here. I know the Arm driver > does it, but I'm super uncomfortable by them doing it too. Keep in mind, > when they have complex dependencies they insert dummy "dependency-only" > jobs into the graph, though I concede I haven't seen this yet.
Why aren't we using regular dma-buf fences here? The submit ioctl should be able to take a number of in fences to wait on and return an out fence if requested. See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_ARRAY in https://cgit.freedesktop.org/~airlied/linux/tree/include/uapi/drm/i915_drm.h?h=drm-next or MSM_SUBMIT_FENCE_FD_IN/OUT in https://cgit.freedesktop.org/~airlied/linux/tree/include/uapi/drm/msm_drm.h?h=drm-next > I'm not at all sure what the right number is, especially since the new > kernel interface seems to support waiting on BOs? Or am I > misinterpreting this? > > Regardless, this will start to matter when we implement support for more > complex FBOs (where various FBOs samples from various other FBOs), which > I think can have arbitrarily complicated dependency graphs. So > hardcoding to [2] is inappropriate if we want to use deps for waiting on > FBOs. On the other hand, if we use a kernel-side BO wait or fences or > something to resolve dependencies, do we even need two deps, or just > one? > > > + // TODO cache allocations > > + // TODO properly handle errors > > + // TODO take into account extra_flags > > Not a blocker at all for merge, but these should be taken care of before > we drop support for the non-DRM stuff (since at least the > lack of GROWABLE support represents a regression compared to the Arm > driver). > > > + // TODO map and unmap on demand? > > I don't know if this matters too much...? Usually if we're allocating a > slab, that means we want it mapped immediately, unless we set the > INVISIBLE flag which means we odn't want it mapped at all. > > Maybe we'll have fancier scenarios in the future, but at the moment I > think we can safely cross off at least the first half of the TODO. > > As you know I'm not a believer in unmapping/freeing resources ever, so I > don't get an opinion there ;) > > > + gem_info.handle = gem_new.handle; > > + ret = drmIoctl(drm->fd, DRM_IOCTL_PANFROST_GEM_INFO, &gem_info); > > + if (ret) { > > + fprintf(stderr, "DRM_IOCTL_PANFROST_GEM_INFO failed: > > %d\n", ret); > > + assert(0); > > + } > > Maybe a question for Rob, but why isn't this info passed along with the > allocate in the first place? I guess the extra roundtrip to the kernel > isn't a huge deal, but it seems a little odd...? > > > +static uint8_t > > +allocate_atom() > > +{ > > + /* Use to allocate atom numbers for jobs. We probably want to > > overhaul this in kernel space at some point. */ > > + static uint8_t atom_counter = 0; > > + > > + atom_counter++; > > + > > + /* Workaround quirk where atoms must be strictly positive */ > > + > > + if (atom_counter == 0) > > + atom_counter++; > > + > > + return atom_counter; > > +} > > So, this routine (which is copied straight from the non-DRM code) is > specifically to workaround a quirk in the Arm driver where atom numbers > must be non-zero u8's. I'm skeptical the DRM interface needs this. > > > + idx++; > > Nice one, no idea why I didn't think of doing it this way :) > > > +panfrost_fence_create(struct panfrost_context *ctx) > > I'd appreciate a link to documentation about mainline fences, since I > have no idea what's happening here :) > > > +static void > > +panfrost_drm_enable_counters(struct panfrost_screen *screen) > > +{ > > + fprintf(stderr, "unimplemented: %s\n", __func__); > > +} > > If nobody else is taking it, would anyone mind if I play with adding > performance counters to the kernel? I want to at least dip my feet into > the other side of the stack, and that seems like a nice low-hanging > fruit (especially since I actually grok the performance counters, more > or less) :) > > > + return drmSyncobjCreate(drm->fd, DRM_SYNCOBJ_CREATE_SIGNALED, > > (See fence question) > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev