Re: [Intel-gfx] [PATCH] drm: Fix deadlock due to getconnector locking changes
On Sun, 22 Feb 2015, Daniel Vetter wrote: > In > > daniel@phenom:~/linux/src$ git show ccfc08655 copy-paste fail? J. > commit ccfc08655d5fd5076828f45fb09194c070f2f63a > Author: Rob Clark > Date: Thu Dec 18 16:01:48 2014 -0500 > > drm: tweak getconnector locking > > We need to extend the locking to cover connector->state reading for > atomic drivers, but the above commit was a bit too eager and also > included the fill_modes callback. Which on i915 on old platforms using > load detection needs to acquire modeset locks, resulting in a deadlock > on output probing. > > Reported-by: Marc Finet > Cc: Marc Finet > Cc: robdcl...@gmail.com > Signed-off-by: Daniel Vetter > --- > drivers/gpu/drm/drm_crtc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c > index b15d720eda4c..ce5f1193ecd6 100644 > --- a/drivers/gpu/drm/drm_crtc.c > +++ b/drivers/gpu/drm/drm_crtc.c > @@ -2180,7 +2180,6 @@ int drm_mode_getconnector(struct drm_device *dev, void > *data, > DRM_DEBUG_KMS("[CONNECTOR:%d:?]\n", out_resp->connector_id); > > mutex_lock(&dev->mode_config.mutex); > - drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); > > connector = drm_connector_find(dev, out_resp->connector_id); > if (!connector) { > @@ -2210,6 +2209,8 @@ int drm_mode_getconnector(struct drm_device *dev, void > *data, > out_resp->mm_height = connector->display_info.height_mm; > out_resp->subpixel = connector->display_info.subpixel_order; > out_resp->connection = connector->status; > + > + drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); > encoder = drm_connector_get_encoder(connector); > if (encoder) > out_resp->encoder_id = encoder->base.id; > -- > 2.1.4 > > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: logical-not-parenthesis gcc 5.0 fixes
On Mon, 23 Feb 2015, Chris Wilson wrote: > On Sun, Feb 22, 2015 at 08:10:11PM +0100, François Tigeot wrote: >> * Originally added by John Marino in DragonFly's >> eecf6c3c3b6f7127edd8b8f8c2a83e2f882ed0da >> commit. ... >> diff --git a/drivers/gpu/drm/i915/intel_display.c >> b/drivers/gpu/drm/i915/intel_display.c >> index 3b0fe9f..91264b2 100644 >> --- a/drivers/gpu/drm/i915/intel_display.c >> +++ b/drivers/gpu/drm/i915/intel_display.c >> @@ -13299,7 +13299,7 @@ intel_check_plane_mapping(struct intel_crtc *crtc) >> val = I915_READ(reg); >> >> if ((val & DISPLAY_PLANE_ENABLE) && >> -(!!(val & DISPPLANE_SEL_PIPE_MASK) == crtc->pipe)) >> +(!!( (val & DISPPLANE_SEL_PIPE_MASK) == crtc->pipe) )) > > That's wrong. Indeed; this has introduced a bug in DragonFly. Not that I care, but so you know. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t] tests/gem_render_linear_blits: split into two subtests
> -Original Message- > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > Sent: Wednesday, February 18, 2015 10:46 AM > To: Gore, Tim > Cc: intel-gfx@lists.freedesktop.org; Wood, Thomas > Subject: Re: [PATCH i-g-t] tests/gem_render_linear_blits: split into two > subtests > > On Tue, Feb 17, 2015 at 11:40:17AM +, tim.g...@intel.com wrote: > > From: Tim Gore > > > > The gem_render_linear_blits test tends to get oom killed on low memory > > (< 4GB) Android systems. This is because the test tries to allocate > > (sysinfo.totalram * 9 / 10) in buffer objects and the remaining 10% of > > memory is not always enough for the Android system. > > After a discussion with Chris Wilson I have split this test into a > > "basic" and an "apperture-thrash" subtest, in the same way as > > gem_linear_blits. The basic test uses just two buffer objects and the > > apperture-thrash test is skipped if there is insuffiecient memory. > > > > Signed-off-by: Tim Gore > > --- > > tests/gem_render_linear_blits.c | 55 > > ++--- > > 1 file changed, 30 insertions(+), 25 deletions(-) > > > > diff --git a/tests/gem_render_linear_blits.c > > b/tests/gem_render_linear_blits.c index 60ba831..3a548d3 100644 > > --- a/tests/gem_render_linear_blits.c > > +++ b/tests/gem_render_linear_blits.c > > @@ -80,18 +80,14 @@ check_bo(int fd, uint32_t handle, uint32_t val) > > } > > } > > > > -int main(int argc, char **argv) > > +static void run_test (int fd, int count) > > { > > drm_intel_bufmgr *bufmgr; > > struct intel_batchbuffer *batch; > > uint32_t *start_val; > > drm_intel_bo **bo; > > uint32_t start = 0; > > - int i, j, fd, count; > > - > > - igt_simple_init(argc, argv); > > - > > - fd = drm_open_any(); > > + int i, j; > > > > render_copy = igt_get_render_copyfunc(intel_get_drm_devid(fd)); > > igt_require(render_copy); > > @@ -99,24 +95,6 @@ int main(int argc, char **argv) > > bufmgr = drm_intel_bufmgr_gem_init(fd, 4096); > > batch = intel_batchbuffer_alloc(bufmgr, intel_get_drm_devid(fd)); > > > > - count = 0; > > - if (igt_run_in_simulation()) > > - count = 2; > > - if (argc > 1) > > - count = atoi(argv[1]); > > - > > - if (count == 0) > > - count = 3 * gem_aperture_size(fd) / SIZE / 2; > > - else if (count < 2) { > > - igt_warn("count must be >= 2\n"); > > - return 1; > > - } > > - > > - if (count > intel_get_total_ram_mb() * 9 / 10) { > > - count = intel_get_total_ram_mb() * 9 / 10; > > - igt_info("not enough RAM to run test, reducing buffer > count\n"); > > - } > > - > > bo = malloc(sizeof(*bo)*count); > > start_val = malloc(sizeof(*start_val)*count); > > > > @@ -153,7 +131,7 @@ int main(int argc, char **argv) > > check_bo(fd, bo[i]->handle, start_val[i]); > > > > if (igt_run_in_simulation()) > > - return 0; > > + return; > > > > igt_info("Cyclic blits, backward...\n"); > > for (i = 0; i < count * 4; i++) { > > @@ -200,5 +178,32 @@ int main(int argc, char **argv) > > for (i = 0; i < count; i++) > > check_bo(fd, bo[i]->handle, start_val[i]); > > > > + intel_batchbuffer_free(batch); > > + drm_intel_bufmgr_destroy(bufmgr); > > +} > > + > > +int main(int argc, char **argv) > > +{ > > + static int fd = 0; > > static! > > > + int count=0; > > + > > + igt_subtest_init(argc, argv); > > + igt_fixture { > > + fd = drm_open_any(); > > + } > > + > > + igt_subtest("basic") { > > + run_test(fd, 2); > > + } > > + > > + igt_subtest("apperture-thrash") { > > + if (argc > 1) > > + count = atoi(argv[1]); > > With automated testing we want to perform the same test over and over > again. If it is called aperture-thrash, let's make sure we do! > > If anyone ever wants to manually run this with varying amounts of > objects: first they should consider enhancing the test to capture the scenario > of concern, then secondly add the manual option. > Hi Chris, are you suggesting that I remove the command line option for count? Tim > > + if (count == 0) > > + count = 3 * gem_aperture_size(fd) / SIZE / 2; > > + igt_require(count > 1); > > + intel_require_memory(count, SIZE, CHECK_RAM); > > + run_test(fd, count); > > + } > > igt_exit(); > > } > > -- > > 2.3.0 > > > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/bdw: PCI IDs ending in 0xb are ULT.
On Wed, 21 Jan 2015, Rodrigo Vivi wrote: > When reviewing patch that fixes VGA on BDW Halo Jani noticed that > we also had other ULT IDs that weren't listed there. > > So this follow-up patch add these pci-ids as halo and fix comments > on i915_pciids.h > > Cc: Jani Nikula > Signed-off-by: Rodrigo Vivi Pushed to -fixes, cc: stable, thanks for the patch. And sorry for the delay. BR, Jani. > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > include/drm/i915_pciids.h | 4 ++-- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index ccb5403..e60df36 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2375,6 +2375,7 @@ struct drm_i915_cmd_table { >(INTEL_DEVID(dev) & 0xFF00) == 0x0C00) > #define IS_BDW_ULT(dev) (IS_BROADWELL(dev) && \ >((INTEL_DEVID(dev) & 0xf) == 0x6 ||\ > + (INTEL_DEVID(dev) & 0xf) == 0xb || \ >(INTEL_DEVID(dev) & 0xf) == 0xe)) > #define IS_BDW_GT3(dev) (IS_BROADWELL(dev) && \ >(INTEL_DEVID(dev) & 0x00F0) == 0x0020) > diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h > index 180ad0e..d016dc5 100644 > --- a/include/drm/i915_pciids.h > +++ b/include/drm/i915_pciids.h > @@ -214,9 +214,9 @@ > INTEL_VGA_DEVICEgt) - 1) << 4) | (id), info) > > #define _INTEL_BDW_M_IDS(gt, info) \ > - _INTEL_BDW_M(gt, 0x1602, info), /* ULT */ \ > + _INTEL_BDW_M(gt, 0x1602, info), /* Halo */ \ > _INTEL_BDW_M(gt, 0x1606, info), /* ULT */ \ > - _INTEL_BDW_M(gt, 0x160B, info), /* Iris */ \ > + _INTEL_BDW_M(gt, 0x160B, info), /* ULT */ \ > _INTEL_BDW_M(gt, 0x160E, info) /* ULX */ > > #define _INTEL_BDW_D_IDS(gt, info) \ > -- > 2.1.0 > -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm: WARN if drm_handle_vblank is called errornously
On Sun, Feb 22, 2015 at 03:11:20PM +0100, Daniel Vetter wrote: > KMS drivers are in full control of their irq and vblank handling, if > they get a vblank interrupt before drm_vblank_init or after > drm_vblank_cleanup that's just a driver bug. > > For ums driver there's only r128 and radeon which support vblank, and > they call drm_vblank_init in their driver load functions. Which again > means that userspace can do whatever it wants with interrupt, vblank > structures will always be there. > > So this should never happen, let's catch driver issues with a WARN_ON. > Motivated by some discussions with Imre. > > v2: Use WARN_ON_ONCE as suggested by Imre. > > Cc: Imre Deak > Reviewed-by: Imre Deak > Signed-off-by: Daniel Vetter Merged the entire series to dinq with Dave's irc-ack for the core bits. Thanks a lot for the review feedback. -Daniel > --- > drivers/gpu/drm/drm_irq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > index 3c18e522cc3b..dbece03979f3 100644 > --- a/drivers/gpu/drm/drm_irq.c > +++ b/drivers/gpu/drm/drm_irq.c > @@ -1682,7 +1682,7 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) > struct timeval tvblank; > unsigned long irqflags; > > - if (!dev->num_crtcs) > + if (WARN_ON_ONCE(!dev->num_crtcs)) > return false; > > if (WARN_ON(crtc >= dev->num_crtcs)) > -- > 1.9.3 > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] drm/irq: Add drm_crtc_vblank_reset
On Sun, Feb 15, 2015 at 04:08:31PM +0200, Laurent Pinchart wrote: > Hi Daniel, > > Thank you for the patch. > > On Friday 13 February 2015 21:03:42 Daniel Vetter wrote: > > At driver load we need to tell the vblank code about the state of the > > pipes, so that the logic around reject vblank_get when the pipe is off > > works correctly. > > > > Thus far i915 used drm_vblank_off, but one of the side-effects of it > > is that it also saves the vblank counter. And for that it calls down > > into the ->get_vblank_counter hook. Which isn't really a good idea > > when the pipe is off for a few reasons: > > - With runtime pm the register might not respond. > > - If the pipe is off some datastructures might not be around or > > unitialized. > > > > The later is what blew up on gen3: We look at intel_crtc->config to > > compute the vblank counter, and for a disabled pipe at boot-up that's > > just not there. Thus far this was papered over by a check for > > intel_crtc->active, but I want to get rid of that (since it's fairly > > race, vblank hooks are called from all kinds of places). > > > > So prep for that by adding a _reset functions which only does what we > > really need to be done at driver load: Mark the vblank pipe as off, > > but don't do any vblank counter saving or event flushing - neither of > > that is required. > > > > v2: Clarify the code flow slightly as suggested by Ville. > > > > Cc: Ville Syrjälä > > Cc: Laurent Pinchart > > Cc: Imre Deak > > Reviewed-by: Imre Deak > > Signed-off-by: Daniel Vetter > > --- > > drivers/gpu/drm/drm_irq.c| 32 > > drivers/gpu/drm/i915/intel_display.c | 6 +++--- > > include/drm/drmP.h | 1 + > > 3 files changed, 36 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > > index 75647e7f012b..1e5fb1b994d7 100644 > > --- a/drivers/gpu/drm/drm_irq.c > > +++ b/drivers/gpu/drm/drm_irq.c > > @@ -1226,6 +1226,38 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > > EXPORT_SYMBOL(drm_crtc_vblank_off); > > > > /** > > + * drm_crtc_vblank_reset - reset vblank state to off on a CRTC > > + * @crtc: CRTC in question > > + * > > + * Drivers can use this function to reset the vblank state to off at load > > time. > > + * Drivers should use this together with the drm_crtc_vblank_off() and > > + * drm_crtc_vblank_on() functions. The diffrence comparet to > > s/diffrence comparet/difference compared/ > > With that fixed, > > Acked-by: Laurent Pinchart Fixed while applying, thanks for the feedback. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2] drm/i915: avoid processing spurious/shared interrupts in low-power states
Atm, it's possible that the interrupt handler is called when the device is in D3 or some other low-power state. It can be due to another device that is still in D0 state and shares the interrupt line with i915, or on some platforms there could be spurious interrupts even without sharing the interrupt line. The latter case was reported by Klaus Ethgen using a Lenovo x61p machine (gen 4). He noticed this issue via a system suspend/resume hang and bisected it to the following commit: commit e11aa362308f5de467ce355a2a2471321b15a35c Author: Jesse Barnes Date: Wed Jun 18 09:52:55 2014 -0700 drm/i915: use runtime irq suspend/resume in freeze/thaw This is a problem, since in low-power states IIR will always read 0x resulting in an endless IRQ servicing loop. Fix this by handling interrupts only when the driver explicitly enables them and so it's guaranteed that the interrupt registers return a valid value. Note that this issue existed even before the above commit, since during runtime suspend/resume we never unregistered the handler. v2: - clarify the purpose of smp_mb() vs. synchronize_irq() in the code comment(Chris) Reference: https://lkml.org/lkml/2015/2/11/205 Reported-and-bisected-by: Klaus Ethgen Signed-off-by: Imre Deak --- drivers/gpu/drm/i915/i915_irq.c | 51 + 1 file changed, 47 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 9073119..b17c953 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1889,6 +1889,9 @@ static irqreturn_t valleyview_irq_handler(int irq, void *arg) u32 iir, gt_iir, pm_iir; irqreturn_t ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + while (true) { /* Find, clear, then process each source of interrupt */ @@ -1933,6 +1936,9 @@ static irqreturn_t cherryview_irq_handler(int irq, void *arg) u32 master_ctl, iir; irqreturn_t ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + for (;;) { master_ctl = I915_READ(GEN8_MASTER_IRQ) & ~GEN8_MASTER_IRQ_CONTROL; iir = I915_READ(VLV_IIR); @@ -2205,6 +2211,9 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) u32 de_iir, gt_iir, de_ier, sde_ier = 0; irqreturn_t ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + /* We get interrupts on unclaimed registers, so check for this before we * do any I915_{READ,WRITE}. */ intel_uncore_check_errors(dev); @@ -2276,6 +2285,9 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg) enum pipe pipe; u32 aux_mask = GEN8_AUX_CHANNEL_A; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + if (IS_GEN9(dev)) aux_mask |= GEN9_AUX_CHANNEL_B | GEN9_AUX_CHANNEL_C | GEN9_AUX_CHANNEL_D; @@ -3768,6 +3780,9 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg) I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT | I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + iir = I915_READ16(IIR); if (iir == 0) return IRQ_NONE; @@ -3948,6 +3963,9 @@ static irqreturn_t i915_irq_handler(int irq, void *arg) I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; int pipe, ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + iir = I915_READ(IIR); do { bool irq_received = (iir & ~flip_mask) != 0; @@ -4168,6 +4186,9 @@ static irqreturn_t i965_irq_handler(int irq, void *arg) I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT | I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + iir = I915_READ(IIR); for (;;) { @@ -4469,6 +4490,28 @@ void intel_hpd_init(struct drm_i915_private *dev_priv) spin_unlock_irq(&dev_priv->irq_lock); } +static void intel_irq_set_state(struct drm_i915_private *dev_priv, + bool enabled) +{ + dev_priv->pm.irqs_enabled = enabled; + /* +* Before we unmask or after we masked the interrupt make sure that +* any interrupt handler running on another CPU sees the updated +* irqs_enabled value. Note that when masking the interrupt the +* subsequent synchronize_irq doesn't guarantee this, because the +* interrupt handler can be called to service shared/spurious +* interrupts even after synchronize_irq. +*/ + smp_mb(); + /* +* Make sure that any interrupt handler pending on another CPU +* finishes. Otherwise such a handle
Re: [Intel-gfx] [PATCH 5/5] drm/atomic-helpers: make mode_set hooks optional
On Sun, Feb 22, 2015 at 08:17:04PM +0200, Laurent Pinchart wrote: > Hi Daniel, > > On Sunday 22 February 2015 19:53:23 Laurent Pinchart wrote: > > On Sunday 22 February 2015 12:24:20 Daniel Vetter wrote: > > > With runtime PM the hw might still be off while doing the ->mode_set > > > callbacks - runtime PM get/put should only happen in the > > > enable/disable hooks to properly support DPMS. Which essentially makes > > > these callbacks useless for drivers support runtime PM, so make them > > > optional. Again motivated by discussions with Laurent. > > > > > > Cc: Laurent Pinchart > > > Signed-off-by: Daniel Vetter > > > > I think we should go one step further and remove .mode_set() completely for > > drivers converted to atomic updates. There are two cases to consider: > > > > - Drivers that implement runtime PM can't use the .mode_set() callback for > > the reason explained above. Those drivers will thus not implement > > .mode_set() and will perform mode setting related hardware configuration in > > .enable(). > > > > - Drivers that don't implement runtime PM (we probably want to discourage > > this globally, but that's a different topic) can use the .mode_set() > > callbacks, but they could equally well perform mode setting in .enable() as > > the runtime PM- enabled drivers, without any drawback. > > > > To increase consistency, I thus believe we should get rid of .mode_set() > > completely for drivers converted to atomic updates. > > On second thought, I've confused .mode_set() and .mode_set_nofb(). > .mode_set() > still makes sense for encoders, but the above reasoning should apply in my > opinion for the CRTC .mode_set_nofb(). You're reasoning is correct, but we need to keep smooth transitioning in mind for driver coming from crtc helpers. And since those use ->mode_set(_nofb) all over the place it's imo better to keep this. At least until we've run out of drivers to convert ;-) We could add a DRM_INFO_ONCE though to remind drivers that they're using deprecated hooks and should convert over. I plan to submit such a patch at least for dpms/prepare/commit, maybe we could throw in ->mode_set into the mix too. > > > However, this patch is good as a first step, so if you want to apply it > > already, > > > > Acked-by: Laurent Pinchart Merged the entire series to drm-misc, thanks for the feedback. -Daniel > > > > > --- > > > > > > drivers/gpu/drm/drm_atomic_helper.c | 5 +++-- > > > include/drm/drm_crtc_helper.h | 3 ++- > > > 2 files changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > > > b/drivers/gpu/drm/drm_atomic_helper.c index 9fd3466bf277..5e10bcb7d98d > > > 100644 > > > --- a/drivers/gpu/drm/drm_atomic_helper.c > > > +++ b/drivers/gpu/drm/drm_atomic_helper.c > > > @@ -723,7 +723,7 @@ crtc_set_mode(struct drm_device *dev, struct > > > drm_atomic_state *old_state) > > > > > > funcs = crtc->helper_private; > > > > > > - if (crtc->state->enable) { > > > + if (crtc->state->enable && funcs->mode_set_nofb) { > > > > > > DRM_DEBUG_ATOMIC("modeset on [CRTC:%d]\n", > > > > > >crtc->base.id); > > > > > > @@ -759,7 +759,8 @@ crtc_set_mode(struct drm_device *dev, struct > > > drm_atomic_state *old_state) * Each encoder has at most one connector > > > (since we always steal * it away), so we won't call call mode_set hooks > > > twice. > > > > > >*/ > > > > > > - funcs->mode_set(encoder, mode, adjusted_mode); > > > + if (funcs->mode_set) > > > + funcs->mode_set(encoder, mode, adjusted_mode); > > > > > > if (encoder->bridge && encoder->bridge->funcs->mode_set) > > > > > > encoder->bridge->funcs->mode_set(encoder->bridge, > > > > > > diff --git a/include/drm/drm_crtc_helper.h b/include/drm/drm_crtc_helper.h > > > index c250a22b39ab..92d5135b55d2 100644 > > > --- a/include/drm/drm_crtc_helper.h > > > +++ b/include/drm/drm_crtc_helper.h > > > @@ -89,6 +89,7 @@ struct drm_crtc_helper_funcs { > > > > > > int (*mode_set)(struct drm_crtc *crtc, struct drm_display_mode *mode, > > > > > > struct drm_display_mode *adjusted_mode, int x, int y, > > > struct drm_framebuffer *old_fb); > > > > > > + /* Actually set the mode for atomic helpers, optional */ > > > > > > void (*mode_set_nofb)(struct drm_crtc *crtc); > > > > > > /* Move the crtc on the current fb to the given position *optional* > */ > > > > > > @@ -119,7 +120,7 @@ struct drm_crtc_helper_funcs { > > > > > > * @mode_fixup: try to fixup proposed mode for this connector > > > * @prepare: part of the disable sequence, called before the CRTC modeset > > > * @commit: called after the CRTC modeset > > > > > > - * @mode_set: set this mode > > > + * @mode_set: set this mode, optional for atomic helpers > > > > > > * @get_crtc: return CRTC that
Re: [Intel-gfx] eDP display control registers in Linux kernel
On Sun, Feb 22, 2015 at 11:59 PM, Jani Nikula wrote: > > Hi Michael - > > Please always cc: the relevant mailing lists; done now. > > On Sun, 22 Feb 2015, Michael Leuchtenburg wrote: >> Hi Jani, >> >> I've been trying to figure out how to control the dynamic backlight control >> on my new Dell XPS 13 since the defaults are atrocious - huge swings in >> brightness from a black background to a white one, over a few seconds >> period so it's very obvious. I eventually tracked down a patch from the >> Chromium folks that adds a sysfs interface ( >> https://chromium.googlesource.com/chromiumos/third_party/kernel/+/db5eacd6ac7a0cbda4ea1010fabbd3ff6b30e0bc%5E%21/), >> but it seems to depend on your patch that adds eDP display control >> registers ( >> http://lists.freedesktop.org/archives/dri-devel/2013-November/049098.html), >> which was never merged into mainline as far as I can tell. >> >> Do you know what the status of that is? Is it still wending its way through >> the process (now, over a year later) or did it die somewhere along the way? >> The patch doesn't apply to mainline today, though it's simple enough that I >> suspect it'd be easy to adapt. I'd rather see where it went, though. >> >> I'd appreciate any help you can offer. > > The Chrome OS patches wouldn't be acceptable upstream, and indeed > they've never been posted upstream. A more driver agnostic approach > would be required. I actually asked Eric to make a property version of this: https://chromium-review.googlesource.com/#/c/244165/ Once this lands in Chrome OS, we should upstream it. Stéphane > > My patches were simple, adding some macros etc. They were reviewed but > apparently forgotten, also by me. I'll repost them, but they won't do > you much good in this case. > > I'm also not convinced yet that your problem would be solved by the > patches; are you sure Dell XPS 13 does have dynamic backlight control in > the panel, adjustable via DPCD? > > BR, > Jani. > > > -- > Jani Nikula, Intel Open Source Technology Center > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/7] drm/i915: Remove DRIVER_MODESET checks from gem code
Hooray! Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem.c | 14 -- 1 file changed, 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f28f0dea6c96..4e05f57b9c54 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4606,10 +4606,6 @@ i915_gem_suspend(struct drm_device *dev) i915_gem_retire_requests(dev); - /* Under UMS, be paranoid and evict. */ - if (!drm_core_check_feature(dev, DRIVER_MODESET)) - i915_gem_evict_everything(dev); - i915_gem_stop_ringbuffers(dev); mutex_unlock(&dev->struct_mutex); @@ -4973,18 +4969,8 @@ i915_gem_load(struct drm_device *dev) i915_gem_idle_work_handler); init_waitqueue_head(&dev_priv->gpu_error.reset_queue); - /* On GEN3 we really need to make sure the ARB C3 LP bit is set */ - if (!drm_core_check_feature(dev, DRIVER_MODESET) && IS_GEN3(dev)) { - I915_WRITE(MI_ARB_STATE, - _MASKED_BIT_ENABLE(MI_ARB_C3_LP_WRITE_ENABLE)); - } - dev_priv->relative_constants_mode = I915_EXEC_CONSTANTS_REL_GENERAL; - /* Old X drivers will take 0-2 for front, back, depth buffers */ - if (!drm_core_check_feature(dev, DRIVER_MODESET)) - dev_priv->fence_reg_start = 3; - if (INTEL_INFO(dev)->gen >= 7 && !IS_VALLEYVIEW(dev)) dev_priv->num_fence_regs = 32; else if (INTEL_INFO(dev)->gen >= 4 || IS_I945G(dev) || IS_I945GM(dev) || IS_G33(dev)) -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/7] drm/i915: Remove DRIVER_MODESET checks in the gpu reset code
Again, good riddance to UMS! Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.c | 49 +++-- 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index c1a5377caff0..cc6c51107047 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -863,38 +863,35 @@ int i915_reset(struct drm_device *dev) * was running at the time of the reset (i.e. we weren't VT * switched away). */ - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - /* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */ - dev_priv->gpu_error.reload_in_reset = true; - ret = i915_gem_init_hw(dev); + /* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */ + dev_priv->gpu_error.reload_in_reset = true; - dev_priv->gpu_error.reload_in_reset = false; + ret = i915_gem_init_hw(dev); - mutex_unlock(&dev->struct_mutex); - if (ret) { - DRM_ERROR("Failed hw init on reset %d\n", ret); - return ret; - } - - /* -* FIXME: This races pretty badly against concurrent holders of -* ring interrupts. This is possible since we've started to drop -* dev->struct_mutex in select places when waiting for the gpu. -*/ + dev_priv->gpu_error.reload_in_reset = false; - /* -* rps/rc6 re-init is necessary to restore state lost after the -* reset and the re-install of gt irqs. Skip for ironlake per -* previous concerns that it doesn't respond well to some forms -* of re-init after reset. -*/ - if (INTEL_INFO(dev)->gen > 5) - intel_enable_gt_powersave(dev); - } else { - mutex_unlock(&dev->struct_mutex); + mutex_unlock(&dev->struct_mutex); + if (ret) { + DRM_ERROR("Failed hw init on reset %d\n", ret); + return ret; } + /* +* FIXME: This races pretty badly against concurrent holders of +* ring interrupts. This is possible since we've started to drop +* dev->struct_mutex in select places when waiting for the gpu. +*/ + + /* +* rps/rc6 re-init is necessary to restore state lost after the +* reset and the re-install of gt irqs. Skip for ironlake per +* previous concerns that it doesn't respond well to some forms +* of re-init after reset. +*/ + if (INTEL_INFO(dev)->gen > 5) + intel_enable_gt_powersave(dev); + return 0; } -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/7] drm/i915: Remove irq-related FIXME in reset code
With the two-step reset counter increments which braket the actual reset code and the subsequent wake-up we're guaranteeing that all the lockless waiters _will_ be woken up. And since we unconditionally bail out of waits with -EAGAIN (or -EIO) in that case there is not risk of lost interrupt enabling bits when the lockless wait code races against a gpu reset. Let's remove this FIXME as resolved then. Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index cc6c51107047..89741e6e2d08 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -878,12 +878,6 @@ int i915_reset(struct drm_device *dev) } /* -* FIXME: This races pretty badly against concurrent holders of -* ring interrupts. This is possible since we've started to drop -* dev->struct_mutex in select places when waiting for the gpu. -*/ - - /* * rps/rc6 re-init is necessary to restore state lost after the * reset and the re-install of gt irqs. Skip for ironlake per * previous concerns that it doesn't respond well to some forms -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/7] drm/i915: Remove DRIVER_MODESET checks from modeset code
Mostly just checks in i915-private modeset ioctls. Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/intel_display.c | 3 --- drivers/gpu/drm/i915/intel_opregion.c | 6 ++ drivers/gpu/drm/i915/intel_overlay.c | 2 -- drivers/gpu/drm/i915/intel_sprite.c | 6 -- 4 files changed, 2 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 3b0fe9f1f3c9..253a201e20dd 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12372,9 +12372,6 @@ int intel_get_pipe_from_crtc_id(struct drm_device *dev, void *data, struct drm_crtc *drmmode_crtc; struct intel_crtc *crtc; - if (!drm_core_check_feature(dev, DRIVER_MODESET)) - return -ENODEV; - drmmode_crtc = drm_crtc_find(dev, pipe_from_crtc_id->crtc_id); if (!drmmode_crtc) { diff --git a/drivers/gpu/drm/i915/intel_opregion.c b/drivers/gpu/drm/i915/intel_opregion.c index d8de1d5140a7..71e87abdcae7 100644 --- a/drivers/gpu/drm/i915/intel_opregion.c +++ b/drivers/gpu/drm/i915/intel_opregion.c @@ -744,10 +744,8 @@ void intel_opregion_init(struct drm_device *dev) return; if (opregion->acpi) { - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - intel_didl_outputs(dev); - intel_setup_cadls(dev); - } + intel_didl_outputs(dev); + intel_setup_cadls(dev); /* Notify BIOS we are ready to handle ACPI video ext notifs. * Right now, all the events are handled by the ACPI video module. diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index f93dfc174495..823d1d97a000 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -1065,7 +1065,6 @@ int intel_overlay_put_image(struct drm_device *dev, void *data, struct put_image_params *params; int ret; - /* No need to check for DRIVER_MODESET - we don't set it up then. */ overlay = dev_priv->overlay; if (!overlay) { DRM_DEBUG("userspace bug: no overlay\n"); @@ -1261,7 +1260,6 @@ int intel_overlay_attrs(struct drm_device *dev, void *data, struct overlay_registers __iomem *regs; int ret; - /* No need to check for DRIVER_MODESET - we don't set it up then. */ overlay = dev_priv->overlay; if (!overlay) { DRM_DEBUG("userspace bug: no overlay\n"); diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c index f2d408dd7c15..4f8fa0534954 100644 --- a/drivers/gpu/drm/i915/intel_sprite.c +++ b/drivers/gpu/drm/i915/intel_sprite.c @@ -1301,9 +1301,6 @@ int intel_sprite_set_colorkey(struct drm_device *dev, void *data, struct intel_plane *intel_plane; int ret = 0; - if (!drm_core_check_feature(dev, DRIVER_MODESET)) - return -ENODEV; - /* Make sure we don't try to enable both src & dest simultaneously */ if ((set->flags & (I915_SET_COLORKEY_DESTINATION | I915_SET_COLORKEY_SOURCE)) == (I915_SET_COLORKEY_DESTINATION | I915_SET_COLORKEY_SOURCE)) return -EINVAL; @@ -1332,9 +1329,6 @@ int intel_sprite_get_colorkey(struct drm_device *dev, void *data, struct intel_plane *intel_plane; int ret = 0; - if (!drm_core_check_feature(dev, DRIVER_MODESET)) - return -ENODEV; - drm_modeset_lock_all(dev); plane = drm_plane_find(dev, get->plane_id); -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/7] drm/i915: Remove regfile code&data for UMS suspend/resume
Lots of lines to remove! Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.h | 133 - drivers/gpu/drm/i915/i915_suspend.c | 215 +- drivers/gpu/drm/i915/i915_ums.c | 552 3 files changed, 2 insertions(+), 898 deletions(-) delete mode 100644 drivers/gpu/drm/i915/i915_ums.c diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3f210aa7652e..e4a988ef7f16 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -889,150 +889,21 @@ struct intel_gmbus { }; struct i915_suspend_saved_registers { - u8 saveLBB; - u32 saveDSPACNTR; - u32 saveDSPBCNTR; u32 saveDSPARB; - u32 savePIPEACONF; - u32 savePIPEBCONF; - u32 savePIPEASRC; - u32 savePIPEBSRC; - u32 saveFPA0; - u32 saveFPA1; - u32 saveDPLL_A; - u32 saveDPLL_A_MD; - u32 saveHTOTAL_A; - u32 saveHBLANK_A; - u32 saveHSYNC_A; - u32 saveVTOTAL_A; - u32 saveVBLANK_A; - u32 saveVSYNC_A; - u32 saveBCLRPAT_A; - u32 saveTRANSACONF; - u32 saveTRANS_HTOTAL_A; - u32 saveTRANS_HBLANK_A; - u32 saveTRANS_HSYNC_A; - u32 saveTRANS_VTOTAL_A; - u32 saveTRANS_VBLANK_A; - u32 saveTRANS_VSYNC_A; - u32 savePIPEASTAT; - u32 saveDSPASTRIDE; - u32 saveDSPASIZE; - u32 saveDSPAPOS; - u32 saveDSPAADDR; - u32 saveDSPASURF; - u32 saveDSPATILEOFF; - u32 savePFIT_PGM_RATIOS; - u32 saveBLC_HIST_CTL; - u32 saveBLC_PWM_CTL; - u32 saveBLC_PWM_CTL2; - u32 saveBLC_CPU_PWM_CTL; - u32 saveBLC_CPU_PWM_CTL2; - u32 saveFPB0; - u32 saveFPB1; - u32 saveDPLL_B; - u32 saveDPLL_B_MD; - u32 saveHTOTAL_B; - u32 saveHBLANK_B; - u32 saveHSYNC_B; - u32 saveVTOTAL_B; - u32 saveVBLANK_B; - u32 saveVSYNC_B; - u32 saveBCLRPAT_B; - u32 saveTRANSBCONF; - u32 saveTRANS_HTOTAL_B; - u32 saveTRANS_HBLANK_B; - u32 saveTRANS_HSYNC_B; - u32 saveTRANS_VTOTAL_B; - u32 saveTRANS_VBLANK_B; - u32 saveTRANS_VSYNC_B; - u32 savePIPEBSTAT; - u32 saveDSPBSTRIDE; - u32 saveDSPBSIZE; - u32 saveDSPBPOS; - u32 saveDSPBADDR; - u32 saveDSPBSURF; - u32 saveDSPBTILEOFF; - u32 saveVGA0; - u32 saveVGA1; - u32 saveVGA_PD; - u32 saveVGACNTRL; - u32 saveADPA; u32 saveLVDS; u32 savePP_ON_DELAYS; u32 savePP_OFF_DELAYS; - u32 saveDVOA; - u32 saveDVOB; - u32 saveDVOC; u32 savePP_ON; u32 savePP_OFF; u32 savePP_CONTROL; u32 savePP_DIVISOR; - u32 savePFIT_CONTROL; - u32 save_palette_a[256]; - u32 save_palette_b[256]; u32 saveFBC_CONTROL; - u32 saveIER; - u32 saveIIR; - u32 saveIMR; - u32 saveDEIER; - u32 saveDEIMR; - u32 saveGTIER; - u32 saveGTIMR; - u32 saveFDI_RXA_IMR; - u32 saveFDI_RXB_IMR; u32 saveCACHE_MODE_0; u32 saveMI_ARB_STATE; u32 saveSWF0[16]; u32 saveSWF1[16]; u32 saveSWF2[3]; - u8 saveMSR; - u8 saveSR[8]; - u8 saveGR[25]; - u8 saveAR_INDEX; - u8 saveAR[21]; - u8 saveDACMASK; - u8 saveCR[37]; uint64_t saveFENCE[I915_MAX_NUM_FENCES]; - u32 saveCURACNTR; - u32 saveCURAPOS; - u32 saveCURABASE; - u32 saveCURBCNTR; - u32 saveCURBPOS; - u32 saveCURBBASE; - u32 saveCURSIZE; - u32 saveDP_B; - u32 saveDP_C; - u32 saveDP_D; - u32 savePIPEA_GMCH_DATA_M; - u32 savePIPEB_GMCH_DATA_M; - u32 savePIPEA_GMCH_DATA_N; - u32 savePIPEB_GMCH_DATA_N; - u32 savePIPEA_DP_LINK_M; - u32 savePIPEB_DP_LINK_M; - u32 savePIPEA_DP_LINK_N; - u32 savePIPEB_DP_LINK_N; - u32 saveFDI_RXA_CTL; - u32 saveFDI_TXA_CTL; - u32 saveFDI_RXB_CTL; - u32 saveFDI_TXB_CTL; - u32 savePFA_CTL_1; - u32 savePFB_CTL_1; - u32 savePFA_WIN_SZ; - u32 savePFB_WIN_SZ; - u32 savePFA_WIN_POS; - u32 savePFB_WIN_POS; - u32 savePCH_DREF_CONTROL; - u32 saveDISP_ARB_CTL; - u32 savePIPEA_DATA_M1; - u32 savePIPEA_DATA_N1; - u32 savePIPEA_LINK_M1; - u32 savePIPEA_LINK_N1; - u32 savePIPEB_DATA_M1; - u32 savePIPEB_DATA_N1; - u32 savePIPEB_LINK_M1; - u32 savePIPEB_LINK_N1; - u32 saveMCHBAR_RENDER_STANDBY; u32 savePCH_PORT_HOTPLUG; u16 saveGCDGMBUS; }; @@ -3124,10 +2995,6 @@ int i915_parse_cmds(struct intel_engine_cs *ring, extern int i915_save_state(struct drm_device *dev); extern int i915_restore_state(struct drm_device *dev); -/* i915_ums.c */ -void i915_save_display_reg(struct drm_device *dev); -void i915_restore_display_reg(struct drm_device *dev); - /* i915_sysfs.c */ void
[Intel-gfx] [PATCH 1/7] drm/i915: Remove DRIVER_MODESET checks in load/unload/close code
UMS is gone, this is dead code. Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_dma.c | 95 - 1 file changed, 37 insertions(+), 58 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 5804aa5f9df0..63a001b6abed 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -638,17 +638,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) info = (struct intel_device_info *) flags; - /* Refuse to load on gen6+ without kms enabled. */ - if (info->gen >= 6 && !drm_core_check_feature(dev, DRIVER_MODESET)) { - DRM_INFO("Your hardware requires kernel modesetting (KMS)\n"); - DRM_INFO("See CONFIG_DRM_I915_KMS, nomodeset, and i915.modeset parameters\n"); - return -ENODEV; - } - - /* UMS needs agp support. */ - if (!drm_core_check_feature(dev, DRIVER_MODESET) && !dev->agp) - return -EINVAL; - dev_priv = kzalloc(sizeof(*dev_priv), GFP_KERNEL); if (dev_priv == NULL) return -ENOMEM; @@ -718,20 +707,18 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) if (ret) goto out_regs; - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - /* WARNING: Apparently we must kick fbdev drivers before vgacon, -* otherwise the vga fbdev driver falls over. */ - ret = i915_kick_out_firmware_fb(dev_priv); - if (ret) { - DRM_ERROR("failed to remove conflicting framebuffer drivers\n"); - goto out_gtt; - } + /* WARNING: Apparently we must kick fbdev drivers before vgacon, +* otherwise the vga fbdev driver falls over. */ + ret = i915_kick_out_firmware_fb(dev_priv); + if (ret) { + DRM_ERROR("failed to remove conflicting framebuffer drivers\n"); + goto out_gtt; + } - ret = i915_kick_out_vgacon(dev_priv); - if (ret) { - DRM_ERROR("failed to remove conflicting VGA console\n"); - goto out_gtt; - } + ret = i915_kick_out_vgacon(dev_priv); + if (ret) { + DRM_ERROR("failed to remove conflicting VGA console\n"); + goto out_gtt; } pci_set_master(dev->pdev); @@ -835,12 +822,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) intel_power_domains_init(dev_priv); - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - ret = i915_load_modeset_init(dev); - if (ret < 0) { - DRM_ERROR("failed to init modeset\n"); - goto out_power_well; - } + ret = i915_load_modeset_init(dev); + if (ret < 0) { + DRM_ERROR("failed to init modeset\n"); + goto out_power_well; } /* @@ -929,28 +914,25 @@ int i915_driver_unload(struct drm_device *dev) acpi_video_unregister(); - if (drm_core_check_feature(dev, DRIVER_MODESET)) - intel_fbdev_fini(dev); + intel_fbdev_fini(dev); drm_vblank_cleanup(dev); - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - intel_modeset_cleanup(dev); - - /* -* free the memory space allocated for the child device -* config parsed from VBT -*/ - if (dev_priv->vbt.child_dev && dev_priv->vbt.child_dev_num) { - kfree(dev_priv->vbt.child_dev); - dev_priv->vbt.child_dev = NULL; - dev_priv->vbt.child_dev_num = 0; - } + intel_modeset_cleanup(dev); - vga_switcheroo_unregister_client(dev->pdev); - vga_client_register(dev->pdev, NULL, NULL, NULL); + /* +* free the memory space allocated for the child device +* config parsed from VBT +*/ + if (dev_priv->vbt.child_dev && dev_priv->vbt.child_dev_num) { + kfree(dev_priv->vbt.child_dev); + dev_priv->vbt.child_dev = NULL; + dev_priv->vbt.child_dev_num = 0; } + vga_switcheroo_unregister_client(dev->pdev); + vga_client_register(dev->pdev, NULL, NULL, NULL); + /* Free error state after interrupts are fully disabled. */ cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); i915_destroy_error_state(dev); @@ -960,17 +942,15 @@ int i915_driver_unload(struct drm_device *dev) intel_opregion_fini(dev); - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - /* Flush any outstanding unpin_work. */ - flush_workqueue(dev_priv->wq); + /* Flush any outstanding unpin_work. */ + flus
[Intel-gfx] [PATCH 2/7] drm/i915: Remove DRIVER_MODESET checks from suspend/resume code
UMS is dead, yay! Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/i915_drv.c | 113 ++-- 1 file changed, 52 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index ba6862f5b6b2..c1a5377caff0 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -574,6 +574,7 @@ static int i915_drm_suspend(struct drm_device *dev) struct drm_i915_private *dev_priv = dev->dev_private; struct drm_crtc *crtc; pci_power_t opregion_target_state; + int error; /* ignore lid events during suspend */ mutex_lock(&dev_priv->modeset_restore_lock); @@ -588,37 +589,32 @@ static int i915_drm_suspend(struct drm_device *dev) pci_save_state(dev->pdev); - /* If KMS is active, we do the leavevt stuff here */ - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - int error; - - error = i915_gem_suspend(dev); - if (error) { - dev_err(&dev->pdev->dev, - "GEM idle failed, resume might fail\n"); - return error; - } + error = i915_gem_suspend(dev); + if (error) { + dev_err(&dev->pdev->dev, + "GEM idle failed, resume might fail\n"); + return error; + } - intel_suspend_gt_powersave(dev); + intel_suspend_gt_powersave(dev); - /* -* Disable CRTCs directly since we want to preserve sw state -* for _thaw. Also, power gate the CRTC power wells. -*/ - drm_modeset_lock_all(dev); - for_each_crtc(dev, crtc) - intel_crtc_control(crtc, false); - drm_modeset_unlock_all(dev); + /* +* Disable CRTCs directly since we want to preserve sw state +* for _thaw. Also, power gate the CRTC power wells. +*/ + drm_modeset_lock_all(dev); + for_each_crtc(dev, crtc) + intel_crtc_control(crtc, false); + drm_modeset_unlock_all(dev); - intel_dp_mst_suspend(dev); + intel_dp_mst_suspend(dev); - intel_runtime_pm_disable_interrupts(dev_priv); - intel_hpd_cancel_work(dev_priv); + intel_runtime_pm_disable_interrupts(dev_priv); + intel_hpd_cancel_work(dev_priv); - intel_suspend_encoders(dev_priv); + intel_suspend_encoders(dev_priv); - intel_suspend_hw(dev); - } + intel_suspend_hw(dev); i915_gem_suspend_gtt_mappings(dev); @@ -690,53 +686,48 @@ static int i915_drm_resume(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - mutex_lock(&dev->struct_mutex); - i915_gem_restore_gtt_mappings(dev); - mutex_unlock(&dev->struct_mutex); - } + mutex_lock(&dev->struct_mutex); + i915_gem_restore_gtt_mappings(dev); + mutex_unlock(&dev->struct_mutex); i915_restore_state(dev); intel_opregion_setup(dev); - /* KMS EnterVT equivalent */ - if (drm_core_check_feature(dev, DRIVER_MODESET)) { - intel_init_pch_refclk(dev); - drm_mode_config_reset(dev); + intel_init_pch_refclk(dev); + drm_mode_config_reset(dev); - mutex_lock(&dev->struct_mutex); - if (i915_gem_init_hw(dev)) { - DRM_ERROR("failed to re-initialize GPU, declaring wedged!\n"); - atomic_set_mask(I915_WEDGED, &dev_priv->gpu_error.reset_counter); - } - mutex_unlock(&dev->struct_mutex); + mutex_lock(&dev->struct_mutex); + if (i915_gem_init_hw(dev)) { + DRM_ERROR("failed to re-initialize GPU, declaring wedged!\n"); + atomic_set_mask(I915_WEDGED, &dev_priv->gpu_error.reset_counter); + } + mutex_unlock(&dev->struct_mutex); - /* We need working interrupts for modeset enabling ... */ - intel_runtime_pm_enable_interrupts(dev_priv); + /* We need working interrupts for modeset enabling ... */ + intel_runtime_pm_enable_interrupts(dev_priv); - intel_modeset_init_hw(dev); + intel_modeset_init_hw(dev); - spin_lock_irq(&dev_priv->irq_lock); - if (dev_priv->display.hpd_irq_setup) - dev_priv->display.hpd_irq_setup(dev); - spin_unlock_irq(&dev_priv->irq_lock); + spin_lock_irq(&dev_priv->irq_lock); + if (dev_priv->display.hpd_irq_setup) + dev_priv->display.hpd_irq_setup(dev); + spin_unlock_irq(&dev_priv->irq_lock); - drm_modeset_lock_all(dev); - intel_modeset_setup_hw_
Re: [Intel-gfx] [PATCH 3/4] drm/i915: Flatten DRIVER_MODESET checks in i915_irq.c
Hi Dave, On to, 2015-02-19 at 15:42 +0200, Imre Deak wrote: > On to, 2015-02-19 at 15:39 +0200, Imre Deak wrote: > > On to, 2015-02-19 at 12:25 +, Dave Gordon wrote: > > > On 12/02/15 22:38, Imre Deak wrote: > > > > On Tue, 2015-02-03 at 11:30 +0100, Daniel Vetter wrote: > > > >> UMS is no more! > > > >> > > > >> Signed-off-by: Daniel Vetter > > > > > > Some machines now won't boot in "recovery mode", which specifies > > > "nomodeset" and therefore results in various important bits of code not > > > being executed. Will we eventually ignore "modeset" completely, or just > > > refuse to load at all if "nomodeset" is explicitly specified? > > > > The driver will already refuse to load with nomodeset for GEN6+ for > > quite some time now. On old platforms UMS would still work before this > > patch, but afaik there was a decision to stop supporting UMS. Note that > > this doesn't mean "recovery mode" or equivalently nomodeset will break > > booting, it just means user space will fall back to vesa/vga or text > > mode. > > Ah, or did you mean after this patch we should refuse loading the driver > in case of nomodeset even for old platforms? That would make sense > indeed. I was wrong here, I was thinking only about the GEN6 MODESET check in i915_driver_load. As Daniel pointed out on IRC in addition to that we also silently fail to load driver in i915_init for !MODESET always, regardless of the platform, so the check in i915_driver_load is redundant. Based on this it's safe to remove the !MODESET parts. --Imre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t] tests/gem_render_linear_blits: split into two subtests
On Mon, Feb 23, 2015 at 09:25:22AM +, Gore, Tim wrote: > > > + igt_subtest("apperture-thrash") { > > > + if (argc > 1) > > > + count = atoi(argv[1]); > > > > With automated testing we want to perform the same test over and over > > again. If it is called aperture-thrash, let's make sure we do! > > > > If anyone ever wants to manually run this with varying amounts of > > objects: first they should consider enhancing the test to capture the > > scenario > > of concern, then secondly add the manual option. > > > > Hi Chris, are you suggesting that I remove the command line option for count? Yes, I don't think it makes much sense to keep it. Whilst you are here, could you add a third subtest for "swap-thrash" (the check is intel_require_memory(CHECK_RAM | CHECK_SWAP), and I'll let you work out how to compute count :) -Chris > -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: avoid processing spurious/shared interrupts in low-power states
On Mon, Feb 23, 2015 at 11:58:14AM +0200, Imre Deak wrote: > Atm, it's possible that the interrupt handler is called when the device > is in D3 or some other low-power state. It can be due to another device > that is still in D0 state and shares the interrupt line with i915, or on > some platforms there could be spurious interrupts even without sharing > the interrupt line. The latter case was reported by Klaus Ethgen using a > Lenovo x61p machine (gen 4). He noticed this issue via a system > suspend/resume hang and bisected it to the following commit: > > commit e11aa362308f5de467ce355a2a2471321b15a35c > Author: Jesse Barnes > Date: Wed Jun 18 09:52:55 2014 -0700 > > drm/i915: use runtime irq suspend/resume in freeze/thaw > > This is a problem, since in low-power states IIR will always read > 0x resulting in an endless IRQ servicing loop. > > Fix this by handling interrupts only when the driver explicitly enables > them and so it's guaranteed that the interrupt registers return a valid > value. > > Note that this issue existed even before the above commit, since during > runtime suspend/resume we never unregistered the handler. > > v2: > - clarify the purpose of smp_mb() vs. synchronize_irq() in the > code comment(Chris) > > Reference: https://lkml.org/lkml/2015/2/11/205 > Reported-and-bisected-by: Klaus Ethgen > Signed-off-by: Imre Deak Thanks! That comment makes it very clear. Reviewed-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Add debugfs entry for DRRS
From: Vandana Kannan Adding a debugfs entry to determine if DRRS is supported or not V2: [By Ram]: Following details about the active crtc will be filled in seq-file of the debugfs 1. Encoder output type 2. DRRS Support on this CRTC 3. DRRS current state 4. Current Vrefresh Format is as follows: CRTC 1: Output: eDP, DRRS Supported: Yes (Seamless), DRRS_State: DRRS_HIGH_RR, Vrefresh: 60 CRTC 2: Output: HDMI, DRRS Supported : No, VBT DRRS_type: Seamless CRTC 1: Output: eDP, DRRS Supported: Yes (Seamless), DRRS_State: DRRS_LOW_RR, Vrefresh: 40 CRTC 2: Output: HDMI, DRRS Supported : No, VBT DRRS_type: Seamless V3: [By Ram]: Readability is improved. Another error case is covered [Daniel] V4: [By Ram]: Current status of the Idleness DRRS along with the Front buffer bits are added to the debugfs. [Rodrigo] V5: [By Ram]: Rephrased to make it easy to understand. And format is modified. [Rodrigo] Signed-off-by: Vandana Kannan Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/i915_debugfs.c | 113 +++ 1 file changed, 113 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 164fa82..e51001c 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2869,6 +2869,118 @@ static int i915_ddb_info(struct seq_file *m, void *unused) return 0; } +static void drrs_status_per_crtc(struct seq_file *m, + struct drm_device *dev, struct intel_crtc *intel_crtc) +{ + struct intel_encoder *intel_encoder; + struct drm_i915_private *dev_priv = dev->dev_private; + struct i915_drrs *drrs = &dev_priv->drrs; + int vrefresh = 0; + u32 work_status; + + for_each_encoder_on_crtc(dev, &intel_crtc->base, intel_encoder) { + /* Encoder connected on this CRTC */ + switch (intel_encoder->type) { + case INTEL_OUTPUT_EDP: + seq_puts(m, "eDP:\n"); + break; + case INTEL_OUTPUT_DSI: + seq_puts(m, "DSI:\n"); + break; + case INTEL_OUTPUT_HDMI: + seq_puts(m, "HDMI:\n"); + break; + case INTEL_OUTPUT_DISPLAYPORT: + seq_puts(m, "DP:\n"); + break; + default: + seq_printf(m, "Other encoder (id=%d).\n", + intel_encoder->type); + return; + } + } + + if (dev_priv->vbt.drrs_type == STATIC_DRRS_SUPPORT) + seq_puts(m, "\tVBT: DRRS_type: Static"); + else if (dev_priv->vbt.drrs_type == SEAMLESS_DRRS_SUPPORT) + seq_puts(m, "\tVBT: DRRS_type: Seamless"); + else if (dev_priv->vbt.drrs_type == DRRS_NOT_SUPPORTED) + seq_puts(m, "\tVBT: DRRS_type: None"); + else + seq_puts(m, "\tVBT: DRRS_type: FIXME: Unrecognized Value"); + + seq_puts(m, "\n\n"); + + if (intel_crtc->config->has_drrs) { + struct intel_panel *panel; + + panel = &drrs->dp->attached_connector->panel; + /* DRRS Supported */ + seq_puts(m, "\tDRRS Supported: Yes\n"); + seq_printf(m, "\t\tBusy_frontbuffer_bits: 0x%X", + drrs->busy_frontbuffer_bits); + + seq_puts(m, "\n\t\t"); + work_status = work_busy(&drrs->work.work); + if (drrs->busy_frontbuffer_bits) { + seq_puts(m, "Front buffer: Busy.\n"); + seq_puts(m, "\t\tIdleness DRRS: Disabled"); + } else { + seq_puts(m, "Front buffer: Idle"); + seq_puts(m, "\n\t\t"); + if (drrs->refresh_rate_type == DRRS_HIGH_RR) { + if (work_status) + seq_puts(m, "Idleness DRRS: Enabled"); + else + seq_puts(m, "Idleness DRRS: Disabled"); + } else if (drrs->refresh_rate_type == DRRS_LOW_RR) { + seq_puts(m, "Idleness DRRS: Enabled"); + } + } + + seq_puts(m, "\n\t\t"); + if (drrs->refresh_rate_type == DRRS_HIGH_RR) { + seq_puts(m, "DRRS_State: DRRS_HIGH_RR\n"); + vrefresh = panel->fixed_mode->vrefresh; + } else if (drrs->refresh_rate_type == DRRS_LOW_RR) { + seq_puts(m, "DRRS_State: DRRS_LOW_RR\n"); + vrefresh = panel->downclock_mode->vrefresh; + } else { + seq_printf(m, "DRRS_State: Unknown(%d)\n", +
[Intel-gfx] [PATCH] drm/i915: Enhancing eDP DRRS debug message
When Downclock mode is not found, the same info is added to the corresponding debug log. Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/intel_dp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index e9862e7..8d674f4 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -5083,7 +5083,7 @@ intel_dp_drrs_init(struct intel_connector *intel_connector, (dev, fixed_mode, connector); if (!downclock_mode) { - DRM_DEBUG_KMS("DRRS not supported\n"); + DRM_DEBUG_KMS("Downclock mode is not found. DRRS not supported\n"); return NULL; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm: Fix deadlock due to getconnector locking changes
On Sun, Feb 22, 2015 at 5:38 AM, Daniel Vetter wrote: > In > > daniel@phenom:~/linux/src$ git show ccfc08655 > commit ccfc08655d5fd5076828f45fb09194c070f2f63a > Author: Rob Clark > Date: Thu Dec 18 16:01:48 2014 -0500 > > drm: tweak getconnector locking > > We need to extend the locking to cover connector->state reading for > atomic drivers, but the above commit was a bit too eager and also > included the fill_modes callback. Which on i915 on old platforms using > load detection needs to acquire modeset locks, resulting in a deadlock > on output probing. > > Reported-by: Marc Finet > Cc: Marc Finet > Cc: robdcl...@gmail.com > Signed-off-by: Daniel Vetter Reviewed-by: Rob Clark > --- > drivers/gpu/drm/drm_crtc.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c > index b15d720eda4c..ce5f1193ecd6 100644 > --- a/drivers/gpu/drm/drm_crtc.c > +++ b/drivers/gpu/drm/drm_crtc.c > @@ -2180,7 +2180,6 @@ int drm_mode_getconnector(struct drm_device *dev, void > *data, > DRM_DEBUG_KMS("[CONNECTOR:%d:?]\n", out_resp->connector_id); > > mutex_lock(&dev->mode_config.mutex); > - drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); > > connector = drm_connector_find(dev, out_resp->connector_id); > if (!connector) { > @@ -2210,6 +2209,8 @@ int drm_mode_getconnector(struct drm_device *dev, void > *data, > out_resp->mm_height = connector->display_info.height_mm; > out_resp->subpixel = connector->display_info.subpixel_order; > out_resp->connection = connector->status; > + > + drm_modeset_lock(&dev->mode_config.connection_mutex, NULL); > encoder = drm_connector_get_encoder(connector); > if (encoder) > out_resp->encoder_id = encoder->base.id; > -- > 2.1.4 > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [BISECTED REGRESSION in 3.19-rc1] [drm/i915] WARNING: drivers/gpu/drm/drm_irq.c:1077 drm_wait_one_vblank
On Mon, 16 Feb 2015, Paul Bolle wrote: > I still see this on v3.19. I booted with drm.debug=0x4. The almost 2K > lines in dmesg containing either "[drm" or this WARNING are pasted > below. I really know nothing about all this, but I do note that only the > WARNINGS are preceded by: > [drm:intel_calculate_wm] FIFO watermark level: -5 > [drm:i9xx_update_wm] FIFO watermarks - A: 26, B: 8 > > But perhaps that's another symptom of the same issue. A bit of staring > at the code couldn't help me determine that. > > Perhaps these debug messages help someone in discovering what might be > going on here. Please try v4.0-rc1 or try cherry-picking this on top of v3.19 and report back: commit f9b61ff6bce9a44555324b29e593fdffc9a115bc Author: Daniel Vetter Date: Wed Jan 7 13:54:39 2015 +0100 drm/i915: Push vblank enable/disable past encoder->enable/disable BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [BISECTED REGRESSION in 3.19-rc1] [drm/i915] WARNING: drivers/gpu/drm/drm_irq.c:1077 drm_wait_one_vblank
On Mon, 23 Feb 2015, Jani Nikula wrote: > On Mon, 16 Feb 2015, Paul Bolle wrote: >> I still see this on v3.19. I booted with drm.debug=0x4. The almost 2K >> lines in dmesg containing either "[drm" or this WARNING are pasted >> below. I really know nothing about all this, but I do note that only the >> WARNINGS are preceded by: >> [drm:intel_calculate_wm] FIFO watermark level: -5 >> [drm:i9xx_update_wm] FIFO watermarks - A: 26, B: 8 >> >> But perhaps that's another symptom of the same issue. A bit of staring >> at the code couldn't help me determine that. >> >> Perhaps these debug messages help someone in discovering what might be >> going on here. > > Please try v4.0-rc1 or try cherry-picking this on top of v3.19 and > report back: > > commit f9b61ff6bce9a44555324b29e593fdffc9a115bc > Author: Daniel Vetter > Date: Wed Jan 7 13:54:39 2015 +0100 > > drm/i915: Push vblank enable/disable past encoder->enable/disable https://bugs.freedesktop.org/show_bug.cgi?id=89108 > > BR, > Jani. > > > > -- > Jani Nikula, Intel Open Source Technology Center -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Push vblank enable/disable past encoder->enable/disable
On Mon, 16 Feb 2015, Jani Nikula wrote: > On Wed, 07 Jan 2015, Daniel Vetter wrote: >> It is platform/output depenedent when exactly the pipe will start >> running. Sometimes we just need the (cpu) pipe enabled, in other cases >> the pch transcoder is enough and in yet other cases the (DP) port is >> sending the frame start signal. >> >> In a perfect world we'd put the drm_crtc_vblank_on call exactly where >> the pipe starts running, but due to cloning and similar things this >> will get messy. And the current approach of picking the most >> conservative place for all combinations also doesn't work since that >> results in legit vblank waits (in encoder->enable hooks, e.g. the 2 >> vblank waits for sdvo) failing. >> >> Completely going back to the old world before >> >> commit 51e31d49c89055299e34b8f44d13f70e19d1 >> Author: Daniel Vetter >> Date: Mon Sep 15 12:36:02 2014 +0200 >> >> drm/i915: Use generic vblank wait# Please enter the commit message for >> your changes. Lines starting >> >> isn't great either since screaming when the vblank wait work because >> the pipe is off is kinda nice. >> >> Pick a compromise and move the drm_crtc_vblank_on right before the >> encoder->enable call. This is a lie on some outputs/platforms, but >> after the ->enable callback the pipe is guaranteed to run everywhere. >> So not that bad really. Suggested by Ville. >> >> v2: Same treatment for drm_crtc_vblank_off and encoder->disable: I've >> missed the ibx pipe B select w/a, which also has a vblank wait in the >> disable function (while the pipe is obviously still running). >> >> Cc: Ville Syrjälä >> Cc: Chris Wilson >> Acked-by: Ville Syrjälä >> Signed-off-by: Daniel Vetter > > Should this be forwarded to stable 3.19? https://bugs.freedesktop.org/show_bug.cgi?id=89108 and probably http://mid.gmane.org/20150131211609.GA3710@yulia-desktop BR, Jani. > > BR, > Jani. > > >> --- >> drivers/gpu/drm/i915/intel_display.c | 42 >> ++-- >> 1 file changed, 21 insertions(+), 21 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_display.c >> b/drivers/gpu/drm/i915/intel_display.c >> index a1dbe747a372..e224820ea5a4 100644 >> --- a/drivers/gpu/drm/i915/intel_display.c >> +++ b/drivers/gpu/drm/i915/intel_display.c >> @@ -4301,15 +4301,15 @@ static void ironlake_crtc_enable(struct drm_crtc >> *crtc) >> if (intel_crtc->config.has_pch_encoder) >> ironlake_pch_enable(crtc); >> >> +assert_vblank_disabled(crtc); >> +drm_crtc_vblank_on(crtc); >> + >> for_each_encoder_on_crtc(dev, crtc, encoder) >> encoder->enable(encoder); >> >> if (HAS_PCH_CPT(dev)) >> cpt_verify_modeset(dev, intel_crtc->pipe); >> >> -assert_vblank_disabled(crtc); >> -drm_crtc_vblank_on(crtc); >> - >> intel_crtc_enable_planes(crtc); >> } >> >> @@ -4421,14 +4421,14 @@ static void haswell_crtc_enable(struct drm_crtc >> *crtc) >> if (intel_crtc->config.dp_encoder_is_mst) >> intel_ddi_set_vc_payload_alloc(crtc, true); >> >> +assert_vblank_disabled(crtc); >> +drm_crtc_vblank_on(crtc); >> + >> for_each_encoder_on_crtc(dev, crtc, encoder) { >> encoder->enable(encoder); >> intel_opregion_notify_encoder(encoder, true); >> } >> >> -assert_vblank_disabled(crtc); >> -drm_crtc_vblank_on(crtc); >> - >> /* If we change the relative order between pipe/planes enabling, we need >> * to change the workaround. */ >> haswell_mode_set_planes_workaround(intel_crtc); >> @@ -4479,12 +4479,12 @@ static void ironlake_crtc_disable(struct drm_crtc >> *crtc) >> >> intel_crtc_disable_planes(crtc); >> >> -drm_crtc_vblank_off(crtc); >> -assert_vblank_disabled(crtc); >> - >> for_each_encoder_on_crtc(dev, crtc, encoder) >> encoder->disable(encoder); >> >> +drm_crtc_vblank_off(crtc); >> +assert_vblank_disabled(crtc); >> + >> if (intel_crtc->config.has_pch_encoder) >> intel_set_pch_fifo_underrun_reporting(dev_priv, pipe, false); >> >> @@ -4544,14 +4544,14 @@ static void haswell_crtc_disable(struct drm_crtc >> *crtc) >> >> intel_crtc_disable_planes(crtc); >> >> -drm_crtc_vblank_off(crtc); >> -assert_vblank_disabled(crtc); >> - >> for_each_encoder_on_crtc(dev, crtc, encoder) { >> intel_opregion_notify_encoder(encoder, false); >> encoder->disable(encoder); >> } >> >> +drm_crtc_vblank_off(crtc); >> +assert_vblank_disabled(crtc); >> + >> if (intel_crtc->config.has_pch_encoder) >> intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A, >>false); >> @@ -5021,12 +5021,12 @@ static void valleyview_crtc_enable(struct drm_crtc >> *crtc) >> intel_update_watermarks(crtc); >> intel_enable_pipe(intel_crtc); >> >> -for_each_encoder_on_crtc(dev, crtc, encoder) >> -
Re: [Intel-gfx] [PATCH] drm/i915: Dell Chromebook 11 has PWM backlight
On Thu, 19 Feb 2015, Jani Nikula wrote: > On Thu, 19 Feb 2015, Jani Nikula wrote: >> Add quirk for Dell Chromebook 11 backlight. > > Also > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93451 > >> Tested-by: Owen Garland >> Cc: sta...@vger.kernel.org >> Signed-off-by: Jani Nikula Pushed to drm-intel-fixes with Damien's IRC ack. BR, Jani. >> --- >> drivers/gpu/drm/i915/intel_display.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/intel_display.c >> b/drivers/gpu/drm/i915/intel_display.c >> index 3b0fe9f1f3c9..d630ce46298a 100644 >> --- a/drivers/gpu/drm/i915/intel_display.c >> +++ b/drivers/gpu/drm/i915/intel_display.c >> @@ -13100,6 +13100,9 @@ static struct intel_quirk intel_quirks[] = { >> >> /* HP Chromebook 14 (Celeron 2955U) */ >> { 0x0a06, 0x103c, 0x21ed, quirk_backlight_present }, >> + >> +/* Dell Chromebook 11 */ >> +{ 0x0a06, 0x1028, 0x0a35, quirk_backlight_present }, >> }; >> >> static void intel_init_quirks(struct drm_device *dev) >> -- >> 2.1.4 >> >> ___ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Jani Nikula, Intel Open Source Technology Center -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/skl: handle all pixel formats in skylake_update_primary_plane()
On Mon, 16 Feb 2015, Damien Lespiau wrote: > On Tue, Feb 10, 2015 at 01:15:49PM +0200, Jani Nikula wrote: >> skylake_update_primary_plane() did not handle all pixel formats returned >> by skl_format_to_fourcc(). Handle alpha similar to skl_update_plane(). >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89052 >> Signed-off-by: Jani Nikula > > Given the discussion with Ville, it's quite likely we'll default to > alpha blending for pre-multiplied fbs (for plane supporting alpha), even > with the blending properties added. In that context, we can provide a > single, fixed, (but useful) blending mode before we get to implement the > full thing. So: > > Reviewed-by: Damien Lespiau Pushed to drm-intel-fixes, thanks for the review. BR, Jani. > > -- > Damien > >> --- >> >> This is purely cargo culting to avoid the BUG. >> --- >> drivers/gpu/drm/i915/intel_display.c | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/intel_display.c >> b/drivers/gpu/drm/i915/intel_display.c >> index 3fe95982be93..cede05256d56 100644 >> --- a/drivers/gpu/drm/i915/intel_display.c >> +++ b/drivers/gpu/drm/i915/intel_display.c >> @@ -2751,10 +2751,19 @@ static void skylake_update_primary_plane(struct >> drm_crtc *crtc, >> case DRM_FORMAT_XRGB: >> plane_ctl |= PLANE_CTL_FORMAT_XRGB_; >> break; >> +case DRM_FORMAT_ARGB: >> +plane_ctl |= PLANE_CTL_FORMAT_XRGB_; >> +plane_ctl |= PLANE_CTL_ALPHA_SW_PREMULTIPLY; >> +break; >> case DRM_FORMAT_XBGR: >> plane_ctl |= PLANE_CTL_ORDER_RGBX; >> plane_ctl |= PLANE_CTL_FORMAT_XRGB_; >> break; >> +case DRM_FORMAT_ABGR: >> +plane_ctl |= PLANE_CTL_ORDER_RGBX; >> +plane_ctl |= PLANE_CTL_FORMAT_XRGB_; >> +plane_ctl |= PLANE_CTL_ALPHA_SW_PREMULTIPLY; >> +break; >> case DRM_FORMAT_XRGB2101010: >> plane_ctl |= PLANE_CTL_FORMAT_XRGB_2101010; >> break; >> -- >> 2.1.4 >> >> ___ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: logical-not-parenthesis gcc 5.0 fixes
On 22/02/15 19:10, François Tigeot wrote: > * This change prevents "logical not is only applied to the left hand side of > comparison" > gcc 5.0 warnings. > > * Originally added by John Marino in DragonFly's > eecf6c3c3b6f7127edd8b8f8c2a83e2f882ed0da > commit. > > Signed-off-by: François Tigeot > --- > drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +- > drivers/gpu/drm/i915/intel_display.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c > b/drivers/gpu/drm/i915/i915_gem_tiling.c > index 7a24bd1..402179c 100644 > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c > @@ -512,7 +512,7 @@ i915_gem_object_do_bit_17_swizzle(struct > drm_i915_gem_object *obj) > struct page *page = sg_page_iter_page(&sg_iter); > char new_bit_17 = page_to_phys(page) >> 17; > if ((new_bit_17 & 0x1) != > - (test_bit(i, obj->bit_17) != 0)) { > + (test_bit(i, obj->bit_17) ? 1 : 0)) { test_bit() already returns a bool; the last line of the definition says: return ((mask & *addr) != 0); so comparing it with "!= 0" OR "? 1 : 0" is completely redundant. I'd suggest that the clearest formulation is: + char new_bit_17 = (page_to_phys(page) >> 17) & 1; + if (new_bit_17 != test_bit(i, obj->bit_17)) { > i915_gem_swizzle_page(page); > set_page_dirty(page); > } > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 3b0fe9f..91264b2 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -13299,7 +13299,7 @@ intel_check_plane_mapping(struct intel_crtc *crtc) > val = I915_READ(reg); > > if ((val & DISPLAY_PLANE_ENABLE) && > - (!!(val & DISPPLANE_SEL_PIPE_MASK) == crtc->pipe)) > + (!!( (val & DISPPLANE_SEL_PIPE_MASK) == crtc->pipe) )) You never need a "!!" after a "&&" as it's already implicitly "!= 0". Also, the "!!" here applies to the result of "==" which is already bool. So it's not equivalent to the previous version which applied the "!!" to the result of the "&". It might be clearer to break it into several steps, especially as in general DISPPLANE_SEL_PIPE_MASK is more than one bit, and crtc->pipe is an enum not a bool! BTW, are these definitions right? #define DISPPLANE_STEREO_ENABLE (1<<25) #define DISPPLANE_STEREO_DISABLE 0 #define DISPPLANE_PIPE_CSC_ENABLE (1<<24) #define DISPPLANE_SEL_PIPE_SHIFT 24 #define DISPPLANE_SEL_PIPE_MASK (3
Re: [Intel-gfx] [PATCH] drm/i915: Fix a use after free, and unbalanced refcounting
> -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf Of > Nick Hoath > Sent: Thursday, February 19, 2015 4:31 PM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH] drm/i915: Fix a use after free, and unbalanced > refcounting > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 > > When converting from implicitly tracked execlist queue items to ref counted > requests, not all frees of requests were replaced with unrefs, and extraneous > refs/unrefs of contexts were added. > Correct the unbalanced refcount & replace the frees. > Remove a noisy warning when hitting the request creation path. > > drm_i915_gem_request and intel_context are both kref reference counted > structures. Upon allocation, drm_i915_gem_request's ref count should be > bumped using kref_init. When a context is assigned to the request, > the context's reference count should be bumped using > i915_gem_context_reference. > i915_gem_request_reference will reduce the context reference count when > the request is freed. > > Problem introduced in > commit 6d3d8274bc45de4babb62d64562d92af984dd238 > Author: Nick Hoath > AuthorDate: Thu Jan 15 13:10:39 2015 + > > drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request > > v2: Added comments explaining how the ctx pointer and the request object > should > be ref-counted. Removed noisy warning. > > v3: Cleaned up the language used in the commit & the header > description (Thanks David Gordon) > > Signed-off-by: Nick Hoath > --- > drivers/gpu/drm/i915/i915_drv.h | 11 ++- > drivers/gpu/drm/i915/i915_gem.c | 3 +-- > drivers/gpu/drm/i915/intel_lrc.c | 8 > 4 files changed, 16 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 2dedd43..956fe26 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2121,6 +2121,9 @@ void i915_gem_track_fb(struct drm_i915_gem_object > *old, > * number comparisons on buffer last_read|write_seqno. It also allows an > * emission time to be associated with the request for tracking how far ahead > * of the GPU the submission is. > + * > + * The requests are reference counted, so upon creation they should have an > + * initial reference taken using kref_init > */ > struct drm_i915_gem_request { > struct kref ref; > @@ -2144,7 +2147,16 @@ struct drm_i915_gem_request { > /** Position in the ringbuffer of the end of the whole request */ > u32 tail; > > - /** Context related to this request */ > + /** > + * Context related to this request > + * Contexts are refcounted, so when this request is associated with a > + * context, we must increment the context's refcount, to guarantee that > + * it persists while any request is linked to it. Requests themselves > + * are also refcounted, so the request will only be freed when the last > + * reference to it is dismissed, and the code in > + * i915_gem_request_free() will then decrement the refcount on the > + * context. > + */ > struct intel_context *ctx; > > /** Batch buffer related to this request if any */ > diff --git a/drivers/gpu/drm/i915/i915_gem.c > b/drivers/gpu/drm/i915/i915_gem.c > index 61134ab..996f60f 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2664,8 +2664,7 @@ static void i915_gem_reset_ring_cleanup(struct > drm_i915_private *dev_priv, > if (submit_req->ctx != ring->default_context) > intel_lr_context_unpin(ring, submit_req->ctx); > > - i915_gem_context_unreference(submit_req->ctx); > - kfree(submit_req); > + i915_gem_request_unreference(submit_req); > } > > /* > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index aafcef3..62a2b2a 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -512,18 +512,19 @@ static int execlists_context_queue(struct > intel_engine_cs *ring, >* If there isn't a request associated with this submission, >* create one as a temporary holder. >*/ > - WARN(1, "execlist context submission without request"); > request = kzalloc(sizeof(*request), GFP_KERNEL); > if (request == NULL) > return -ENOMEM; > request->ring = ring; > request->ctx = to; > + kref_init(&request->ref); > + request->uniq = dev_priv->request_uniq++; > + i915_gem_context_reference(request->ctx); > } else { > + i915_gem_request_reference(request); > WARN_ON(to != request->ctx); > } > request->tail = tail; > - i915_gem_request_reference(request); > - i915_gem_context_r
[Intel-gfx] Screen locksup with only black screen and cursor
I upgraded to the 3.19.0 kernel several days ago on my Ubuntu 14.04.2 LTS system running on a Dell Optiplex 780 with 4Gb ram kernel 3.19.0-031900-generic #201502091451 SMP Mon Feb 9 14:52:52 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux. Since then quite often I'll see this in my syslog: Feb 22 18:00:46 localhost kernel: [ 9088.033815] [drm:intel_crtc_cursor_set_obj] cursor off Feb 22 18:00:46 localhost kernel: [ 9088.033821] [drm:g4x_check_srwm] SR watermark: display plane 92, cursor 2 Feb 22 18:00:46 localhost kernel: [ 9088.033822] [drm:g4x_check_srwm] display watermark is too large(92/63), disabling Feb 22 18:00:46 localhost kernel: [ 9088.033824] [drm:intel_set_memory_cxsr] memory self-refresh is disabled Feb 22 18:00:46 localhost kernel: [ 9088.033826] [drm:g4x_update_wm] Setting FIFO watermarks - A: plane=40, cursor=2, B: plane=2, cursor=2, SR: plane=0, cursor=0 Quite often I'll also come to a black screen with only the mouse cursor. The cursor will move however I can't get back to the desktop. Today when this happened I was able to CTRL>ALT>ESC, login and run sudo reboot which rebooted the system. I really don't know if any of the above has anything to do with the lockups but if there are any other log items I need to look for I'll gladly do it. I have hourly log snippets for months. I'm also attaching the output of lshw and dmidecode in case it helps. http://pastebin.com/GDKCfGCX lshw http://pastebin.com/KH9aknM2 dmidecode I'm not sure even if this is the correct list to post to. If not possibly someone could point me to the correct place to post and whether or not to file a bug report and where to file it. Thanks for your time Chris -- Chris KeyID 0xE372A7DA98E6705C 31.11°N 97.89°W (Elev. 1092 ft) 07:49:44 up 44 min, 1 user, load average: 0.24, 0.25, 0.31 Ubuntu 14.04.2 LTS, kernel 3.19.0-031900-generic ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: avoid processing spurious/shared interrupts in low-power states
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang...@intel.com) Task id: 5809 -Summary- Platform Delta drm-intel-nightly Series Applied PNV -1 281/281 280/281 ILK 308/308 308/308 SNB -2 326/326 324/326 IVB 380/380 380/380 BYT 294/294 294/294 HSW -3 421/421 418/421 BDW -1 316/316 315/316 -Detailed- Platform Testdrm-intel-nightly Series Applied *PNV igt_gen3_render_mixed_blits PASS(3) CRASH(1)PASS(1) *SNB igt_kms_plane_plane-position-hole-pipe-B-plane-2 DMESG_WARN(12)PASS(2) TIMEOUT(1)PASS(1) *SNB igt_kms_rotation_crc_sprite-rotation DMESG_WARN(11)PASS(3) FAIL(1)PASS(1) *HSW igt_gem_pwrite_pread_snooped-pwrite-blt-cpu_mmap-performance PASS(3) DMESG_WARN(1)PASS(1) HSW igt_gem_storedw_loop_vebox DMESG_WARN(1)PASS(1) DMESG_WARN(1)PASS(1) *HSW igt_kms_flip_plain-flip-fb-recreate-interruptible PASS(2) TIMEOUT(2) *BDW igt_gem_gtt_hog PASS(2) DMESG_WARN(1)PASS(1) Note: You need to pay more attention to line start with '*' ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3] drm/i915: avoid processing spurious/shared interrupts in low-power states
Atm, it's possible that the interrupt handler is called when the device is in D3 or some other low-power state. It can be due to another device that is still in D0 state and shares the interrupt line with i915, or on some platforms there could be spurious interrupts even without sharing the interrupt line. The latter case was reported by Klaus Ethgen using a Lenovo x61p machine (gen 4). He noticed this issue via a system suspend/resume hang and bisected it to the following commit: commit e11aa362308f5de467ce355a2a2471321b15a35c Author: Jesse Barnes Date: Wed Jun 18 09:52:55 2014 -0700 drm/i915: use runtime irq suspend/resume in freeze/thaw This is a problem, since in low-power states IIR will always read 0x resulting in an endless IRQ servicing loop. Fix this by handling interrupts only when the driver explicitly enables them and so it's guaranteed that the interrupt registers return a valid value. Note that this issue existed even before the above commit, since during runtime suspend/resume we never unregistered the handler. v2: - clarify the purpose of smp_mb() vs. synchronize_irq() in the code comment (Chris) v3: - no need for an explicit smp_mb(), we can assume that synchronize_irq() and the mmio read/writes in the install hooks provide for this (Daniel) - remove code comment as the remaining synchronize_irq() is self explanatory (Daniel) Reference: https://lkml.org/lkml/2015/2/11/205 Reported-and-bisected-by: Klaus Ethgen Signed-off-by: Imre Deak --- drivers/gpu/drm/i915/i915_irq.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 9073119..612c9c0 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1889,6 +1889,9 @@ static irqreturn_t valleyview_irq_handler(int irq, void *arg) u32 iir, gt_iir, pm_iir; irqreturn_t ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + while (true) { /* Find, clear, then process each source of interrupt */ @@ -1933,6 +1936,9 @@ static irqreturn_t cherryview_irq_handler(int irq, void *arg) u32 master_ctl, iir; irqreturn_t ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + for (;;) { master_ctl = I915_READ(GEN8_MASTER_IRQ) & ~GEN8_MASTER_IRQ_CONTROL; iir = I915_READ(VLV_IIR); @@ -2205,6 +2211,9 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) u32 de_iir, gt_iir, de_ier, sde_ier = 0; irqreturn_t ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + /* We get interrupts on unclaimed registers, so check for this before we * do any I915_{READ,WRITE}. */ intel_uncore_check_errors(dev); @@ -2276,6 +2285,9 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg) enum pipe pipe; u32 aux_mask = GEN8_AUX_CHANNEL_A; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + if (IS_GEN9(dev)) aux_mask |= GEN9_AUX_CHANNEL_B | GEN9_AUX_CHANNEL_C | GEN9_AUX_CHANNEL_D; @@ -3768,6 +3780,9 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg) I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT | I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + iir = I915_READ16(IIR); if (iir == 0) return IRQ_NONE; @@ -3948,6 +3963,9 @@ static irqreturn_t i915_irq_handler(int irq, void *arg) I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; int pipe, ret = IRQ_NONE; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + iir = I915_READ(IIR); do { bool irq_received = (iir & ~flip_mask) != 0; @@ -4168,6 +4186,9 @@ static irqreturn_t i965_irq_handler(int irq, void *arg) I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT | I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT; + if (!intel_irqs_enabled(dev_priv)) + return IRQ_NONE; + iir = I915_READ(IIR); for (;;) { @@ -4504,6 +4525,7 @@ void intel_irq_uninstall(struct drm_i915_private *dev_priv) drm_irq_uninstall(dev_priv->dev); intel_hpd_cancel_work(dev_priv); dev_priv->pm.irqs_enabled = false; + synchronize_irq(dev_priv->dev->irq); } /** @@ -4517,6 +4539,7 @@ void intel_runtime_pm_disable_interrupts(struct drm_i915_private *dev_priv) { dev_priv->dev->driver->irq_uninstall(dev_priv->dev); dev_priv->pm.irqs_enabled = false; + synchronize_irq(dev_priv->dev->irq); } /** -- 2.1.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.or
Re: [Intel-gfx] [PATCH v4 05/24] drm/i915: page table abstractions
On 2/18/2015 11:27 AM, Mika Kuoppala wrote: Michel Thierry writes: From: Ben Widawsky When we move to dynamic page allocation, keeping page_directory and pagetabs as separate structures will help to break actions into simpler tasks. To help transition the code nicely there is some wasted space in gen6/7. This will be ameliorated shortly. Following the x86 pagetable terminology: PDPE = struct i915_page_directory_pointer_entry. PDE = struct i915_page_directory_entry [page_directory]. PTE = struct i915_page_table_entry [page_tables]. v2: fixed mismatches after clean-up/rebase. v3: Clarify the names of the multiple levels of page tables (Daniel) Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2, v3) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 177 ++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 23 - 2 files changed, 107 insertions(+), 93 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index b48b586..98b4698 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -334,7 +334,8 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, I915_CACHE_LLC, use_scratch); while (num_entries) { - struct page *page_table = ppgtt->gen8_pt_pages[pdpe][pde]; + struct i915_page_directory_entry *pd = &ppgtt->pdp.page_directory[pdpe]; + struct page *page_table = pd->page_tables[pde].page; last_pte = pte + num_entries; if (last_pte > GEN8_PTES_PER_PAGE) @@ -378,8 +379,12 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES)) break; - if (pt_vaddr == NULL) - pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[pdpe][pde]); + if (pt_vaddr == NULL) { + struct i915_page_directory_entry *pd = &ppgtt->pdp.page_directory[pdpe]; + struct page *page_table = pd->page_tables[pde].page; + + pt_vaddr = kmap_atomic(page_table); + } pt_vaddr[pte] = gen8_pte_encode(sg_page_iter_dma_address(&sg_iter), @@ -403,29 +408,33 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, } } -static void gen8_free_page_tables(struct page **pt_pages) +static void gen8_free_page_tables(struct i915_page_directory_entry *pd) { int i; - if (pt_pages == NULL) + if (pd->page_tables == NULL) return; for (i = 0; i < GEN8_PDES_PER_PAGE; i++) - if (pt_pages[i]) - __free_pages(pt_pages[i], 0); + if (pd->page_tables[i].page) + __free_page(pd->page_tables[i].page); } -static void gen8_ppgtt_free(const struct i915_hw_ppgtt *ppgtt) +static void gen8_free_page_directories(struct i915_page_directory_entry *pd) ^ You only free one directory so why plural here? +{ If you free the page tables for the directory here.. + kfree(pd->page_tables); + __free_page(pd->page); +} + +static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) { int i; for (i = 0; i < ppgtt->num_pd_pages; i++) { - gen8_free_page_tables(ppgtt->gen8_pt_pages[i]); - kfree(ppgtt->gen8_pt_pages[i]); + gen8_free_page_tables(&ppgtt->pdp.page_directory[i]); ...this loop will be cleaner. Also consider renaming 'num_pd_pages' to 'num_pd'. But if it does cause a lot of rebase burden dont worry about it. num_pd_pages will go away in patch #19, so I rather not change that. All other comments addressed in v4. Thanks, -Michel + gen8_free_page_directories(&ppgtt->pdp.page_directory[i]); kfree(ppgtt->gen8_pt_dma_addr[i]); } - - __free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT)); } static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) @@ -460,86 +469,75 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm) gen8_ppgtt_free(ppgtt); } -static struct page **__gen8_alloc_page_tables(void) +static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt) { - struct page **pt_pages; int i; - pt_pages = kcalloc(GEN8_PDES_PER_PAGE, sizeof(struct page *), GFP_KERNEL); - if (!pt_pages) - return ERR_PTR(-ENOMEM); - - for (i = 0; i < GEN8_PDES_PER_PAGE; i++) { - pt_pages[i] = alloc_page(GFP_KERNEL); - if (!pt_pages[i]) - goto bail; + for (i = 0; i < ppgtt->num_pd_pages; i++) { + ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE, +sizeof(dma_addr_t), +
Re: [Intel-gfx] [PATCH v4 09/24] drm/i915: Track GEN6 page table usage
On 2/20/2015 4:41 PM, Mika Kuoppala wrote: Michel Thierry writes: From: Ben Widawsky Instead of implementing the full tracking + dynamic allocation, this patch does a bit less than half of the work, by tracking and warning on unexpected conditions. The tracking itself follows which PTEs within a page table are currently being used for objects. The next patch will modify this to actually allocate the page tables only when necessary. With the current patch there isn't much in the way of making a gen agnostic range allocation function. However, in the next patch we'll add more specificity which makes having separate functions a bit easier to manage. One important change introduced here is that DMA mappings are created/destroyed at the same page directories/tables are allocated/deallocated. Notice that aliasing PPGTT is not managed here. The patch which actually begins dynamic allocation/teardown explains the reasoning for this. v2: s/pdp.page_directory/pdp.page_directorys Make a scratch page allocation helper v3: Rebase and expand commit message. v4: Allocate required pagetables only when it is needed, _bind_to_vm instead of bind_vma (Daniel). v5: Rebased to remove the unnecessary noise in the diff, also: - PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK. - Removed unnecessary checks in gen6_alloc_va_range. - Changed map/unmap_px_single macros to use dma functions directly and be part of a static inline function instead. - Moved drm_device plumbing through page tables operation to its own patch. - Moved allocate/teardown_va_range calls until they are fully implemented (in subsequent patch). - Merged pt and scratch_pt unmap_and_free path. - Moved scratch page allocator helper to the patch that will use it. v6: Reduce complexity by not tearing down pagetables dynamically, the same can be achieved while freeing empty vms. (Daniel) Cc: Daniel Vetter Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v3+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 191 +--- drivers/gpu/drm/i915/i915_gem_gtt.h | 75 ++ 2 files changed, 206 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index e2bcd10..760585e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -274,29 +274,88 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, return pte; } -static void unmap_and_free_pt(struct i915_page_table_entry *pt, struct drm_device *dev) +#define i915_dma_unmap_single(px, dev) \ + __i915_dma_unmap_single((px)->daddr, dev) + +static inline void __i915_dma_unmap_single(dma_addr_t daddr, + struct drm_device *dev) +{ + struct device *device = &dev->pdev->dev; + + dma_unmap_page(device, daddr, 4096, PCI_DMA_BIDIRECTIONAL); +} + +/** + * i915_dma_map_px_single() - Create a dma mapping for a page table/dir/etc. + * @px:Page table/dir/etc to get a DMA map for + * @dev: drm device + * + * Page table allocations are unified across all gens. They always require a + * single 4k allocation, as well as a DMA mapping. If we keep the structs + * symmetric here, the simple macro covers us for every page table type. + * + * Return: 0 if success. + */ +#define i915_dma_map_px_single(px, dev) \ + i915_dma_map_page_single((px)->page, (dev), &(px)->daddr) + If this is symmetrical to i915_dma_unmap_single() is the _px_ needed? +static inline int i915_dma_map_page_single(struct page *page, + struct drm_device *dev, + dma_addr_t *daddr) +{ + struct device *device = &dev->pdev->dev; + + *daddr = dma_map_page(device, page, 0, 4096, PCI_DMA_BIDIRECTIONAL); + return dma_mapping_error(device, *daddr); +} + +static void unmap_and_free_pt(struct i915_page_table_entry *pt, + struct drm_device *dev) { if (WARN_ON(!pt->page)) return; + + i915_dma_unmap_single(pt, dev); __free_page(pt->page); + kfree(pt->used_ptes); kfree(pt); } static struct i915_page_table_entry *alloc_pt_single(struct drm_device *dev) { struct i915_page_table_entry *pt; + const size_t count = INTEL_INFO(dev)->gen >= 8 ? + GEN8_PTES_PER_PAGE : I915_PPGTT_PT_ENTRIES; + int ret = -ENOMEM; pt = kzalloc(sizeof(*pt), GFP_KERNEL); if (!pt) return ERR_PTR(-ENOMEM); + pt->used_ptes = kcalloc(BITS_TO_LONGS(count), sizeof(*pt->used_ptes), + GFP_KERNEL); + + if (!pt->used_ptes) + goto fail_bitmap; + pt->page = alloc_page(GFP_KERNEL | __GFP_ZERO); - if (!pt->page) { - kfree(pt); - return ERR_PTR(-ENOMEM); - } + if (!pt->page) +
Re: [Intel-gfx] [PATCH] i-g-t: Adding test case to test background color.
On 21 February 2015 at 00:12, Chandra Konduru wrote: > From: chandra konduru > > Adding i-g-t test case to test display crtc background color. > > Signed-off-by: chandra konduru > --- > lib/igt_kms.c | 60 +++ > lib/igt_kms.h | 4 + > tests/Android.mk | 1 + > tests/Makefile.sources| 1 + > tests/kms_crtc_background_color.c | 220 > ++ > 5 files changed, 286 insertions(+) > create mode 100644 tests/kms_crtc_background_color.c Please could you also add the test to .gitignore and include a short description using the IGT_TEST_DESCRIPTION macro? > > diff --git a/lib/igt_kms.c b/lib/igt_kms.c > index d0c3690..7246e59 100644 > --- a/lib/igt_kms.c > +++ b/lib/igt_kms.c > @@ -926,6 +926,22 @@ igt_plane_set_property(igt_plane_t *plane, uint32_t > prop_id, uint64_t value) > DRM_MODE_OBJECT_PLANE, prop_id, value); > } > > +static bool > +get_crtc_property(int drm_fd, uint32_t crtc_id, const char *name, > + uint32_t *prop_id /* out */, uint64_t *value /* out */, > + drmModePropertyPtr *prop /* out */) > +{ > + return kmstest_get_property(drm_fd, crtc_id, DRM_MODE_OBJECT_CRTC, > + name, prop_id, value, prop); > +} > + > +static void > +igt_crtc_set_property(igt_output_t *output, uint32_t prop_id, uint64_t value) > +{ > + drmModeObjectSetProperty(output->display->drm_fd, > + output->config.crtc->crtc_id, DRM_MODE_OBJECT_CRTC, prop_id, > value); > +} > + > /* > * Walk a plane's property list to determine its type. If we don't > * find a type property, then the kernel doesn't support universal > @@ -1083,6 +1099,7 @@ void igt_display_init(igt_display_t *display, int > drm_fd) > igt_assert(display->outputs); > > for (i = 0; i < display->n_outputs; i++) { > + int j; > igt_output_t *output = &display->outputs[i]; > > /* > @@ -1094,6 +,19 @@ void igt_display_init(igt_display_t *display, int > drm_fd) > output->display = display; > > igt_output_refresh(output); > + > + for (j = 0; j < display->n_pipes; j++) { > + uint64_t prop_value; > + igt_pipe_t *pipe = &display->pipes[j]; > + if (output->config.crtc) { > + get_crtc_property(display->drm_fd, > output->config.crtc->crtc_id, > + "background_color", > + &pipe->background_property, > + &prop_value, > + NULL); > + pipe->background = (uint32_t)prop_value; > + } > + } > } > > drmModeFreePlaneResources(plane_resources); > @@ -1513,6 +1543,13 @@ static int igt_output_commit(igt_output_t *output, > > pipe = igt_output_get_driving_pipe(output); > > + if (pipe->background_changed) { > + igt_crtc_set_property(output, pipe->background_property, > + pipe->background); > + > + pipe->background_changed = false; > + } > + > for (i = 0; i < pipe->n_planes; i++) { > igt_plane_t *plane = &pipe->planes[i]; > > @@ -1765,6 +1802,29 @@ void igt_plane_set_rotation(igt_plane_t *plane, > igt_rotation_t rotation) > plane->rotation_changed = true; > } > > +/** > + * igt_crtc_set_background: > + * @pipe: pipe pointer to which background color to be set > + * @background: background color value > + * > + * Sets background color for requested pipe. Color value provided here > + * will be actually submitted at output commit time via "background_color" > + * property. it might be helpful to describe the format for "background_color" here too. > + */ > +void igt_crtc_set_background(igt_pipe_t *pipe, uint64_t background) > +{ > + igt_display_t *display = pipe->display; > + > + LOG(display, "%s.%d: crtc_set_background(%lu)\n", > + kmstest_pipe_name(pipe->pipe), > + pipe->pipe, background); > + > + pipe->background = background; > + > + pipe->background_changed = true; > +} > + > + > void igt_wait_for_vblank(int drm_fd, enum pipe pipe) > { > drmVBlank wait_vbl; > diff --git a/lib/igt_kms.h b/lib/igt_kms.h > index a1483a4..4fada1b 100644 > --- a/lib/igt_kms.h > +++ b/lib/igt_kms.h > @@ -206,6 +206,9 @@ struct igt_pipe { > #define IGT_MAX_PLANES 4 > int n_planes; > igt_plane_t planes[IGT_MAX_PLANES]; > + uint64_t background; /* Background color MSB BGR 16bpc LSB */ > + uint32_t background_changed : 1; > + uint32_t background_property; > }; > > typedef struct { > @@ -251,6 +254,7 @@
Re: [Intel-gfx] [PATCH v4 07/24] drm/i915: Create page table allocators
On 2/20/2015 4:50 PM, Mika Kuoppala wrote: Michel Thierry writes: From: Ben Widawsky As we move toward dynamic page table allocation, it becomes much easier to manage our data structures if break do things less coarsely by breaking up all of our actions into individual tasks. This makes the code easier to write, read, and verify. Aside from the dissection of the allocation functions, the patch statically allocates the page table structures without a page directory. This remains the same for all platforms, The patch itself should not have much functional difference. The primary noticeable difference is the fact that page tables are no longer allocated, but rather statically declared as part of the page directory. This has non-zero overhead, but things gain non-trivial complexity as a result. This patch exists for a few reasons: 1. Splitting out the functions allows easily combining GEN6 and GEN8 code. Page tables have no difference based on GEN8. As we'll see in a future patch when we add the DMA mappings to the allocations, it requires only one small change to make work, and error handling should just fall into place. 2. Unless we always want to allocate all page tables under a given PDE, we'll have to eventually break this up into an array of pointers (or pointer to pointer). 3. Having the discrete functions is easier to review, and understand. All allocations and frees now take place in just a couple of locations. Reviewing, and catching leaks should be easy. 4. Less important: the GFP flags are confined to one location, which makes playing around with such things trivial. v2: Updated commit message to explain why this patch exists v3: For lrc, s/pdp.page_directory[i].daddr/pdp.page_directory[i]->daddr/ v4: Renamed free_pt/pd_single functions to unmap_and_free_pt/pd (Daniel) v5: Added additional safety checks in gen8 clear/free/unmap. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v3, v4, v5) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 251 drivers/gpu/drm/i915/i915_gem_gtt.h | 4 +- drivers/gpu/drm/i915/intel_lrc.c| 16 +-- 3 files changed, 179 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0fe5c1e..85ea535 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -275,6 +275,99 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, return pte; } +static void unmap_and_free_pt(struct i915_page_table_entry *pt) +{ + if (WARN_ON(!pt->page)) + return; + __free_page(pt->page); + kfree(pt); +} + +static struct i915_page_table_entry *alloc_pt_single(void) +{ + struct i915_page_table_entry *pt; + + pt = kzalloc(sizeof(*pt), GFP_KERNEL); + if (!pt) + return ERR_PTR(-ENOMEM); + + pt->page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!pt->page) { + kfree(pt); + return ERR_PTR(-ENOMEM); + } + + return pt; +} + +/** + * alloc_pt_range() - Allocate a multiple page tables + * @pd:The page directory which will have at least @count entries + * available to point to the allocated page tables. + * @pde: First page directory entry for which we are allocating. + * @count: Number of pages to allocate. + * + * Allocates multiple page table pages and sets the appropriate entries in the + * page table structure within the page directory. Function cleans up after + * itself on any failures. + * + * Return: 0 if allocation succeeded. + */ +static int alloc_pt_range(struct i915_page_directory_entry *pd, uint16_t pde, size_t count) +{ + int i, ret; + + /* 512 is the max page tables per page_directory on any platform. +* TODO: make WARN after patch series is done +*/ + BUG_ON(pde + count > GEN6_PPGTT_PD_ENTRIES); + WARN_ON in here and return -EINVAL. -Mika I applied the changes in v6. Thanks for the review. -Michel smime.p7s Description: S/MIME Cryptographic Signature ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 07/32] drm/i915: Track page table reload need
From: Ben Widawsky This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. v3: Invalidate PPGTT TLBs inside alloc_va_range. v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when neither ctx->ppgtt and aliasing_ppgtt exist. v5: Removed references to teardown_va_range. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_context.c| 29 - drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++ drivers/gpu/drm/i915/i915_gem_gtt.c| 11 +++ drivers/gpu/drm/i915/i915_gem_gtt.h| 1 + 4 files changed, 47 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 6206d27..437cdcc 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -569,8 +569,20 @@ static inline bool should_skip_switch(struct intel_engine_cs *ring, struct intel_context *from, struct intel_context *to) { - if (from == to && !to->remap_slice) - return true; + struct drm_i915_private *dev_priv = ring->dev->dev_private; + + if (to->remap_slice) + return false; + + if (to->ppgtt) { + if (from == to && !test_bit(ring->id, + &to->ppgtt->pd_dirty_rings)) + return true; + } else if (dev_priv->mm.aliasing_ppgtt) { + if (from == to && !test_bit(ring->id, + &dev_priv->mm.aliasing_ppgtt->pd_dirty_rings)) + return true; + } return false; } @@ -587,9 +599,8 @@ needs_pd_load_pre(struct intel_engine_cs *ring, struct intel_context *to) static bool needs_pd_load_post(struct intel_engine_cs *ring, struct intel_context *to) { - return (!to->legacy_hw_ctx.initialized || - i915_gem_context_is_default(to)) && - to->ppgtt && IS_GEN8(ring->dev); + return IS_GEN8(ring->dev) && + (to->ppgtt || &to->ppgtt->pd_dirty_rings); } static int do_switch(struct intel_engine_cs *ring, @@ -634,6 +645,12 @@ static int do_switch(struct intel_engine_cs *ring, ret = to->ppgtt->switch_mm(to->ppgtt, ring); if (ret) goto unpin_out; + + /* Doing a PD load always reloads the page dirs */ + if (to->ppgtt) + clear_bit(ring->id, &to->ppgtt->pd_dirty_rings); + else + clear_bit(ring->id, &dev_priv->mm.aliasing_ppgtt->pd_dirty_rings); } if (ring != &dev_priv->ring[RCS]) { @@ -672,6 +689,8 @@ static int do_switch(struct intel_engine_cs *ring, */ if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; + else if (to->ppgtt && test_and_clear_bit(ring->id, &to->ppgtt->pd_dirty_rings)) + hw_flags |= MI_FORCE_RESTORE; ret = mi_set_context(ring, to, hw_flags); if (ret) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index b773368..1961107 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/dr
[Intel-gfx] [PATCH v5 06/32] drm/i915: Extract context switch skip and pd load logic
From: Ben Widawsky We have some fanciness coming up. This patch just breaks out the logic of context switch skip, pd load pre, and pd load post. v2: Use new functions to replace the logic right away (Daniel) Cc: Daniel Vetter Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_context.c | 40 + 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 755b415..6206d27 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -565,6 +565,33 @@ mi_set_context(struct intel_engine_cs *ring, return ret; } +static inline bool should_skip_switch(struct intel_engine_cs *ring, + struct intel_context *from, + struct intel_context *to) +{ + if (from == to && !to->remap_slice) + return true; + + return false; +} + +static bool +needs_pd_load_pre(struct intel_engine_cs *ring, struct intel_context *to) +{ + struct drm_i915_private *dev_priv = ring->dev->dev_private; + + return ((INTEL_INFO(ring->dev)->gen < 8) || + (ring != &dev_priv->ring[RCS])) && to->ppgtt; +} + +static bool +needs_pd_load_post(struct intel_engine_cs *ring, struct intel_context *to) +{ + return (!to->legacy_hw_ctx.initialized || + i915_gem_context_is_default(to)) && + to->ppgtt && IS_GEN8(ring->dev); +} + static int do_switch(struct intel_engine_cs *ring, struct intel_context *to) { @@ -573,9 +600,6 @@ static int do_switch(struct intel_engine_cs *ring, u32 hw_flags = 0; bool uninitialized = false; struct i915_vma *vma; - bool needs_pd_load_pre = ((INTEL_INFO(ring->dev)->gen < 8) || - (ring != &dev_priv->ring[RCS])) && to->ppgtt; - bool needs_pd_load_post = false; int ret, i; if (from != NULL && ring == &dev_priv->ring[RCS]) { @@ -583,7 +607,7 @@ static int do_switch(struct intel_engine_cs *ring, BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state)); } - if (from == to && !to->remap_slice) + if (should_skip_switch(ring, from, to)) return 0; /* Trying to pin first makes error handling easier. */ @@ -601,7 +625,7 @@ static int do_switch(struct intel_engine_cs *ring, */ from = ring->last_context; - if (needs_pd_load_pre) { + if (needs_pd_load_pre(ring, to)) { /* Older GENs and non render rings still want the load first, * "PP_DCLV followed by PP_DIR_BASE register through Load * Register Immediate commands in Ring Buffer before submitting @@ -646,16 +670,14 @@ static int do_switch(struct intel_engine_cs *ring, * XXX: If we implemented page directory eviction code, this * optimization needs to be removed. */ - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) { + if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; - needs_pd_load_post = to->ppgtt && IS_GEN8(ring->dev); - } ret = mi_set_context(ring, to, hw_flags); if (ret) goto unpin_out; - if (needs_pd_load_post) { + if (needs_pd_load_post(ring, to)) { ret = to->ppgtt->switch_mm(to->ppgtt, ring); /* The hardware context switch is emitted, but we haven't * actually changed the state - so it's probably safe to bail -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 14/32] drm/i915/bdw: Update pdp switch and point unused PDPs to scratch page
From: Ben Widawsky One important part of this patch is we now write a scratch page directory into any unused PDP descriptors. This matters for 2 reasons, first, we're not allowed to just use 0, or an invalid pointer, and second, we must wipe out any previous contents from the last context. The latter point only matters with full PPGTT. The former point only effect platforms with less than 4GB memory. v2: Updated commit message to point that we must set unused PDPs to the scratch page. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 29 ++--- drivers/gpu/drm/i915/i915_gem_gtt.h | 5 - 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a359f62..079a742 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -442,8 +442,9 @@ static struct i915_page_directory_entry *alloc_pd_single(void) } /* Broadwell Page Directory Pointer Descriptors */ -static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry, - uint64_t val) +static int gen8_write_pdp(struct intel_engine_cs *ring, + unsigned entry, + dma_addr_t addr) { int ret; @@ -455,10 +456,10 @@ static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry, intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); intel_ring_emit(ring, GEN8_RING_PDP_UDW(ring, entry)); - intel_ring_emit(ring, (u32)(val >> 32)); + intel_ring_emit(ring, upper_32_bits(addr)); intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); intel_ring_emit(ring, GEN8_RING_PDP_LDW(ring, entry)); - intel_ring_emit(ring, (u32)(val)); + intel_ring_emit(ring, lower_32_bits(addr)); intel_ring_advance(ring); return 0; @@ -469,12 +470,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, { int i, ret; - /* bit of a hack to find the actual last used pd */ - int used_pd = ppgtt->num_pd_entries / GEN8_PDES_PER_PAGE; - - for (i = used_pd - 1; i >= 0; i--) { - dma_addr_t addr = ppgtt->pdp.page_directory[i]->daddr; - ret = gen8_write_pdp(ring, i, addr); + for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) { + struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i]; + dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr; + /* The page directory might be NULL, but we need to clear out +* whatever the previous context might have used. */ + ret = gen8_write_pdp(ring, i, pd_daddr); if (ret) return ret; } @@ -816,10 +817,16 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) ppgtt->base.start = 0; ppgtt->base.total = size; + ppgtt->scratch_pd = alloc_pt_scratch(ppgtt->base.dev); + if (IS_ERR(ppgtt->scratch_pd)) + return PTR_ERR(ppgtt->scratch_pd); + /* 1. Do all our allocations for page directories and page tables. */ ret = gen8_ppgtt_alloc(ppgtt, ppgtt->base.start, ppgtt->base.total); - if (ret) + if (ret) { + unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev); return ret; + } /* * 2. Create DMA mappings for the page directories and page tables. diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 70ce50d..f7d2af5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -306,7 +306,10 @@ struct i915_hw_ppgtt { struct i915_page_directory_entry pd; }; - struct i915_page_table_entry *scratch_pt; + union { + struct i915_page_table_entry *scratch_pt; + struct i915_page_table_entry *scratch_pd; /* Just need the daddr */ + }; struct drm_i915_file_private *file_priv; -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 12/32] drm/i915/bdw: page directories rework allocation
From: Ben Widawsky Start using gen8_for_each_pdpe macro to allocate the page directories. v2: Rebased after s/free_pt_*/unmap_and_free_pt/ change. v3: Rebased after teardown va range logic was removed. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 43 ++--- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0289176..2d7359e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -681,25 +681,39 @@ unwind_out: return -ENOMEM; } -static int gen8_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt, - const int max_pdp) +static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_entry *pdp, +uint64_t start, +uint64_t length) { - int i; - - for (i = 0; i < max_pdp; i++) { - ppgtt->pdp.page_directory[i] = alloc_pd_single(); - if (IS_ERR(ppgtt->pdp.page_directory[i])) + struct i915_hw_ppgtt *ppgtt = + container_of(pdp, struct i915_hw_ppgtt, pdp); + struct i915_page_directory_entry *unused; + uint64_t temp; + uint32_t pdpe; + + /* FIXME: PPGTT container_of won't work for 64b */ + BUG_ON((start + length) > 0x8ULL); + + gen8_for_each_pdpe(unused, pdp, start, length, temp, pdpe) { + BUG_ON(unused); + pdp->page_directory[pdpe] = alloc_pd_single(); + if (IS_ERR(ppgtt->pdp.page_directory[pdpe])) goto unwind_out; + + ppgtt->num_pd_pages++; } - ppgtt->num_pd_pages = max_pdp; BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPES); return 0; unwind_out: - while (i--) - unmap_and_free_pd(ppgtt->pdp.page_directory[i]); + while (pdpe--) { + unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe]); + ppgtt->num_pd_pages--; + } + + WARN_ON(ppgtt->num_pd_pages); return -ENOMEM; } @@ -709,7 +723,8 @@ static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt, { int ret; - ret = gen8_ppgtt_allocate_page_directories(ppgtt, max_pdp); + ret = gen8_ppgtt_alloc_page_directories(&ppgtt->pdp, ppgtt->base.start, + ppgtt->base.total); if (ret) return ret; @@ -785,6 +800,10 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) if (size % (1<<30)) DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size); + ppgtt->base.start = 0; + ppgtt->base.total = size; + BUG_ON(ppgtt->base.total == 0); + /* 1. Do all our allocations for page directories and page tables. */ ret = gen8_ppgtt_alloc(ppgtt, max_pdp); if (ret) @@ -832,8 +851,6 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) ppgtt->base.clear_range = gen8_ppgtt_clear_range; ppgtt->base.insert_entries = gen8_ppgtt_insert_entries; ppgtt->base.cleanup = gen8_ppgtt_cleanup; - ppgtt->base.start = 0; - ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * PAGE_SIZE; ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true); -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 05/32] drm/i915: Track GEN6 page table usage
From: Ben Widawsky Instead of implementing the full tracking + dynamic allocation, this patch does a bit less than half of the work, by tracking and warning on unexpected conditions. The tracking itself follows which PTEs within a page table are currently being used for objects. The next patch will modify this to actually allocate the page tables only when necessary. With the current patch there isn't much in the way of making a gen agnostic range allocation function. However, in the next patch we'll add more specificity which makes having separate functions a bit easier to manage. One important change introduced here is that DMA mappings are created/destroyed at the same page directories/tables are allocated/deallocated. Notice that aliasing PPGTT is not managed here. The patch which actually begins dynamic allocation/teardown explains the reasoning for this. v2: s/pdp.page_directory/pdp.page_directorys Make a scratch page allocation helper v3: Rebase and expand commit message. v4: Allocate required pagetables only when it is needed, _bind_to_vm instead of bind_vma (Daniel). v5: Rebased to remove the unnecessary noise in the diff, also: - PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK. - Removed unnecessary checks in gen6_alloc_va_range. - Changed map/unmap_px_single macros to use dma functions directly and be part of a static inline function instead. - Moved drm_device plumbing through page tables operation to its own patch. - Moved allocate/teardown_va_range calls until they are fully implemented (in subsequent patch). - Merged pt and scratch_pt unmap_and_free path. - Moved scratch page allocator helper to the patch that will use it. v6: Reduce complexity by not tearing down pagetables dynamically, the same can be achieved while freeing empty vms. (Daniel) v7: s/i915_dma_map_px_single/i915_dma_map_single s/gen6_write_pdes/gen6_write_pde Prevent a NULL case when only GGTT is available. (Mika) Cc: Daniel Vetter Cc: Mika Kuoppala Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v3+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 198 +--- drivers/gpu/drm/i915/i915_gem_gtt.h | 75 ++ 2 files changed, 211 insertions(+), 62 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 65a506c..5ee92ce 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -278,29 +278,88 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, return pte; } -static void unmap_and_free_pt(struct i915_page_table_entry *pt, struct drm_device *dev) +#define i915_dma_unmap_single(px, dev) \ + __i915_dma_unmap_single((px)->daddr, dev) + +static inline void __i915_dma_unmap_single(dma_addr_t daddr, + struct drm_device *dev) +{ + struct device *device = &dev->pdev->dev; + + dma_unmap_page(device, daddr, 4096, PCI_DMA_BIDIRECTIONAL); +} + +/** + * i915_dma_map_single() - Create a dma mapping for a page table/dir/etc. + * @px:Page table/dir/etc to get a DMA map for + * @dev: drm device + * + * Page table allocations are unified across all gens. They always require a + * single 4k allocation, as well as a DMA mapping. If we keep the structs + * symmetric here, the simple macro covers us for every page table type. + * + * Return: 0 if success. + */ +#define i915_dma_map_single(px, dev) \ + i915_dma_map_page_single((px)->page, (dev), &(px)->daddr) + +static inline int i915_dma_map_page_single(struct page *page, + struct drm_device *dev, + dma_addr_t *daddr) +{ + struct device *device = &dev->pdev->dev; + + *daddr = dma_map_page(device, page, 0, 4096, PCI_DMA_BIDIRECTIONAL); + return dma_mapping_error(device, *daddr); +} + +static void unmap_and_free_pt(struct i915_page_table_entry *pt, + struct drm_device *dev) { if (WARN_ON(!pt->page)) return; + + i915_dma_unmap_single(pt, dev); __free_page(pt->page); + kfree(pt->used_ptes); kfree(pt); } static struct i915_page_table_entry *alloc_pt_single(struct drm_device *dev) { struct i915_page_table_entry *pt; + const size_t count = INTEL_INFO(dev)->gen >= 8 ? + GEN8_PTES_PER_PAGE : I915_PPGTT_PT_ENTRIES; + int ret = -ENOMEM; pt = kzalloc(sizeof(*pt), GFP_KERNEL); if (!pt) return ERR_PTR(-ENOMEM); + pt->used_ptes = kcalloc(BITS_TO_LONGS(count), sizeof(*pt->used_ptes), + GFP_KERNEL); + + if (!pt->used_ptes) + goto fail_bitmap; + pt->page = alloc_page(GFP_KERNEL | __GFP_ZERO); - if (!pt->page) { - kfree(pt); - return ERR_PTR(-ENOMEM); - } + if (!pt->page) +
[Intel-gfx] [PATCH v5 00/32] PPGTT dynamic page allocations and 48b addressing
This patchset starts addressing comments from v4 by Mika, and also has been rebased on top of nightly. For GEN8, it has also been extended to work in logical ring submission (lrc) mode, as it will be the preferred mode of operation. I also tried to update the lrc code at the same time the ppgtt refactoring occurred, leaving only one patch that is exclusively for lrc. I'm also now including the required patches for PPGTT with 48b addressing. In order expand the GPU address space, a 4th level translation is added, the Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255], each pointing to a PDP. For now, this feature will only be available in BDW, in LRC submission mode (execlists) and when i915.enable_ppgtt=3 is set. Also note that this expanded address space is only available for full PPGTT, aliasing PPGTT remains 32b. This list can be seen in 3 parts: [01-10] Add page table allocation for GEN6/GEN7 [11-20] Enable dynamic allocation in GEN8,for both legacy and execlist submission modes. [21-32] PML4 support in BDW. Ben Widawsky (26): drm/i915: page table abstractions drm/i915: Complete page table structures drm/i915: Create page table allocators drm/i915: Track GEN6 page table usage drm/i915: Extract context switch skip and pd load logic drm/i915: Track page table reload need drm/i915: Initialize all contexts drm/i915: Finish gen6/7 dynamic page table allocation drm/i915/bdw: Use dynamic allocation idioms on free drm/i915/bdw: page directories rework allocation drm/i915/bdw: pagetable allocation rework drm/i915/bdw: Update pdp switch and point unused PDPs to scratch page drm/i915: num_pd_pages/num_pd_entries isn't useful drm/i915: Extract PPGTT param from page_directory alloc drm/i915/bdw: Split out mappings drm/i915/bdw: begin bitmap tracking drm/i915/bdw: Dynamic page table allocations drm/i915/bdw: Make pdp allocation more dynamic drm/i915/bdw: Abstract PDP usage drm/i915/bdw: Add dynamic page trace events drm/i915/bdw: Add ppgtt info for dynamic pages drm/i915/bdw: implement alloc/free for 4lvl drm/i915/bdw: Add 4 level switching infrastructure drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT drm/i915: Plumb sg_iter through va allocation ->maps drm/i915: Expand error state's address width to 64b Michel Thierry (6): drm/i915: Plumb drm_device through page tables operations drm/i915: Add dynamic page trace events drm/i915/bdw: Support dynamic pdp updates in lrc mode drm/i915/bdw: Support 64 bit PPGTT in lrc mode drm/i915/bdw: Add 4 level support in insert_entries and clear_range drm/i915/bdw: Flip the 48b switch drivers/gpu/drm/i915/i915_debugfs.c| 26 +- drivers/gpu/drm/i915/i915_drv.h| 11 +- drivers/gpu/drm/i915/i915_gem.c| 11 + drivers/gpu/drm/i915/i915_gem_context.c| 64 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 + drivers/gpu/drm/i915/i915_gem_gtt.c| 1534 ++-- drivers/gpu/drm/i915/i915_gem_gtt.h| 248 - drivers/gpu/drm/i915/i915_gpu_error.c | 17 +- drivers/gpu/drm/i915/i915_params.c |2 +- drivers/gpu/drm/i915/i915_reg.h|1 + drivers/gpu/drm/i915/i915_trace.h | 111 ++ drivers/gpu/drm/i915/intel_lrc.c | 149 ++- 12 files changed, 1786 insertions(+), 399 deletions(-) -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 13/32] drm/i915/bdw: pagetable allocation rework
From: Ben Widawsky Start using gen8_for_each_pde macro to allocate page tables. v2: teardown_va_range references removed. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 46 +++-- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2d7359e..a359f62 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -661,22 +661,27 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm) gen8_ppgtt_free(ppgtt); } -static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt) +static int gen8_ppgtt_alloc_pagetabs(struct i915_page_directory_entry *pd, +uint64_t start, +uint64_t length, +struct drm_device *dev) { - int i, ret; + struct i915_page_table_entry *unused; + uint64_t temp; + uint32_t pde; - for (i = 0; i < ppgtt->num_pd_pages; i++) { - ret = alloc_pt_range(ppgtt->pdp.page_directory[i], -0, GEN8_PDES_PER_PAGE, ppgtt->base.dev); - if (ret) + gen8_for_each_pde(unused, pd, start, length, temp, pde) { + BUG_ON(unused); + pd->page_tables[pde] = alloc_pt_single(dev); + if (IS_ERR(pd->page_tables[pde])) goto unwind_out; } return 0; unwind_out: - while (i--) - gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); + while (pde--) + unmap_and_free_pt(pd->page_tables[pde], dev); return -ENOMEM; } @@ -719,20 +724,28 @@ unwind_out: } static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt, - const int max_pdp) + uint64_t start, + uint64_t length) { + struct i915_page_directory_entry *pd; + uint64_t temp; + uint32_t pdpe; int ret; - ret = gen8_ppgtt_alloc_page_directories(&ppgtt->pdp, ppgtt->base.start, - ppgtt->base.total); + ret = gen8_ppgtt_alloc_page_directories(&ppgtt->pdp, start, length); if (ret) return ret; - ret = gen8_ppgtt_allocate_page_tables(ppgtt); - if (ret) - goto err_out; + gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) { + ret = gen8_ppgtt_alloc_pagetabs(pd, start, length, + ppgtt->base.dev); + if (ret) + goto err_out; + + ppgtt->num_pd_entries += GEN8_PDES_PER_PAGE; + } - ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE; + BUG_ON(pdpe > ppgtt->num_pd_pages); return 0; @@ -802,10 +815,9 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) ppgtt->base.start = 0; ppgtt->base.total = size; - BUG_ON(ppgtt->base.total == 0); /* 1. Do all our allocations for page directories and page tables. */ - ret = gen8_ppgtt_alloc(ppgtt, max_pdp); + ret = gen8_ppgtt_alloc(ppgtt, ppgtt->base.start, ppgtt->base.total); if (ret) return ret; -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 16/32] drm/i915: Extract PPGTT param from page_directory alloc
From: Ben Widawsky Now that we don't need to trace num_pd_pages, we may as well kill all need for the PPGTT structure in the alloc_page_directorys. This is very useful for when we move to 48b addressing, and the PDP isn't the root of the page table structure. The param is replaced with drm_device, which is an unavoidable wart throughout the series. (in other words, not extra flagrant). Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 781b751..7849769 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -689,8 +689,6 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ uint64_t start, uint64_t length) { - struct i915_hw_ppgtt *ppgtt = - container_of(pdp, struct i915_hw_ppgtt, pdp); struct i915_page_directory_entry *unused; uint64_t temp; uint32_t pdpe; @@ -701,7 +699,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ gen8_for_each_pdpe(unused, pdp, start, length, temp, pdpe) { BUG_ON(unused); pdp->page_directory[pdpe] = alloc_pd_single(); - if (IS_ERR(ppgtt->pdp.page_directory[pdpe])) + if (IS_ERR(pdp->page_directory[pdpe])) goto unwind_out; } @@ -709,7 +707,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ unwind_out: while (pdpe--) - unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe]); + unmap_and_free_pd(pdp->page_directory[pdpe]); return -ENOMEM; } -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 27/32] drm/i915/bdw: Support 64 bit PPGTT in lrc mode
In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains the base address to PML4, while the other PDP registers are ignored. Also, the addressing mode must be specified in every context descriptor. Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/intel_lrc.c | 167 ++- 1 file changed, 114 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index f461631..2b6d262 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -255,7 +255,8 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj) } static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring, -struct drm_i915_gem_object *ctx_obj) +struct drm_i915_gem_object *ctx_obj, +bool legacy_64bit_ctx) { struct drm_device *dev = ring->dev; uint64_t desc; @@ -264,7 +265,10 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring, WARN_ON(lrca & 0x0FFFULL); desc = GEN8_CTX_VALID; - desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT; + if (legacy_64bit_ctx) + desc |= LEGACY_64B_CONTEXT << GEN8_CTX_MODE_SHIFT; + else + desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT; desc |= GEN8_CTX_L3LLC_COHERENT; desc |= GEN8_CTX_PRIVILEGE; desc |= lrca; @@ -292,16 +296,17 @@ static void execlists_elsp_write(struct intel_engine_cs *ring, struct drm_i915_private *dev_priv = dev->dev_private; uint64_t temp = 0; uint32_t desc[4]; + bool legacy_64bit_ctx = USES_FULL_48BIT_PPGTT(dev); /* XXX: You must always write both descriptors in the order below. */ if (ctx_obj1) - temp = execlists_ctx_descriptor(ring, ctx_obj1); + temp = execlists_ctx_descriptor(ring, ctx_obj1, legacy_64bit_ctx); else temp = 0; desc[1] = (u32)(temp >> 32); desc[0] = (u32)temp; - temp = execlists_ctx_descriptor(ring, ctx_obj0); + temp = execlists_ctx_descriptor(ring, ctx_obj0, legacy_64bit_ctx); desc[3] = (u32)(temp >> 32); desc[2] = (u32)temp; @@ -332,37 +337,60 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj, reg_state[CTX_RING_TAIL+1] = tail; reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj); - /* True PPGTT with dynamic page allocation: update PDP registers and -* point the unallocated PDPs to the scratch page -*/ - if (ppgtt) { + if (ppgtt && USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { + /* True 64b PPGTT (48bit canonical) +* PDP0_DESCRIPTOR contains the base address to PML4 and +* other PDP Descriptors are ignored +*/ + reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pml4.daddr); + reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pml4.daddr); + } else if (ppgtt) { + /* True 32b PPGTT with dynamic page allocation: update PDP +* registers and point the unallocated PDPs to the scratch page +*/ if (test_bit(3, ppgtt->pdp.used_pdpes)) { - reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr); - reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr); + reg_state[CTX_PDP3_UDW+1] = + upper_32_bits(ppgtt->pdp.page_directory[3]->daddr); + reg_state[CTX_PDP3_LDW+1] = + lower_32_bits(ppgtt->pdp.page_directory[3]->daddr); } else { - reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr); - reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr); + reg_state[CTX_PDP3_UDW+1] = + upper_32_bits(ppgtt->scratch_pd->daddr); + reg_state[CTX_PDP3_LDW+1] = + lower_32_bits(ppgtt->scratch_pd->daddr); } if (test_bit(2, ppgtt->pdp.used_pdpes)) { - reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr); - reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[2]->daddr); + reg_state[CTX_PDP2_UDW+1] = + upper_32_bits(ppgtt->pdp.page_directory[2]->daddr); + reg_state[CTX_PDP2_LDW+1] = + lower_32_bits(ppgtt->pdp.page_directory[2]->daddr); } else { -
[Intel-gfx] [PATCH v5 15/32] drm/i915: num_pd_pages/num_pd_entries isn't useful
From: Ben Widawsky These values are never quite useful for dynamic allocations of the page tables. Getting rid of them will help prevent later confusion. v2: Updated to use unmap_and_free_pd functions. v3: Updated gen8_ppgtt_free after teardown logic was removed. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_debugfs.c | 2 -- drivers/gpu/drm/i915/i915_gem_gtt.c | 72 - drivers/gpu/drm/i915/i915_gem_gtt.h | 7 ++-- 3 files changed, 28 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index e8ad450..e85da9d 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2149,8 +2149,6 @@ static void gen8_ppgtt_info(struct seq_file *m, struct drm_device *dev) if (!ppgtt) return; - seq_printf(m, "Page directories: %d\n", ppgtt->num_pd_pages); - seq_printf(m, "Page tables: %d\n", ppgtt->num_pd_entries); for_each_ring(ring, dev_priv, unused) { seq_printf(m, "%s\n", ring->name); for (i = 0; i < 4; i++) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 079a742..781b751 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -613,9 +613,7 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) struct pci_dev *hwdev = ppgtt->base.dev->pdev; int i, j; - for (i = 0; i < ppgtt->num_pd_pages; i++) { - /* TODO: In the future we'll support sparse mappings, so this -* will have to change. */ + for (i = 0; i < GEN8_LEGACY_PDPES; i++) { if (!ppgtt->pdp.page_directory[i]->daddr) continue; @@ -644,7 +642,7 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) { int i; - for (i = 0; i < ppgtt->num_pd_pages; i++) { + for (i = 0; i < GEN8_LEGACY_PDPES; i++) { if (WARN_ON(!ppgtt->pdp.page_directory[i])) continue; @@ -705,21 +703,13 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ pdp->page_directory[pdpe] = alloc_pd_single(); if (IS_ERR(ppgtt->pdp.page_directory[pdpe])) goto unwind_out; - - ppgtt->num_pd_pages++; } - BUG_ON(ppgtt->num_pd_pages > GEN8_LEGACY_PDPES); - return 0; unwind_out: - while (pdpe--) { + while (pdpe--) unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe]); - ppgtt->num_pd_pages--; - } - - WARN_ON(ppgtt->num_pd_pages); return -ENOMEM; } @@ -742,12 +732,8 @@ static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt, ppgtt->base.dev); if (ret) goto err_out; - - ppgtt->num_pd_entries += GEN8_PDES_PER_PAGE; } - BUG_ON(pdpe > ppgtt->num_pd_pages); - return 0; err_out: @@ -808,7 +794,6 @@ static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt, static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) { const int max_pdp = DIV_ROUND_UP(size, 1 << 30); - const int min_pt_pages = GEN8_PDES_PER_PAGE * max_pdp; int i, j, ret; if (size % (1<<30)) @@ -872,12 +857,6 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) ppgtt->base.cleanup = gen8_ppgtt_cleanup; ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true); - - DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d wasted)\n", -ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp); - DRM_DEBUG_DRIVER("Allocated %d pages for page tables (%lld wasted)\n", -ppgtt->num_pd_entries, -(ppgtt->num_pd_entries - min_pt_pages) + size % (1<<30)); return 0; bail: @@ -888,26 +867,20 @@ bail: static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) { - struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private; struct i915_address_space *vm = &ppgtt->base; - gen6_gtt_pte_t __iomem *pd_addr; + struct i915_page_table_entry *unused; gen6_gtt_pte_t scratch_pte; uint32_t pd_entry; - int pte, pde; + uint32_t pte, pde, temp; + uint32_t start = ppgtt->base.start, length = ppgtt->base.total; scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, true, 0); - pd_addr = (gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + - ppgtt->pd.pd_offset / sizeof(gen6_gtt_pte_t); - - seq_printf(m, " VM %p (pd_offset %x-%x):\n", vm, - ppgtt->pd.pd_offset, - ppgtt->pd.pd_offset + ppgtt->
[Intel-gfx] [PATCH v5 28/32] drm/i915/bdw: Generalize PTE writing for GEN8 PPGTT
From: Ben Widawsky The insert_entries function was the function used to write PTEs. For the PPGTT it was "hardcoded" to only understand two level page tables, which was the case for GEN7. We can reuse this for 4 level page tables, and remove the concept of insert_entries, which was never viable past 2 level page tables anyway, but it requires a bit of rework to make the function a bit more generic. This patch begins the generalization work, and it will be heavily used upon when the 48b code is complete. The patch series attempts to make each function which touches a part of code specific to the page table level and here is no exception. Having extra variables (such as the PPGTT) distracts and provides room to add bugs since the function shouldn't be touching anything in the higher order page tables. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem_gtt.c | 55 + 1 file changed, 38 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2c3f2db..ad7e274 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -618,23 +618,19 @@ static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt, return gen8_write_pdp(ring, 0, ppgtt->pml4.daddr); } -static void gen8_ppgtt_clear_range(struct i915_address_space *vm, - uint64_t start, - uint64_t length, - bool use_scratch) +static void gen8_ppgtt_clear_pte_range(struct i915_page_directory_pointer_entry *pdp, + uint64_t start, + uint64_t length, + gen8_gtt_pte_t scratch_pte, + const bool flush) { - struct i915_hw_ppgtt *ppgtt = - container_of(vm, struct i915_hw_ppgtt, base); - struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ - gen8_gtt_pte_t *pt_vaddr, scratch_pte; + gen8_gtt_pte_t *pt_vaddr; unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK; unsigned num_entries = length >> PAGE_SHIFT; unsigned last_pte, i; - scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr, - I915_CACHE_LLC, use_scratch); while (num_entries) { struct i915_page_directory_entry *pd; @@ -667,7 +663,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, num_entries--; } - if (!HAS_LLC(ppgtt->base.dev)) + if (flush) drm_clflush_virt_range(pt_vaddr, PAGE_SIZE); kunmap_atomic(pt_vaddr); @@ -679,14 +675,27 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, } } -static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, - struct sg_table *pages, - uint64_t start, - enum i915_cache_level cache_level, u32 unused) +static void gen8_ppgtt_clear_range(struct i915_address_space *vm, + uint64_t start, + uint64_t length, + bool use_scratch) { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ + + gen8_gtt_pte_t scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr, +I915_CACHE_LLC, use_scratch); + + gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte, !HAS_LLC(vm->dev)); +} + +static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp, + struct sg_table *pages, + uint64_t start, + enum i915_cache_level cache_level, + const bool flush) +{ gen8_gtt_pte_t *pt_vaddr; unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; @@ -708,7 +717,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, gen8_pte_encode(sg_page_iter_dma_address(&sg_iter), cache_level, true); if (++pte == GEN8_PTES_PER_PAGE) { - if (!HAS_LLC(ppgtt->base.dev)) + if (flush) drm_clflush_virt_range(pt
[Intel-gfx] [PATCH v5 09/32] drm/i915: Finish gen6/7 dynamic page table allocation
From: Ben Widawsky This patch continues on the idea from the previous patch. From here on, in the steady state, PDEs are all pointing to the scratch page table (as recommended in the spec). When an object is allocated in the VA range, the code will determine if we need to allocate a page for the page table. Similarly when the object is destroyed, we will remove, and free the page table pointing the PDE back to the scratch page. Following patches will work to unify the code a bit as we bring in GEN8 support. GEN6 and GEN8 are different enough that I had a hard time to get to this point with as much common code as I do. The aliasing PPGTT must pre-allocate all of the page tables. There are a few reasons for this. Two trivial ones: aliasing ppgtt goes through the ggtt paths, so it's hard to maintain, we currently do not restore the default context (assuming the previous force reload is indeed necessary). Most importantly though, the only way (it seems from empirical evidence) to invalidate the CS TLBs on non-render ring is to either use ring sync (which requires actually stopping the rings in order to synchronize when the sync completes vs. where you are in execution), or to reload DCLV. Since without full PPGTT we do not ever reload the DCLV register, there is no good way to achieve this. The simplest solution is just to not support dynamic page table creation/destruction in the aliasing PPGTT. We could always reload DCLV, but this seems like quite a bit of excess overhead only to save at most 2MB-4k of memory for the aliasing PPGTT page tables. v2: Make the page table bitmap declared inside the function (Chris) Simplify the way scratching address space works. Move the alloc/teardown tracepoints up a level in the call stack so that both all implementations get the trace. v3: Updated trace event to spit out a name v4: Aliasing ppgtt is now initialized differently (in setup global gtt) v5: Rebase to latest code. Also removed unnecessary aliasing ppgtt check for trace, as it is no longer possible after the PPGTT cleanup patch series of a couple of months ago (Daniel). v6: Implement changes from code review (Daniel): - allocate/teardown_va_range calls added. - Add a scratch page allocation helper (only need the address). - Move trace events to a new patch. - Use updated mark_tlbs_dirty. - Moved pt preallocation for aliasing ppgtt into gen6_ppgtt_init. v7: teardown_va_range removed (Daniel). In init, gen6_ppgtt_clear_range call is only needed for aliasing ppgtt. Cc: Daniel Vetter Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v4+) --- drivers/gpu/drm/i915/i915_debugfs.c | 3 +- drivers/gpu/drm/i915/i915_gem.c | 9 +++ drivers/gpu/drm/i915/i915_gem_gtt.c | 125 +++- drivers/gpu/drm/i915/i915_gem_gtt.h | 3 + 4 files changed, 123 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 4d07030..e8ad450 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2181,6 +2181,8 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev) seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring))); seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring))); } + seq_printf(m, "ECOCHK: 0x%08x\n\n", I915_READ(GAM_ECOCHK)); + if (dev_priv->mm.aliasing_ppgtt) { struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; @@ -2197,7 +2199,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev) get_pid_task(file->pid, PIDTYPE_PID)->comm); idr_for_each(&file_priv->context_idr, per_file_ctx, m); } - seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK)); } static int i915_ppgtt_info(struct seq_file *m, void *data) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 61134ab..312b7d2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3599,6 +3599,15 @@ search_free: if (ret) goto err_remove_node; + /* allocate before insert / bind */ + if (vma->vm->allocate_va_range) { + ret = vma->vm->allocate_va_range(vma->vm, + vma->node.start, + vma->node.size); + if (ret) + goto err_remove_node; + } + trace_i915_vma_bind(vma, flags); ret = i915_vma_bind(vma, obj->cache_level, flags & PIN_GLOBAL ? GLOBAL_BIND : 0); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 18d7b28..85c8a51 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -362,6 +362,16 @@ fail_bitmap: return
[Intel-gfx] [PATCH v5 20/32] drm/i915/bdw: Support dynamic pdp updates in lrc mode
Logic ring contexts need to know the PDPs when they are populated. With dynamic page table allocations, these PDPs may not exist yet. Check if PDPs have been allocated and use the scratch page if they do not exist yet. Before submission, update the PDPs in the logic ring context as PDPs have been allocated. v2: Renamed commit title (Daniel) Cc: Daniel Vetter Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/intel_lrc.c | 80 +++- 1 file changed, 70 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index bc9c7c3..f461631 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -320,6 +320,7 @@ static void execlists_elsp_write(struct intel_engine_cs *ring, static int execlists_update_context(struct drm_i915_gem_object *ctx_obj, struct drm_i915_gem_object *ring_obj, + struct i915_hw_ppgtt *ppgtt, u32 tail) { struct page *page; @@ -331,6 +332,40 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj, reg_state[CTX_RING_TAIL+1] = tail; reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(ring_obj); + /* True PPGTT with dynamic page allocation: update PDP registers and +* point the unallocated PDPs to the scratch page +*/ + if (ppgtt) { + if (test_bit(3, ppgtt->pdp.used_pdpes)) { + reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr); + reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr); + } else { + reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr); + reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr); + } + if (test_bit(2, ppgtt->pdp.used_pdpes)) { + reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr); + reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[2]->daddr); + } else { + reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr); + reg_state[CTX_PDP2_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr); + } + if (test_bit(1, ppgtt->pdp.used_pdpes)) { + reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[1]->daddr); + reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[1]->daddr); + } else { + reg_state[CTX_PDP1_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr); + reg_state[CTX_PDP1_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr); + } + if (test_bit(0, ppgtt->pdp.used_pdpes)) { + reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[0]->daddr); + reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[0]->daddr); + } else { + reg_state[CTX_PDP0_UDW+1] = upper_32_bits(ppgtt->scratch_pd->daddr); + reg_state[CTX_PDP0_LDW+1] = lower_32_bits(ppgtt->scratch_pd->daddr); + } + } + kunmap_atomic(reg_state); return 0; @@ -349,7 +384,7 @@ static void execlists_submit_contexts(struct intel_engine_cs *ring, WARN_ON(!i915_gem_obj_is_pinned(ctx_obj0)); WARN_ON(!i915_gem_obj_is_pinned(ringbuf0->obj)); - execlists_update_context(ctx_obj0, ringbuf0->obj, tail0); + execlists_update_context(ctx_obj0, ringbuf0->obj, to0->ppgtt, tail0); if (to1) { ringbuf1 = to1->engine[ring->id].ringbuf; @@ -358,7 +393,7 @@ static void execlists_submit_contexts(struct intel_engine_cs *ring, WARN_ON(!i915_gem_obj_is_pinned(ctx_obj1)); WARN_ON(!i915_gem_obj_is_pinned(ringbuf1->obj)); - execlists_update_context(ctx_obj1, ringbuf1->obj, tail1); + execlists_update_context(ctx_obj1, ringbuf1->obj, to1->ppgtt, tail1); } execlists_elsp_write(ring, ctx_obj0, ctx_obj1); @@ -1735,14 +1770,39 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1); reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0); reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0); - reg_state[CTX_PDP3_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[3]->daddr); - reg_state[CTX_PDP3_LDW+1] = lower_32_bits(ppgtt->pdp.page_directory[3]->daddr); - reg_state[CTX_PDP2_UDW+1] = upper_32_bits(ppgtt->pdp.page_directory[2]->daddr); - reg_state[CTX_PD
[Intel-gfx] [PATCH v5 21/32] drm/i915/bdw: Make pdp allocation more dynamic
From: Ben Widawsky This transitional patch doesn't do much for the existing code. However, it should make upcoming patches to use the full 48b address space a bit easier to swallow. The patch also introduces the PML4, ie. the new top level structure of the page tables. v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp), To facilitate testing, 48b mode will be available on Broadwell, when i915.enable_ppgtt = 3. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_drv.h | 7 ++- drivers/gpu/drm/i915/i915_gem_gtt.c | 108 +--- drivers/gpu/drm/i915/i915_gem_gtt.h | 41 +++--- 3 files changed, 126 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 3cc0196..662d6c1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2433,7 +2433,12 @@ struct drm_i915_cmd_table { #define HAS_HW_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 6) #define HAS_LOGICAL_RING_CONTEXTS(dev) (INTEL_INFO(dev)->gen >= 8) #define USES_PPGTT(dev)(i915.enable_ppgtt) -#define USES_FULL_PPGTT(dev) (i915.enable_ppgtt == 2) +#define USES_FULL_PPGTT(dev) (i915.enable_ppgtt >= 2) +#ifdef CONFIG_64BIT +# define USES_FULL_48BIT_PPGTT(dev)(i915.enable_ppgtt == 3) +#else +# define USES_FULL_48BIT_PPGTT(dev)false +#endif #define HAS_OVERLAY(dev) (INTEL_INFO(dev)->has_overlay) #define OVERLAY_NEEDS_PHYSICAL(dev) (INTEL_INFO(dev)->overlay_needs_physical) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 63caaed..1cd5f65 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -100,10 +100,17 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) { bool has_aliasing_ppgtt; bool has_full_ppgtt; + bool has_full_64bit_ppgtt; has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6; has_full_ppgtt = INTEL_INFO(dev)->gen >= 7; +#ifdef CONFIG_64BIT + has_full_64bit_ppgtt = IS_BROADWELL(dev) && false; /* FIXME: 64b */ +#else + has_full_64bit_ppgtt = false; +#endif + if (intel_vgpu_active(dev)) has_full_ppgtt = false; /* emulation is too hard */ @@ -121,6 +128,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) if (enable_ppgtt == 2 && has_full_ppgtt) return 2; + if (enable_ppgtt == 3 && has_full_64bit_ppgtt) + return 3; + #ifdef CONFIG_INTEL_IOMMU /* Disable ppgtt on SNB if VT-d is on. */ if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) { @@ -461,6 +471,45 @@ free_pd: return ERR_PTR(ret); } +static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp) +{ + kfree(pdp->used_pdpes); + kfree(pdp->page_directory); + /* HACK */ + pdp->page_directory = NULL; +} + +static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp, + struct drm_device *dev) +{ + __pdp_fini(pdp); + if (USES_FULL_48BIT_PPGTT(dev)) + kfree(pdp); +} + +static int __pdp_init(struct i915_page_directory_pointer_entry *pdp, + struct drm_device *dev) +{ + size_t pdpes = I915_PDPES_PER_PDP(dev); + + pdp->used_pdpes = kcalloc(BITS_TO_LONGS(pdpes), + sizeof(unsigned long), + GFP_KERNEL); + if (!pdp->used_pdpes) + return -ENOMEM; + + pdp->page_directory = kcalloc(pdpes, sizeof(*pdp->page_directory), GFP_KERNEL); + if (!pdp->page_directory) { + kfree(pdp->used_pdpes); + /* the PDP might be the statically allocated top level. Keep it +* as clean as possible */ + pdp->used_pdpes = NULL; + return -ENOMEM; + } + + return 0; +} + /* Broadwell Page Directory Pointer Descriptors */ static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry, @@ -490,7 +539,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, { int i, ret; - for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) { + for (i = 3; i >= 0; i--) { struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i]; dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr; /* The page directory might be NULL, but we need to clear out @@ -579,9 +628,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, pt_vaddr = NULL; for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) { - if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES)) - break; - if (pt_vaddr == NULL) { struct i915_page_directory_entry *pd =
[Intel-gfx] [PATCH v5 17/32] drm/i915/bdw: Split out mappings
From: Ben Widawsky When we do dynamic page table allocations for gen8, we'll need to have more control over how and when we map page tables, similar to gen6. In particular, DMA mappings for page directories/tables occur at allocation time. This patch adds the functionality and calls it at init, which should have no functional change. The PDPEs are still a special case for now. We'll need a function for that in the future as well. v2: Handle renamed unmap_and_free_page functions. v3: Updated after teardown_va logic was removed. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 176 ++-- 1 file changed, 69 insertions(+), 107 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 7849769..3a75408 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -416,17 +416,20 @@ err_out: return ret; } -static void unmap_and_free_pd(struct i915_page_directory_entry *pd) +static void unmap_and_free_pd(struct i915_page_directory_entry *pd, + struct drm_device *dev) { if (pd->page) { + i915_dma_unmap_single(pd, dev); __free_page(pd->page); kfree(pd); } } -static struct i915_page_directory_entry *alloc_pd_single(void) +static struct i915_page_directory_entry *alloc_pd_single(struct drm_device *dev) { struct i915_page_directory_entry *pd; + int ret; pd = kzalloc(sizeof(*pd), GFP_KERNEL); if (!pd) @@ -438,6 +441,13 @@ static struct i915_page_directory_entry *alloc_pd_single(void) return ERR_PTR(-ENOMEM); } + ret = i915_dma_map_single(pd, dev); + if (ret) { + __free_page(pd->page); + kfree(pd); + return ERR_PTR(ret); + } + return pd; } @@ -592,6 +602,36 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, } } +static void __gen8_do_map_pt(gen8_ppgtt_pde_t *pde, +struct i915_page_table_entry *pt, +struct drm_device *dev) +{ + gen8_ppgtt_pde_t entry = + gen8_pde_encode(dev, pt->daddr, I915_CACHE_LLC); + *pde = entry; +} + +/* It's likely we'll map more than one pagetable at a time. This function will + * save us unnecessary kmap calls, but do no more functionally than multiple + * calls to map_pt. */ +static void gen8_map_pagetable_range(struct i915_page_directory_entry *pd, +uint64_t start, +uint64_t length, +struct drm_device *dev) +{ + gen8_ppgtt_pde_t *page_directory = kmap_atomic(pd->page); + struct i915_page_table_entry *pt; + uint64_t temp, pde; + + gen8_for_each_pde(pt, pd, start, length, temp, pde) + __gen8_do_map_pt(page_directory + pde, pt, dev); + + if (!HAS_LLC(dev)) + drm_clflush_virt_range(page_directory, PAGE_SIZE); + + kunmap_atomic(page_directory); +} + static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct drm_device *dev) { int i; @@ -647,7 +687,7 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) continue; gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); - unmap_and_free_pd(ppgtt->pdp.page_directory[i]); + unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev); } } @@ -687,7 +727,8 @@ unwind_out: static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_entry *pdp, uint64_t start, -uint64_t length) +uint64_t length, +struct drm_device *dev) { struct i915_page_directory_entry *unused; uint64_t temp; @@ -698,7 +739,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ gen8_for_each_pdpe(unused, pdp, start, length, temp, pdpe) { BUG_ON(unused); - pdp->page_directory[pdpe] = alloc_pd_single(); + pdp->page_directory[pdpe] = alloc_pd_single(dev); if (IS_ERR(pdp->page_directory[pdpe])) goto unwind_out; } @@ -707,21 +748,24 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ unwind_out: while (pdpe--) - unmap_and_free_pd(pdp->page_directory[pdpe]); + unmap_and_free_pd(pdp->page_directory[pdpe], dev); return -ENOMEM; } -static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt, - uint64_t start, - uint64_t length)
[Intel-gfx] [PATCH v5 08/32] drm/i915: Initialize all contexts
From: Ben Widawsky The problem is we're going to switch to a new context, which could be the default context. The plan was to use restore inhibit, which would be fine, except if we are using dynamic page tables (which we will). If we use dynamic page tables and we don't load new page tables, the previous page tables might go away, and future operations will fault. CTXA runs. switch to default, restore inhibit CTXA dies and has its address space taken away. Run CTXB, tries to save using the context A's address space - this fails. The general solution is to make sure every context has it's own state, and its own address space. For cases when we must restore inhibit, first thing we do is load a valid address space. I thought this would be enough, but apparently there are references within the context itself which will refer to the old address space - therefore, we also must reinitialize. It was tricky to track this down as we don't have much insight into what happens in a context save. This is required for the next patch which enables dynamic page tables. v2: to->ppgtt is only valid in full ppgtt. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_context.c | 25 +++-- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 437cdcc..6a583c3 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -596,13 +596,6 @@ needs_pd_load_pre(struct intel_engine_cs *ring, struct intel_context *to) (ring != &dev_priv->ring[RCS])) && to->ppgtt; } -static bool -needs_pd_load_post(struct intel_engine_cs *ring, struct intel_context *to) -{ - return IS_GEN8(ring->dev) && - (to->ppgtt || &to->ppgtt->pd_dirty_rings); -} - static int do_switch(struct intel_engine_cs *ring, struct intel_context *to) { @@ -683,20 +676,24 @@ static int do_switch(struct intel_engine_cs *ring, /* GEN8 does *not* require an explicit reload if the PDPs have been * setup, and we do not wish to move them. -* -* XXX: If we implemented page directory eviction code, this -* optimization needs to be removed. */ - if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) + if (!to->legacy_hw_ctx.initialized) { hw_flags |= MI_RESTORE_INHIBIT; - else if (to->ppgtt && test_and_clear_bit(ring->id, &to->ppgtt->pd_dirty_rings)) + /* NB: If we inhibit the restore, the context is not allowed to +* die because future work may end up depending on valid address +* space. This means we must enforce that a page table load +* occur when this occurs. */ + } else if (to->ppgtt && test_and_clear_bit(ring->id, &to->ppgtt->pd_dirty_rings)) hw_flags |= MI_FORCE_RESTORE; ret = mi_set_context(ring, to, hw_flags); if (ret) goto unpin_out; - if (needs_pd_load_post(ring, to)) { + if (IS_GEN8(ring->dev) && to->ppgtt && (hw_flags & MI_RESTORE_INHIBIT)) { + /* We have a valid page directory (scratch) to switch to. This +* allows the old VM to be freed. Note that if anything occurs +* between the set context, and here, we are f*cked */ ret = to->ppgtt->switch_mm(to->ppgtt, ring); /* The hardware context switch is emitted, but we haven't * actually changed the state - so it's probably safe to bail @@ -746,7 +743,7 @@ static int do_switch(struct intel_engine_cs *ring, i915_gem_context_unreference(from); } - uninitialized = !to->legacy_hw_ctx.initialized && from == NULL; + uninitialized = !to->legacy_hw_ctx.initialized; to->legacy_hw_ctx.initialized = true; done: -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 29/32] drm/i915: Plumb sg_iter through va allocation ->maps
From: Ben Widawsky As a step towards implementing 4 levels, while not discarding the existing pte map functions, we need to pass the sg_iter through. The current function understands to the page directory granularity. An object's pages may span the page directory, and so using the iter directly as we write the PTEs allows the iterator to stay coherent through a VMA mapping operation spanning multiple page table levels. Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_gem_gtt.c | 46 +++-- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index ad7e274..483dd73 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -691,7 +691,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, } static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp, - struct sg_table *pages, + struct sg_page_iter *sg_iter, uint64_t start, enum i915_cache_level cache_level, const bool flush) @@ -700,11 +700,10 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK; - struct sg_page_iter sg_iter; pt_vaddr = NULL; - for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) { + while (__sg_page_iter_next(sg_iter)) { if (pt_vaddr == NULL) { struct i915_page_directory_entry *pd = pdp->page_directory[pdpe]; struct i915_page_table_entry *pt = pd->page_tables[pde]; @@ -714,7 +713,7 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent } pt_vaddr[pte] = - gen8_pte_encode(sg_page_iter_dma_address(&sg_iter), + gen8_pte_encode(sg_page_iter_dma_address(sg_iter), cache_level, true); if (++pte == GEN8_PTES_PER_PAGE) { if (flush) @@ -743,8 +742,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ + struct sg_page_iter sg_iter; - gen8_ppgtt_insert_pte_entries(pdp, pages, start, cache_level, !HAS_LLC(vm->dev)); + __sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0); + gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start, cache_level, !HAS_LLC(vm->dev)); } static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde, @@ -1106,10 +1107,12 @@ err_out: return -ENOMEM; } -static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, - struct i915_page_directory_pointer_entry *pdp, - uint64_t start, - uint64_t length) +static int __gen8_alloc_vma_range_3lvl(struct i915_address_space *vm, + struct i915_page_directory_pointer_entry *pdp, + struct sg_page_iter *sg_iter, + uint64_t start, + uint64_t length, + u32 flags) { unsigned long *new_page_dirs, **new_page_tables; struct drm_device *dev = vm->dev; @@ -1178,7 +1181,11 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, gen8_pte_index(pd_start), gen8_pte_count(pd_start, pd_len)); - /* Our pde is now pointing to the pagetable, pt */ + if (sg_iter) { + BUG_ON(!sg_iter->__nents); + gen8_ppgtt_insert_pte_entries(pdp, sg_iter, pd_start, + flags, !HAS_LLC(vm->dev)); + } set_bit(pde, pd->used_pdes); } @@ -1203,10 +1210,12 @@ err_out: return ret; } -static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, - struct i915_pml4 *pml4, - uint64_t start, - uint64_t length) +static int __gen8_alloc_vma_range_4lvl(struct i915_address_space *vm, + struct i915_pml4 *pml4, +
[Intel-gfx] [PATCH v5 30/32] drm/i915/bdw: Add 4 level support in insert_entries and clear_range
When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map Level 4 (PML4), before it selects which Page Directory Pointer (PDP) it will write to. Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range. Also add a scratch page for PML4. This patch was inspired by Ben's "Depend exclusively on map and unmap_vma". Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem_gtt.c | 66 ++--- drivers/gpu/drm/i915/i915_gem_gtt.h | 12 +++ 2 files changed, 67 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 483dd73..cd57c22 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -675,24 +675,52 @@ static void gen8_ppgtt_clear_pte_range(struct i915_page_directory_pointer_entry } } +static void gen8_ppgtt_clear_range_4lvl(struct i915_hw_ppgtt *ppgtt, + gen8_gtt_pte_t scratch_pte, + uint64_t start, + uint64_t length) +{ + struct i915_page_directory_pointer_entry *pdp; + uint64_t templ4, templ3, pml4e, pdpe; + + gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) { + struct i915_page_directory_entry *pd; + uint64_t pdp_len = gen8_clamp_pdp(start, length); + uint64_t pdp_start = start; + + gen8_for_each_pdpe(pd, pdp, pdp_start, pdp_len, templ3, pdpe) { + uint64_t pd_len = gen8_clamp_pd(pdp_start, pdp_len); + uint64_t pd_start = pdp_start; + + gen8_ppgtt_clear_pte_range(pdp, pd_start, pd_len, + scratch_pte, !HAS_LLC(ppgtt->base.dev)); + } + } +} + static void gen8_ppgtt_clear_range(struct i915_address_space *vm, - uint64_t start, - uint64_t length, + uint64_t start, uint64_t length, bool use_scratch) { struct i915_hw_ppgtt *ppgtt = - container_of(vm, struct i915_hw_ppgtt, base); - struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ - + container_of(vm, struct i915_hw_ppgtt, base); gen8_gtt_pte_t scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr, I915_CACHE_LLC, use_scratch); - gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte, !HAS_LLC(vm->dev)); + if (!USES_FULL_48BIT_PPGTT(vm->dev)) { + struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; + + gen8_ppgtt_clear_pte_range(pdp, start, length, scratch_pte, + !HAS_LLC(ppgtt->base.dev)); + } else { + gen8_ppgtt_clear_range_4lvl(ppgtt, scratch_pte, start, length); + } } static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_entry *pdp, struct sg_page_iter *sg_iter, uint64_t start, + size_t pages, enum i915_cache_level cache_level, const bool flush) { @@ -703,7 +731,7 @@ static void gen8_ppgtt_insert_pte_entries(struct i915_page_directory_pointer_ent pt_vaddr = NULL; - while (__sg_page_iter_next(sg_iter)) { + while (pages-- && __sg_page_iter_next(sg_iter)) { if (pt_vaddr == NULL) { struct i915_page_directory_entry *pd = pdp->page_directory[pdpe]; struct i915_page_table_entry *pt = pd->page_tables[pde]; @@ -741,11 +769,26 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, u32 unused) { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); - struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ + struct i915_page_directory_pointer_entry *pdp; struct sg_page_iter sg_iter; __sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0); - gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start, cache_level, !HAS_LLC(vm->dev)); + + if (!USES_FULL_48BIT_PPGTT(vm->dev)) { + pdp = &ppgtt->pdp; + gen8_ppgtt_insert_pte_entries(pdp, &sg_iter, start, + sg_nents(pages->sgl), + cache_level, !HAS_LLC(vm->dev)); + } else { + struct i915_pml4 *pml4; + unsigned pml4e = gen8_pml4e_index(start); + + pml4 = &ppgtt->pml4; + pdp = pml4->pdps[
[Intel-gfx] [PATCH v5 23/32] drm/i915/bdw: Add dynamic page trace events
From: Ben Widawsky The dynamic page allocation patch series added it for GEN6, this patch adds them for GEN8. v2: Consolidate pagetable/page_directory events v3: Multiple rebases. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v3) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 23 +++ drivers/gpu/drm/i915/i915_trace.h | 16 2 files changed, 31 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 92ca430..a6dad95 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -672,19 +672,24 @@ static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde, /* It's likely we'll map more than one pagetable at a time. This function will * save us unnecessary kmap calls, but do no more functionally than multiple * calls to map_pt. */ -static void gen8_map_pagetable_range(struct i915_page_directory_entry *pd, +static void gen8_map_pagetable_range(struct i915_address_space *vm, +struct i915_page_directory_entry *pd, uint64_t start, -uint64_t length, -struct drm_device *dev) +uint64_t length) { gen8_ppgtt_pde_t * const page_directory = kmap_atomic(pd->page); struct i915_page_table_entry *pt; uint64_t temp, pde; - gen8_for_each_pde(pt, pd, start, length, temp, pde) - __gen8_do_map_pt(page_directory + pde, pt, dev); + gen8_for_each_pde(pt, pd, start, length, temp, pde) { + __gen8_do_map_pt(page_directory + pde, pt, vm->dev); + trace_i915_page_table_entry_map(vm, pde, pt, +gen8_pte_index(start), +gen8_pte_count(start, length), +GEN8_PTES_PER_PAGE); + } - if (!HAS_LLC(dev)) + if (!HAS_LLC(vm->dev)) drm_clflush_virt_range(page_directory, PAGE_SIZE); kunmap_atomic(page_directory); @@ -814,6 +819,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm, pd->page_tables[pde] = pt; set_bit(pde, new_pts); + trace_i915_page_table_entry_alloc(vm, pde, start, GEN8_PDE_SHIFT); } return 0; @@ -875,6 +881,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm, pdp->page_directory[pdpe] = pd; set_bit(pdpe, new_pds); + trace_i915_page_directory_entry_alloc(vm, pdpe, start, GEN8_PDPE_SHIFT); } return 0; @@ -1013,7 +1020,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, } set_bit(pdpe, pdp->used_pdpes); - gen8_map_pagetable_range(pd, start, length, dev); + gen8_map_pagetable_range(vm, pd, start, length); } free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes); @@ -1114,7 +1121,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt) } gen8_for_each_pdpe(pd, pdp, start, size, temp, pdpe) - gen8_map_pagetable_range(pd, start, size, dev); + gen8_map_pagetable_range(&ppgtt->base, pd,start, size); ppgtt->base.allocate_va_range = NULL; ppgtt->base.clear_range = gen8_ppgtt_clear_range; diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 0038dc2..10cd830 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -214,6 +214,22 @@ DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc, TP_ARGS(vm, pde, start, pde_shift) ); +DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_entry_alloc, + TP_PROTO(struct i915_address_space *vm, u32 pdpe, u64 start, u64 pdpe_shift), + TP_ARGS(vm, pdpe, start, pdpe_shift), + + TP_printk("vm=%p, pdpe=%d (0x%llx-0x%llx)", +__entry->vm, __entry->pde, __entry->start, __entry->end) +); + +DEFINE_EVENT_PRINT(i915_page_table_entry, i915_page_directory_pointer_entry_alloc, + TP_PROTO(struct i915_address_space *vm, u32 pml4e, u64 start, u64 pml4e_shift), + TP_ARGS(vm, pml4e, start, pml4e_shift), + + TP_printk("vm=%p, pml4e=%d (0x%llx-0x%llx)", +__entry->vm, __entry->pde, __entry->start, __entry->end) +); + /* Avoid extra math because we only support two sizes. The format is defined by * bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */ #define TRACE_PT_SIZE(bits) \ -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.or
[Intel-gfx] [PATCH v5 03/32] drm/i915: Create page table allocators
From: Ben Widawsky As we move toward dynamic page table allocation, it becomes much easier to manage our data structures if break do things less coarsely by breaking up all of our actions into individual tasks. This makes the code easier to write, read, and verify. Aside from the dissection of the allocation functions, the patch statically allocates the page table structures without a page directory. This remains the same for all platforms, The patch itself should not have much functional difference. The primary noticeable difference is the fact that page tables are no longer allocated, but rather statically declared as part of the page directory. This has non-zero overhead, but things gain non-trivial complexity as a result. This patch exists for a few reasons: 1. Splitting out the functions allows easily combining GEN6 and GEN8 code. Page tables have no difference based on GEN8. As we'll see in a future patch when we add the DMA mappings to the allocations, it requires only one small change to make work, and error handling should just fall into place. 2. Unless we always want to allocate all page tables under a given PDE, we'll have to eventually break this up into an array of pointers (or pointer to pointer). 3. Having the discrete functions is easier to review, and understand. All allocations and frees now take place in just a couple of locations. Reviewing, and catching leaks should be easy. 4. Less important: the GFP flags are confined to one location, which makes playing around with such things trivial. v2: Updated commit message to explain why this patch exists v3: For lrc, s/pdp.page_directory[i].daddr/pdp.page_directory[i]->daddr/ v4: Renamed free_pt/pd_single functions to unmap_and_free_pt/pd (Daniel) v5: Added additional safety checks in gen8 clear/free/unmap. v6: Use WARN_ON and return -EINVAL in alloc_pt_range (Mika). Cc: Mika Kuoppala Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v3+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 252 drivers/gpu/drm/i915/i915_gem_gtt.h | 4 +- drivers/gpu/drm/i915/intel_lrc.c| 16 +-- 3 files changed, 178 insertions(+), 94 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index eb0714c..65c77e5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -279,6 +279,98 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, return pte; } +static void unmap_and_free_pt(struct i915_page_table_entry *pt) +{ + if (WARN_ON(!pt->page)) + return; + __free_page(pt->page); + kfree(pt); +} + +static struct i915_page_table_entry *alloc_pt_single(void) +{ + struct i915_page_table_entry *pt; + + pt = kzalloc(sizeof(*pt), GFP_KERNEL); + if (!pt) + return ERR_PTR(-ENOMEM); + + pt->page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!pt->page) { + kfree(pt); + return ERR_PTR(-ENOMEM); + } + + return pt; +} + +/** + * alloc_pt_range() - Allocate a multiple page tables + * @pd:The page directory which will have at least @count entries + * available to point to the allocated page tables. + * @pde: First page directory entry for which we are allocating. + * @count: Number of pages to allocate. + * + * Allocates multiple page table pages and sets the appropriate entries in the + * page table structure within the page directory. Function cleans up after + * itself on any failures. + * + * Return: 0 if allocation succeeded. + */ +static int alloc_pt_range(struct i915_page_directory_entry *pd, uint16_t pde, size_t count) +{ + int i, ret; + + /* 512 is the max page tables per page_directory on any platform. */ + if (WARN_ON(pde + count > GEN6_PPGTT_PD_ENTRIES)) + return -EINVAL; + + for (i = pde; i < pde + count; i++) { + struct i915_page_table_entry *pt = alloc_pt_single(); + + if (IS_ERR(pt)) { + ret = PTR_ERR(pt); + goto err_out; + } + WARN(pd->page_tables[i], +"Leaking page directory entry %d (%pa)\n", +i, pd->page_tables[i]); + pd->page_tables[i] = pt; + } + + return 0; + +err_out: + while (i--) + unmap_and_free_pt(pd->page_tables[i]); + return ret; +} + +static void unmap_and_free_pd(struct i915_page_directory_entry *pd) +{ + if (pd->page) { + __free_page(pd->page); + kfree(pd); + } +} + +static struct i915_page_directory_entry *alloc_pd_single(void) +{ + struct i915_page_directory_entry *pd; + + pd = kzalloc(sizeof(*pd), GFP_KERNEL); + if (!pd) + return ERR_PTR(-ENOMEM); + + pd->page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!pd->page) { +
[Intel-gfx] [PATCH v5 11/32] drm/i915/bdw: Use dynamic allocation idioms on free
From: Ben Widawsky The page directory freer is left here for now as it's still useful given that GEN8 still preallocates. Once the allocation functions are broken up into more discrete chunks, we'll follow suit and destroy this leftover piece. v2: Match trace_i915_va_teardown params v3: Multiple rebases. v4: Updated to use unmap_and_free_pt. v5: teardown_va_range logic no longer needed. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 26 ++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 47 + 2 files changed, 60 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 93b7bce..0289176 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -607,19 +607,6 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d } } -static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) -{ - int i; - - for (i = 0; i < ppgtt->num_pd_pages; i++) { - if (WARN_ON(!ppgtt->pdp.page_directory[i])) - continue; - - gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); - unmap_and_free_pd(ppgtt->pdp.page_directory[i]); - } -} - static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) { struct pci_dev *hwdev = ppgtt->base.dev->pdev; @@ -652,6 +639,19 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) } } +static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) +{ + int i; + + for (i = 0; i < ppgtt->num_pd_pages; i++) { + if (WARN_ON(!ppgtt->pdp.page_directory[i])) + continue; + + gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); + unmap_and_free_pd(ppgtt->pdp.page_directory[i]); + } +} + static void gen8_ppgtt_cleanup(struct i915_address_space *vm) { struct i915_hw_ppgtt *ppgtt = diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 43b5adf..70ce50d 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -383,6 +383,53 @@ static inline uint32_t gen6_pde_index(uint32_t addr) return i915_pde_index(addr, GEN6_PDE_SHIFT); } +#define gen8_for_each_pde(pt, pd, start, length, temp, iter) \ + for (iter = gen8_pde_index(start), pt = (pd)->page_tables[iter]; \ +length > 0 && iter < GEN8_PDES_PER_PAGE; \ +pt = (pd)->page_tables[++iter],\ +temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,\ +temp = min(temp, length), \ +start += temp, length -= temp) + +#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter) \ + for (iter = gen8_pdpe_index(start), pd = (pdp)->page_directory[iter]; \ +length > 0 && iter < GEN8_LEGACY_PDPES;\ +pd = (pdp)->page_directory[++iter], \ +temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start, \ +temp = min(temp, length), \ +start += temp, length -= temp) + +/* Clamp length to the next page_directory boundary */ +static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length) +{ + uint64_t next_pd = ALIGN(start + 1, 1 << GEN8_PDPE_SHIFT); + + if (next_pd > (start + length)) + return length; + + return next_pd - start; +} + +static inline uint32_t gen8_pte_index(uint64_t address) +{ + return i915_pte_index(address, GEN8_PDE_SHIFT); +} + +static inline uint32_t gen8_pde_index(uint64_t address) +{ + return i915_pde_index(address, GEN8_PDE_SHIFT); +} + +static inline uint32_t gen8_pdpe_index(uint64_t address) +{ + return (address >> GEN8_PDPE_SHIFT) & GEN8_PDPE_MASK; +} + +static inline uint32_t gen8_pml4e_index(uint64_t address) +{ + BUG(); /* For 64B */ +} + int i915_gem_gtt_init(struct drm_device *dev); void i915_gem_init_global_gtt(struct drm_device *dev); void i915_global_gtt_cleanup(struct drm_device *dev); -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: FIFO space query code refactor
On Fri, Feb 20, 2015 at 11:34:29AM +0200, Mika Kuoppala wrote: > Dave Gordon writes: > > > When querying the GTFIFOCTL register to check the FIFO space, the read value > > must be masked. The operation is repeated explicitly in several places. This > > change refactors the read-and-mask code into a function call. > > > > v2: rebased on top of Mika's forcewake patch set, specifically: > > [PATCH 8/8] drm/i915: Enum forcewake domains and domain identifiers > > > > Change-Id: Id1a9f3785cb20b82d4caa330c37b31e4e384a3ef > > Signed-off-by: Dave Gordon > > Reviewed-by: Mika Kuoppala Queued for -next, thanks for the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5 10/32] drm/i915: Add dynamic page trace events
Traces for page directories and tables allocation and map. v2: Removed references to teardown. v3: bitmap_scnprintf has been deprecated. Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_gem_gtt.c | 5 ++ drivers/gpu/drm/i915/i915_trace.h | 95 + 3 files changed, 102 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 312b7d2..4e51275 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3601,6 +3601,8 @@ search_free: /* allocate before insert / bind */ if (vma->vm->allocate_va_range) { + trace_i915_va_alloc(vma->vm, vma->node.start, vma->node.size, + VM_TO_TRACE_NAME(vma->vm)); ret = vma->vm->allocate_va_range(vma->vm, vma->node.start, vma->node.size); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 85c8a51..93b7bce 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1210,6 +1210,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm, ppgtt->pd.page_tables[pde] = pt; set_bit(pde, new_page_tables); + trace_i915_page_table_entry_alloc(vm, pde, start, GEN6_PDE_SHIFT); } start = start_save; @@ -1225,6 +1226,10 @@ static int gen6_alloc_va_range(struct i915_address_space *vm, if (test_and_clear_bit(pde, new_page_tables)) gen6_write_pde(&ppgtt->pd, pde, pt); + trace_i915_page_table_entry_map(vm, pde, pt, +gen6_pte_index(start), +gen6_pte_count(start, length), +I915_PPGTT_PT_ENTRIES); bitmap_or(pt->used_ptes, tmp_bitmap, pt->used_ptes, I915_PPGTT_PT_ENTRIES); } diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index f004d3d..0038dc2 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -156,6 +156,101 @@ TRACE_EVENT(i915_vma_unbind, __entry->obj, __entry->offset, __entry->size, __entry->vm) ); +#define VM_TO_TRACE_NAME(vm) \ + (i915_is_ggtt(vm) ? "GGTT" : \ + "Private VM") + +DECLARE_EVENT_CLASS(i915_va, + TP_PROTO(struct i915_address_space *vm, u64 start, u64 length, const char *name), + TP_ARGS(vm, start, length, name), + + TP_STRUCT__entry( + __field(struct i915_address_space *, vm) + __field(u64, start) + __field(u64, end) + __string(name, name) + ), + + TP_fast_assign( + __entry->vm = vm; + __entry->start = start; + __entry->end = start + length; + __assign_str(name, name); + ), + + TP_printk("vm=%p (%s), 0x%llx-0x%llx", + __entry->vm, __get_str(name), __entry->start, __entry->end) +); + +DEFINE_EVENT(i915_va, i915_va_alloc, +TP_PROTO(struct i915_address_space *vm, u64 start, u64 length, const char *name), +TP_ARGS(vm, start, length, name) +); + +DECLARE_EVENT_CLASS(i915_page_table_entry, + TP_PROTO(struct i915_address_space *vm, u32 pde, u64 start, u64 pde_shift), + TP_ARGS(vm, pde, start, pde_shift), + + TP_STRUCT__entry( + __field(struct i915_address_space *, vm) + __field(u32, pde) + __field(u64, start) + __field(u64, end) + ), + + TP_fast_assign( + __entry->vm = vm; + __entry->pde = pde; + __entry->start = start; + __entry->end = (start + (1ULL << pde_shift)) & ~((1ULL << pde_shift)-1); + ), + + TP_printk("vm=%p, pde=%d (0x%llx-0x%llx)", + __entry->vm, __entry->pde, __entry->start, __entry->end) +); + +DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc, +TP_PROTO(struct i915_address_space *vm, u32 pde, u64 start, u64 pde_shift), +TP_ARGS(vm, pde, start, pde_shift) +); + +/* Avoid extra math because we only support two sizes. The format is defined by + * bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */ +#define TRACE_PT_SIZE(bits) \ + bits) == 1024) ? 288 : 144) + 1) + +DECLARE_EVENT_CLASS(i915_page_table_entry_update, + TP_PROTO(struct i915_address_space *vm, u32 pde, +struct i915_page_table_entry *pt, u32 first, u32 len, size_t bits), + TP_ARGS(vm, pde, pt, first, len, bits), + + TP_STRUCT__entry( + __field(struct i915_address_space *, vm) + __field(u32,
[Intel-gfx] [PATCH v5 24/32] drm/i915/bdw: Add ppgtt info for dynamic pages
From: Ben Widawsky Note that there is no gen8 ppgtt debug_dump function yet. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_debugfs.c | 19 ++- drivers/gpu/drm/i915/i915_gem_gtt.c | 32 drivers/gpu/drm/i915/i915_gem_gtt.h | 9 + 3 files changed, 51 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index e85da9d..c877957 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2165,7 +2165,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *ring; - struct drm_file *file; int i; if (INTEL_INFO(dev)->gen == 6) @@ -2189,14 +2188,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev) ppgtt->debug_dump(ppgtt, m); } - - list_for_each_entry_reverse(file, &dev->filelist, lhead) { - struct drm_i915_file_private *file_priv = file->driver_priv; - - seq_printf(m, "proc: %s\n", - get_pid_task(file->pid, PIDTYPE_PID)->comm); - idr_for_each(&file_priv->context_idr, per_file_ctx, m); - } } static int i915_ppgtt_info(struct seq_file *m, void *data) @@ -2204,6 +2195,7 @@ static int i915_ppgtt_info(struct seq_file *m, void *data) struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct drm_i915_private *dev_priv = dev->dev_private; + struct drm_file *file; int ret = mutex_lock_interruptible(&dev->struct_mutex); if (ret) @@ -2215,6 +2207,15 @@ static int i915_ppgtt_info(struct seq_file *m, void *data) else if (INTEL_INFO(dev)->gen >= 6) gen6_ppgtt_info(m, dev); + list_for_each_entry_reverse(file, &dev->filelist, lhead) { + struct drm_i915_file_private *file_priv = file->driver_priv; + + seq_printf(m, "\nproc: %s\n", + get_pid_task(file->pid, PIDTYPE_PID)->comm); + idr_for_each(&file_priv->context_idr, per_file_ctx, +(void *)(unsigned long)m); + } + intel_runtime_pm_put(dev_priv); mutex_unlock(&dev->struct_mutex); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a6dad95..1bf457a 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2127,6 +2127,38 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm, readl(gtt_base); } +void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt, +void (*callback)(struct i915_page_directory_pointer_entry *pdp, + struct i915_page_directory_entry *pd, + struct i915_page_table_entry *pt, + unsigned pdpe, + unsigned pde, + void *data), +void *data) +{ + uint64_t start = ppgtt->base.start; + uint64_t length = ppgtt->base.total; + uint64_t pdpe, pde, temp; + + struct i915_page_directory_entry *pd; + struct i915_page_table_entry *pt; + + gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) { + uint64_t pd_start = start, pd_length = length; + int i; + + if (pd == NULL) { + for (i = 0; i < GEN8_PDES_PER_PAGE; i++) + callback(&ppgtt->pdp, NULL, NULL, pdpe, i, data); + continue; + } + + gen8_for_each_pde(pt, pd, pd_start, pd_length, temp, pde) { + callback(&ppgtt->pdp, pd, pt, pdpe, pde, data); + } + } +} + static void gen6_ggtt_clear_range(struct i915_address_space *vm, uint64_t start, uint64_t length, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index a33c6e9..144858e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -483,6 +483,15 @@ static inline size_t gen8_pde_count(uint64_t addr, uint64_t length) return i915_pde_index(end, GEN8_PDE_SHIFT) - i915_pde_index(addr, GEN8_PDE_SHIFT); } +void gen8_for_every_pdpe_pde(struct i915_hw_ppgtt *ppgtt, +void (*callback)(struct i915_page_directory_pointer_entry *pdp, + struct i915_page_directory_entry *pd, + struct i915_page_table_entry *pt, +
[Intel-gfx] [PATCH v5 04/32] drm/i915: Plumb drm_device through page tables operations
The next patch in the series will require it for alloc_pt_single. Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem_gtt.c | 29 - 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 65c77e5..65a506c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -142,7 +142,6 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) return has_aliasing_ppgtt ? 1 : 0; } - static void ppgtt_bind_vma(struct i915_vma *vma, enum i915_cache_level cache_level, u32 flags); @@ -279,7 +278,7 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, return pte; } -static void unmap_and_free_pt(struct i915_page_table_entry *pt) +static void unmap_and_free_pt(struct i915_page_table_entry *pt, struct drm_device *dev) { if (WARN_ON(!pt->page)) return; @@ -287,7 +286,7 @@ static void unmap_and_free_pt(struct i915_page_table_entry *pt) kfree(pt); } -static struct i915_page_table_entry *alloc_pt_single(void) +static struct i915_page_table_entry *alloc_pt_single(struct drm_device *dev) { struct i915_page_table_entry *pt; @@ -317,7 +316,9 @@ static struct i915_page_table_entry *alloc_pt_single(void) * * Return: 0 if allocation succeeded. */ -static int alloc_pt_range(struct i915_page_directory_entry *pd, uint16_t pde, size_t count) +static int alloc_pt_range(struct i915_page_directory_entry *pd, uint16_t pde, size_t count, + struct drm_device *dev) + { int i, ret; @@ -326,7 +327,7 @@ static int alloc_pt_range(struct i915_page_directory_entry *pd, uint16_t pde, si return -EINVAL; for (i = pde; i < pde + count; i++) { - struct i915_page_table_entry *pt = alloc_pt_single(); + struct i915_page_table_entry *pt = alloc_pt_single(dev); if (IS_ERR(pt)) { ret = PTR_ERR(pt); @@ -342,7 +343,7 @@ static int alloc_pt_range(struct i915_page_directory_entry *pd, uint16_t pde, si err_out: while (i--) - unmap_and_free_pt(pd->page_tables[i]); + unmap_and_free_pt(pd->page_tables[i], dev); return ret; } @@ -521,7 +522,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, } } -static void gen8_free_page_tables(struct i915_page_directory_entry *pd) +static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct drm_device *dev) { int i; @@ -532,7 +533,7 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd) if (WARN_ON(!pd->page_tables[i])) continue; - unmap_and_free_pt(pd->page_tables[i]); + unmap_and_free_pt(pd->page_tables[i], dev); pd->page_tables[i] = NULL; } } @@ -545,7 +546,7 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) if (WARN_ON(!ppgtt->pdp.page_directory[i])) continue; - gen8_free_page_tables(ppgtt->pdp.page_directory[i]); + gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); unmap_and_free_pd(ppgtt->pdp.page_directory[i]); } } @@ -597,7 +598,7 @@ static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt) for (i = 0; i < ppgtt->num_pd_pages; i++) { ret = alloc_pt_range(ppgtt->pdp.page_directory[i], -0, GEN8_PDES_PER_PAGE); +0, GEN8_PDES_PER_PAGE, ppgtt->base.dev); if (ret) goto unwind_out; } @@ -606,7 +607,7 @@ static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt) unwind_out: while (i--) - gen8_free_page_tables(ppgtt->pdp.page_directory[i]); + gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); return -ENOMEM; } @@ -1087,7 +1088,7 @@ static void gen6_ppgtt_free(struct i915_hw_ppgtt *ppgtt) int i; for (i = 0; i < ppgtt->num_pd_entries; i++) - unmap_and_free_pt(ppgtt->pd.page_tables[i]); + unmap_and_free_pt(ppgtt->pd.page_tables[i], ppgtt->base.dev); unmap_and_free_pd(&ppgtt->pd); } @@ -1152,7 +1153,9 @@ static int gen6_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt) if (ret) return ret; - ret = alloc_pt_range(&ppgtt->pd, 0, ppgtt->num_pd_entries); + ret = alloc_pt_range(&ppgtt->pd, 0, ppgtt->num_pd_entries, + ppgtt->base.dev); + if (ret) { drm_mm_remove_node(&ppgtt->node); return ret; -- 2.1.1 ___ Intel-gfx
[Intel-gfx] [PATCH v5 22/32] drm/i915/bdw: Abstract PDP usage
From: Ben Widawsky Up until now, ppgtt->pdp has always been the root of our page tables. Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs. In preparation for 4 level page tables, we need to stop use ppgtt->pdp directly unless we know it's what we want. The future structure will use ppgtt->pml4 for the top level, and the pdp is just one of the entries being pointed to by a pml4e. This patch addresses some carelessness done throughout development wrt assumptions made of the root page tables. v2: Updated after dynamic page allocation changes. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 123 1 file changed, 70 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1cd5f65..92ca430 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -559,6 +559,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); + struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ gen8_gtt_pte_t *pt_vaddr, scratch_pte; unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; @@ -574,10 +575,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, struct i915_page_table_entry *pt; struct page *page_table; - if (WARN_ON(!ppgtt->pdp.page_directory[pdpe])) + if (WARN_ON(!pdp->page_directory[pdpe])) continue; - pd = ppgtt->pdp.page_directory[pdpe]; + pd = pdp->page_directory[pdpe]; if (WARN_ON(!pd->page_tables[pde])) continue; @@ -619,6 +620,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, { struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); + struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ gen8_gtt_pte_t *pt_vaddr; unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; @@ -629,7 +631,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) { if (pt_vaddr == NULL) { - struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[pdpe]; + struct i915_page_directory_entry *pd = pdp->page_directory[pdpe]; struct i915_page_table_entry *pt = pd->page_tables[pde]; struct page *page_table = pt->page; @@ -707,16 +709,17 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) { struct pci_dev *hwdev = ppgtt->base.dev->pdev; + struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ int i, j; - for_each_set_bit(i, ppgtt->pdp.used_pdpes, + for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(ppgtt->base.dev)) { struct i915_page_directory_entry *pd; - if (WARN_ON(!ppgtt->pdp.page_directory[i])) + if (WARN_ON(!pdp->page_directory[i])) continue; - pd = ppgtt->pdp.page_directory[i]; + pd = pdp->page_directory[i]; if (!pd->daddr) pci_unmap_page(hwdev, pd->daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); @@ -742,15 +745,21 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) { int i; - for_each_set_bit(i, ppgtt->pdp.used_pdpes, - I915_PDPES_PER_PDP(ppgtt->base.dev)) { - if (WARN_ON(!ppgtt->pdp.page_directory[i])) - continue; + if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { + for_each_set_bit(i, ppgtt->pdp.used_pdpes, +I915_PDPES_PER_PDP(ppgtt->base.dev)) { + if (WARN_ON(!ppgtt->pdp.page_directory[i])) + continue; - gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev); - unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev); + gen8_free_page_tables(ppgtt->pdp.page_directory[i], + ppgtt->base.dev); + unmap_and_free_pd(ppgtt->pdp.page_directory[i], + ppgtt->base.dev); + } + unmap_and_free_pdp(&ppgtt->pdp
[Intel-gfx] [PATCH v5 18/32] drm/i915/bdw: begin bitmap tracking
From: Ben Widawsky Like with gen6/7, we can enable bitmap tracking with all the preallocations to make sure things actually don't blow up. v2: Rebased to match changes from previous patches. v3: Without teardown logic, rely on used_pdpes and used_pdes when freeing page tables. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 75 - drivers/gpu/drm/i915/i915_gem_gtt.h | 24 2 files changed, 81 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 3a75408..d9b488a 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -422,6 +422,7 @@ static void unmap_and_free_pd(struct i915_page_directory_entry *pd, if (pd->page) { i915_dma_unmap_single(pd, dev); __free_page(pd->page); + kfree(pd->used_pdes); kfree(pd); } } @@ -429,26 +430,35 @@ static void unmap_and_free_pd(struct i915_page_directory_entry *pd, static struct i915_page_directory_entry *alloc_pd_single(struct drm_device *dev) { struct i915_page_directory_entry *pd; - int ret; + int ret = -ENOMEM; pd = kzalloc(sizeof(*pd), GFP_KERNEL); if (!pd) return ERR_PTR(-ENOMEM); + pd->used_pdes = kcalloc(BITS_TO_LONGS(GEN8_PDES_PER_PAGE), + sizeof(*pd->used_pdes), GFP_KERNEL); + if (!pd->used_pdes) + goto free_pd; + pd->page = alloc_page(GFP_KERNEL | __GFP_ZERO); - if (!pd->page) { - kfree(pd); - return ERR_PTR(-ENOMEM); - } + if (!pd->page) + goto free_bitmap; ret = i915_dma_map_single(pd, dev); - if (ret) { - __free_page(pd->page); - kfree(pd); - return ERR_PTR(ret); - } + if (ret) + goto free_page; return pd; + +free_page: + __free_page(pd->page); +free_bitmap: + kfree(pd->used_pdes); +free_pd: + kfree(pd); + + return ERR_PTR(ret); } /* Broadwell Page Directory Pointer Descriptors */ @@ -639,7 +649,7 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d if (!pd->page) return; - for (i = 0; i < GEN8_PDES_PER_PAGE; i++) { + for_each_set_bit(i, pd->used_pdes, GEN8_PDES_PER_PAGE) { if (WARN_ON(!pd->page_tables[i])) continue; @@ -653,15 +663,18 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) struct pci_dev *hwdev = ppgtt->base.dev->pdev; int i, j; - for (i = 0; i < GEN8_LEGACY_PDPES; i++) { - if (!ppgtt->pdp.page_directory[i]->daddr) + for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) { + struct i915_page_directory_entry *pd; + + if (WARN_ON(!ppgtt->pdp.page_directory[i])) continue; - pci_unmap_page(hwdev, ppgtt->pdp.page_directory[i]->daddr, PAGE_SIZE, - PCI_DMA_BIDIRECTIONAL); + pd = ppgtt->pdp.page_directory[i]; + if (!pd->daddr) + pci_unmap_page(hwdev, pd->daddr, PAGE_SIZE, + PCI_DMA_BIDIRECTIONAL); - for (j = 0; j < GEN8_PDES_PER_PAGE; j++) { - struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i]; + for_each_set_bit(j, pd->used_pdes, GEN8_PDES_PER_PAGE) { struct i915_page_table_entry *pt; dma_addr_t addr; @@ -682,7 +695,7 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) { int i; - for (i = 0; i < GEN8_LEGACY_PDPES; i++) { + for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) { if (WARN_ON(!ppgtt->pdp.page_directory[i])) continue; @@ -725,6 +738,7 @@ unwind_out: return -ENOMEM; } +/* bitmap of new page_directories */ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_entry *pdp, uint64_t start, uint64_t length, @@ -740,6 +754,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_ gen8_for_each_pdpe(unused, pdp, start, length, temp, pdpe) { BUG_ON(unused); pdp->page_directory[pdpe] = alloc_pd_single(dev); + if (IS_ERR(pdp->page_directory[pdpe])) goto unwind_out; } @@ -760,10 +775,13 @@ static int gen8_alloc_va_range(struct i915_address_space *vm, struct i915_hw_ppgtt *ppgtt = container_of(vm, struct i915_hw_ppgtt, base); struct i915_page_di
[Intel-gfx] [PATCH v5 19/32] drm/i915/bdw: Dynamic page table allocations
From: Ben Widawsky This finishes off the dynamic page tables allocations, in the legacy 3 level style that already exists. Most everything has already been setup to this point, the patch finishes off the enabling by setting the appropriate function pointers. v2: Update aliasing/true ppgtt allocate/teardown/clear functions for gen 6 & 7. v3: Rebase. v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either teardown_va_range or clear_range functions exist (Daniel). v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed for aliasing ppgtt. Zombie tracking was originally added for teardown function and is no longer required. v6: Update err_out case in gen8_alloc_va_range (missed from lastest rebase). Cc: Daniel Vetter Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 300 +--- 1 file changed, 246 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index d9b488a..63caaed 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -612,7 +612,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, } } -static void __gen8_do_map_pt(gen8_ppgtt_pde_t *pde, +static void __gen8_do_map_pt(gen8_ppgtt_pde_t * const pde, struct i915_page_table_entry *pt, struct drm_device *dev) { @@ -629,7 +629,7 @@ static void gen8_map_pagetable_range(struct i915_page_directory_entry *pd, uint64_t length, struct drm_device *dev) { - gen8_ppgtt_pde_t *page_directory = kmap_atomic(pd->page); + gen8_ppgtt_pde_t * const page_directory = kmap_atomic(pd->page); struct i915_page_table_entry *pt; uint64_t temp, pde; @@ -713,58 +713,163 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm) gen8_ppgtt_free(ppgtt); } -static int gen8_ppgtt_alloc_pagetabs(struct i915_page_directory_entry *pd, +/** + * gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range. + * @ppgtt: Master ppgtt structure. + * @pd:Page directory for this address range. + * @start: Starting virtual address to begin allocations. + * @length Size of the allocations. + * @new_pts: Bitmap set by function with new allocations. Likely used by the + * caller to free on error. + * + * Allocate the required number of page tables. Extremely similar to + * gen8_ppgtt_alloc_page_directories(). The main difference is here we are limited by + * the page directory boundary (instead of the page directory pointer). That + * boundary is 1GB virtual. Therefore, unlike gen8_ppgtt_alloc_page_directories(), it is + * possible, and likely that the caller will need to use multiple calls of this + * function to achieve the appropriate allocation. + * + * Return: 0 if success; negative error code otherwise. + */ +static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt, +struct i915_page_directory_entry *pd, uint64_t start, uint64_t length, -struct drm_device *dev) +unsigned long *new_pts) { - struct i915_page_table_entry *unused; + struct i915_page_table_entry *pt; uint64_t temp; uint32_t pde; - gen8_for_each_pde(unused, pd, start, length, temp, pde) { - BUG_ON(unused); - pd->page_tables[pde] = alloc_pt_single(dev); - if (IS_ERR(pd->page_tables[pde])) + gen8_for_each_pde(pt, pd, start, length, temp, pde) { + /* Don't reallocate page tables */ + if (pt) { + /* Scratch is never allocated this way */ + WARN_ON(pt->scratch); + continue; + } + + pt = alloc_pt_single(ppgtt->base.dev); + if (IS_ERR(pt)) goto unwind_out; + + pd->page_tables[pde] = pt; + set_bit(pde, new_pts); } return 0; unwind_out: - while (pde--) - unmap_and_free_pt(pd->page_tables[pde], dev); + for_each_set_bit(pde, new_pts, GEN8_PDES_PER_PAGE) + unmap_and_free_pt(pd->page_tables[pde], ppgtt->base.dev); return -ENOMEM; } -/* bitmap of new page_directories */ -static int gen8_ppgtt_alloc_page_directories(struct i915_page_directory_pointer_entry *pdp, +/** + * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range. + * @ppgtt: Master ppgtt structure. + * @pdp: Page directory pointer for this address range. + * @start: Starting virtual address to begin allocations. + * @length Size of
[Intel-gfx] [PATCH v5 01/32] drm/i915: page table abstractions
From: Ben Widawsky When we move to dynamic page allocation, keeping page_directory and pagetabs as separate structures will help to break actions into simpler tasks. To help transition the code nicely there is some wasted space in gen6/7. This will be ameliorated shortly. Following the x86 pagetable terminology: PDPE = struct i915_page_directory_pointer_entry. PDE = struct i915_page_directory_entry [page_directory]. PTE = struct i915_page_table_entry [page_tables]. v2: fixed mismatches after clean-up/rebase. v3: Clarify the names of the multiple levels of page tables (Daniel) v4: Addressing Mika's review comments. s/gen8_free_page_directories/gen8_free_page_directory and free the page tables for the directory there. In gen8_ppgtt_allocate_page_directories, do not leak previously allocated pt in case the page_directory alloc fails. Update error return handling in gen8_ppgtt_alloc. Cc: Mika Kuoppala Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 178 ++-- drivers/gpu/drm/i915/i915_gem_gtt.h | 23 - 2 files changed, 109 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index e54b2a0..10026d3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -338,7 +338,8 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, I915_CACHE_LLC, use_scratch); while (num_entries) { - struct page *page_table = ppgtt->gen8_pt_pages[pdpe][pde]; + struct i915_page_directory_entry *pd = &ppgtt->pdp.page_directory[pdpe]; + struct page *page_table = pd->page_tables[pde].page; last_pte = pte + num_entries; if (last_pte > GEN8_PTES_PER_PAGE) @@ -382,8 +383,12 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES)) break; - if (pt_vaddr == NULL) - pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[pdpe][pde]); + if (pt_vaddr == NULL) { + struct i915_page_directory_entry *pd = &ppgtt->pdp.page_directory[pdpe]; + struct page *page_table = pd->page_tables[pde].page; + + pt_vaddr = kmap_atomic(page_table); + } pt_vaddr[pte] = gen8_pte_encode(sg_page_iter_dma_address(&sg_iter), @@ -407,29 +412,33 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, } } -static void gen8_free_page_tables(struct page **pt_pages) +static void gen8_free_page_tables(struct i915_page_directory_entry *pd) { int i; - if (pt_pages == NULL) + if (pd->page_tables == NULL) return; for (i = 0; i < GEN8_PDES_PER_PAGE; i++) - if (pt_pages[i]) - __free_pages(pt_pages[i], 0); + if (pd->page_tables[i].page) + __free_page(pd->page_tables[i].page); } -static void gen8_ppgtt_free(const struct i915_hw_ppgtt *ppgtt) +static void gen8_free_page_directory(struct i915_page_directory_entry *pd) +{ + gen8_free_page_tables(pd); + kfree(pd->page_tables); + __free_page(pd->page); +} + +static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) { int i; for (i = 0; i < ppgtt->num_pd_pages; i++) { - gen8_free_page_tables(ppgtt->gen8_pt_pages[i]); - kfree(ppgtt->gen8_pt_pages[i]); + gen8_free_page_directory(&ppgtt->pdp.page_directory[i]); kfree(ppgtt->gen8_pt_dma_addr[i]); } - - __free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT)); } static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) @@ -464,86 +473,77 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm) gen8_ppgtt_free(ppgtt); } -static struct page **__gen8_alloc_page_tables(void) +static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt) { - struct page **pt_pages; int i; - pt_pages = kcalloc(GEN8_PDES_PER_PAGE, sizeof(struct page *), GFP_KERNEL); - if (!pt_pages) - return ERR_PTR(-ENOMEM); - - for (i = 0; i < GEN8_PDES_PER_PAGE; i++) { - pt_pages[i] = alloc_page(GFP_KERNEL); - if (!pt_pages[i]) - goto bail; + for (i = 0; i < ppgtt->num_pd_pages; i++) { + ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE, +sizeof(dma_addr_t), +GFP_KERNEL); + if (!ppgtt->gen8_pt_dma_addr[i]) + return -ENOMEM; } - return pt_pages; - -bail: -
[Intel-gfx] [PATCH v5 31/32] drm/i915: Expand error state's address width to 64b
From: Ben Widawsky v2: 0 pad the new 8B fields or else intel_error_decode has a hard time. Note, regardless we need an igt update. v3: Make reloc_offset 64b also. Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_drv.h | 4 ++-- drivers/gpu/drm/i915/i915_gpu_error.c | 17 + 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 662d6c1..d28abd1 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -459,7 +459,7 @@ struct drm_i915_error_state { struct drm_i915_error_object { int page_count; - u32 gtt_offset; + u64 gtt_offset; u32 *pages[0]; } *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page; @@ -485,7 +485,7 @@ struct drm_i915_error_state { u32 size; u32 name; u32 rseqno, wseqno; - u32 gtt_offset; + u64 gtt_offset; u32 read_domains; u32 write_domain; s32 fence_reg:I915_MAX_NUM_FENCE_BITS; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index a982849..bbf25d0 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -195,7 +195,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m, err_printf(m, " %s [%d]:\n", name, count); while (count--) { - err_printf(m, "%08x %8u %02x %02x %x %x", + err_printf(m, "%016llx %8u %02x %02x %x %x", err->gtt_offset, err->size, err->read_domains, @@ -415,7 +415,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, err_printf(m, " (submitted by %s [%d])", error->ring[i].comm, error->ring[i].pid); - err_printf(m, " --- gtt_offset = 0x%08x\n", + err_printf(m, " --- gtt_offset = 0x%016llx\n", obj->gtt_offset); print_error_obj(m, obj); } @@ -423,7 +423,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, obj = error->ring[i].wa_batchbuffer; if (obj) { err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n", - dev_priv->ring[i].name, obj->gtt_offset); + dev_priv->ring[i].name, + lower_32_bits(obj->gtt_offset)); print_error_obj(m, obj); } @@ -442,14 +443,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, if ((obj = error->ring[i].ringbuffer)) { err_printf(m, "%s --- ringbuffer = 0x%08x\n", dev_priv->ring[i].name, - obj->gtt_offset); + lower_32_bits(obj->gtt_offset)); print_error_obj(m, obj); } if ((obj = error->ring[i].hws_page)) { err_printf(m, "%s --- HW Status = 0x%08x\n", dev_priv->ring[i].name, - obj->gtt_offset); + lower_32_bits(obj->gtt_offset)); offset = 0; for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { err_printf(m, "[%04x] %08x %08x %08x %08x\n", @@ -465,13 +466,13 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m, if ((obj = error->ring[i].ctx)) { err_printf(m, "%s --- HW Context = 0x%08x\n", dev_priv->ring[i].name, - obj->gtt_offset); + lower_32_bits(obj->gtt_offset)); print_error_obj(m, obj); } } if ((obj = error->semaphore_obj)) { - err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset); + err_printf(m, "Semaphore page = 0x%016llx\n", obj->gtt_offset); for (elt = 0; elt < PAGE_SIZE/16; elt += 4) { err_printf(m, "[%04x] %08x %08x %08x %08x\n", elt * 4, @@ -571,7 +572,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv, int num_pages; bool use_ggtt; int i = 0; - u32 reloc_offset; + u64 reloc_offset; if (src == NULL || src->pages == NULL) return NULL; -- 2.1.1 _
[Intel-gfx] [PATCH v5 26/32] drm/i915/bdw: Add 4 level switching infrastructure
From: Ben Widawsky Map is easy, it's the same register as the PDP descriptor 0, but it only has one entry. v2: PML4 update in legacy context switch is left for historic reasons, the preferred mode of operation is with lrc context based submission. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem_gtt.c | 56 + drivers/gpu/drm/i915/i915_gem_gtt.h | 4 ++- drivers/gpu/drm/i915/i915_reg.h | 1 + 3 files changed, 55 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 44f8fa5..2c3f2db 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -192,6 +192,9 @@ static inline gen8_ppgtt_pde_t gen8_pde_encode(struct drm_device *dev, return pde; } +#define gen8_pdpe_encode gen8_pde_encode +#define gen8_pml4e_encode gen8_pde_encode + static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr, enum i915_cache_level level, bool valid, u32 unused) @@ -591,8 +594,8 @@ static int gen8_write_pdp(struct intel_engine_cs *ring, return 0; } -static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, - struct intel_engine_cs *ring) +static int gen8_legacy_mm_switch(struct i915_hw_ppgtt *ppgtt, +struct intel_engine_cs *ring) { int i, ret; @@ -609,6 +612,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, return 0; } +static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt, + struct intel_engine_cs *ring) +{ + return gen8_write_pdp(ring, 0, ppgtt->pml4.daddr); +} + static void gen8_ppgtt_clear_range(struct i915_address_space *vm, uint64_t start, uint64_t length, @@ -752,6 +761,37 @@ static void gen8_map_pagetable_range(struct i915_address_space *vm, kunmap_atomic(page_directory); } +static void gen8_map_page_directory(struct i915_page_directory_pointer_entry *pdp, + struct i915_page_directory_entry *pd, + int index, + struct drm_device *dev) +{ + gen8_ppgtt_pdpe_t *page_directorypo; + gen8_ppgtt_pdpe_t pdpe; + + /* We do not need to clflush because no platform requiring flush +* supports 64b pagetables. */ + if (!USES_FULL_48BIT_PPGTT(dev)) + return; + + page_directorypo = kmap_atomic(pdp->page); + pdpe = gen8_pdpe_encode(dev, pd->daddr, I915_CACHE_LLC); + page_directorypo[index] = pdpe; + kunmap_atomic(page_directorypo); +} + +static void gen8_map_page_directory_pointer(struct i915_pml4 *pml4, + struct i915_page_directory_pointer_entry *pdp, + int index, + struct drm_device *dev) +{ + gen8_ppgtt_pml4e_t *pagemap = kmap_atomic(pml4->page); + gen8_ppgtt_pml4e_t pml4e = gen8_pml4e_encode(dev, pdp->daddr, I915_CACHE_LLC); + BUG_ON(!USES_FULL_48BIT_PPGTT(dev)); + pagemap[index] = pml4e; + kunmap_atomic(pagemap); +} + static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct drm_device *dev) { int i; @@ -1123,6 +1163,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm, set_bit(pdpe, pdp->used_pdpes); gen8_map_pagetable_range(vm, pd, start, length); + gen8_map_page_directory(pdp, pd, pdpe, dev); } free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes); @@ -1191,6 +1232,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm, ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length); if (ret) goto err_out; + + gen8_map_page_directory_pointer(pml4, pdp, pml4e, vm->dev); } bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es, @@ -1250,14 +1293,14 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size) ppgtt->base.cleanup = gen8_ppgtt_cleanup; ppgtt->base.insert_entries = gen8_ppgtt_insert_entries; - ppgtt->switch_mm = gen8_mm_switch; - if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { int ret = pml4_init(ppgtt); if (ret) { unmap_and_free_pt(ppgtt->scratch_pd, ppgtt->base.dev); return ret; } + + ppgtt->switch_mm = gen8_48b_mm_switch; } else { int ret = __pdp_init(&ppgtt->pdp, false); if (ret) { @@ -1265,6 +1308,7 @@ static int gen8_ppgtt_init_common(struct i915_hw_ppgtt *ppgtt, uint64_t size)
[Intel-gfx] [PATCH v5 02/32] drm/i915: Complete page table structures
From: Ben Widawsky Move the remaining members over to the new page table structures. This can be squashed with the previous commit if desire. The reasoning is the same as that patch. I simply felt it is easier to review if split. v2: In lrc: s/ppgtt->pd_dma_addr[i]/ppgtt->pdp.page_directory[i].daddr/ v3: Rebase. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2, v3) --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 85 + drivers/gpu/drm/i915/i915_gem_gtt.h | 14 +++--- drivers/gpu/drm/i915/intel_lrc.c| 16 +++ 4 files changed, 44 insertions(+), 73 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 63be374..4d07030 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2185,7 +2185,7 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev) struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; seq_puts(m, "aliasing PPGTT:\n"); - seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd_offset); + seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd.pd_offset); ppgtt->debug_dump(ppgtt, m); } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 10026d3..eb0714c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -311,7 +311,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, int used_pd = ppgtt->num_pd_entries / GEN8_PDES_PER_PAGE; for (i = used_pd - 1; i >= 0; i--) { - dma_addr_t addr = ppgtt->pd_dma_addr[i]; + dma_addr_t addr = ppgtt->pdp.page_directory[i].daddr; ret = gen8_write_pdp(ring, i, addr); if (ret) return ret; @@ -437,7 +437,6 @@ static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) for (i = 0; i < ppgtt->num_pd_pages; i++) { gen8_free_page_directory(&ppgtt->pdp.page_directory[i]); - kfree(ppgtt->gen8_pt_dma_addr[i]); } } @@ -449,14 +448,14 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) for (i = 0; i < ppgtt->num_pd_pages; i++) { /* TODO: In the future we'll support sparse mappings, so this * will have to change. */ - if (!ppgtt->pd_dma_addr[i]) + if (!ppgtt->pdp.page_directory[i].daddr) continue; - pci_unmap_page(hwdev, ppgtt->pd_dma_addr[i], PAGE_SIZE, + pci_unmap_page(hwdev, ppgtt->pdp.page_directory[i].daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); for (j = 0; j < GEN8_PDES_PER_PAGE; j++) { - dma_addr_t addr = ppgtt->gen8_pt_dma_addr[i][j]; + dma_addr_t addr = ppgtt->pdp.page_directory[i].page_tables[j].daddr; if (addr) pci_unmap_page(hwdev, addr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); @@ -473,32 +472,19 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm) gen8_ppgtt_free(ppgtt); } -static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt) -{ - int i; - - for (i = 0; i < ppgtt->num_pd_pages; i++) { - ppgtt->gen8_pt_dma_addr[i] = kcalloc(GEN8_PDES_PER_PAGE, -sizeof(dma_addr_t), -GFP_KERNEL); - if (!ppgtt->gen8_pt_dma_addr[i]) - return -ENOMEM; - } - - return 0; -} - static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt) { int i, j; for (i = 0; i < ppgtt->num_pd_pages; i++) { + struct i915_page_directory_entry *pd = &ppgtt->pdp.page_directory[i]; for (j = 0; j < GEN8_PDES_PER_PAGE; j++) { - struct i915_page_table_entry *pt = &ppgtt->pdp.page_directory[i].page_tables[j]; + struct i915_page_table_entry *pt = &pd->page_tables[j]; pt->page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!pt->page) goto unwind_out; + } } @@ -561,10 +547,6 @@ static int gen8_ppgtt_alloc(struct i915_hw_ppgtt *ppgtt, ppgtt->num_pd_entries = max_pdp * GEN8_PDES_PER_PAGE; - ret = gen8_ppgtt_allocate_dma(ppgtt); - if (ret) - goto err_out; - return 0; err_out: @@ -586,7 +568,7 @@ static int gen8_ppgtt_setup_page_directories(struct i915_hw_ppgtt *ppgtt, if (ret) return ret; - ppgtt->pd_dma_addr[pd] = pd_addr; + ppgtt->pdp.page_directory[pd].daddr = pd_addr;
[Intel-gfx] [PATCH v5 25/32] drm/i915/bdw: implement alloc/free for 4lvl
From: Ben Widawsky The code for 4lvl works just as one would expect, and nicely it is able to call into the existing 3lvl page table code to handle all of the lower levels. PML4 has no special attributes, and there will always be a PML4. So simply initialize it at creation, and destroy it at the end. v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the compiler happy. And define ret only in one place. Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2) --- drivers/gpu/drm/i915/i915_gem_gtt.c | 240 +++- drivers/gpu/drm/i915/i915_gem_gtt.h | 11 +- 2 files changed, 217 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1bf457a..44f8fa5 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -482,9 +482,12 @@ static void __pdp_fini(struct i915_page_directory_pointer_entry *pdp) static void unmap_and_free_pdp(struct i915_page_directory_pointer_entry *pdp, struct drm_device *dev) { - __pdp_fini(pdp); - if (USES_FULL_48BIT_PPGTT(dev)) + if (USES_FULL_48BIT_PPGTT(dev)) { + __pdp_fini(pdp); + i915_dma_unmap_single(pdp, dev); + __free_page(pdp->page); kfree(pdp); + } } static int __pdp_init(struct i915_page_directory_pointer_entry *pdp, @@ -510,6 +513,60 @@ static int __pdp_init(struct i915_page_directory_pointer_entry *pdp, return 0; } +static struct i915_page_directory_pointer_entry *alloc_pdp_single(struct i915_hw_ppgtt *ppgtt, + struct i915_pml4 *pml4) +{ + struct drm_device *dev = ppgtt->base.dev; + struct i915_page_directory_pointer_entry *pdp; + int ret; + + BUG_ON(!USES_FULL_48BIT_PPGTT(dev)); + + pdp = kmalloc(sizeof(*pdp), GFP_KERNEL); + if (!pdp) + return ERR_PTR(-ENOMEM); + + pdp->page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO); + if (!pdp->page) { + kfree(pdp); + return ERR_PTR(-ENOMEM); + } + + ret = __pdp_init(pdp, dev); + if (ret) { + __free_page(pdp->page); + kfree(pdp); + return ERR_PTR(ret); + } + + i915_dma_map_single(pdp, dev); + + return pdp; +} + +static void pml4_fini(struct i915_pml4 *pml4) +{ + struct i915_hw_ppgtt *ppgtt = + container_of(pml4, struct i915_hw_ppgtt, pml4); + i915_dma_unmap_single(pml4, ppgtt->base.dev); + __free_page(pml4->page); + /* HACK */ + pml4->page = NULL; +} + +static int pml4_init(struct i915_hw_ppgtt *ppgtt) +{ + struct i915_pml4 *pml4 = &ppgtt->pml4; + + pml4->page = alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!pml4->page) + return -ENOMEM; + + i915_dma_map_single(pml4, ppgtt->base.dev); + + return 0; +} + /* Broadwell Page Directory Pointer Descriptors */ static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry, @@ -711,14 +768,13 @@ static void gen8_free_page_tables(struct i915_page_directory_entry *pd, struct d } } -static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) +static void gen8_ppgtt_unmap_pages_3lvl(struct i915_page_directory_pointer_entry *pdp, + struct drm_device *dev) { - struct pci_dev *hwdev = ppgtt->base.dev->pdev; - struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ + struct pci_dev *hwdev = dev->pdev; int i, j; - for_each_set_bit(i, pdp->used_pdpes, - I915_PDPES_PER_PDP(ppgtt->base.dev)) { + for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) { struct i915_page_directory_entry *pd; if (WARN_ON(!pdp->page_directory[i])) @@ -746,27 +802,73 @@ static void gen8_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) } } -static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) +static void gen8_ppgtt_unmap_pages_4lvl(struct i915_hw_ppgtt *ppgtt) { + struct pci_dev *hwdev = ppgtt->base.dev->pdev; int i; - if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) { - for_each_set_bit(i, ppgtt->pdp.used_pdpes, -I915_PDPES_PER_PDP(ppgtt->base.dev)) { - if (WARN_ON(!ppgtt->pdp.page_directory[i])) - continue; + for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) { + struct i915_page_directory_pointer_entry *pdp; - gen8_free_page_tables(ppgtt->pdp.page_directory[i], - ppgtt->base.dev); - unmap_and_free_pd(ppgtt-
[Intel-gfx] [PATCH v5 32/32] drm/i915/bdw: Flip the 48b switch
Use 48b addresses if hw supports it and i915.enable_ppgtt=3. Aliasing PPGTT remains 32b only. Signed-off-by: Michel Thierry --- drivers/gpu/drm/i915/i915_gem_gtt.c | 7 ++- drivers/gpu/drm/i915/i915_params.c | 2 +- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index cd57c22..cebb868 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -106,7 +106,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt) has_full_ppgtt = INTEL_INFO(dev)->gen >= 7; #ifdef CONFIG_64BIT - has_full_64bit_ppgtt = IS_BROADWELL(dev) && false; /* FIXME: 64b */ + has_full_64bit_ppgtt = IS_BROADWELL(dev); #else has_full_64bit_ppgtt = false; #endif @@ -1075,9 +1075,6 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm, BUG_ON(!bitmap_empty(new_pds, pdpes)); - /* FIXME: PPGTT container_of won't work for 64b */ - BUG_ON((start + length) > 0x8ULL); - gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) { if (pd) continue; @@ -1396,7 +1393,7 @@ static int gen8_aliasing_ppgtt_init(struct i915_hw_ppgtt *ppgtt) { struct drm_device *dev = ppgtt->base.dev; struct drm_i915_private *dev_priv = dev->dev_private; - struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b */ + struct i915_page_directory_pointer_entry *pdp = &ppgtt->pdp; /* FIXME: 48b? */ struct i915_page_directory_entry *pd; uint64_t temp, start = 0, size = dev_priv->gtt.base.total; uint32_t pdpe; diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 44f2262..1cd43b0 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -119,7 +119,7 @@ MODULE_PARM_DESC(enable_hangcheck, module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400); MODULE_PARM_DESC(enable_ppgtt, "Override PPGTT usage. " - "(-1=auto [default], 0=disabled, 1=aliasing, 2=full)"); + "(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full_64b)"); module_param_named(enable_execlists, i915.enable_execlists, int, 0400); MODULE_PARM_DESC(enable_execlists, -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 11/17] drm/i915: Update the EDID automated compliance test function
On Wed, Dec 10, 2014 at 04:53:11PM -0700, Todd Previte wrote: > Updates the EDID compliance test function to perform the EDID read as > required by the tests. This read needs to take place in the kernel for > reasons of speed and efficiency. The results of the EDID read are handed > off to userspace so that the remainder of the test can be conducted there. > > V2: > - Addressed mailing list feedback > - Removed excess debug messages > - Removed extraneous comments > - Fixed formatting issues (line length > 80) > - Updated the debug message in compute_edid_checksum to output hex values > instead of decimal > > Signed-off-by: Todd Previte Returning the abstract discussion about where to put the edid checksum checks back to the concrete patch at hand. > --- > drivers/gpu/drm/i915/intel_dp.c | 72 > - > 1 file changed, 71 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index b6f5a72..2a13124 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -40,6 +40,13 @@ > > #define DP_LINK_CHECK_TIMEOUT(10 * 1000) > > +/* Compliance test status bits */ > +#define INTEL_DP_EDID_OK(0<<0) > +#define INTEL_DP_EDID_CORRUPT (1<<0) > +#define INTEL_DP_RESOLUTION_PREFERRED (1<<2) > +#define INTEL_DP_RESOLUTION_STANDARD(1<<3) > +#define INTEL_DP_RESOLUTION_FAILSAFE(1<<4) > + > struct dp_link_dpll { > int link_bw; > struct dpll dpll; > @@ -3761,9 +3768,72 @@ static uint8_t intel_dp_autotest_video_pattern(struct > intel_dp *intel_dp) > return test_result; > } > > +static bool intel_dp_compute_edid_checksum(uint8_t *edid_data, > +uint8_t *edid_checksum) > +{ > + uint32_t byte_total = 0; > + uint8_t i = 0; > + bool edid_ok = true; > + > + /* Don't include last byte (the checksum) in the computation */ > + for (i = 0; i < EDID_LENGTH - 2; i++) > + byte_total += edid_data[i]; > + > + *edid_checksum = 256 - (byte_total % 256); > + > + if (*edid_checksum != edid_data[EDID_LENGTH - 1]) { > + DRM_DEBUG_KMS("Invalid EDID checksum %02x, should be %02x\n", > + edid_data[EDID_LENGTH - 40 - 1], *edid_checksum); > + edid_ok = false; > + } > + > + return edid_ok; > +} > + > static uint8_t intel_dp_autotest_edid(struct intel_dp *intel_dp) > { > - uint8_t test_result = DP_TEST_NAK; > + struct drm_connector *connector = &intel_dp->attached_connector->base; > + struct i2c_adapter *adapter = &intel_dp->aux.ddc; > + struct edid *edid_read = NULL; > + uint8_t *edid_data = NULL; > + uint8_t test_result = DP_TEST_NAK, checksum = 0; > + uint32_t ret = 0; > + > + intel_dp->aux.i2c_nack_count = 0; > + intel_dp->aux.i2c_defer_count = 0; > + > + edid_read = drm_get_edid(connector, adapter); > + > + if (edid_read == NULL) { So if the edid core thinks your edid is foul (e.g. checksum mismatch) you wont ever see the edid and land in this case. > + /* Check for NACKs/DEFERs, use failsafe if detected > +(DP CTS 1.2 Core Rev 1.1, 4.2.2.4, 4.2.2.5) */ > + if (intel_dp->aux.i2c_nack_count > 0 || > + intel_dp->aux.i2c_defer_count > 0) > + DRM_DEBUG_KMS("EDID read had %d NACKs, %d DEFERs\n", > + intel_dp->aux.i2c_nack_count, > + intel_dp->aux.i2c_defer_count); > + intel_dp->compliance_test_data = INTEL_DP_EDID_CORRUPT | > + INTEL_DP_RESOLUTION_FAILSAFE; And return the above error condition of "corrupt edid" to the tester. > + } else { > + edid_data = (uint8_t *) edid_read; > + > + if (intel_dp_compute_edid_checksum(edid_data, &checksum)) { > + ret = drm_dp_dpcd_write(&intel_dp->aux, > + DP_TEST_EDID_CHECKSUM, > + &edid_read->checksum, 1); > + test_result = DP_TEST_ACK | > + DP_TEST_EDID_CHECKSUM_WRITE; > + intel_dp->compliance_test_data = > + INTEL_DP_EDID_OK | > + INTEL_DP_RESOLUTION_PREFERRED; > + } else { Which means except when there is some difference between your checksum code and the drm_edid.c checksum code this case here is dead code. And if there _is_ a difference then they'll never agree, so the "edid ok" case above can't happen. Either way one of the two sides of this if is dead code (at least if I haven't missed something). That's essentially the reason why I think we should fixup the checksum code in drm_edid.c and drop this. -Daniel > +
[Intel-gfx] [PATCH 2/7] drm/i915/skl: Allow scanning out Y and Yf fbs
From: Damien Lespiau Skylake is able to scannout those tiling formats. We need to allow them in the ADDFB ioctl and tell the harware about it. v2: Rebased for addfb2 interface. (Tvrtko Ursulin) v3: Rebased for fb modifier changes. (Tvrtko Ursulin) v4: Don't allow Y tiled fbs just yet. (Tvrtko Ursulin) v5: Check for stride alignment and max pitch. (Tvrtko Ursulin) v6: Simplify maximum pitch check. (Ville Syrjälä) v7: Drop the gen9 check since requirements are no different. (Ville Syrjälä) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin Cc: Ville Syrjälä --- drivers/gpu/drm/i915/intel_display.c | 115 --- drivers/gpu/drm/i915/intel_drv.h | 2 + drivers/gpu/drm/i915/intel_sprite.c | 18 -- 3 files changed, 95 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 6e70748..a523d84 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2728,6 +2728,34 @@ static void ironlake_update_primary_plane(struct drm_crtc *crtc, POSTING_READ(reg); } +u32 intel_fb_stride_alignment(struct drm_device *dev, uint64_t fb_modifier, + uint32_t pixel_format) +{ + u32 bits_per_pixel = drm_format_plane_cpp(pixel_format, 0) * 8; + + /* +* The stride is either expressed as a multiple of 64 bytes +* chunks for linear buffers or in number of tiles for tiled +* buffers. +*/ + switch (fb_modifier) { + case DRM_FORMAT_MOD_NONE: + return 64; + case I915_FORMAT_MOD_X_TILED: + return 512; + case I915_FORMAT_MOD_Y_TILED: + return 128; + case I915_FORMAT_MOD_Yf_TILED: + if (bits_per_pixel == 8) + return 64; + else + return 128; + default: + MISSING_CASE(fb_modifier); + return 64; + } +} + static void skylake_update_primary_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb, int x, int y) @@ -2735,10 +2763,9 @@ static void skylake_update_primary_plane(struct drm_crtc *crtc, struct drm_device *dev = crtc->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); - struct intel_framebuffer *intel_fb; struct drm_i915_gem_object *obj; int pipe = intel_crtc->pipe; - u32 plane_ctl, stride; + u32 plane_ctl, stride_div; if (!intel_crtc->primary_enabled) { I915_WRITE(PLANE_CTL(pipe, 0), 0); @@ -2782,29 +2809,30 @@ static void skylake_update_primary_plane(struct drm_crtc *crtc, BUG(); } - intel_fb = to_intel_framebuffer(fb); - obj = intel_fb->obj; - - /* -* The stride is either expressed as a multiple of 64 bytes chunks for -* linear buffers or in number of tiles for tiled buffers. -*/ switch (fb->modifier[0]) { case DRM_FORMAT_MOD_NONE: - stride = fb->pitches[0] >> 6; break; case I915_FORMAT_MOD_X_TILED: plane_ctl |= PLANE_CTL_TILED_X; - stride = fb->pitches[0] >> 9; + break; + case I915_FORMAT_MOD_Y_TILED: + plane_ctl |= PLANE_CTL_TILED_Y; + break; + case I915_FORMAT_MOD_Yf_TILED: + plane_ctl |= PLANE_CTL_TILED_YF; break; default: - BUG(); + MISSING_CASE(fb->modifier[0]); } plane_ctl |= PLANE_CTL_PLANE_GAMMA_DISABLE; if (crtc->primary->state->rotation == BIT(DRM_ROTATE_180)) plane_ctl |= PLANE_CTL_ROTATE_180; + obj = intel_fb_obj(fb); + stride_div = intel_fb_stride_alignment(dev, fb->modifier[0], + fb->pixel_format); + I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl); DRM_DEBUG_KMS("Writing base %08lX %d,%d,%d,%d pitch=%d\n", @@ -2817,7 +2845,7 @@ static void skylake_update_primary_plane(struct drm_crtc *crtc, I915_WRITE(PLANE_SIZE(pipe, 0), (intel_crtc->config->pipe_src_h - 1) << 16 | (intel_crtc->config->pipe_src_w - 1)); - I915_WRITE(PLANE_STRIDE(pipe, 0), stride); + I915_WRITE(PLANE_STRIDE(pipe, 0), fb->pitches[0] / stride_div); I915_WRITE(PLANE_SURF(pipe, 0), i915_gem_obj_ggtt_offset(obj)); POSTING_READ(PLANE_SURF(pipe, 0)); @@ -12657,14 +12685,43 @@ static const struct drm_framebuffer_funcs intel_fb_funcs = { .create_handle = intel_user_framebuffer_create_handle, }; +static +u32 intel_fb_pitch_limit(struct drm_device *dev, uint64_t fb_modifier, +uint32_t pixel_format) +{ + u32 gen = INTEL_INF
[Intel-gfx] [PATCH 5/7] drm/i915/skl: Adjust get_plane_config() to support Yb/Yf tiling
From: Damien Lespiau v2: Rebased for addfb2 interface and consolidated a bit. (Tvrtko Ursulin) v3: Rebased for fb modifier changes. (Tvrtko Ursulin) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/intel_display.c | 45 ++-- 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 358a97e..c622b11 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7718,7 +7718,7 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc, { struct drm_device *dev = crtc->base.dev; struct drm_i915_private *dev_priv = dev->dev_private; - u32 val, base, offset, stride_mult; + u32 val, base, offset, stride_mult, tiling; int pipe = crtc->pipe; int fourcc, pixel_format; int aligned_height; @@ -7737,11 +7737,6 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc, if (!(val & PLANE_CTL_ENABLE)) goto error; - if (val & PLANE_CTL_TILED_MASK) { - plane_config->tiling = I915_TILING_X; - fb->modifier[0] = I915_FORMAT_MOD_X_TILED; - } - pixel_format = val & PLANE_CTL_FORMAT_MASK; fourcc = skl_format_to_fourcc(pixel_format, val & PLANE_CTL_ORDER_RGBX, @@ -7749,6 +7744,33 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc, fb->pixel_format = fourcc; fb->bits_per_pixel = drm_format_plane_cpp(fourcc, 0) * 8; + tiling = val & PLANE_CTL_TILED_MASK; + switch (tiling) { + case PLANE_CTL_TILED_LINEAR: + fb->modifier[0] = DRM_FORMAT_MOD_NONE; + stride_mult = 64; + break; + case PLANE_CTL_TILED_X: + plane_config->tiling = I915_TILING_X; + fb->modifier[0] = I915_FORMAT_MOD_X_TILED; + stride_mult = 512; + break; + case PLANE_CTL_TILED_Y: + fb->modifier[0] = I915_FORMAT_MOD_Y_TILED; + stride_mult = 128; + break; + case PLANE_CTL_TILED_YF: + fb->modifier[0] = I915_FORMAT_MOD_Yf_TILED; + if (fb->bits_per_pixel == 8) + stride_mult = 64; + else + stride_mult = 128; + break; + default: + MISSING_CASE(tiling); + goto error; + } + base = I915_READ(PLANE_SURF(pipe, 0)) & 0xf000; plane_config->base = base; @@ -7759,17 +7781,6 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc, fb->width = ((val >> 0) & 0x1fff) + 1; val = I915_READ(PLANE_STRIDE(pipe, 0)); - switch (plane_config->tiling) { - case I915_TILING_NONE: - stride_mult = 64; - break; - case I915_TILING_X: - stride_mult = 512; - break; - default: - MISSING_CASE(plane_config->tiling); - goto error; - } fb->pitches[0] = (val & 0x3ff) * stride_mult; aligned_height = intel_fb_align_height(dev, fb->height, -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/7] drm/i915/skl: Add new displayable tiling formats
From: Tvrtko Ursulin Starting with SKL display engine can scan out Y, and newly introduced Yf tiling formats so add the latter to the frame buffer modifier space. v2: Definitions moved to drm_fourcc.h. v3: Try to document the format better. Signed-off-by: Tvrtko Ursulin Reviewed-by: Damien Lespiau --- include/uapi/drm/drm_fourcc.h | 15 +++ 1 file changed, 15 insertions(+) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 1a5a357..e6efac2 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -192,4 +192,19 @@ */ #define I915_FORMAT_MOD_Y_TILEDfourcc_mod_code(INTEL, 2) +/* + * Intel Yf-tiling layout + * + * This is a tiled layout using 4Kb tiles in row-major layout. + * Within the tile pixels are laid out in 16 256 byte units / sub-tiles which + * are arranged in four groups (two wide, two high) with column-major layout. + * Each group therefore consits out of four 256 byte units, which are also laid + * out as 2x2 column-major. + * 256 byte units are made out of four 64 byte blocks of pixels, producing + * either a square block or a 2:1 unit. + * 64 byte blocks of pixels contain four pixel rows of 16 bytes, where the width + * in pixel depends on the pixel depth. + */ +#define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3) + #endif /* DRM_FOURCC_H */ -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/7] drm/i915/skl: Teach pin_and_fence_fb_obj() about Y tiling constraints
From: Damien Lespiau 1Mb! v2: Rebased for addfb2 interface. (Tvrtko Ursulin) v3: Rebased for fb modifier changes. (Tvrtko Ursulin) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/intel_display.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4f0033a..358a97e 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2262,8 +2262,12 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane, } break; case I915_FORMAT_MOD_Y_TILED: - WARN(1, "Y tiled bo slipped through, driver bug!\n"); - return -EINVAL; + case I915_FORMAT_MOD_Yf_TILED: + if (WARN_ONCE(INTEL_INFO(dev)->gen < 9, + "Y tiling bo slipped through, driver bug!\n")) + return -EINVAL; + alignment = 1 * 1024 * 1024; + break; default: MISSING_CASE(fb->modifier[0]); return -EINVAL; -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/7] drm/i915/skl: Adjust intel_fb_align_height() for Yb/Yf tiling
From: Damien Lespiau We now need the bpp of the fb as Yf tiling has different tile widths depending on it. v2: Rebased for the new addfb2 interface. (Tvrtko Ursulin) v3: Rebased for fb modifier changes. (Tvrtko Ursulin) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/intel_display.c | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a523d84..4f0033a 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2195,9 +2195,36 @@ intel_fb_align_height(struct drm_device *dev, int height, uint64_t fb_format_modifier) { int tile_height; + uint32_t bits_per_pixel; - tile_height = fb_format_modifier == I915_FORMAT_MOD_X_TILED ? - (IS_GEN2(dev) ? 16 : 8) : 1; + switch (fb_format_modifier) { + case I915_FORMAT_MOD_X_TILED: + tile_height = IS_GEN2(dev) ? 16 : 8; + break; + case I915_FORMAT_MOD_Y_TILED: + tile_height = 32; + break; + case I915_FORMAT_MOD_Yf_TILED: + bits_per_pixel = drm_format_plane_cpp(pixel_format, 0) * 8; + switch (bits_per_pixel) { + default: + case 8: + tile_height = 64; + break; + case 16: + case 32: + tile_height = 32; + break; + case 64: + case 128: + tile_height = 16; + break; + } + break; + default: + tile_height = 1; + break; + } return ALIGN(height, tile_height); } -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/7] Skylake Y tiled scanout
From: Tvrtko Ursulin Starting with Skylake the display engine can scan out Y tiled objects. (Both legacy Y tiled, and the new Yf format.) This series takes the original work by Damien Lespiau and converts it to use the new frame buffer modifiers instead of object set/get tiling. Some patches needed to be dropped, some added and some refactored. Lightly tested with "testdisplay -m -y" and "testdisplay -m --yf". v2: Rebased on v2 of "i915 fb modifier support, respun". v3: * Part which allows Y tiled fb creation extracted out and moved to the end of series. * Dropped redundant "drm/i915/skl: Allow Y tiling for sprites". * Also see individual change logs. Damien Lespiau (4): drm/i915/skl: Allow scanning out Y and Yf fbs drm/i915/skl: Adjust intel_fb_align_height() for Yb/Yf tiling drm/i915/skl: Teach pin_and_fence_fb_obj() about Y tiling constraints drm/i915/skl: Adjust get_plane_config() to support Yb/Yf tiling Tvrtko Ursulin (3): drm/i915/skl: Add new displayable tiling formats drm/i915/skl: Update watermarks for Y tiling drm/i915/skl: Allow Y (and Yf) frame buffer creation drivers/gpu/drm/i915/intel_display.c | 218 +-- drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_pm.c | 33 +- drivers/gpu/drm/i915/intel_sprite.c | 24 +++- include/uapi/drm/drm_fourcc.h| 15 +++ 5 files changed, 222 insertions(+), 71 deletions(-) -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 05/12] lib: Don't give a struct igt_buf * to fast_copy_pitch()
From: Damien Lespiau So we can use this function in a "raw" (ie without igt_buf) version. Signed-off-by: Damien Lespiau --- lib/intel_batchbuffer.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index 51552b0..cbe3520 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -486,12 +486,12 @@ void igt_buf_write_to_png(struct igt_buf *buf, const char *filename) * pitches are in bytes if the surfaces are linear, number of dwords * otherwise */ -static uint32_t fast_copy_pitch(struct igt_buf *buf) +static uint32_t fast_copy_pitch(unsigned int stride, enum i915_tiling tiling) { - if (buf->tiling != I915_TILING_NONE) - return buf->stride / 4; + if (tiling != I915_TILING_NONE) + return stride / 4; else - return buf->stride; + return stride; } /** @@ -519,8 +519,8 @@ void igt_blitter_fast_copy(struct intel_batchbuffer *batch, uint32_t src_pitch, dst_pitch; uint32_t dword0 = 0, dword1 = 0; - src_pitch = fast_copy_pitch(src); - dst_pitch = fast_copy_pitch(dst); + src_pitch = fast_copy_pitch(src->stride, src->tiling); + dst_pitch = fast_copy_pitch(dst->stride, src->tiling); #define CHECK_RANGE(x) ((x) >= 0 && (x) < (1 << 15)) assert(CHECK_RANGE(src_x) && CHECK_RANGE(src_y) && -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 06/12] lib: Split two helpers to build fast copy's dword0 and dword1
From: Damien Lespiau Again, these helpers will be useful for a raw version of the gen9 fast copy. Signed-off-by: Damien Lespiau --- lib/intel_batchbuffer.c | 96 + 1 file changed, 57 insertions(+), 39 deletions(-) diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index cbe3520..8ef5ada 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -494,46 +494,14 @@ static uint32_t fast_copy_pitch(unsigned int stride, enum i915_tiling tiling) return stride; } -/** - * igt_blitter_fast_copy: - * @batch: batchbuffer object - * @context: libdrm hardware context to use - * @src: source i-g-t buffer object - * @src_x: source pixel x-coordination - * @src_y: source pixel y-coordination - * @width: width of the copied rectangle - * @height: height of the copied rectangle - * @dst: destination i-g-t buffer object - * @dst_x: destination pixel x-coordination - * @dst_y: destination pixel y-coordination - * - * Copy @src into @dst using the gen9 fast copy blitter comamnd. - * - * The source and destination surfaces cannot overlap. - */ -void igt_blitter_fast_copy(struct intel_batchbuffer *batch, - struct igt_buf *src, unsigned src_x, unsigned src_y, - unsigned width, unsigned height, - struct igt_buf *dst, unsigned dst_x, unsigned dst_y) +static uint32_t fast_copy_dword0(unsigned int src_tiling, +unsigned int dst_tiling) { - uint32_t src_pitch, dst_pitch; - uint32_t dword0 = 0, dword1 = 0; - - src_pitch = fast_copy_pitch(src->stride, src->tiling); - dst_pitch = fast_copy_pitch(dst->stride, src->tiling); - -#define CHECK_RANGE(x) ((x) >= 0 && (x) < (1 << 15)) - assert(CHECK_RANGE(src_x) && CHECK_RANGE(src_y) && - CHECK_RANGE(dst_x) && CHECK_RANGE(dst_y) && - CHECK_RANGE(width) && CHECK_RANGE(height) && - CHECK_RANGE(src_x + width) && CHECK_RANGE(src_y + height) && - CHECK_RANGE(dst_x + width) && CHECK_RANGE(dst_y + height) && - CHECK_RANGE(src_pitch) && CHECK_RANGE(dst_pitch)); -#undef CHECK_RANGE + uint32_t dword0 = 0; dword0 |= XY_FAST_COPY_BLT; - switch (src->tiling) { + switch (src_tiling) { case I915_TILING_X: dword0 |= XY_FAST_COPY_SRC_TILING_X; break; @@ -549,7 +517,7 @@ void igt_blitter_fast_copy(struct intel_batchbuffer *batch, break; } - switch (dst->tiling) { + switch (dst_tiling) { case I915_TILING_X: dword0 |= XY_FAST_COPY_DST_TILING_X; break; @@ -565,13 +533,63 @@ void igt_blitter_fast_copy(struct intel_batchbuffer *batch, break; } - if (src->tiling == I915_TILING_Yf) + return dword0; +} + +static uint32_t fast_copy_dword1(unsigned int src_tiling, +unsigned int dst_tiling) +{ + uint32_t dword1 = 0; + + if (src_tiling == I915_TILING_Yf) dword1 |= XY_FAST_COPY_SRC_TILING_Yf; - if (dst->tiling == I915_TILING_Yf) + if (dst_tiling == I915_TILING_Yf) dword1 |= XY_FAST_COPY_DST_TILING_Yf; dword1 |= XY_FAST_COPY_COLOR_DEPTH_32; + return dword1; +} + +/** + * igt_blitter_fast_copy: + * @batch: batchbuffer object + * @context: libdrm hardware context to use + * @src: source i-g-t buffer object + * @src_x: source pixel x-coordination + * @src_y: source pixel y-coordination + * @width: width of the copied rectangle + * @height: height of the copied rectangle + * @dst: destination i-g-t buffer object + * @dst_x: destination pixel x-coordination + * @dst_y: destination pixel y-coordination + * + * Copy @src into @dst using the gen9 fast copy blitter comamnd. + * + * The source and destination surfaces cannot overlap. + */ +void igt_blitter_fast_copy(struct intel_batchbuffer *batch, + struct igt_buf *src, unsigned src_x, unsigned src_y, + unsigned width, unsigned height, + struct igt_buf *dst, unsigned dst_x, unsigned dst_y) +{ + uint32_t src_pitch, dst_pitch; + uint32_t dword0, dword1; + + src_pitch = fast_copy_pitch(src->stride, src->tiling); + dst_pitch = fast_copy_pitch(dst->stride, src->tiling); + dword0 = fast_copy_dword0(src->tiling, dst->tiling); + dword1 = fast_copy_dword1(src->tiling, dst->tiling); + +#define CHECK_RANGE(x) ((x) >= 0 && (x) < (1 << 15)) + assert(CHECK_RANGE(src_x) && CHECK_RANGE(src_y) && + CHECK_RANGE(dst_x) && CHECK_RANGE(dst_y) && + CHECK_RANGE(width) && CHECK_RANGE(height) && + CHECK_RANGE(src_x + width) && CHECK_RANGE(src_y + height) && + CHECK_RANGE(dst_x + width) && CHECK_RANGE(dst_y + height) && + CHECK_RANGE(src_pitch) && CHE
[Intel-gfx] [PATCH i-g-t 01/12] tests/kms_addfb: Add support for fb modifiers
From: Tvrtko Ursulin Just a few basic tests to make sure fb modifiers can be used and behave sanely when mixed with the old set_tiling API. v2: * Review feedback from Daniel Vetter: 1. Move cap detection into the subtest so skipping works. 2. Added some gtkdoc comments. 3. Two more test cases. 4. Removed unused parts for now. v3: * Removed two tests which do not make sense any more after the fb modifier rewrite. v4: * Moved gtkdoc comments into .c file. * Moved all initialization into fixtures. * Rebased for fb modifier changes. Signed-off-by: Tvrtko Ursulin --- lib/ioctl_wrappers.c | 23 +++ lib/ioctl_wrappers.h | 30 tests/kms_addfb.c| 65 3 files changed, 118 insertions(+) diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c index cd6884a..0ab25c4 100644 --- a/lib/ioctl_wrappers.c +++ b/lib/ioctl_wrappers.c @@ -1142,3 +1142,26 @@ off_t prime_get_size(int dma_buf_fd) return ret; } + +/** + * igt_require_fb_modifiers: + * @fd: Open DRM file descriptor. + * + * Requires presence of DRM_CAP_ADDFB2_MODIFIERS. + */ +void igt_require_fb_modifiers(int fd) +{ + static bool has_modifiers, cap_modifiers_tested; + + if (!cap_modifiers_tested) { + uint64_t cap_modifiers; + int ret; + + ret = drmGetCap(fd, LOCAL_DRM_CAP_ADDFB2_MODIFIERS, &cap_modifiers); + igt_assert(ret == 0 || errno == EINVAL); + has_modifiers = ret == 0 && cap_modifiers == 1; + cap_modifiers_tested = true; + } + + igt_require(has_modifiers); +} diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h index 7c0c87e..3c85e8b 100644 --- a/lib/ioctl_wrappers.h +++ b/lib/ioctl_wrappers.h @@ -135,4 +135,34 @@ int prime_handle_to_fd(int fd, uint32_t handle); uint32_t prime_fd_to_handle(int fd, int dma_buf_fd); off_t prime_get_size(int dma_buf_fd); +/* addfb2 fb modifiers */ +struct local_drm_mode_fb_cmd2 { + uint32_t fb_id; + uint32_t width, height; + uint32_t pixel_format; + uint32_t flags; + uint32_t handles[4]; + uint32_t pitches[4]; + uint32_t offsets[4]; + uint64_t modifier[4]; +}; + +#define LOCAL_DRM_MODE_FB_MODIFIERS(1<<1) + +#define LOCAL_DRM_FORMAT_MOD_VENDOR_INTEL 0x01 + +#define local_fourcc_mod_code(vendor, val) \ + uint64_t)LOCAL_DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | \ + (val & 0x00ffL)) + +#define LOCAL_DRM_FORMAT_MOD_NONE (0) +#define LOCAL_I915_FORMAT_MOD_X_TILED local_fourcc_mod_code(INTEL, 1) + +#define LOCAL_DRM_IOCTL_MODE_ADDFB2DRM_IOWR(0xB8, \ +struct local_drm_mode_fb_cmd2) + +#define LOCAL_DRM_CAP_ADDFB2_MODIFIERS 0x10 + +void igt_require_fb_modifiers(int fd); + #endif /* IOCTL_WRAPPERS_H */ diff --git a/tests/kms_addfb.c b/tests/kms_addfb.c index 756589e..0a82619 100644 --- a/tests/kms_addfb.c +++ b/tests/kms_addfb.c @@ -213,6 +213,69 @@ static void size_tests(int fd) } } +static void addfb25_tests(int fd) +{ + struct local_drm_mode_fb_cmd2 f = {}; + + igt_fixture { + gem_bo = gem_create(fd, 1024*1024*4); + igt_assert(gem_bo); + + memset(&f, 0, sizeof(f)); + + f.width = 1024; + f.height = 1024; + f.pixel_format = DRM_FORMAT_XRGB; + f.pitches[0] = 1024*4; + f.modifier[0] = LOCAL_DRM_FORMAT_MOD_NONE; + + f.handles[0] = gem_bo; + } + + igt_subtest("addfb25-modifier-no-flag") { + igt_require_fb_modifiers(fd); + + f.modifier[0] = LOCAL_I915_FORMAT_MOD_X_TILED; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) < 0 && errno == EINVAL); + } + + igt_fixture { + gem_set_tiling(fd, gem_bo, I915_TILING_X, 1024*4); + f.flags = LOCAL_DRM_MODE_FB_MODIFIERS; + } + + igt_subtest("addfb25-X-tiled-mismatch") { + igt_require_fb_modifiers(fd); + + f.modifier[0] = LOCAL_DRM_FORMAT_MOD_NONE; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) < 0 && errno == EINVAL); + } + + igt_subtest("addfb25-X-tiled") { + igt_require_fb_modifiers(fd); + + f.modifier[0] = LOCAL_I915_FORMAT_MOD_X_TILED; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) == 0); + igt_assert(drmIoctl(fd, DRM_IOCTL_MODE_RMFB, &f.fb_id) == 0); + f.fb_id = 0; + } + + igt_subtest("addfb25-framebuffer-vs-set-tiling") { + igt_require_fb_modifiers(fd); + + f.modifier[0] = LOCAL_I915_FORMAT_MOD_X_TILED; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) == 0); + igt_assert(__gem_set_
[Intel-gfx] [PATCH i-g-t 03/12] tests/kms_addfb: Y tiled testcases
From: Tvrtko Ursulin v2: Moved all init into fixtures. Signed-off-by: Tvrtko Ursulin --- lib/ioctl_wrappers.h | 2 ++ tests/kms_addfb.c| 70 +++- 2 files changed, 71 insertions(+), 1 deletion(-) diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h index 3c85e8b..99fc7fd 100644 --- a/lib/ioctl_wrappers.h +++ b/lib/ioctl_wrappers.h @@ -157,6 +157,8 @@ struct local_drm_mode_fb_cmd2 { #define LOCAL_DRM_FORMAT_MOD_NONE (0) #define LOCAL_I915_FORMAT_MOD_X_TILED local_fourcc_mod_code(INTEL, 1) +#define LOCAL_I915_FORMAT_MOD_Y_TILED local_fourcc_mod_code(INTEL, 2) +#define LOCAL_I915_FORMAT_MOD_Yf_TILED local_fourcc_mod_code(INTEL, 3) #define LOCAL_DRM_IOCTL_MODE_ADDFB2DRM_IOWR(0xB8, \ struct local_drm_mode_fb_cmd2) diff --git a/tests/kms_addfb.c b/tests/kms_addfb.c index 0a82619..d474e95 100644 --- a/tests/kms_addfb.c +++ b/tests/kms_addfb.c @@ -38,6 +38,7 @@ #include "ioctl_wrappers.h" #include "drmtest.h" #include "drm_fourcc.h" +#include "intel_chipset.h" uint32_t gem_bo; uint32_t gem_bo_small; @@ -276,12 +277,77 @@ static void addfb25_tests(int fd) } } +static void addfb25_ytile(int fd, int gen) +{ + struct local_drm_mode_fb_cmd2 f = {}; + int shouldret; + + igt_fixture { + gem_bo = gem_create(fd, 1024*1024*4); + igt_assert(gem_bo); + gem_bo_small = gem_create(fd, 1024*1023*4); + igt_assert(gem_bo_small); + + shouldret = gen >= 9 ? 0 : -1; + + memset(&f, 0, sizeof(f)); + + f.width = 1024; + f.height = 1024; + f.pixel_format = DRM_FORMAT_XRGB; + f.pitches[0] = 1024*4; + f.flags = LOCAL_DRM_MODE_FB_MODIFIERS; + f.modifier[0] = LOCAL_DRM_FORMAT_MOD_NONE; + + f.handles[0] = gem_bo; + } + + igt_subtest("addfb25-Y-tiled") { + igt_require_fb_modifiers(fd); + + f.modifier[0] = LOCAL_I915_FORMAT_MOD_Y_TILED; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) == shouldret); + if (!shouldret) + igt_assert(drmIoctl(fd, DRM_IOCTL_MODE_RMFB, &f.fb_id) == 0); + f.fb_id = 0; + } + + igt_subtest("addfb25-Yf-tiled") { + igt_require_fb_modifiers(fd); + + f.modifier[0] = LOCAL_I915_FORMAT_MOD_Yf_TILED; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) == shouldret); + if (!shouldret) + igt_assert(drmIoctl(fd, DRM_IOCTL_MODE_RMFB, &f.fb_id) == 0); + f.fb_id = 0; + } + + igt_subtest("addfb25-Y-tiled-small") { + igt_require_fb_modifiers(fd); + igt_require(gen >= 9); + + f.modifier[0] = LOCAL_I915_FORMAT_MOD_Y_TILED; + f.height = 1023; + f.handles[0] = gem_bo_small; + igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) < 0 && errno == EINVAL); + f.fb_id = 0; + } + + igt_fixture { + gem_close(fd, gem_bo); + gem_close(fd, gem_bo_small); + } +} + int fd; +int gen; igt_main { - igt_fixture + igt_fixture { fd = drm_open_any_master(); + gen = intel_gen(intel_get_drm_devid(fd)); + } pitch_tests(fd); @@ -289,6 +355,8 @@ igt_main addfb25_tests(fd); + addfb25_ytile(fd, gen); + igt_fixture close(fd); } -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 08/12] tiling: Convert framebuffer helpers to use fb modifiers
From: Tvrtko Ursulin This converts the IGT API only, underneath legacy set_tiling is still used. Signed-off-by: Tvrtko Ursulin --- lib/igt_fb.c| 20 ++-- lib/igt_fb.h| 10 +- lib/igt_kms.h | 1 + tests/kms_3d.c | 2 +- tests/kms_cursor_crc.c | 8 +--- tests/kms_fbc_crc.c | 4 ++-- tests/kms_fence_pin_leak.c | 4 ++-- tests/kms_flip.c| 6 +++--- tests/kms_flip_event_leak.c | 4 ++-- tests/kms_flip_tiling.c | 7 --- tests/kms_mmio_vs_cs_flip.c | 12 ++-- tests/kms_pipe_crc_basic.c | 2 +- tests/kms_plane.c | 8 tests/kms_psr_sink_crc.c| 8 +--- tests/kms_pwrite_crc.c | 4 ++-- tests/kms_render.c | 8 tests/kms_rotation_crc.c| 4 ++-- tests/kms_setmode.c | 2 +- tests/kms_sink_crc_basic.c | 6 -- tests/kms_universal_plane.c | 18 +- tests/pm_lpsp.c | 2 +- tests/pm_rpm.c | 26 ++ tests/testdisplay.c | 4 ++-- 23 files changed, 90 insertions(+), 80 deletions(-) diff --git a/lib/igt_fb.c b/lib/igt_fb.c index 9b41301..853b2f9 100644 --- a/lib/igt_fb.c +++ b/lib/igt_fb.c @@ -75,7 +75,7 @@ static struct format_desc_struct { /* helpers to create nice-looking framebuffers */ static int create_bo_for_fb(int fd, int width, int height, int bpp, - unsigned int tiling, unsigned bo_size, + uint64_t tiling, unsigned bo_size, uint32_t *gem_handle_ret, unsigned *size_ret, unsigned *stride_ret) @@ -84,7 +84,7 @@ static int create_bo_for_fb(int fd, int width, int height, int bpp, int size, ret = 0; unsigned stride; - if (tiling) { + if (tiling != LOCAL_DRM_FORMAT_MOD_NONE) { int v; /* Round the tiling up to the next power-of-two and the @@ -112,8 +112,8 @@ static int create_bo_for_fb(int fd, int width, int height, int bpp, bo_size = size; gem_handle = gem_create(fd, bo_size); - if (tiling) - ret = __gem_set_tiling(fd, gem_handle, tiling, stride); + if (tiling != LOCAL_DRM_FORMAT_MOD_NONE) + ret = __gem_set_tiling(fd, gem_handle, I915_TILING_X, stride); *stride_ret = stride; *size_ret = size; @@ -385,7 +385,7 @@ void igt_paint_image(cairo_t *cr, const char *filename, * @width: width of the framebuffer in pixel * @height: height of the framebuffer in pixel * @format: drm fourcc pixel format code - * @tiling: tiling layout of the framebuffer + * @tiling: tiling layout of the framebuffer (as framebuffer modifier) * @fb: pointer to an #igt_fb structure * @bo_size: size of the backing bo (0 for minimum needed size) * @@ -401,7 +401,7 @@ void igt_paint_image(cairo_t *cr, const char *filename, */ unsigned int igt_create_fb_with_bo_size(int fd, int width, int height, - uint32_t format, unsigned int tiling, + uint32_t format, uint64_t tiling, struct igt_fb *fb, unsigned bo_size) { uint32_t handles[4]; @@ -417,7 +417,7 @@ igt_create_fb_with_bo_size(int fd, int width, int height, bpp = igt_drm_format_to_bpp(format); - igt_debug("%s(width=%d, height=%d, format=0x%x [bpp=%d], tiling=%d, size=%d\n", + igt_debug("%s(width=%d, height=%d, format=0x%x [bpp=%d], tiling=%llx, size=%d\n", __func__, width, height, format, bpp, tiling, bo_size); do_or_die(create_bo_for_fb(fd, width, height, bpp, tiling, bo_size, &fb->gem_handle, &fb->size, &fb->stride)); @@ -460,7 +460,7 @@ igt_create_fb_with_bo_size(int fd, int width, int height, * The kms id of the created framebuffer. */ unsigned int igt_create_fb(int fd, int width, int height, uint32_t format, - unsigned int tiling, struct igt_fb *fb) + uint64_t tiling, struct igt_fb *fb) { return igt_create_fb_with_bo_size(fd, width, height, format, tiling, fb, 0); } @@ -489,7 +489,7 @@ unsigned int igt_create_fb(int fd, int width, int height, uint32_t format, * failure. */ unsigned int igt_create_color_fb(int fd, int width, int height, -uint32_t format, unsigned int tiling, +uint32_t format, uint64_t tiling, double r, double g, double b, struct igt_fb *fb /* out */) { @@ -583,7 +583,7 @@ static void stereo_fb_layout_from_mode(struct stereo_fb_layout *layout, * failure. */ unsigned int igt_create_stereo_fb(int drm_fd, drmModeModeInfo *mode, - uint32_t format, unsigned int tiling) + uint
[Intel-gfx] [PATCH 7/7] drm/i915/skl: Allow Y (and Yf) frame buffer creation
From: Tvrtko Ursulin By this patch all underlying bits have been implemented and this patch actually enables the feature. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/intel_display.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 74d4923..f100086 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12781,6 +12781,14 @@ static int intel_framebuffer_init(struct drm_device *dev, DRM_DEBUG("tiling_mode doesn't match fb modifier\n"); return -EINVAL; } + + if (INTEL_INFO(dev)->gen < 9 && + (mode_cmd->modifier[0] == I915_FORMAT_MOD_Y_TILED || +mode_cmd->modifier[0] == I915_FORMAT_MOD_Yf_TILED)) { + DRM_DEBUG("Unsupported tiling 0x%llx!\n", + mode_cmd->modifier[0]); + return -EINVAL; + } } else { if (obj->tiling_mode == I915_TILING_X) mode_cmd->modifier[0] = I915_FORMAT_MOD_X_TILED; @@ -12790,11 +12798,6 @@ static int intel_framebuffer_init(struct drm_device *dev, } } - if (mode_cmd->modifier[0] == I915_FORMAT_MOD_Y_TILED) { - DRM_DEBUG("hardware does not support tiling Y\n"); - return -EINVAL; - } - stride_alignment = intel_fb_stride_alignment(dev, mode_cmd->modifier[0], mode_cmd->pixel_format); if (mode_cmd->pitches[0] & (stride_alignment - 1)) { -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 00/12] Testing the Y tiled display
From: Tvrtko Ursulin Starting with Skylake the display engine can scan out Y tiled objects. (Both legacy Y tiled, and the new Yf format.) This series takes the original work by Damien Lespiau and converts it to use the new frame buffer modifiers instead of object set/get tiling. Some patches needed to be dropped, some added and some refactored. v2: Refactored for fb modifier changes. Damien Lespiau (7): lib: Extract igt_buf_write_to_png() from gem_render_copy lib/skl: Add gen9 specific igt_blitter_fast_copy() lib: Don't give a struct igt_buf * to fast_copy_pitch() lib: Split two helpers to build fast copy's dword0 and dword1 lib: Provide a raw version of the gen9 fast copy blits lib: Allow the creation of Ys/Yf tiled FBs testdisplay/skl: Add command line options for Yb/Yf tiled fbs Tvrtko Ursulin (5): tests/kms_addfb: Add support for fb modifiers tests/kms_addfb: Y tiled testcases tiling: Convert framebuffer helpers to use fb modifiers lib: Add support for new extension to the ADDFB2 ioctl. lib/igt_fb: Use new ADDFB2 extension for new tiling modes lib/igt_fb.c| 163 + lib/igt_fb.h| 10 +- lib/igt_kms.h | 1 + lib/intel_batchbuffer.c | 281 lib/intel_batchbuffer.h | 37 ++ lib/intel_reg.h | 18 +++ lib/ioctl_wrappers.c| 49 lib/ioctl_wrappers.h| 41 +++ tests/gem_render_copy.c | 24 +--- tests/kms_3d.c | 2 +- tests/kms_addfb.c | 135 - tests/kms_cursor_crc.c | 8 +- tests/kms_fbc_crc.c | 4 +- tests/kms_fence_pin_leak.c | 4 +- tests/kms_flip.c| 6 +- tests/kms_flip_event_leak.c | 4 +- tests/kms_flip_tiling.c | 7 +- tests/kms_mmio_vs_cs_flip.c | 12 +- tests/kms_pipe_crc_basic.c | 2 +- tests/kms_plane.c | 8 +- tests/kms_psr_sink_crc.c| 8 +- tests/kms_pwrite_crc.c | 4 +- tests/kms_render.c | 8 +- tests/kms_rotation_crc.c| 4 +- tests/kms_setmode.c | 2 +- tests/kms_sink_crc_basic.c | 6 +- tests/kms_universal_plane.c | 18 +-- tests/pm_lpsp.c | 2 +- tests/pm_rpm.c | 26 ++-- tests/testdisplay.c | 20 +++- 30 files changed, 795 insertions(+), 119 deletions(-) -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 10/12] lib/igt_fb: Use new ADDFB2 extension for new tiling modes
From: Tvrtko Ursulin Signed-off-by: Tvrtko Ursulin --- lib/igt_fb.c | 36 +++- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/lib/igt_fb.c b/lib/igt_fb.c index 853b2f9..c54907e 100644 --- a/lib/igt_fb.c +++ b/lib/igt_fb.c @@ -404,16 +404,10 @@ igt_create_fb_with_bo_size(int fd, int width, int height, uint32_t format, uint64_t tiling, struct igt_fb *fb, unsigned bo_size) { - uint32_t handles[4]; - uint32_t pitches[4]; - uint32_t offsets[4]; uint32_t fb_id; int bpp; memset(fb, 0, sizeof(*fb)); - memset(handles, 0, sizeof(handles)); - memset(pitches, 0, sizeof(pitches)); - memset(offsets, 0, sizeof(offsets)); bpp = igt_drm_format_to_bpp(format); @@ -422,14 +416,30 @@ igt_create_fb_with_bo_size(int fd, int width, int height, do_or_die(create_bo_for_fb(fd, width, height, bpp, tiling, bo_size, &fb->gem_handle, &fb->size, &fb->stride)); - handles[0] = fb->gem_handle; - pitches[0] = fb->stride; - igt_debug("%s(handle=%d, pitch=%d)\n", - __func__, handles[0], pitches[0]); - do_or_die(drmModeAddFB2(fd, width, height, format, - handles, pitches, offsets, - &fb_id, 0)); + __func__, fb->gem_handle, fb->stride); + + if (tiling != LOCAL_DRM_FORMAT_MOD_NONE && + tiling != LOCAL_I915_FORMAT_MOD_X_TILED) { + do_or_die(__kms_addfb(fd, fb->gem_handle, width, height, + fb->stride, format, tiling, + LOCAL_DRM_MODE_FB_MODIFIERS, &fb_id)); + } else { + uint32_t handles[4]; + uint32_t pitches[4]; + uint32_t offsets[4]; + + memset(handles, 0, sizeof(handles)); + memset(pitches, 0, sizeof(pitches)); + memset(offsets, 0, sizeof(offsets)); + + handles[0] = fb->gem_handle; + pitches[0] = fb->stride; + + do_or_die(drmModeAddFB2(fd, width, height, format, + handles, pitches, offsets, + &fb_id, 0)); + } fb->width = width; fb->height = height; -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 11/12] lib: Allow the creation of Ys/Yf tiled FBs
From: Damien Lespiau There's no fencing for those tiling layouts, so we create a linear bo for cairo to play with, and when cairo is finished with it, we do a fast copy blit to the fb BO with its final tiling. v2: Move to correct domain after CPU is done with the object (-EINVAL). (Tvrtko Ursulin) Correct arguments passed in to framebuffer creation (segfault). (Tvrtko Ursulin) Pass zero stride to kernel as it expects for Yf&Ys. (Tvrtko Ursulin) v3: Rebase for gem_mmap__cpu changes. (Tvrtko Ursulin) v4: Rebase for addfb2.5. (Tvrtko Ursulin) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin --- lib/igt_fb.c | 109 +-- 1 file changed, 106 insertions(+), 3 deletions(-) diff --git a/lib/igt_fb.c b/lib/igt_fb.c index c54907e..5c92fac 100644 --- a/lib/igt_fb.c +++ b/lib/igt_fb.c @@ -112,7 +112,7 @@ static int create_bo_for_fb(int fd, int width, int height, int bpp, bo_size = size; gem_handle = gem_create(fd, bo_size); - if (tiling != LOCAL_DRM_FORMAT_MOD_NONE) + if (tiling == LOCAL_I915_FORMAT_MOD_X_TILED) ret = __gem_set_tiling(fd, gem_handle, I915_TILING_X, stride); *stride_ret = stride; @@ -629,6 +629,104 @@ static cairo_format_t drm_format_to_cairo(uint32_t drm_format) drm_format, igt_format_str(drm_format)); } +struct fb_blit_upload { + int fd; + struct igt_fb *fb; + struct { + uint32_t handle; + unsigned size, stride; + uint8_t *map; + } linear; +}; + +static void destroy_cairo_surface__blit(void *arg) +{ + struct fb_blit_upload *blit = arg; + struct igt_fb *fb = blit->fb; + unsigned int obj_tiling = I915_TILING_NONE; + + munmap(blit->linear.map, blit->linear.size); + fb->cairo_surface = NULL; + + gem_set_domain(blit->fd, blit->linear.handle, + I915_GEM_DOMAIN_GTT, 0); + + switch (fb->tiling) { + case LOCAL_I915_FORMAT_MOD_X_TILED: + obj_tiling = I915_TILING_X; + break; + case LOCAL_I915_FORMAT_MOD_Y_TILED: + obj_tiling = I915_TILING_Y; + break; + case LOCAL_I915_FORMAT_MOD_Yf_TILED: + obj_tiling = I915_TILING_Yf; + break; + } + + igt_blitter_fast_copy__raw(blit->fd, + blit->linear.handle, + blit->linear.stride, + I915_TILING_NONE, + 0, 0, /* src_x, src_y */ + fb->width, fb->height, + fb->gem_handle, + fb->stride, + obj_tiling, + 0, 0 /* dst_x, dst_y */); + + gem_sync(blit->fd, blit->linear.handle); + gem_close(blit->fd, blit->linear.handle); + + free(blit); +} + +static void create_cairo_surface__blit(int fd, struct igt_fb *fb) +{ + struct fb_blit_upload *blit; + cairo_format_t cairo_format; + int bpp, ret; + + blit = malloc(sizeof(*blit)); + igt_assert(blit); + + /* +* We create a linear BO that we'll map for the CPU to write to (using +* cairo). This linear bo will be then blitted to its final +* destination, tiling it at the same time. +*/ + bpp = igt_drm_format_to_bpp(fb->drm_format); + ret = create_bo_for_fb(fd, fb->width, fb->height, bpp, + LOCAL_DRM_FORMAT_MOD_NONE, 0, + &blit->linear.handle, + &blit->linear.size, + &blit->linear.stride); + + igt_assert(ret == 0); + + blit->fd = fd; + blit->fb = fb; + blit->linear.map = gem_mmap__cpu(fd, +blit->linear.handle, +0, +blit->linear.size, +PROT_READ | PROT_WRITE); + igt_assert(blit->linear.map); + + gem_set_domain(fd, blit->linear.handle, + I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU); + + cairo_format = drm_format_to_cairo(fb->drm_format); + fb->cairo_surface = + cairo_image_surface_create_for_data(blit->linear.map, + cairo_format, + fb->width, fb->height, + blit->linear.stride); + + cairo_surface_set_user_data(fb->cairo_surface, + (cairo_user_data_key_t *)create_cairo_surface__blit, + blit, destroy_cairo_surface__blit); +} + static void destroy_cairo_surface__gtt(void *arg) {
[Intel-gfx] [PATCH i-g-t 09/12] lib: Add support for new extension to the ADDFB2 ioctl.
From: Tvrtko Ursulin New functionality accessesed via the __kms_addfb wrapper. Signed-off-by: Tvrtko Ursulin --- lib/ioctl_wrappers.c | 26 ++ lib/ioctl_wrappers.h | 9 + 2 files changed, 35 insertions(+) diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c index 0ab25c4..536431a 100644 --- a/lib/ioctl_wrappers.c +++ b/lib/ioctl_wrappers.c @@ -1165,3 +1165,29 @@ void igt_require_fb_modifiers(int fd) igt_require(has_modifiers); } + +int __kms_addfb(int fd, uint32_t handle, uint32_t width, uint32_t height, + uint32_t stride, uint32_t pixel_format, uint64_t modifier, + uint32_t flags, uint32_t *buf_id) +{ + struct local_drm_mode_fb_cmd2 f; + int ret; + + igt_require_fb_modifiers(fd); + + memset(&f, 0, sizeof(f)); + + f.width = width; + f.height = height; + f.pixel_format = pixel_format; + f.flags = flags; + f.handles[0] = handle; + f.pitches[0] = stride; + f.modifier[0] = modifier; + + ret = drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f); + + *buf_id = f.fb_id; + + return ret < 0 ? -errno : ret; +} diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h index 99fc7fd..ced7ef3 100644 --- a/lib/ioctl_wrappers.h +++ b/lib/ioctl_wrappers.h @@ -167,4 +167,13 @@ struct local_drm_mode_fb_cmd2 { void igt_require_fb_modifiers(int fd); +/** + * __kms_addfb: + * + * Creates a framebuffer object. + */ +int __kms_addfb(int fd, uint32_t handle, uint32_t width, uint32_t height, + uint32_t stride, uint32_t pixel_format, uint64_t modifier, + uint32_t flags, uint32_t *buf_id); + #endif /* IOCTL_WRAPPERS_H */ -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 07/12] lib: Provide a raw version of the gen9 fast copy blits
From: Damien Lespiau So we can use it with bare kernel types, without going through libdrm bos. v2: Don't forget the object handle. (Tvrtko) Correct surface pitch calculation. (Tvrtko) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin --- lib/intel_batchbuffer.c | 134 +++- lib/intel_batchbuffer.h | 18 +++ 2 files changed, 151 insertions(+), 1 deletion(-) diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index 8ef5ada..5cf65cf 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -41,6 +41,8 @@ #include "intel_reg.h" #include "rendercopy.h" #include "media_fill.h" +#include "ioctl_wrappers.h" + #include /** @@ -486,7 +488,7 @@ void igt_buf_write_to_png(struct igt_buf *buf, const char *filename) * pitches are in bytes if the surfaces are linear, number of dwords * otherwise */ -static uint32_t fast_copy_pitch(unsigned int stride, enum i915_tiling tiling) +static uint32_t fast_copy_pitch(unsigned int stride, unsigned int tiling) { if (tiling != I915_TILING_NONE) return stride / 4; @@ -551,6 +553,136 @@ static uint32_t fast_copy_dword1(unsigned int src_tiling, return dword1; } +static void +fill_relocation(struct drm_i915_gem_relocation_entry *reloc, + uint32_t gem_handle, uint32_t offset, /* in dwords */ + uint32_t read_domains, uint32_t write_domains) +{ + reloc->target_handle = gem_handle; + reloc->delta = 0; + reloc->offset = offset * sizeof(uint32_t); + reloc->presumed_offset = 0; + reloc->read_domains = read_domains; + reloc->write_domain = write_domains; +} + +static void +fill_object(struct drm_i915_gem_exec_object2 *obj, uint32_t gem_handle, + struct drm_i915_gem_relocation_entry *relocs, uint32_t count) +{ + memset(obj, 0, sizeof(*obj)); + obj->handle = gem_handle; + obj->relocation_count = count; + obj->relocs_ptr = (uint64_t)relocs; +} + +static void exec_blit(int fd, + struct drm_i915_gem_exec_object2 *objs, uint32_t count, + uint32_t batch_len /* in dwords */) +{ + struct drm_i915_gem_execbuffer2 exec; + + exec.buffers_ptr = (uint64_t)objs; + exec.buffer_count = count; + exec.batch_start_offset = 0; + exec.batch_len = batch_len * 4; + exec.DR1 = exec.DR4 = 0; + exec.num_cliprects = 0; + exec.cliprects_ptr = 0; + exec.flags = I915_EXEC_BLT; + i915_execbuffer2_set_context_id(exec, 0); + exec.rsvd2 = 0; + + gem_execbuf(fd, &exec); +} + +/** + * igt_blitter_fast_copy__raw: + * @fd: file descriptor of the i915 driver + * @src_handle: GEM handle of the source buffer + * @src_stride: Stride (in bytes) of the source buffer + * @src_tiling: Tiling mode of the source buffer + * @src_x: X coordinate of the source region to copy + * @src_y: Y coordinate of the source region to copy + * @width: Width of the region to copy + * @height: Height of the region to copy + * @dst_handle: GEM handle of the source buffer + * @dst_stride: Stride (in bytes) of the destination buffer + * @dst_tiling: Tiling mode of the destination buffer + * @dst_x: X coordinate of destination + * @dst_y: Y coordinate of destination + * + * Like igt_blitter_fast_copy(), but talking to the kernel directly. + */ +void igt_blitter_fast_copy__raw(int fd, + /* src */ + uint32_t src_handle, + unsigned int src_stride, + unsigned int src_tiling, + unsigned int src_x, unsigned src_y, + + /* size */ + unsigned int width, unsigned int height, + + /* dst */ + uint32_t dst_handle, + unsigned int dst_stride, + unsigned int dst_tiling, + unsigned int dst_x, unsigned dst_y) +{ + uint32_t batch[12]; + struct drm_i915_gem_exec_object2 objs[3]; + struct drm_i915_gem_relocation_entry relocs[2]; + uint32_t batch_handle; + uint32_t dword0, dword1; + uint32_t src_pitch, dst_pitch; + int i = 0; + + src_pitch = fast_copy_pitch(src_stride, src_tiling); + dst_pitch = fast_copy_pitch(dst_stride, dst_tiling); + dword0 = fast_copy_dword0(src_tiling, dst_tiling); + dword1 = fast_copy_dword1(src_tiling, dst_tiling); + +#define CHECK_RANGE(x) ((x) >= 0 && (x) < (1 << 15)) + assert(CHECK_RANGE(src_x) && CHECK_RANGE(src_y) && + CHECK_RANGE(dst_x) && CHECK_RANGE(dst_y) && + CHECK_RANGE(width) && CHECK_RANGE(height) && + CHECK_RANGE(src_x + width) && CHECK_RANGE(src_y + height) && + CHECK_RANGE(dst_x + width) && CHECK_RANGE(dst_y + height) && +
[Intel-gfx] [PATCH i-g-t 04/12] lib/skl: Add gen9 specific igt_blitter_fast_copy()
From: Damien Lespiau v2: Adjust for BB handling changes. (Tvrtko Ursulin) Correct XY_FAST_COPY_DST_TILING_Yf. (Tvrtko Ursulin) v3: New tiling modes are not defined in the kernel any more. (Tvrtko Ursulin) Signed-off-by: Damien Lespiau Signed-off-by: Tvrtko Ursulin --- lib/intel_batchbuffer.c | 106 lib/intel_batchbuffer.h | 17 lib/intel_reg.h | 18 3 files changed, 141 insertions(+) diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index 5226910..51552b0 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -482,6 +482,112 @@ void igt_buf_write_to_png(struct igt_buf *buf, const char *filename) drm_intel_bo_unmap(buf->bo); } +/* + * pitches are in bytes if the surfaces are linear, number of dwords + * otherwise + */ +static uint32_t fast_copy_pitch(struct igt_buf *buf) +{ + if (buf->tiling != I915_TILING_NONE) + return buf->stride / 4; + else + return buf->stride; +} + +/** + * igt_blitter_fast_copy: + * @batch: batchbuffer object + * @context: libdrm hardware context to use + * @src: source i-g-t buffer object + * @src_x: source pixel x-coordination + * @src_y: source pixel y-coordination + * @width: width of the copied rectangle + * @height: height of the copied rectangle + * @dst: destination i-g-t buffer object + * @dst_x: destination pixel x-coordination + * @dst_y: destination pixel y-coordination + * + * Copy @src into @dst using the gen9 fast copy blitter comamnd. + * + * The source and destination surfaces cannot overlap. + */ +void igt_blitter_fast_copy(struct intel_batchbuffer *batch, + struct igt_buf *src, unsigned src_x, unsigned src_y, + unsigned width, unsigned height, + struct igt_buf *dst, unsigned dst_x, unsigned dst_y) +{ + uint32_t src_pitch, dst_pitch; + uint32_t dword0 = 0, dword1 = 0; + + src_pitch = fast_copy_pitch(src); + dst_pitch = fast_copy_pitch(dst); + +#define CHECK_RANGE(x) ((x) >= 0 && (x) < (1 << 15)) + assert(CHECK_RANGE(src_x) && CHECK_RANGE(src_y) && + CHECK_RANGE(dst_x) && CHECK_RANGE(dst_y) && + CHECK_RANGE(width) && CHECK_RANGE(height) && + CHECK_RANGE(src_x + width) && CHECK_RANGE(src_y + height) && + CHECK_RANGE(dst_x + width) && CHECK_RANGE(dst_y + height) && + CHECK_RANGE(src_pitch) && CHECK_RANGE(dst_pitch)); +#undef CHECK_RANGE + + dword0 |= XY_FAST_COPY_BLT; + + switch (src->tiling) { + case I915_TILING_X: + dword0 |= XY_FAST_COPY_SRC_TILING_X; + break; + case I915_TILING_Y: + case I915_TILING_Yf: + dword0 |= XY_FAST_COPY_SRC_TILING_Yb_Yf; + break; + case I915_TILING_Ys: + dword0 |= XY_FAST_COPY_SRC_TILING_Ys; + break; + case I915_TILING_NONE: + default: + break; + } + + switch (dst->tiling) { + case I915_TILING_X: + dword0 |= XY_FAST_COPY_DST_TILING_X; + break; + case I915_TILING_Y: + case I915_TILING_Yf: + dword0 |= XY_FAST_COPY_DST_TILING_Yb_Yf; + break; + case I915_TILING_Ys: + dword0 |= XY_FAST_COPY_DST_TILING_Ys; + break; + case I915_TILING_NONE: + default: + break; + } + + if (src->tiling == I915_TILING_Yf) + dword1 |= XY_FAST_COPY_SRC_TILING_Yf; + if (dst->tiling == I915_TILING_Yf) + dword1 |= XY_FAST_COPY_DST_TILING_Yf; + + dword1 |= XY_FAST_COPY_COLOR_DEPTH_32; + + BEGIN_BATCH(10, 2); + OUT_BATCH(dword0); + OUT_BATCH(dword1 | dst_pitch); + OUT_BATCH((dst_y << 16) | dst_x); /* dst x1,y1 */ + OUT_BATCH(((dst_y + height) << 16) | (dst_x + width)); /* dst x2,y2 */ + OUT_RELOC(dst->bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0); + OUT_BATCH(0); /* dst address upper bits */ + OUT_BATCH((src_y << 16) | src_x); /* src x1,y1 */ + OUT_BATCH(src_pitch); + OUT_RELOC(src->bo, I915_GEM_DOMAIN_RENDER, 0, 0); + OUT_BATCH(0); /* src address upper bits */ + ADVANCE_BATCH(); + + intel_batchbuffer_flush(batch); +} + /** * igt_get_render_copyfunc: * @devid: pci device id diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h index e2afc3b..0f22cd6 100644 --- a/lib/intel_batchbuffer.h +++ b/lib/intel_batchbuffer.h @@ -186,6 +186,18 @@ void intel_copy_bo(struct intel_batchbuffer *batch, long int size); /** + * Yf/Ys tiling + * + * Tiling mode in the I915_TILING_... namespace for new tiling modes which are + * defined in the kernel. (They are not fenceable so the kernel does not need + * to know about them.) + * + * They are to be used the the blitting routines below. + */ +#defin
[Intel-gfx] [PATCH i-g-t 02/12] lib: Extract igt_buf_write_to_png() from gem_render_copy
From: Damien Lespiau Now that the Android build has cairo, we can put cairo-dependant code back into lib/ Signed-off-by: Damien Lespiau --- lib/intel_batchbuffer.c | 25 + lib/intel_batchbuffer.h | 2 ++ tests/gem_render_copy.c | 24 +++- 3 files changed, 30 insertions(+), 21 deletions(-) diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index c70f6d8..5226910 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -31,6 +31,8 @@ #include #include +#include + #include "drm.h" #include "drmtest.h" #include "intel_batchbuffer.h" @@ -458,6 +460,29 @@ unsigned igt_buf_height(struct igt_buf *buf) } /** + * igt_buf_write_to_png: + * @buf: an i-g-t buffer object + * + * Writes the content of @buf as a PNG file + */ +void igt_buf_write_to_png(struct igt_buf *buf, const char *filename) +{ + cairo_surface_t *surface; + cairo_status_t ret; + + drm_intel_bo_map(buf->bo, 0); + surface = cairo_image_surface_create_for_data(buf->bo->virtual, + CAIRO_FORMAT_RGB24, + igt_buf_width(buf), + igt_buf_height(buf), + buf->stride); + ret = cairo_surface_write_to_png(surface, filename); + igt_assert(ret == CAIRO_STATUS_SUCCESS); + cairo_surface_destroy(surface); + drm_intel_bo_unmap(buf->bo); +} + +/** * igt_get_render_copyfunc: * @devid: pci device id * diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h index 12f7be1..e2afc3b 100644 --- a/lib/intel_batchbuffer.h +++ b/lib/intel_batchbuffer.h @@ -210,6 +210,8 @@ struct igt_buf { unsigned igt_buf_width(struct igt_buf *buf); unsigned igt_buf_height(struct igt_buf *buf); +void igt_buf_write_to_png(struct igt_buf *buf, const char *filename); + /** * igt_render_copyfunc_t: * @batch: batchbuffer object diff --git a/tests/gem_render_copy.c b/tests/gem_render_copy.c index 6348eee..6aa9e0d 100644 --- a/tests/gem_render_copy.c +++ b/tests/gem_render_copy.c @@ -31,7 +31,6 @@ #include #include -#include #include #include #include @@ -71,23 +70,6 @@ typedef struct { static int opt_dump_png = false; static int check_all_pixels = false; -static void scratch_buf_write_to_png(struct igt_buf *buf, const char *filename) -{ - cairo_surface_t *surface; - cairo_status_t ret; - - drm_intel_bo_map(buf->bo, 0); - surface = cairo_image_surface_create_for_data(buf->bo->virtual, - CAIRO_FORMAT_RGB24, - igt_buf_width(buf), - igt_buf_height(buf), - buf->stride); - ret = cairo_surface_write_to_png(surface, filename); - igt_assert(ret == CAIRO_STATUS_SUCCESS); - cairo_surface_destroy(surface); - drm_intel_bo_unmap(buf->bo); -} - static void scratch_buf_init(data_t *data, struct igt_buf *buf, int width, int height, int stride, uint32_t color) { @@ -165,8 +147,8 @@ int main(int argc, char **argv) scratch_buf_check(&data, &dst, WIDTH / 2, HEIGHT / 2, DST_COLOR); if (opt_dump_png) { - scratch_buf_write_to_png(&src, "source.png"); - scratch_buf_write_to_png(&dst, "destination.png"); + igt_buf_write_to_png(&src, "source.png"); + igt_buf_write_to_png(&dst, "destination.png"); } if (opt_dump_aub) { @@ -188,7 +170,7 @@ int main(int argc, char **argv) &dst, WIDTH / 2, HEIGHT / 2); if (opt_dump_png) - scratch_buf_write_to_png(&dst, "result.png"); + igt_buf_write_to_png(&dst, "result.png"); if (opt_dump_aub) { drm_intel_gem_bo_aub_dump_bmp(dst.bo, -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH i-g-t 12/12] testdisplay/skl: Add command line options for Yb/Yf tiled fbs
From: Damien Lespiau Signed-off-by: Damien Lespiau --- tests/testdisplay.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/tests/testdisplay.c b/tests/testdisplay.c index 64ce4d7..dab9e12 100644 --- a/tests/testdisplay.c +++ b/tests/testdisplay.c @@ -51,6 +51,7 @@ #include #include +#include #include #include #include @@ -71,8 +72,10 @@ #include #include -#define SUBTEST_OPTS 1 +#define SUBTEST_OPTS1 #define HELP_DESCRIPTION 2 +#define Yb_OPT 3 +#define Yf_OPT 4 static int tio_fd; struct termios saved_tio; @@ -544,7 +547,7 @@ int update_display(void) return 1; } -static char optstr[] = "3hiaf:s:d:p:mrto:j:"; +static char optstr[] = "3hiaf:s:d:p:mrto:j:y"; static void __attribute__((noreturn)) usage(char *name, char opt) { @@ -645,6 +648,8 @@ int main(int argc, char **argv) {"run-subtest", 1, 0, SUBTEST_OPTS}, {"help-description", 0, 0, HELP_DESCRIPTION}, {"help", 0, 0, 'h'}, + {"yb", 0, 0, Yb_OPT}, + {"yf", 0, 0, Yf_OPT}, { 0, 0, 0, 0 } }; @@ -697,6 +702,13 @@ int main(int argc, char **argv) case 't': tiling = LOCAL_I915_FORMAT_MOD_X_TILED; break; + case 'y': + case Yb_OPT: + tiling = LOCAL_I915_FORMAT_MOD_Y_TILED; + break; + case Yf_OPT: + tiling = LOCAL_I915_FORMAT_MOD_Yf_TILED; + break; case 'r': qr_code = 1; break; -- 2.3.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Push vblank enable/disable past encoder->enable/disable
On Wed, Jan 07, 2015 at 09:40:50PM +, Chris Wilson wrote: > On Wed, Jan 07, 2015 at 02:38:46PM +0100, Daniel Vetter wrote: > > It is platform/output depenedent when exactly the pipe will start > > running. Sometimes we just need the (cpu) pipe enabled, in other cases > > the pch transcoder is enough and in yet other cases the (DP) port is > > sending the frame start signal. > > > > In a perfect world we'd put the drm_crtc_vblank_on call exactly where > > the pipe starts running, but due to cloning and similar things this > > will get messy. And the current approach of picking the most > > conservative place for all combinations also doesn't work since that > > results in legit vblank waits (in encoder->enable hooks, e.g. the 2 > > vblank waits for sdvo) failing. > > > > Completely going back to the old world before > > > > commit 51e31d49c89055299e34b8f44d13f70e19d1 > > Author: Daniel Vetter > > Date: Mon Sep 15 12:36:02 2014 +0200 > > > > drm/i915: Use generic vblank wait# Please enter the commit message for > > your changes. Lines starting > > > > isn't great either since screaming when the vblank wait work because > > the pipe is off is kinda nice. > > > > Pick a compromise and move the drm_crtc_vblank_on right before the > > encoder->enable call. This is a lie on some outputs/platforms, but > > after the ->enable callback the pipe is guaranteed to run everywhere. > > So not that bad really. Suggested by Ville. > > > > v2: Same treatment for drm_crtc_vblank_off and encoder->disable: I've > > missed the ibx pipe B select w/a, which also has a vblank wait in the > > disable function (while the pipe is obviously still running). > > > > Cc: Ville Syrjälä > > Cc: Chris Wilson > > Acked-by: Ville Syrjälä > > Signed-off-by: Daniel Vetter > > Rather than decreasing the number of WARNs on my pnv during boot, this > doubled them. > > The original was: > > [ 34.136161] WARNING: CPU: 3 PID: 206 at drivers/gpu/drm/drm_irq.c:1130 > drm_wait_one_vblank+0x15a/0x160() > [ 34.136166] vblank wait timed out on crtc 1 > [ 34.136402] [] drm_wait_one_vblank+0x15a/0x160 > [ 34.136415] [] ? prepare_to_wait_event+0xd0/0xd0 > [ 34.136433] [] i9xx_crtc_disable+0x59/0x400 > > and the interloper: > > [ 47.012212] WARNING: CPU: 2 PID: 1423 at drivers/gpu/drm/drm_irq.c:1130 > drm_wait_one_vblank+0x15a/0x160() > [ 47.012217] vblank wait timed out on crtc 1 > [ 47.012400] [] drm_wait_one_vblank+0x15a/0x160 > [ 47.012409] [] ? prepare_to_wait_event+0xd0/0xd0 > [ 47.012420] [] intel_pipe_set_base+0x11e/0x1f0 Are you sure? The patch strictly increase the coverage for when vblanks works, so more sounds really funky ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/7] drm/i915/skl: Update watermarks for Y tiling
From: Tvrtko Ursulin Display watermarks need different programming for different tiling modes. Set the relevant flag so this happens during the plane commit and add relevant data into a structure made available to the watermark computation code. v2: Pass in tiling info to sprite plane updates as well. v3: Rebased for plane handling changes. v4: Handle fb == NULL when plane is disabled. v5: Refactored for addfb2 interface. v6: Refactored for fb modifier changes. v7: Updated for atomic commit by only updating watermarks when tiling changes. Signed-off-by: Tvrtko Ursulin Acked-by: Ander Conselvan de Oliveira Acked-by: Matt Roper --- drivers/gpu/drm/i915/intel_display.c | 6 ++ drivers/gpu/drm/i915/intel_drv.h | 1 + drivers/gpu/drm/i915/intel_pm.c | 33 - drivers/gpu/drm/i915/intel_sprite.c | 6 ++ 4 files changed, 41 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index c622b11..74d4923 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -11985,6 +11985,12 @@ intel_check_primary_plane(struct drm_plane *plane, INTEL_FRONTBUFFER_PRIMARY(intel_crtc->pipe); intel_crtc->atomic.update_fbc = true; + + /* Update watermarks on tiling changes. */ + if (!plane->state->fb || !state->base.fb || + plane->state->fb->modifier[0] != + state->base.fb->modifier[0]) + intel_crtc->atomic.update_wm = true; } return 0; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 399d2b2..b124548 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -501,6 +501,7 @@ struct intel_plane_wm_parameters { uint8_t bytes_per_pixel; bool enabled; bool scaled; + u64 tiling; }; struct intel_plane { diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index f7c9938..006e635 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2662,6 +2662,7 @@ static void skl_compute_wm_pipe_parameters(struct drm_crtc *crtc, struct intel_crtc *intel_crtc = to_intel_crtc(crtc); enum pipe pipe = intel_crtc->pipe; struct drm_plane *plane; + struct drm_framebuffer *fb; int i = 1; /* Index for sprite planes start */ p->active = intel_crtc_active(crtc); @@ -2677,6 +2678,14 @@ static void skl_compute_wm_pipe_parameters(struct drm_crtc *crtc, crtc->primary->fb->bits_per_pixel / 8; p->plane[0].horiz_pixels = intel_crtc->config->pipe_src_w; p->plane[0].vert_pixels = intel_crtc->config->pipe_src_h; + p->plane[0].tiling = DRM_FORMAT_MOD_NONE; + fb = crtc->primary->fb; + /* +* Framebuffer can be NULL on plane disable, but it does not +* matter for watermarks if we assume no tiling in that case. +*/ + if (fb) + p->plane[0].tiling = fb->modifier[0]; p->cursor.enabled = true; p->cursor.bytes_per_pixel = 4; @@ -2702,6 +2711,7 @@ static bool skl_compute_plane_wm(struct skl_pipe_wm_parameters *p, { uint32_t method1, method2, plane_bytes_per_line, res_blocks, res_lines; uint32_t result_bytes; + uint32_t y_tile_minimum; if (mem_value == 0 || !p->active || !p_params->enabled) return false; @@ -2718,11 +2728,16 @@ static bool skl_compute_plane_wm(struct skl_pipe_wm_parameters *p, plane_bytes_per_line = p_params->horiz_pixels * p_params->bytes_per_pixel; - /* For now xtile and linear */ - if (((ddb_allocation * 512) / plane_bytes_per_line) >= 1) - result_bytes = min(method1, method2); - else - result_bytes = method1; + if (p_params->tiling == I915_FORMAT_MOD_Y_TILED || + p_params->tiling == I915_FORMAT_MOD_Yf_TILED) { + y_tile_minimum = plane_bytes_per_line * 4; + result_bytes = max(method2, y_tile_minimum); + } else { + if (((ddb_allocation * 512) / plane_bytes_per_line) >= 1) + result_bytes = min(method1, method2); + else + result_bytes = method1; + } res_blocks = DIV_ROUND_UP(result_bytes, 512) + 1; res_lines = DIV_ROUND_UP(result_bytes, plane_bytes_per_line); @@ -3153,12 +3168,20 @@ skl_update_sprite_wm(struct drm_plane *plane, struct drm_crtc *crtc, int pixel_size, bool enabled, bool scaled) { struct intel_plane *intel_plane = to_intel_plane(plane); + struct drm_framebuffer *fb = plane->fb; intel_plane->wm.enabled
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Trim the command parser allocations
On Fri, Feb 13, 2015 at 04:43:22PM +, John Harrison wrote: > On 13/02/2015 13:23, Chris Wilson wrote: > >On Fri, Feb 13, 2015 at 01:08:59PM +, John Harrison wrote: > >>>@@ -1155,40 +1154,30 @@ i915_gem_execbuffer_parse(struct intel_engine_cs > >>>*ring, > >>> batch_start_offset, > >>> batch_len, > >>> is_master); > >>>- if (ret) { > >>>- if (ret == -EACCES) > >>>- return batch_obj; > >>>- } else { > >>>- struct i915_vma *vma; > >>>+ if (ret) > >>>+ goto err; > >>>- memset(shadow_exec_entry, 0, sizeof(*shadow_exec_entry)); > >>>+ ret = i915_gem_obj_ggtt_pin(shadow_batch_obj, 0, 0); > >>There is no explicit unpin for this. Does it happen automatically > >>due to adding the vma to the eb->vmas list? > >We set the exec_flag that tells us to unpin the obj when unwinding the > >execbuf. > >>Also, does it matter that it will be pinned again (and explicitly > >>unpinned) if the SECURE flag is set? > >No, pin/unpin is just a counter, it just needs to be balanced. (Long > >answer, yes, the restrictions given to both pin requests much match or > >else we will attempt to repin the buffer and fail miserably as the > >object is already pinned.) > >-Chris > > > > Reviewed-by: John Harrison Queued for -next, thanks for the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [v2 2/5] drm/i915: Limit max VCO supported in CHV to 6.48GHz
On Mon, Feb 16, 2015 at 01:21:34PM +0200, Ville Syrjälä wrote: > On Mon, Feb 16, 2015 at 03:07:59PM +0530, Vijay Purushothaman wrote: > > As per the recommendation from PHY team, limit the max vco supported in CHV > > to 6.48 GHz > > > > Signed-off-by: Vijay Purushothaman > > --- > > drivers/gpu/drm/i915/intel_display.c |2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c > > b/drivers/gpu/drm/i915/intel_display.c > > index 3b0fe9f..4e710f6 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -390,7 +390,7 @@ static const intel_limit_t intel_limits_chv = { > > * them would make no difference. > > */ > > .dot = { .min = 25000 * 5, .max = 54 * 5}, > > - .vco = { .min = 486, .max = 670 }, > > + .vco = { .min = 486, .max = 648 }, > > I have a patch here to reduce the minimum to 4.80 GHz, otherwise I can't > get my 2560x1440 HDMI display working (241.5 MHz clock). With that change > we still have a gap (233-240 MHz) in the frequencies we can produce. > Reducing the max to 6.48 GHz will increase that gap to 216-240 MHz, which > is a bit unfortunate. But if that's the recommendation we should follow > it I suppose, and hope no HDMI displays will want such frequencies. > > Is there an updated spreadsheet available with the new limits? Quite a > few of the frequencies in the original spreadsheet did have vco>6.48 > GHz. Has the updated doc been dug up meanwhile? A big part of review is getting access to docs and making sure they're up-to-date too ... -Daniel > > I any case this seems OK, so > Acked-by: Ville Syrjälä > > > .n = { .min = 1, .max = 1 }, > > .m1 = { .min = 2, .max = 2 }, > > .m2 = { .min = 24 << 22, .max = 175 << 22 }, > > -- > > 1.7.9.5 > > > > ___ > > Intel-gfx mailing list > > Intel-gfx@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Ville Syrjälä > Intel OTC > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [v2 0/5] More DPIO magic for CHV HDMI & DP
On Mon, Feb 16, 2015 at 03:07:57PM +0530, Vijay Purushothaman wrote: > Changes since version 1: > Addressed Ville's review comments > Decoded the magic numbers as much as possible > Split the single patch into logical patch set > Dropped the DPIO_CLK_EN changes > > > Vijay Purushothaman (5): > drm/i915: Add new PHY reg definitions for lock threshold > drm/i915: Limit max VCO supported in CHV to 6.48GHz > drm/i915: Disable M2 frac division for integer case > drm/i915: Initialize CHV digital lock detect threshold > drm/i915: Update prop, int co-eff and gain threshold for CHV Merged the first two patches from this series to dinq, thanks. -Daniel > > drivers/gpu/drm/i915/i915_reg.h | 11 + > drivers/gpu/drm/i915/intel_display.c | 78 > +- > 2 files changed, 70 insertions(+), 19 deletions(-) > > -- > 1.7.9.5 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/7] drm/i915/skl: Added new macros
On Tue, Feb 17, 2015 at 03:32:35PM +, Goel, Akash wrote: > Will prefer GT_INTERVAL_FROM_US, as GT_EVALUATION_COUNTER_FROM_US would be > more specific. Is there a new patch with revised #defines? I haven't yet caught up with mail ... -Daniel > > Best regards > Akash > > -Original Message- > From: Lespiau, Damien > Sent: Tuesday, February 17, 2015 8:56 PM > To: Goel, Akash > Cc: intel-gfx@lists.freedesktop.org > Subject: Re: [PATCH 1/7] drm/i915/skl: Added new macros > > How about GT_INTERVAL_FROM_US()? GT_EVALUATION_COUNTER_FROM_US()? > something along these lines I guess. > > -- > Damien > > On Tue, Feb 17, 2015 at 03:20:53PM +, Goel, Akash wrote: > > Thanks for the review. Agree it's not an appropriate name. > > Please kindly suggest one. > > 'GT_TIME_COUNTER_UNITS_FROM_PERIOD' ?? > > > > Best regards > > Akash > > > > -Original Message- > > From: Lespiau, Damien > > Sent: Tuesday, February 17, 2015 8:10 PM > > To: Goel, Akash > > Cc: intel-gfx@lists.freedesktop.org > > Subject: Re: [PATCH 1/7] drm/i915/skl: Added new macros > > > > On Fri, Feb 06, 2015 at 08:26:32PM +0530, akash.g...@intel.com wrote: > > > +#define FREQ_1_28_US(us) (((us) * 100) >> 7) > > > +#define FREQ_1_33_US(us) (((us) * 3) >> 2) > > > +#define GT_FREQ_FROM_PERIOD(us, dev) (IS_GEN9(dev) ? \ > > > + FREQ_1_33_US(us) : \ > > > + FREQ_1_28_US(us)) > > > > I'm not sure why you call that GT_FREQ when it looks like a time for > > evaluation intervals. > > > > -- > > Damien > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/7] drm/i915/skl: Added new macros
On Mon, Feb 23, 2015 at 05:21:35PM +0100, Daniel Vetter wrote: > On Tue, Feb 17, 2015 at 03:32:35PM +, Goel, Akash wrote: > > Will prefer GT_INTERVAL_FROM_US, as GT_EVALUATION_COUNTER_FROM_US would be > > more specific. > > Is there a new patch with revised #defines? I haven't yet caught up with > mail ... Yes, but it's awfully entangled, each new series being sent as a reply of another series. Sending a separate thread for a new version is so much easier for the reader when several patches change. -- Damien ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 12/18] drm/i915/skl: Implement WaDisablePowerCompilerClockGating
On Wed, Feb 18, 2015 at 09:57:52AM +, Nick Hoath wrote: > On 11/02/2015 17:48, Lespiau, Damien wrote: > >On Wed, Feb 11, 2015 at 03:29:51PM +, Nick Hoath wrote: > >>On 09/02/2015 19:33, Damien Lespiau wrote: > >>>Signed-off-by: Damien Lespiau > >>>--- > >>> drivers/gpu/drm/i915/i915_reg.h | 5 +++-- > >>> drivers/gpu/drm/i915/intel_ringbuffer.c | 8 > >>> 2 files changed, 11 insertions(+), 2 deletions(-) > >>> > >>>diff --git a/drivers/gpu/drm/i915/i915_reg.h > >>>b/drivers/gpu/drm/i915/i915_reg.h > >>>index a457c28..fdfbdb3 100644 > >>>--- a/drivers/gpu/drm/i915/i915_reg.h > >>>+++ b/drivers/gpu/drm/i915/i915_reg.h > >>>@@ -5241,8 +5241,9 @@ enum skl_disp_power_wells { > >>> #define COMMON_SLICE_CHICKEN20x7014 > >>> # define GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE(1<<0) > >>> > >>>-#define HIZ_CHICKEN 0x7018 > >>>-# define CHV_HZ_8X8_MODE_IN_1X(1<<15) > >>>+#define HIZ_CHICKEN 0x7018 > >>>+# define CHV_HZ_8X8_MODE_IN_1X(1<<15) > >>>+# define BDW_HIZ_POWER_COMPILER_CLOCK_GATING_DISABLE (1<<3) > >>> > >>> #define GEN9_SLICE_COMMON_ECO_CHICKEN0 0x7308 > >>> #define DISABLE_PIXEL_MASK_CAMMING (1<<14) > >>>diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > >>>b/drivers/gpu/drm/i915/intel_ringbuffer.c > >>>index 27d101c..3135192 100644 > >>>--- a/drivers/gpu/drm/i915/intel_ringbuffer.c > >>>+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > >>>@@ -930,8 +930,16 @@ static int gen9_init_workarounds(struct > >>>intel_engine_cs *ring) > >>> > >>> static int skl_init_workarounds(struct intel_engine_cs *ring) > >>> { > >>>+ struct drm_device *dev = ring->dev; > >>>+ struct drm_i915_private *dev_priv = dev->dev_private; > >>>+ > >>> gen9_init_workarounds(ring); > >>> > >>>+ /* WaDisablePowerCompilerClockGating:skl */ > >>>+ if (INTEL_REVID(dev) == SKL_REVID_B0) > >> > >>Should this be <= ? > > > >Nop, both specs (SKL:GT2:B) and the wa db (SIWA_ONLY_SKL_B0) state > >firmly B0 only. > > > > In that case: > Reviewed-by: Nick Hoath Queued for -next, thanks for the patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 2/2] drm/i915: Clean-up PPGTT on context destruction
On Fri, Feb 13, 2015 at 10:05:16AM +, Chris Wilson wrote: > On Fri, Feb 13, 2015 at 10:51:36AM +0100, Daniel Vetter wrote: > > On Thu, Feb 12, 2015 at 08:05:02PM +, rafael.barba...@intel.com wrote: > > > From: Rafael Barbalho > > > > > > With full PPGTT enabled an object's VMA entry into a PPGTT VM needs to be > > > cleaned up so that the PPGTT PDE & PTE allocations can be freed. > > > > > > This problem only shows up with full PPGTT because an object's VMA is > > > only cleaned-up when the object is destroyed. However, if the object has > > > been shared between multiple processes this may not happen, which leads to > > > references to the PPGTT still being kept the object was shared. > > > > > > Under android the sharing of GEM objects is a fairly common operation, > > > thus > > > the clean-up has to be more agressive. > > > > > > Signed-off-by: Rafael Barbalho > > > Cc: Daniel Vetter > > > Cc: Jon Bloomfield > > > > So when we've merged this we iirc talked about this issue and decided that > > the shrinker should be good enough in cleaning up the crap from shared > > objects. Not a pretty solution, but it should have worked. > > > > Is this again the lowmemory killer wreaking havoc with our i915 shrinker, > > or is there something else going on? And do you have some igt testcase for > > this? If sharing is all that's required the following should do the trick: > > 1. allocate obj > > 2. create new context > > 3. do dummy execbuf with that obj to map it into the ppgtt > > 4. free context > > 5. goto 2 often enough to OOM > > You know I have patches to fix all of this... It just happens to fall > out of tracking vma in requests, and by extension vm. Since you replied to my description of the igt testcase ... What kind of bugs do you mean? I kinda can't find the context here ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx