Re: [Intel-gfx] [RFC 00/12] i915 init-time configuration.
On Fri, 13 Feb 2015, Bob Paauwe wrote: > Background: > > This capability is targeted at deeply embedded appliance like devices > that make use of Intel integrated graphics. There are a few use cases > that are not currently supported by the i915 driver. For example, > they may not be running userspace code that is capable of querying and > setting the various DRM properties and would like them set during the > driver initialization. Also they may be using a custom firmware bootloader > that does not include any graphics initialization or VBT information. > > This level of initialization configuration has been available in > the Intel EMGD kernel driver and a similar level of configurability will > be expected as designs transition to the i915 driver. > > This patch set provides a framework that makes use of ACPI property > tables containing configuration information. It also includes some > examples on how this may be applied to various aspects of the i915 > driver initialization. The biggest issue I have with this series is the introduction of another source of configuration in addition to VBT (and to a lesser extent ACPI OpRegion) without an attempt to abstract them. Information from both will get used. The mixture is completed in patch 9 that initializes some of the same data structures as intel_bios.c but without reuse of the code. Maybe we need to better abstract our use of the VBT information to begin with, so that we could plug in additional (complementary or replacement) sources of the configuration. Offhand, I am not sure if what you propose as intel_config.c API could be developed into such an abstraction, or if there's something ready made in kernel we could use. I do know we already and historically have had problems with the forward compatibility of the VBT data. It's been getting better, but we need to avoid the same mistakes. On a related note, I'd really appreciate it if the specification for your data could be made public. Oh, one other thing, this thing needs to build with CONFIG_ACPI=n. At least that's been the case for i915 for a long time. I'll add some random notes on the patches too. BR, Jani. > > Series description: > > Patch 1 creates the initial framework. It looks up a specific ACPI > property table and builds lists containing the configuration found > in that table. It includes functions that can make use of that > configuration information. > > Patch 2 adds a function to i915 that provides a unique name for > each output. We previously had something similar to this in the > driver for debug output, it was not be used and removed recently. > > Patch 3 is the first example usage. We check the configuration for > a CRTC bits-per-pixel value and use that if EDID does not provide > this. > > Patch 4 is an example of using the configuration to specify a > default value for the DP panel fitter property. > > Patch 5 is an example of using the configuration to specify default > values for a couple of common connector properties. > > Patch 6 modifies the framework slightly to better support the > remaining examples. > > Patch 7 adds a function to the framework that looks for a > workaround section. If found, it builds a list of workarounds that > can be used in place of of the workarounds hardcoded in the driver. > > Patch 8 changes the workaround initialization code to make use > of the workaround list from the configuration instead of the > built-in workaround list. > > Patch 9 adds functions to the frame work that look for a VBT > section and parse that information into the driver's VBT structures. > > Patch 10 adds an example/test ACPI property table and adds code to > the frame to build this table into the driver. This is mainly for > testing the framework, but may also be useful for truly embedded > devices as a way to embed the configuration. > > Patch 11 adds an example workaround section to the test ACPI property > table. > > Patch 12 add an example VBT section to the test ACPI property table. > > Bob Paauwe (12): > drm/i915/config: Initial framework > drm/i915/config: Introduce intel_output_name > drm/i915/config: Add init-time configuration of bits per color. > drm/i915/config: Set dp panel fitter property based on init-time > config. > drm/i915/config: Set general connector properties using config. > drm/i915/config: Split out allocation of list nodes. > drm/i915/config: Get workaround information from configuration. > drm/i915/config: Use workarounds list from configuration. > drm/i915/config: Add VBT settings configuration. > drm/i915/config: Introduce a test table and code to make use of it. > drm/i915/config: Add workaround properties to ACPI table. > drm/i915/config: Add ACPI device examples for VBT configuration. > > drivers/gpu/drm/i915/Makefile|3 +- > drivers/gpu/drm/i915/i915-properties.asl | 340 ++ > drivers/gpu/drm/i915/i915-properties.hex | 409 > drivers/gpu/drm/i915/i915_
Re: [Intel-gfx] [RFC 02/12] drm/i915/config: Introduce intel_output_name
On Fri, 13 Feb 2015, Bob Paauwe wrote: > Human readable name for each output type to correspond with names > used in the ACPI property tables. Could you not use drm_connector and drm_encoder type and name fields? BR, Jani. > > Signed-off-by: Bob Paauwe > --- > drivers/gpu/drm/i915/intel_display.c | 57 > > drivers/gpu/drm/i915/intel_drv.h | 1 + > 2 files changed, 58 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 3b0fe9f..de6de83 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -12440,6 +12440,63 @@ static bool intel_crt_present(struct drm_device *dev) > return true; > } > > +/* > + * Provide a name for the various outputs. > + */ > +const char *intel_output_name(struct intel_connector *connector) > +{ > + int output; > + static const char *names[] = { > + [INTEL_OUTPUT_UNUSED] = "Unused", > + [INTEL_OUTPUT_ANALOG] = "Analog", > + [INTEL_OUTPUT_DVO] = "DVO", > + [INTEL_OUTPUT_SDVO] = "SDVO", > + [INTEL_OUTPUT_LVDS] = "LVDS", > + [INTEL_OUTPUT_TVOUT] = "TV", > + [INTEL_OUTPUT_HDMI] = "HDMI", > + [INTEL_OUTPUT_DISPLAYPORT] = "DisplayPort", > + [INTEL_OUTPUT_EDP] = "eDP", > + [INTEL_OUTPUT_DSI] = "DSI", > + [INTEL_OUTPUT_UNKNOWN] = "Unknown", > + }; > + static const char *name_ex[] = { > + [0] = "HDMI_A", > + [1] = "HDMI_B", > + [2] = "HDMI_C", > + [3] = "HDMI_D", > + [4] = "DisplayPort_A", > + [5] = "DisplayPort_B", > + [6] = "DisplayPort_C", > + [7] = "DisplayPort_D", > + [8] = "eDP_A", > + [9] = "eDP_B", > + [10] = "eDP_C", > + [11] = "eDP_D", > + }; > + > + if (!connector || !connector->encoder) > + return "Unknown"; > + > + switch (connector->encoder->type) { > + case INTEL_OUTPUT_HDMI: > + case INTEL_OUTPUT_DISPLAYPORT: > + case INTEL_OUTPUT_EDP: > + output = ((connector->encoder->type - INTEL_OUTPUT_HDMI) * 4) + > + enc_to_dig_port(&connector->encoder->base)->port; > + > + if (output < 0 || output >= ARRAY_SIZE(name_ex)) > + return "Invalid"; > + > + return name_ex[output]; > + default: > + if (output < 0 || output >= ARRAY_SIZE(names) || !names[output]) > + return "Invalid"; > + > + return names[output]; > + } > +} > + > + > static void intel_setup_outputs(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > diff --git a/drivers/gpu/drm/i915/intel_drv.h > b/drivers/gpu/drm/i915/intel_drv.h > index aefd95e..4c81ee9 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -893,6 +893,7 @@ void i915_audio_component_cleanup(struct drm_i915_private > *dev_priv); > > /* intel_display.c */ > extern const struct drm_plane_funcs intel_plane_funcs; > +const char *intel_output_name(struct intel_connector *intel_connector); > bool intel_has_pending_fb_unpin(struct drm_device *dev); > int intel_pch_rawclk(struct drm_device *dev); > void intel_mark_busy(struct drm_device *dev); > -- > 2.1.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 06/12] drm/i915/config: Split out allocation of list nodes.
On Fri, 13 Feb 2015, Bob Paauwe wrote: > We'll reduce some duplicate code if we move the list node allocation > to its own function when we start processing future config items like > workaround or vbt information. Should probably just be part of patch 1. BR, Jani. > > Signed-off-by: Bob Paauwe > --- > drivers/gpu/drm/i915/intel_config.c | 49 > ++--- > 1 file changed, 29 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_config.c > b/drivers/gpu/drm/i915/intel_config.c > index cf7da93..fb495ed 100644 > --- a/drivers/gpu/drm/i915/intel_config.c > +++ b/drivers/gpu/drm/i915/intel_config.c > @@ -161,6 +161,21 @@ static bool node_property(struct intel_config_node *n, > } > > > +static bool alloc_new_node(struct acpi_device *cl, struct list_head *list) > +{ > + struct intel_config_node *new_node; > + > + new_node = kzalloc(sizeof(*new_node), GFP_KERNEL); > + if (!new_node) > + return false; > + > + new_node->adev = cl; > + INIT_LIST_HEAD(&new_node->node); > + list_add_tail(&new_node->node, list); > + > + return true; > +} > + > /** > * intel_config_init - > * > @@ -232,26 +247,20 @@ void intel_config_init(struct drm_device *dev) > > cname = acpi_device_bid(component); > > - list_for_each_entry(cl, &component->children, node) { > - new_node = kzalloc(sizeof(*new_node), GFP_KERNEL); > - if (!new_node) > - goto bail; > - new_node->adev = cl; > - INIT_LIST_HEAD(&new_node->node); > - > - /* Add to the appropriate list */ > - if (strcmp(cname, i915_COMPONENT_CRTC) == 0) { > - list_add_tail(&new_node->node, > - &info->crtc_list); > - } else if (strcmp(cname, i915_COMPONENT_CONNECTOR) == > 0) { > - list_add_tail(&new_node->node, > - &info->connector_list); > - } else if (strcmp(cname, i915_COMPONENT_PLANE) == 0) { > - list_add_tail(&new_node->node, > - &info->plane_list); > - } else { > - /* unknown component, ignore it */ > - kfree(new_node); > + if (strcmp(cname, i915_COMPONENT_CRTC) == 0) { > + list_for_each_entry(cl, &component->children, node) { > + if (!alloc_new_node(cl, &info->crtc_list)) > + goto bail; > + } > + } else if (strcmp(cname, i915_COMPONENT_CONNECTOR) == 0) { > + list_for_each_entry(cl, &component->children, node) { > + if (!alloc_new_node(cl, &info->crtc_list)) > + goto bail; > + } > + } else if (strcmp(cname, i915_COMPONENT_PLANE) == 0) { > + list_for_each_entry(cl, &component->children, node) { > + if (!alloc_new_node(cl, &info->crtc_list)) > + goto bail; > } > } > } > -- > 2.1.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 01/12] drm/i915/config: Initial framework
On Fri, 13 Feb 2015, Bob Paauwe wrote: > This adds an init-time configuration framework that parses configuration > data from an ACPI property table. The table is assumed to have well > defined sub-device property tables that correspond to the various > driver components. Initially the following sub-device tables are > defined: > > CRTC (CRTC) >The CRTC sub-device table contains additional sub-device tables >where each one corresponds to a CRTC. Each CRTC sub-device must >include a property called "id" whose value matches the driver's >crtc id. Additional properties for the CRTC are used to configure >the crtc. > > Connector (CNCT) >The CNCT sub-device table contains additional sub-device tables >where each one corresponds to a connector. Each of the connector >sub-device tables must include a property called "name" whose value >matches a connector name assigned by the driver (see later patch >for output name function). Additional connector properties can >be set through these tables. > > Plane (PLNS) >The PLNS sub-device table contains additional sub-device tables >where each one corresponds to a plane. [this needs additional work] > > In addition, the main device property table for the device may > contain configuration information that applies to general driver > configuration. > > The framework includes a couple of helper functions to access the > configuration data. > >intel_config_get_integer() will look up a configuration property >and return the integer value associated with it. > >intel_config_init__property() will look up a >configuration property and assign the value to a drm >property of the same name. These functions are used to >initialize drm property instances to specific values. > > Signed-off-by: Bob Paauwe > --- > drivers/gpu/drm/i915/Makefile | 3 +- > drivers/gpu/drm/i915/i915_dma.c | 4 + > drivers/gpu/drm/i915/i915_drv.h | 16 ++ > drivers/gpu/drm/i915/i915_params.c | 6 + > drivers/gpu/drm/i915/intel_config.c | 542 > > drivers/gpu/drm/i915/intel_drv.h| 28 ++ > 6 files changed, 598 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/intel_config.c > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index f025e7f..462de19 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -12,7 +12,8 @@ i915-y := i915_drv.o \ >i915_suspend.o \ > i915_sysfs.o \ > intel_pm.o \ > - intel_runtime_pm.o > + intel_runtime_pm.o \ > + intel_config.o > > i915-$(CONFIG_COMPAT) += i915_ioc32.o > i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 5804aa5..9501360 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -656,6 +656,8 @@ int i915_driver_load(struct drm_device *dev, unsigned > long flags) > dev->dev_private = dev_priv; > dev_priv->dev = dev; > > + intel_config_init(dev); > + > /* Setup the write-once "constant" device info */ > device_info = (struct intel_device_info *)&dev_priv->info; > memcpy(device_info, info, sizeof(dev_priv->info)); > @@ -929,6 +931,8 @@ int i915_driver_unload(struct drm_device *dev) > > acpi_video_unregister(); > > + intel_config_shutdown(dev); > + > if (drm_core_check_feature(dev, DRIVER_MODESET)) > intel_fbdev_fini(dev); > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 2dedd43..165091c 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1645,6 +1645,20 @@ struct i915_virtual_gpu { > bool active; > }; > > +struct intel_config_node { > + struct acpi_device *adev; > + struct list_head node; > + struct list_head list; > +}; > + > +struct intel_config_info { > + struct intel_config_node base; > + struct list_head crtc_list; > + struct list_head connector_list; > + struct list_head plane_list; > +}; > + > + > struct drm_i915_private { > struct drm_device *dev; > struct kmem_cache *slab; > @@ -1886,6 +1900,7 @@ struct drm_i915_private { > u32 long_hpd_port_mask; > u32 short_hpd_port_mask; > struct work_struct dig_port_work; > + struct intel_config_info *config_info; > > /* >* if we get a HPD irq from DP and a HPD irq from non-DP > @@ -2528,6 +2543,7 @@ struct i915_params { > int enable_ips; > int invert_brightness; > int enable_cmd_parser; > + char cfg_firmware[PATH_MAX]; > /* leave bools at the end to not create holes */ > bool enable_hangcheck; > bool fastboot; > diff --git a/drivers/gpu/drm/i915/i915_params.c > b/drivers/gpu/drm/i915/i915_params.c > index 44f2262..f92621c 100644 > --- a/drivers/gpu/drm/i9
Re: [Intel-gfx] [PATCH] drm/i915: Fix frontbuffer false positve.
On Thu, Feb 12, 2015 at 05:17:04PM -0800, Rodrigo Vivi wrote: > No, we had solved old frontbuffer false positives... some missing > flush somewhere at that time... > > So, I added a bunch of printk and I insist that it is conceptually > wrong to set intel_crtc_atomic_commit on check times when you do > memset(&intel_crtc->atomic, 0, sizeof(intel_crtc->atomic)); > on every finish_commit. > > With exception of atomic.disabled_planes I believe the rest shouldn't > work in the way it is implemented because you can have one check > followed by many commits, but after the first commit all atomic > variables are zeroed, except the disabled_planes that is set outside > check... Ok here's the trouble: Every commit should have at exactly one check for the new state objects. Unfortunately in the transition that seems to have been lost for some cases. > For instance: on every cursor movement atomic.fb_bits was 0x000 when > it should be 0x002. This is why this patch solved the false positive, > i.e. setting it on commit instead on check time we get it propperly > set. One of the problems is the false positive but also it breaks > entirely SW tracking on VLV/CHV > > I believe wait_for flips, update_fbc, watermarks, etc should keep the > value got on check for the commit or the check should be done at > commit plane instead of on check. > > I started doing a patch here to move all atomic sets from check to > commit functions but gave up on middle when I noticed the > prepare_commit would almost get empty... All state precomputation must be done in check, at commit time you have a lot less information since the old state is somewhat gone. You can still get at it, but as soon as we add an async flip queue that will get really ugly. The current placement is imo the correct one. Instead we need to figure out where we're doing a ->commit without properly calling ->check beforehand. > Another idea was to make a atomic set per plane and just memset(0) on > begin of every check... But this would require reliable access to the > plane being updated on finish_commit... I believe loop on all planes > would be messy and cause other issues... > > So, I'll be out returning only next wed. Please let me know if you > have any suggestion of best changes to do that I can implement the > changes. Since you've done this testing I've landed Matt's patches to switch legacy plane entry points over to atomic. Which means cursor updates should now be done properly using atomic, always. But even then the old transitional plane helpers should have called the check functions ... So not sure where exactly we're loosing that check call. Matt Roper might have more insights. Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Check obj->vma_list under the struct_mutex
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang...@intel.com) Task id: 5765 -Summary- Platform Delta drm-intel-nightly Series Applied PNV -1 282/282 281/282 ILK 313/313 313/313 SNB 309/323 309/323 IVB 380/380 380/380 BYT 296/296 296/296 HSW -1 425/425 424/425 BDW -1 318/318 317/318 -Detailed- Platform Testdrm-intel-nightly Series Applied *PNV igt_gen3_render_linear_blits PASS(5) CRASH(1) *HSW igt_gem_storedw_loop_blt PASS(3) DMESG_WARN(1) *BDW igt_gem_gtt_hog PASS(8) DMESG_WARN(1) Note: You need to pay more attention to line start with '*' ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915/skl: Use a LRI for WaDisableDgMirrorFixInHalfSliceChicken5
On Thu, Feb 12, 2015 at 01:36:29PM +, Nick Hoath wrote: > On 11/02/2015 18:21, Lespiau, Damien wrote: > >I have no idea how that crept in, but we need to do the write from the > >ring and this is a masked register. Two fixes in 1! > > > >Cc: Nick Hoath > >Signed-off-by: Damien Lespiau > > Reviewed-by: Nick Hoath Merged four more skl wa patches, thanks. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] agp/intel: Serialise after GTT updates
On Fri, Feb 06, 2015 at 09:32:29AM +0100, Daniel Vetter wrote: > On Thu, Feb 05, 2015 at 04:11:00PM -0800, Jesse Barnes wrote: > > On Wed, 14 Jan 2015 11:20:55 + > > Chris Wilson wrote: > > > > > diff --git a/drivers/char/agp/intel-gtt.c > > > b/drivers/char/agp/intel-gtt.c index 92aa43fa8d70..15685ca39193 100644 > > > --- a/drivers/char/agp/intel-gtt.c > > > +++ b/drivers/char/agp/intel-gtt.c > > > @@ -225,7 +225,7 @@ static int i810_insert_dcache_entries(struct > > > agp_memory *mem, off_t pg_start, > > > intel_private.driver->write_entry(addr, i, type); > > > } > > > - readl(intel_private.gtt+i-1); > > > + readl(intel_private.gtt+pg_start); > > > > Any idea why? This one scares me... is it that the read is being > > serviced from the WC buffer w/o being flushed? Or is the compiler > > optimizing the last read based on the previous write? > > > > Writing a non-sequential address should also cause a flush, but I don't > > remember the rules for reads. We should get this figured out while we > > have an easy way to reproduce and a willing tester. > > Yeah agreed, but apparently a full mb(); is good enough too. So that's > what has landed. I was first wondering if we need something like this for gen6+ too, but then I relalized the UC GFX_FLUSH_CNTL access should cause the WC flush already. Hmm, except we don't do it for clear_range(), but I guess that's not a huge issue, just means someone could clobber some other memory besides the scratch page if accidentally writing to a cleared area of the ggtt. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Check obj->vma_list under the struct_mutex
On Thu, Feb 12, 2015 at 09:48:23AM +, Chris Wilson wrote: > On Thu, Feb 12, 2015 at 10:43:44AM +0100, Daniel Vetter wrote: > > On Thu, Feb 12, 2015 at 07:53:18AM +, Chris Wilson wrote: > > > When we walk the list of vma, or even for protecting against concurrent > > > framebuffer creation, we must hold the struct_mutex or else a second > > > thread can corrupt the list as we walk it. > > > > > > Fixes regression from > > > commit d7f46fc4e7323887494db13f063a8e59861fefb0 > > > Author: Ben Widawsky > > > Date: Fri Dec 6 14:10:55 2013 -0800 > > > > > > drm/i915: Make pin count per VMA > > > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=89085 > > > Signed-off-by: Chris Wilson > > > --- > > > drivers/gpu/drm/i915/i915_gem_tiling.c | 7 --- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c > > > b/drivers/gpu/drm/i915/i915_gem_tiling.c > > > index 7a24bd1a51f6..6377b22269ad 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_tiling.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c > > > @@ -335,9 +335,10 @@ i915_gem_set_tiling(struct drm_device *dev, void > > > *data, > > > return -EINVAL; > > > } > > > > > > + mutex_lock(&dev->struct_mutex); > > > if (i915_gem_obj_is_pinned(obj) || obj->framebuffer_references) { > > > > Since the removal of userspace pinning we shouldn't be able to see pinned > > objects here which are _not_ framebuffers too. But we still need the lock > > for synchronization and to avoid races, but perhaps we could drop the list > > walk? > > It would be possible for us to catch an object in the process of being > executed. More so, we *only* care about GTT pinning here, but still we > need to the lock to prevent that disappearing underneath us. We still need to grab dev->struct_mutex of course to avoid seeing bo pinned for execbuf. Just thought we could avoid the list walk in set_tiling as a super-micro-opt. > > Either way this is > > > > Reviewed-by: Daniel Vetter > > Cc: sta...@vger.kernel.org (we have some vague evidence that it blows up > > at last) > > > > I've also audited all the other callers of is_pinned, the only other > > suspicious one is the one in capture_bo. Perhaps we should also move that > > over to obj->framebuffer_references? > > We killed that over a year ago in the conversion of error capture over > to vma for full-ppgtt prepartion... Right? No, that was left semantically unchanged in the switch. So I guess we should dump vma->pin_count and obj->framebuffer_references? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Check obj->vma_list under the struct_mutex
On Fri, Feb 13, 2015 at 10:03:49AM +0100, Daniel Vetter wrote: > On Thu, Feb 12, 2015 at 09:48:23AM +, Chris Wilson wrote: > > On Thu, Feb 12, 2015 at 10:43:44AM +0100, Daniel Vetter wrote: > > > On Thu, Feb 12, 2015 at 07:53:18AM +, Chris Wilson wrote: > We still need to grab dev->struct_mutex of course to avoid seeing bo > pinned for execbuf. Just thought we could avoid the list walk in > set_tiling as a super-micro-opt. When we have an igt that combines thousands of vm with set-tiling, then we might notice! :) > > > Either way this is > > > > > > Reviewed-by: Daniel Vetter > > > Cc: sta...@vger.kernel.org (we have some vague evidence that it blows up > > > at last) > > > > > > I've also audited all the other callers of is_pinned, the only other > > > suspicious one is the one in capture_bo. Perhaps we should also move that > > > over to obj->framebuffer_references? > > > > We killed that over a year ago in the conversion of error capture over > > to vma for full-ppgtt prepartion... Right? > > No, that was left semantically unchanged in the switch. So I guess we > should dump vma->pin_count and obj->framebuffer_references? I meant we sent patches to improve error states for full-ppgtt. vma->pin_count is not interesting, since that is only done for execbuf, I only care about the GTT pinned objects since they are what we have pinned on behalf of hardware (and so are useful for crosschecking against register state), and that is what we specifically dump. Adding obj->framebuffer_references would be interesting, as well as the list of current framebuffers i.e. i915_gem_framebuffers. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] agp/intel: Serialise after GTT updates
On Fri, Feb 13, 2015 at 10:59:42AM +0200, Ville Syrjälä wrote: > Hmm, except we don't do it for clear_range(), but I guess that's not a > huge issue, just means someone could clobber some other memory besides > the scratch page if accidentally writing to a cleared area of the ggtt. It's a very small hole, only since it is the ggtt the only user is the kernel. (Except for i915.enable_ppgtt=0.) It is possible to forgo that clear entirely (looks at the PD implementation for a current violator, and that is a good example for where the extra WC writes can be have easurable impact on synmark). -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix an incorrect free rather than derefence issue.
On Thu, Feb 12, 2015 at 12:29:21PM +, Nick Hoath wrote: > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 > > Signed-off-by: Nick Hoath Commit message is missing the absolutely crucial detail about which patch introduced this regression: commit 6d3d8274bc45de4babb62d64562d92af984dd238 Author: Nick Hoath AuthorDate: Thu Jan 15 13:10:39 2015 + drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request Another thing I've noticed is that we explicitly drop the context reference for the request before dropping the request reference. Without clearing the req->ctx pointer. That has a very high chance to leading to tears, imo the context unreferenceing should be pushed into i915_gem_request_free. Except that it's there already, which means we have a double unref now? Also this patch is for the legacy ringbuffer code, but the referenced bug is for gen8+ execlists. We're definitely not running this code here I think. Imo step one is to drop all the explicit ctx refcounting for req->ctx and always rely on the implicit reference. Then see what happens. Cheers, Daniel > --- > drivers/gpu/drm/i915/i915_drv.c | 2 +- > drivers/gpu/drm/i915/i915_gem.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 1765989..dc10d86 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2661,7 +2661,7 @@ static void i915_gem_reset_ring_cleanup(struct > drm_i915_private *dev_priv, > intel_lr_context_unpin(ring, submit_req->ctx); > > i915_gem_context_unreference(submit_req->ctx); > - kfree(submit_req); > + i915_gem_request_unreference(submit_req); > } > > /* > -- > 2.1.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 2/2] drm/i915: Clean-up PPGTT on context destruction
On Thu, Feb 12, 2015 at 08:05:02PM +, rafael.barba...@intel.com wrote: > From: Rafael Barbalho > > With full PPGTT enabled an object's VMA entry into a PPGTT VM needs to be > cleaned up so that the PPGTT PDE & PTE allocations can be freed. > > This problem only shows up with full PPGTT because an object's VMA is > only cleaned-up when the object is destroyed. However, if the object has > been shared between multiple processes this may not happen, which leads to > references to the PPGTT still being kept the object was shared. > > Under android the sharing of GEM objects is a fairly common operation, thus > the clean-up has to be more agressive. > > Signed-off-by: Rafael Barbalho > Cc: Daniel Vetter > Cc: Jon Bloomfield So when we've merged this we iirc talked about this issue and decided that the shrinker should be good enough in cleaning up the crap from shared objects. Not a pretty solution, but it should have worked. Is this again the lowmemory killer wreaking havoc with our i915 shrinker, or is there something else going on? And do you have some igt testcase for this? If sharing is all that's required the following should do the trick: 1. allocate obj 2. create new context 3. do dummy execbuf with that obj to map it into the ppgtt 4. free context 5. goto 2 often enough to OOM The shrinker should eventually kick in and clean up, but maybe I'm wrong about that ... One thing I've thought of is that if the shared object is pinned as a scanout buffer then the shrinker won't touch it. But not sure whether you can actually do this to end up with a stable leak - after each pageflip the shrinker evict the vma/obj for this eventually. > --- > drivers/gpu/drm/i915/i915_gem.c | 7 +++--- > drivers/gpu/drm/i915/i915_gem_context.c | 2 +- > drivers/gpu/drm/i915/i915_gem_gtt.c | 43 > - > drivers/gpu/drm/i915/i915_gem_gtt.h | 7 ++ > 4 files changed, 54 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 60b8bd1..e509d89 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -4529,11 +4529,12 @@ void i915_gem_vma_destroy(struct i915_vma *vma) > return; > > vm = vma->vm; > + list_del(&vma->vma_link); > > - if (!i915_is_ggtt(vm)) > + if (!i915_is_ggtt(vm)) { > + list_del(&vma->vm_link); > i915_ppgtt_put(i915_vm_to_ppgtt(vm)); > - > - list_del(&vma->vma_link); > + } > > kfree(vma); > } > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index a5221d8..4319a93 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -140,7 +140,7 @@ void i915_gem_context_free(struct kref *ctx_ref) > if (i915.enable_execlists) > intel_lr_context_free(ctx); > > - i915_ppgtt_put(ctx->ppgtt); > + i915_ppgtt_destroy(ctx->ppgtt); > > if (ctx->legacy_hw_ctx.rcs_state) > drm_gem_object_unreference(&ctx->legacy_hw_ctx.rcs_state->base); > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 6f410cf..9ef2f67 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -1097,6 +1097,7 @@ static int __hw_ppgtt_init(struct drm_device *dev, > struct i915_hw_ppgtt *ppgtt) > else > BUG(); > } > + > int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt) > { > struct drm_i915_private *dev_priv = dev->dev_private; > @@ -1108,6 +1109,8 @@ int i915_ppgtt_init(struct drm_device *dev, struct > i915_hw_ppgtt *ppgtt) > drm_mm_init(&ppgtt->base.mm, ppgtt->base.start, > ppgtt->base.total); > i915_init_vm(dev_priv, &ppgtt->base); > + > + INIT_LIST_HEAD(&ppgtt->vma_list); > } > > return ret; > @@ -1177,14 +1180,49 @@ void i915_ppgtt_release(struct kref *kref) > /* vmas should already be unbound */ > WARN_ON(!list_empty(&ppgtt->base.active_list)); > WARN_ON(!list_empty(&ppgtt->base.inactive_list)); > + WARN_ON(!list_empty(&ppgtt->vma_list)); > > list_del(&ppgtt->base.global_link); > drm_mm_takedown(&ppgtt->base.mm); > > ppgtt->base.cleanup(&ppgtt->base); > + > kfree(ppgtt); > } > > +void > +i915_ppgtt_destroy(struct i915_hw_ppgtt *ppgtt) This is misnamed since what it really does is unbind everything and then drop the reference. Unbinding everything is already implemented as i915_gem_evict_vm. Also unconditionally evicting the entire vm upon context destruction means we essentially assume a 1:1 link between ctx and ppgtt. Which is currently true but will change with the interfaces planned for buffered svm. I think we need a new ctx_use_count besides the pointer refcount that we inc/dec on context d
Re: [Intel-gfx] [RFC 2/2] drm/i915: Clean-up PPGTT on context destruction
On Thu, Feb 12, 2015 at 09:03:06PM +, Chris Wilson wrote: > On Thu, Feb 12, 2015 at 08:05:02PM +, rafael.barba...@intel.com wrote: > > From: Rafael Barbalho > > > > With full PPGTT enabled an object's VMA entry into a PPGTT VM needs to be > > cleaned up so that the PPGTT PDE & PTE allocations can be freed. > > > > This problem only shows up with full PPGTT because an object's VMA is > > only cleaned-up when the object is destroyed. However, if the object has > > been shared between multiple processes this may not happen, which leads to > > references to the PPGTT still being kept the object was shared. > > > > Under android the sharing of GEM objects is a fairly common operation, thus > > the clean-up has to be more agressive. > > Not quite. You need an active refcount as we do not expect close(fd) to > stall. The trick is to "simply" use requests to retire vma (as well as > the object management it does today, though that just becomes a second > layer for GEM API management, everything else goes through vma). Linking into the ctx unref should give us that for free since requests do hold a reference on the context. So this will only be run when the buffers are idle. Well except that our unbind code is too dense to do that correctly for shared buffers, so we need to move obj->active to vma->active first. And yeah the commit message should have explained this. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix an incorrect free rather than derefence issue.
On 13/02/2015 09:32, Daniel Vetter wrote: On Thu, Feb 12, 2015 at 12:29:21PM +, Nick Hoath wrote: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 Signed-off-by: Nick Hoath Commit message is missing the absolutely crucial detail about which patch introduced this regression: commit 6d3d8274bc45de4babb62d64562d92af984dd238 Author: Nick Hoath AuthorDate: Thu Jan 15 13:10:39 2015 + drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request Another thing I've noticed is that we explicitly drop the context reference for the request before dropping the request reference. Without clearing the req->ctx pointer. That has a very high chance to leading to tears, imo the context unreferenceing should be pushed into i915_gem_request_free. Except that it's there already, which means we have a double unref now? Looking at the code, it looks like that's the case. Also this patch is for the legacy ringbuffer code, but the referenced bug is for gen8+ execlists. We're definitely not running this code here I think. i915_gem_reset_ring_cleanup is used in execlists in the hang recovery case. Imo step one is to drop all the explicit ctx refcounting for req->ctx and always rely on the implicit reference. Then see what happens. I agree that the refcounting needs re-evaluating after the merge of execlist queue entries & requests, however I think the cleanup of the double unref/removing the refcounting should be done in another patchset. This patch is purely to fix the issue raised in 88652. Depends on the relative priorities. Cheers, Daniel --- drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/i915_gem.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1765989..dc10d86 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2661,7 +2661,7 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv, intel_lr_context_unpin(ring, submit_req->ctx); i915_gem_context_unreference(submit_req->ctx); - kfree(submit_req); + i915_gem_request_unreference(submit_req); } /* -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] INTEL ATOM E3826 Feedback from an industrial customer.
Hi, My name is Stéphane ANCELOT, I am working at Numalliance R&D Team in France (http://www.numalliance.com). We are making our own wire bending CNC platform, linux based using INTEL PC platforms (automation and GUI in the same PC). That may be the wrong place, but I think it is important to report my experience, regarding intel graphics performance , when benchmarking INTEL ATOM platforms for usage in our CNC.You may be able to report to the right persons in INTEL group Our application need realtime performance to run automation tasks. This is done using Realtime patches against standard linux kernels. This means we can not use the more recent kernels, but stabilised kernel releases versions (at time of writing, RT preempt : kernel 3.14, xenomai API: kernel 3.16...) I used a kernel 3.16.2. We are using a 19 inch vertical display at 1280x1024 resolution. We faced following problems with GFX driver : a/console flickering console at screen bottom in kernel 3.16.2 . The problem increased when there was CPU/disk/network activity.This problem does not appear from kernel 3.18.2 release. Unfortunately in our environment, we can not use 3.18 kernel , because it is not ready with realtime patches. b/ 2D performance Poor 2D performance, looks like we have not had 2D acceleration. Visually poor performance visible when raising/lowering fullscreen window. When moving object in paint application (inkscape) , the object does not follow efficiently the mouse. c/ 3D performances In our application,we are making heavy usage of 3D for CNC simulation (some screenshots available on request only). We have seen lot better performances than ATOM D2550 , we tried in the past. That seems a good thing. Conclusion Although there is a wish from Intel to provide ATOM platforms ready for industry, it is not ready regarding ATOM platforms. Because we can not change kernel releases versions, when validating a product. This requirement should be considered. In the same way, we can not change the PC platform every year, because of processor obsolescence. In our case, we are dependant on Ethernet realtime driver, Realtime patches, graphic 2D and 3D performance. We think too, that since ATOM platforms is not very spreaded and so common as Desktop platforms, BayTrail drivers are not so efficients. I am sure They will be... but in may be 2 years... For these reasons, we will stop benchmarking ATOM platforms, and will benchmark Core Ix platforms, since we think the GFX chipsets is better supported regarding drivers . Am I right ? I am an open minded guy, so feel free to give your positive or negative opinion ! ;-) I can give more details if needed. Have a look at what we are doing with an INTEL platform : https://www.youtube.com/watch?v=wj30CeAFwuk Regards Stephane ANCELOT sance...@numalliance.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 2/2] drm/i915: Clean-up PPGTT on context destruction
On Fri, Feb 13, 2015 at 10:51:36AM +0100, Daniel Vetter wrote: > On Thu, Feb 12, 2015 at 08:05:02PM +, rafael.barba...@intel.com wrote: > > From: Rafael Barbalho > > > > With full PPGTT enabled an object's VMA entry into a PPGTT VM needs to be > > cleaned up so that the PPGTT PDE & PTE allocations can be freed. > > > > This problem only shows up with full PPGTT because an object's VMA is > > only cleaned-up when the object is destroyed. However, if the object has > > been shared between multiple processes this may not happen, which leads to > > references to the PPGTT still being kept the object was shared. > > > > Under android the sharing of GEM objects is a fairly common operation, thus > > the clean-up has to be more agressive. > > > > Signed-off-by: Rafael Barbalho > > Cc: Daniel Vetter > > Cc: Jon Bloomfield > > So when we've merged this we iirc talked about this issue and decided that > the shrinker should be good enough in cleaning up the crap from shared > objects. Not a pretty solution, but it should have worked. > > Is this again the lowmemory killer wreaking havoc with our i915 shrinker, > or is there something else going on? And do you have some igt testcase for > this? If sharing is all that's required the following should do the trick: > 1. allocate obj > 2. create new context > 3. do dummy execbuf with that obj to map it into the ppgtt > 4. free context > 5. goto 2 often enough to OOM You know I have patches to fix all of this... It just happens to fall out of tracking vma in requests, and by extension vm. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/6] eDP DRRS based on frontbuffer tracking
This series includes a preparation patch for drrs support across differnt platforms in intel_dp_set_m_n along with last 5 pending patches of V3 eDP DRRS patch series. New series is submitted to make the review more comfortable and to display the dependancy of the patches explicitly. Durgadoss R (1): drm/i915: Enable eDP DRRS for CHV Ramalingam C (1): drm/i915: Add support for DRRS in intel_dp_set_m_n Vandana Kannan (4): drm/i915/bdw: Add support for DRRS to switch RR drm/i915: Support for RR switching on VLV Documentation/drm: DocBook integration for DRRS drm/i915: Add debugfs entry for DRRS Documentation/DocBook/drm.tmpl | 11 drivers/gpu/drm/i915/i915_debugfs.c | 99 drivers/gpu/drm/i915/i915_reg.h |1 + drivers/gpu/drm/i915/intel_display.c | 32 ++--- drivers/gpu/drm/i915/intel_dp.c | 121 -- drivers/gpu/drm/i915/intel_drv.h | 22 ++- 6 files changed, 273 insertions(+), 13 deletions(-) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/6] drm/i915: Add support for DRRS in intel_dp_set_m_n
Till Gen 7 we have two sets of M_N registers, but Gen 8 onwards we have only one M_N register set. To support DRRS on both scenarios a input parameter to intel_dp_set_m_n is added. In case of DRRS, When platform provides two set of M_N registers for dp, we can program them with two different dividers and switch between them. But when only one such register set is provided, we have to program the required divider M_N value on that registers itself. Two enum members M1_N1 and M2_N2 are defined to represent the above scenarios. M1_N1: Program dp_m_n on M1_N1 registers dp_m2_n2 on M2_N2 registers (If supported) M2_N2: Program dp_m2_n2 on M1_N1 registers M2_N2 registers are not supported Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/intel_display.c | 30 +++--- drivers/gpu/drm/i915/intel_drv.h | 22 +- 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 3b0fe9f..2af24a7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -4322,7 +4322,7 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc) intel_prepare_shared_dpll(intel_crtc); if (intel_crtc->config->has_dp_encoder) - intel_dp_set_m_n(intel_crtc); + intel_dp_set_m_n(intel_crtc, M1_N1); intel_set_pipe_timings(intel_crtc); @@ -4430,7 +4430,7 @@ static void haswell_crtc_enable(struct drm_crtc *crtc) intel_enable_shared_dpll(intel_crtc); if (intel_crtc->config->has_dp_encoder) - intel_dp_set_m_n(intel_crtc); + intel_dp_set_m_n(intel_crtc, M1_N1); intel_set_pipe_timings(intel_crtc); @@ -5044,7 +5044,7 @@ static void valleyview_crtc_enable(struct drm_crtc *crtc) } if (intel_crtc->config->has_dp_encoder) - intel_dp_set_m_n(intel_crtc); + intel_dp_set_m_n(intel_crtc, M1_N1); intel_set_pipe_timings(intel_crtc); @@ -5120,7 +5120,7 @@ static void i9xx_crtc_enable(struct drm_crtc *crtc) i9xx_set_pll_dividers(intel_crtc); if (intel_crtc->config->has_dp_encoder) - intel_dp_set_m_n(intel_crtc); + intel_dp_set_m_n(intel_crtc, M1_N1); intel_set_pipe_timings(intel_crtc); @@ -5895,13 +5895,29 @@ static void intel_cpu_transcoder_set_m_n(struct intel_crtc *crtc, } } -void intel_dp_set_m_n(struct intel_crtc *crtc) +void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n) { + struct intel_link_m_n *dp_m_n, *dp_m2_n2 = NULL; + + if (m_n == M1_N1) { + dp_m_n = &crtc->config->dp_m_n; + dp_m2_n2 = &crtc->config->dp_m2_n2; + } else if (m_n == M2_N2) { + + /* +* M2_N2 registers are not supported. Hence m2_n2 divider value +* needs to be programmed into M1_N1. +*/ + dp_m_n = &crtc->config->dp_m2_n2; + } else { + DRM_ERROR("Unsupported divider value\n"); + return; + } + if (crtc->config->has_pch_encoder) intel_pch_transcoder_set_m_n(crtc, &crtc->config->dp_m_n); else - intel_cpu_transcoder_set_m_n(crtc, &crtc->config->dp_m_n, - &crtc->config->dp_m2_n2); + intel_cpu_transcoder_set_m_n(crtc, dp_m_n, dp_m2_n2); } static void vlv_update_pll(struct intel_crtc *crtc, diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1de8e20..1fb1529 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -593,6 +593,26 @@ struct intel_hdmi { struct intel_dp_mst_encoder; #define DP_MAX_DOWNSTREAM_PORTS0x10 +/* + * enum link_m_n_set: + * When platform provides two set of M_N registers for dp, we can + * program them and switch between them incase of DRRS. + * But When only one such register is provided, we have to program the + * required divider value on that registers itself based on the DRRS state. + * + * M1_N1 : Program dp_m_n on M1_N1 registers + * dp_m2_n2 on M2_N2 registers (If supported) + * + * M2_N2 : Program dp_m2_n2 on M1_N1 registers + * M2_N2 registers are not supported + */ + +enum link_m_n_set { + /* Sets the m1_n1 and m2_n2 */ + M1_N1 = 0, + M2_N2 +}; + struct intel_dp { uint32_t output_reg; uint32_t aux_ch_ctl_reg; @@ -994,7 +1014,7 @@ void hsw_enable_pc8(struct drm_i915_private *dev_priv); void hsw_disable_pc8(struct drm_i915_private *dev_priv); void intel_dp_get_m_n(struct intel_crtc *crtc, struct intel_crtc_state *pipe_config); -void intel_dp_set_m_n(struct intel_crtc
[Intel-gfx] [PATCH 3/6] drm/i915: Support for RR switching on VLV
From: Vandana Kannan Definition of VLV RR switch bit and corresponding toggling in set_drrs function. Signed-off-by: Vandana Kannan Signed-off-by: Uma Shankar Reviewed-by: Jani Nikula Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_reg.h |1 + drivers/gpu/drm/i915/intel_dp.c | 10 -- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 39bdbf9..944f788 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3880,6 +3880,7 @@ enum skl_disp_power_wells { #define PIPECONF_INTERLACE_MODE_MASK (7 << 21) #define PIPECONF_EDP_RR_MODE_SWITCH (1 << 20) #define PIPECONF_CXSR_DOWNCLOCK (1<<16) +#define PIPECONF_EDP_RR_MODE_SWITCH_VLV (1 << 14) #define PIPECONF_COLOR_RANGE_SELECT (1 << 13) #define PIPECONF_BPC_MASK(0x7 << 5) #define PIPECONF_8BPC(0<<5) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 6ffbf57..9f3da8f 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4810,9 +4810,15 @@ static void intel_dp_set_drrs_state(struct drm_device *dev, int refresh_rate) val = I915_READ(reg); if (index > DRRS_HIGH_RR) { - val |= PIPECONF_EDP_RR_MODE_SWITCH; + if (IS_VALLEYVIEW(dev)) + val |= PIPECONF_EDP_RR_MODE_SWITCH_VLV; + else + val |= PIPECONF_EDP_RR_MODE_SWITCH; } else { - val &= ~PIPECONF_EDP_RR_MODE_SWITCH; + if (IS_VALLEYVIEW(dev)) + val &= ~PIPECONF_EDP_RR_MODE_SWITCH_VLV; + else + val &= ~PIPECONF_EDP_RR_MODE_SWITCH; } I915_WRITE(reg, val); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/6] drm/i915: Enable eDP DRRS for CHV
From: Durgadoss R This patch enables eDP DRRS for CHV by adding the required IS_CHERRYVIEW() checks. CHV uses the same register bit as VLV. [Vandana]: Since CHV has 2 sets of M_N registers, it will follow the same code path as gen < 8. Added CHV check in dp_set_m_n() [Ram]: Rebased on top of previous patch modifications Signed-off-by: Durgadoss R Signed-off-by: Vandana Kannan Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/intel_display.c |2 +- drivers/gpu/drm/i915/intel_dp.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 2af24a7..6548524 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -5879,7 +5879,7 @@ static void intel_cpu_transcoder_set_m_n(struct intel_crtc *crtc, * for gen < 8) and if DRRS is supported (to make sure the * registers are not unnecessarily accessed). */ - if (m2_n2 && INTEL_INFO(dev)->gen < 8 && + if (m2_n2 && (IS_CHERRYVIEW(dev) || INTEL_INFO(dev)->gen < 8) && crtc->config->has_drrs) { I915_WRITE(PIPE_DATA_M2(transcoder), TU_SIZE(m2_n2->tu) | m2_n2->gmch_m); diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 9f3da8f..dfbe97d 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4793,7 +4793,7 @@ static void intel_dp_set_drrs_state(struct drm_device *dev, int refresh_rate) return; } - if (INTEL_INFO(dev)->gen >= 8) { + if (INTEL_INFO(dev)->gen >= 8 && !IS_CHERRYVIEW(dev)) { switch (index) { case DRRS_HIGH_RR: intel_dp_set_m_n(intel_crtc, M1_N1); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/6] drm/i915: Add debugfs entry for DRRS
From: Vandana Kannan Adding a debugfs entry to determine if DRRS is supported or not V2: [By Ram]: Following details about the active crtc will be filled in seq-file of the debugfs 1. Encoder output type 2. DRRS Support on this CRTC 3. DRRS current state 4. Current Vrefresh Format is as follows: CRTC 1: Output: eDP, DRRS Supported: Yes (Seamless), DRRS_State: DRRS_HIGH_RR, Vrefresh: 60 CRTC 2: Output: HDMI, DRRS Supported : No, VBT DRRS_type: Seamless CRTC 1: Output: eDP, DRRS Supported: Yes (Seamless), DRRS_State: DRRS_LOW_RR, Vrefresh: 40 CRTC 2: Output: HDMI, DRRS Supported : No, VBT DRRS_type: Seamless V3: [By Ram]: Readability is improved. Another error case is covered [Daniel] V4: [By Ram]: Current status of the Idleness DRRS along with the Front buffer bits are added to the debugfs. [Rodrigo] Signed-off-by: Vandana Kannan Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/i915_debugfs.c | 99 +++ 1 file changed, 99 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 164fa82..e08d63f 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2869,6 +2869,104 @@ static int i915_ddb_info(struct seq_file *m, void *unused) return 0; } +static void drrs_status_per_crtc(struct seq_file *m, + struct drm_device *dev, struct intel_crtc *intel_crtc) +{ + struct intel_encoder *intel_encoder; + struct drm_i915_private *dev_priv = dev->dev_private; + struct i915_drrs *drrs = &dev_priv->drrs; + int vrefresh = 0; + + for_each_encoder_on_crtc(dev, &intel_crtc->base, intel_encoder) { + /* Encoder connected on this CRTC */ + switch (intel_encoder->type) { + case INTEL_OUTPUT_EDP: + seq_puts(m, "Output: eDP, "); + break; + case INTEL_OUTPUT_DSI: + seq_puts(m, "Output: DSI, "); + break; + case INTEL_OUTPUT_HDMI: + seq_puts(m, "Output: HDMI, "); + break; + case INTEL_OUTPUT_DISPLAYPORT: + seq_puts(m, "Output: DP, "); + break; + default: + seq_printf(m, "Output: Others (id=%d), ", + intel_encoder->type); + } + } + + if (intel_crtc->config->has_drrs) { + struct intel_panel *panel; + + panel = &drrs->dp->attached_connector->panel; + /* DRRS Supported */ + seq_puts(m, "DRRS Supported: Yes (Seamless), "); + seq_printf(m, "busy_frontbuffer_bits: 0x%X,\n\t", + drrs->busy_frontbuffer_bits); + + if (drrs->busy_frontbuffer_bits) { + seq_puts(m, "Front buffer: busy, "); + seq_puts(m, "Idleness Timer: Suspended, "); + } else { + seq_puts(m, "Front buffer: Idle, "); + if (drrs->refresh_rate_type == DRRS_HIGH_RR) + seq_puts(m, "Idleness Timer: Ticking, "); + else + seq_puts(m, "Idleness Timer: Suspended, "); + } + + if (drrs->refresh_rate_type == DRRS_HIGH_RR) { + seq_puts(m, "DRRS_State: DRRS_HIGH_RR, "); + vrefresh = panel->fixed_mode->vrefresh; + } else if (drrs->refresh_rate_type == DRRS_LOW_RR) { + seq_puts(m, "DRRS_State: DRRS_LOW_RR, "); + vrefresh = panel->downclock_mode->vrefresh; + } else { + seq_printf(m, "DRRS_State: Unknown(%d), ", + drrs->refresh_rate_type); + } + seq_printf(m, "Vrefresh: %d", vrefresh); + + } else { + /* DRRS not supported. Print the VBT parameter*/ + seq_puts(m, "DRRS Supported : No, "); + if (dev_priv->vbt.drrs_type == STATIC_DRRS_SUPPORT) + seq_puts(m, "VBT DRRS_type: Static"); + else if (dev_priv->vbt.drrs_type == SEAMLESS_DRRS_SUPPORT) + seq_puts(m, "VBT DRRS_type: Seamless"); + else if (dev_priv->vbt.drrs_type == DRRS_NOT_SUPPORTED) + seq_puts(m, "VBT DRRS_type: None"); + else + seq_puts(m, "VBT DRRS_type: Unrecognized Value"); + } + seq_puts(m, "\n"); +} + +static int i915_drrs_status(struct seq_file *m, void *unused) +{ + struct drm_info_node *node = m->private; + struct drm_device *dev = node->minor->dev; + struct intel_crtc *intel_crtc; +
[Intel-gfx] [PATCH 2/6] drm/i915/bdw: Add support for DRRS to switch RR
From: Vandana Kannan For Broadwell, there is one instance of Transcoder MN values per transcoder. For dynamic switching between multiple refreshr rates, M/N values may be reprogrammed on the fly. Link N programming triggers update of all data and link M & N registers and the new M/N values will be used in the next frame that is output. V2: [By Ram]: intel_dp_set_m_n() is rewritten to accommodate gen >= 8 [Rodrigo] V3: Coding style correction [Ram] V4: [By Ram] intel_dp_set_m_n modifications are moved into a separate patch, retaining only DRRS related changes here [Rodrigo] Signed-off-by: Vandana Kannan Signed-off-by: Pradeep Bhat Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/intel_dp.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 868a07b..6ffbf57 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4793,12 +4793,24 @@ static void intel_dp_set_drrs_state(struct drm_device *dev, int refresh_rate) return; } - if (INTEL_INFO(dev)->gen > 6 && INTEL_INFO(dev)->gen < 8) { + if (INTEL_INFO(dev)->gen >= 8) { + switch (index) { + case DRRS_HIGH_RR: + intel_dp_set_m_n(intel_crtc, M1_N1); + break; + case DRRS_LOW_RR: + intel_dp_set_m_n(intel_crtc, M2_N2); + break; + case DRRS_MAX_RR: + default: + DRM_ERROR("Unsupported refreshrate type\n"); + } + } else if (INTEL_INFO(dev)->gen > 6) { reg = PIPECONF(intel_crtc->config->cpu_transcoder); val = I915_READ(reg); + if (index > DRRS_HIGH_RR) { val |= PIPECONF_EDP_RR_MODE_SWITCH; - intel_dp_set_m_n(intel_crtc); } else { val &= ~PIPECONF_EDP_RR_MODE_SWITCH; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/6] Documentation/drm: DocBook integration for DRRS
From: Vandana Kannan Adding an overview of DRRS in general and the implementation for eDP DRRS. Also, describing the functions related to eDP DRRS. Signed-off-by: Vandana Kannan Reviewed-by: Rodrigo Vivi --- Documentation/DocBook/drm.tmpl | 11 + drivers/gpu/drm/i915/intel_dp.c | 95 +++ 2 files changed, 106 insertions(+) diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 249f0c9..7a45775 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -4053,6 +4053,17 @@ int num_ioctls; !Idrivers/gpu/drm/i915/intel_fbc.c +Display Refresh Rate Switching (DRRS) +!Pdrivers/gpu/drm/i915/intel_dp.c Display Refresh Rate Switching (DRRS) +!Fdrivers/gpu/drm/i915/intel_dp.c intel_dp_set_drrs_state +!Fdrivers/gpu/drm/i915/intel_dp.c intel_edp_drrs_enable +!Fdrivers/gpu/drm/i915/intel_dp.c intel_edp_drrs_disable +!Fdrivers/gpu/drm/i915/intel_dp.c intel_edp_drrs_invalidate +!Fdrivers/gpu/drm/i915/intel_dp.c intel_edp_drrs_flush +!Fdrivers/gpu/drm/i915/intel_dp.c intel_dp_drrs_init + + + DPIO !Pdrivers/gpu/drm/i915/i915_reg.h DPIO diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index dfbe97d..e9862e7 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4736,6 +4736,18 @@ intel_dp_init_panel_power_sequencer_registers(struct drm_device *dev, I915_READ(pp_div_reg)); } +/** + * intel_dp_set_drrs_state - program registers for RR switch to take effect + * @dev: DRM device + * @refresh_rate: RR to be programmed + * + * This function gets called when refresh rate (RR) has to be changed from + * one frequency to another. Switches can be between high and low RR + * supported by the panel or to any other RR based on media playback (in + * this case, RR value needs to be passed from user space). + * + * The caller of this function needs to take a lock on dev_priv->drrs. + */ static void intel_dp_set_drrs_state(struct drm_device *dev, int refresh_rate) { struct drm_i915_private *dev_priv = dev->dev_private; @@ -4828,6 +4840,12 @@ static void intel_dp_set_drrs_state(struct drm_device *dev, int refresh_rate) DRM_DEBUG_KMS("eDP Refresh Rate set to : %dHz\n", refresh_rate); } +/** + * intel_edp_drrs_enable - init drrs struct if supported + * @intel_dp: DP struct + * + * Initializes frontbuffer_bits and drrs.dp + */ void intel_edp_drrs_enable(struct intel_dp *intel_dp) { struct drm_device *dev = intel_dp_to_dev(intel_dp); @@ -4855,6 +4873,11 @@ unlock: mutex_unlock(&dev_priv->drrs.mutex); } +/** + * intel_edp_drrs_disable - Disable DRRS + * @intel_dp: DP struct + * + */ void intel_edp_drrs_disable(struct intel_dp *intel_dp) { struct drm_device *dev = intel_dp_to_dev(intel_dp); @@ -4914,6 +4937,17 @@ unlock: mutex_unlock(&dev_priv->drrs.mutex); } +/** + * intel_edp_drrs_invalidate - Invalidate DRRS + * @dev: DRM device + * @frontbuffer_bits: frontbuffer plane tracking bits + * + * When there is a disturbance on screen (due to cursor movement/time + * update etc), DRRS needs to be invalidated, i.e. need to switch to + * high RR. + * + * Dirty frontbuffers relevant to DRRS are tracked in busy_frontbuffer_bits. + */ void intel_edp_drrs_invalidate(struct drm_device *dev, unsigned frontbuffer_bits) { @@ -4941,6 +4975,17 @@ void intel_edp_drrs_invalidate(struct drm_device *dev, mutex_unlock(&dev_priv->drrs.mutex); } +/** + * intel_edp_drrs_flush - Flush DRRS + * @dev: DRM device + * @frontbuffer_bits: frontbuffer plane tracking bits + * + * When there is no movement on screen, DRRS work can be scheduled. + * This DRRS work is responsible for setting relevant registers after a + * timeout of 1 second. + * + * Dirty frontbuffers relevant to DRRS are tracked in busy_frontbuffer_bits. + */ void intel_edp_drrs_flush(struct drm_device *dev, unsigned frontbuffer_bits) { @@ -4965,6 +5010,56 @@ void intel_edp_drrs_flush(struct drm_device *dev, mutex_unlock(&dev_priv->drrs.mutex); } +/** + * DOC: Display Refresh Rate Switching (DRRS) + * + * Display Refresh Rate Switching (DRRS) is a power conservation feature + * which enables swtching between low and high refresh rates, + * dynamically, based on the usage scenario. This feature is applicable + * for internal panels. + * + * Indication that the panel supports DRRS is given by the panel EDID, which + * would list multiple refresh rates for one resolution. + * + * DRRS is of 2 types - static and seamless. + * Static DRRS involves changing refresh rate (RR) by doing a full modeset + * (may appear as a blink on screen) and is used in dock-undock scenario. + * Seamless DRRS involves changing RR without any visual effect to the user + * and can be used during normal system usage. This is done by programming + * certain registers. + * + * Supp
Re: [Intel-gfx] [RFC 2/2] drm/i915: Clean-up PPGTT on context destruction
On Fri, Feb 13, 2015 at 10:55:46AM +0100, Daniel Vetter wrote: > On Thu, Feb 12, 2015 at 09:03:06PM +, Chris Wilson wrote: > > On Thu, Feb 12, 2015 at 08:05:02PM +, rafael.barba...@intel.com wrote: > > > From: Rafael Barbalho > > > > > > With full PPGTT enabled an object's VMA entry into a PPGTT VM needs to be > > > cleaned up so that the PPGTT PDE & PTE allocations can be freed. > > > > > > This problem only shows up with full PPGTT because an object's VMA is > > > only cleaned-up when the object is destroyed. However, if the object has > > > been shared between multiple processes this may not happen, which leads to > > > references to the PPGTT still being kept the object was shared. > > > > > > Under android the sharing of GEM objects is a fairly common operation, > > > thus > > > the clean-up has to be more agressive. > > > > Not quite. You need an active refcount as we do not expect close(fd) to > > stall. The trick is to "simply" use requests to retire vma (as well as > > the object management it does today, though that just becomes a second > > layer for GEM API management, everything else goes through vma). > > Linking into the ctx unref should give us that for free since requests do > hold a reference on the context. So this will only be run when the buffers > are idle. > > Well except that our unbind code is too dense to do that correctly for > shared buffers, so we need to move obj->active to vma->active first. We unbind vma, so what do you mean? This is how I forsee the code: static int context_idr_cleanup(int id, void *p, void *data) { struct intel_context *ctx = p; if (ctx->ppgtt && !i915_gem_context_is_default(ctx)) { struct list_head *list; struct i915_vma *vma; /* Decouple the remaining vma to keep the next lookup fast */ list = &ctx->ppgtt->base.vma_list; while (!list_empty(list)) { vma = list_first_entry(list, typeof(*vma), vm_link); list_del_init(&vma->vm_link); list_del_init(&vma->obj_link); i915_vma_put(vma); } /* Drop active references to this vm upon retire */ ctx->ppgtt->base.closed = true; /* Drop all inactive references (via vma->vm reference) */ list = &ctx->ppgtt->base.inactive_list; while (!list_empty(list)) { struct drm_i915_gem_object *obj; int ret; vma = list_first_entry(list, typeof(*vma), mm_list); obj = vma->obj; drm_gem_object_reference(&obj->base); ret = i915_vma_unbind(vma); drm_gem_object_unreference(&obj->base); if (WARN_ON(ret)) break; } } ctx->file_priv = NULL; i915_gem_context_unreference(ctx); return 0; } -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Add process identifier to requests
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang...@intel.com) Task id: 5766 -Summary- Platform Delta drm-intel-nightly Series Applied PNV 282/282 282/282 ILK 313/313 313/313 SNB 309/323 309/323 IVB 380/380 380/380 BYT 296/296 296/296 HSW -1 425/425 424/425 BDW -1 318/318 317/318 -Detailed- Platform Testdrm-intel-nightly Series Applied *HSW igt_gem_storedw_loop_vebox PASS(2) DMESG_WARN(1)PASS(1) *BDW igt_gem_gtt_hog PASS(8) DMESG_WARN(1)PASS(1) Note: You need to pay more attention to line start with '*' ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/4] drm/irq: Add drm_crtc_vblank_reset
On pe, 2015-02-13 at 08:44 +0100, Daniel Vetter wrote: > On Thu, Feb 12, 2015 at 11:56:50PM +0200, Imre Deak wrote: > > On Tue, 2015-02-03 at 11:30 +0100, Daniel Vetter wrote: > > > At driver load we need to tell the vblank code about the state of the > > > pipes, so that the logic around reject vblank_get when the pipe is off > > > works correctly. > > > > > > Thus far i915 used drm_vblank_off, but one of the side-effects of it > > > is that it also saves the vblank counter. And for that it calls down > > > into the ->get_vblank_counter hook. Which isn't really a good idea > > > when the pipe is off for a few reasons: > > > - With runtime pm the register might not respond. > > > - If the pipe is off some datastructures might not be around or > > > unitialized. > > > > > > The later is what blew up on gen3: We look at intel_crtc->config to > > > compute the vblank counter, and for a disabled pipe at boot-up that's > > > just not there. Thus far this was papered over by a check for > > > intel_crtc->active, but I want to get rid of that (since it's fairly > > > race, vblank hooks are called from all kinds of places). > > > > > > So prep for that by adding a _reset functions which only does what we > > > really need to be done at driver load: Mark the vblank pipe as off, > > > but don't do any vblank counter saving or event flushing - neither of > > > that is required. > > > > > > Cc: Laurent Pinchart > > > Signed-off-by: Daniel Vetter > > > --- > > > drivers/gpu/drm/drm_irq.c| 32 > > > > > > drivers/gpu/drm/i915/intel_display.c | 4 ++-- > > > include/drm/drmP.h | 1 + > > > 3 files changed, 35 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > > > index 75647e7f012b..1e5fb1b994d7 100644 > > > --- a/drivers/gpu/drm/drm_irq.c > > > +++ b/drivers/gpu/drm/drm_irq.c > > > @@ -1226,6 +1226,38 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > > > EXPORT_SYMBOL(drm_crtc_vblank_off); > > > > > > /** > > > + * drm_crtc_vblank_reset - reset vblank state to off on a CRTC > > > + * @crtc: CRTC in question > > > + * > > > + * Drivers can use this function to reset the vblank state to off at > > > load time. > > > + * Drivers should use this together with the drm_crtc_vblank_off() and > > > + * drm_crtc_vblank_on() functions. The diffrence comparet to > > > + * drm_crtc_vblank_off() is that this function doesn't save the vblank > > > counter > > > + * and hence doesn't need to call any driver hooks. > > > + */ > > > +void drm_crtc_vblank_reset(struct drm_crtc *drm_crtc) > > > +{ > > > + struct drm_device *dev = drm_crtc->dev; > > > + unsigned long irqflags; > > > + int crtc = drm_crtc_index(drm_crtc); > > > + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; > > > + > > > + spin_lock_irqsave(&dev->vbl_lock, irqflags); > > > + /* > > > + * Prevent subsequent drm_vblank_get() from enabling the vblank > > > + * interrupt by bumping the refcount. > > > + */ > > > + if (!vblank->inmodeset) { > > > + atomic_inc(&vblank->refcount); > > > + vblank->inmodeset = 1; > > > + } > > > + spin_unlock_irqrestore(&dev->vbl_lock, irqflags); > > > + > > > + WARN_ON(!list_empty(&dev->vblank_event_list)); > > > +} > > > +EXPORT_SYMBOL(drm_crtc_vblank_reset); > > > + > > > +/** > > > * drm_vblank_on - enable vblank events on a CRTC > > > * @dev: DRM device > > > * @crtc: CRTC in question > > > diff --git a/drivers/gpu/drm/i915/intel_display.c > > > b/drivers/gpu/drm/i915/intel_display.c > > > index 423ef959264d..f8871a184747 100644 > > > --- a/drivers/gpu/drm/i915/intel_display.c > > > +++ b/drivers/gpu/drm/i915/intel_display.c > > > @@ -13296,9 +13296,9 @@ static void intel_sanitize_crtc(struct intel_crtc > > > *crtc) > > > /* restore vblank interrupts to correct state */ > > > if (crtc->active) { > > > update_scanline_offset(crtc); > > > - drm_vblank_on(dev, crtc->pipe); > > > + drm_crtc_vblank_on(&crtc->base); > > > } else > > > - drm_vblank_off(dev, crtc->pipe); > > > + drm_crtc_vblank_reset(&crtc->base); > > > > Since DRM_IOCTL_WAIT_VBLANK is an unlocked ioctl it could trigger the > > WARN in drm_crtc_vblank_reset() if the ioctl is called during driver > > loading. I know it's a corner case and that probably other ioctls are > > already broken in this regard, but we could try not to make things > > worse. One way to that would be to call drm_crtc_vblank_reset() > > unconditionally as Ville suggested, but before enabling irqs. > > You can't open the drm file until driver load completes, drm_global_mutex > ensures that. Oops, I didn't think about checking open too. What I wrote above can be ignored then. > Which is totally not how it's supposed to be done (correct > way is to delay registering the dev node until it's all loaded), but until > we've completely ripped out UMS we can't switch over. Agreed. Also driver u
[Intel-gfx] [PATCH 07/51] drm/i915: Early alloc request in execbuff
From: John Harrison Start of explicit request management in the execbuffer code path. This patch adds a call to allocate a request structure before all the actual hardware work is done. Thus guaranteeing that all that work is tagged by a known request. At present, nothing further is done with the request, the rest comes later in the series. The only noticable change is that failure to get a request (e.g. due to lack of memory) will be caught earlier in the sequence. It now occurs right at the start before any un-undoable work has been done. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index ca85803..61471e9 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1356,7 +1356,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, const u32 ctx_id = i915_execbuffer2_get_context_id(*args); u32 dispatch_flags; int ret; - bool need_relocs; + bool need_relocs, batch_pinned = false; if (!i915_gem_check_execbuffer(args)) return -EINVAL; @@ -1525,10 +1525,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) goto err; + batch_pinned = true; params->batch_obj_vm_offset = i915_gem_obj_ggtt_offset(batch_obj); } else params->batch_obj_vm_offset = i915_gem_obj_offset(batch_obj, vm); + /* Allocate a request for this batch buffer nice and early. */ + ret = dev_priv->gt.alloc_request(ring, ctx); + if (ret) + goto err; + /* * Save assorted stuff away to pass through to *_submission(). * NB: This data should be 'persistent' and not local as it will @@ -1544,15 +1550,16 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, ret = dev_priv->gt.do_execbuf(params, args, &eb->vmas); +err: /* * FIXME: We crucially rely upon the active tracking for the (ppgtt) * batch vma for correctness. For less ugly and less fragility this * needs to be adjusted to also track the ggtt batch vma properly as * active. */ - if (dispatch_flags & I915_DISPATCH_SECURE) + if (batch_pinned) i915_gem_object_ggtt_unpin(batch_obj); -err: + /* the request owns the ref now */ i915_gem_context_unreference(ctx); eb_destroy(eb); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 16/51] drm/i915: Update i915_gpu_idle() to manage its own request
From: John Harrison Added explicit request creation and submission to the GPU idle code path. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 00a031b..8923ecd 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3109,11 +3109,27 @@ int i915_gpu_idle(struct drm_device *dev) /* Flush everything onto the inactive list. */ for_each_ring(ring, dev_priv, i) { if (!i915.enable_execlists) { - ret = i915_switch_context(ring, ring->default_context); + struct drm_i915_gem_request *req; + + ret = dev_priv->gt.alloc_request(ring, ring->default_context, &req); if (ret) return ret; + + ret = i915_switch_context(req->ring, ring->default_context); + if (ret) { + i915_gem_request_unreference(req); + return ret; + } + + ret = i915_add_request_no_flush(req->ring); + if (ret) { + i915_gem_request_unreference(req); + return ret; + } } + WARN_ON(ring->outstanding_lazy_request); + ret = intel_ring_idle(ring); if (ret) return ret; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 10/51] drm/i915: Update the dispatch tracepoint to use params->request
From: John Harrison Updated a couple of trace points to use the now cached request pointer rather than extracting it from the ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- drivers/gpu/drm/i915/intel_lrc.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 10462f6..883cabd 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1277,7 +1277,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, return ret; } - trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags); + trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); i915_gem_execbuffer_move_to_active(vmas, ring); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 325ef2c..2ab6922 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -703,7 +703,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, if (ret) return ret; - trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags); + trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); i915_gem_execbuffer_move_to_active(vmas, ring); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 03/51] drm/i915: Cache ringbuf pointer in request structure
From: John Harrison In execlist mode, the ringbuf is a function of the ring and context whereas in legacy mode, it is derived from the ring alone. Thus the calculation required to determine the ringbuf pointer from the ring (and context) also needs to test execlist mode or not. This is messy. Further, the request structure holds a pointer to both the ring and the context for which it was created. Thus, given a request, it is possible to derive the ringbuf in either legacy or execlist mode. Hence it is necessary to pass just the request in to all the low level functions rather than some combination of request, ring, context and ringbuf. However, rather than recalculating it each time, it is much simpler to just cache the ringbuf pointer in the request structure itself. Caching the pointer means the calculation is done one at request creation time and all further code and simply read it directly from the request structure. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |3 ++- drivers/gpu/drm/i915/i915_gem.c | 14 +- drivers/gpu/drm/i915/intel_lrc.c|6 -- drivers/gpu/drm/i915/intel_ringbuffer.c |1 + 4 files changed, 8 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f2a825e..e90b786 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2137,8 +2137,9 @@ struct drm_i915_gem_request { /** Position in the ringbuffer of the end of the whole request */ u32 tail; - /** Context related to this request */ + /** Context and ring buffer related to this request */ struct intel_context *ctx; + struct intel_ringbuffer *ringbuf; /** Batch buffer related to this request if any */ struct drm_i915_gem_object *batch_obj; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c26d36c..2ebe914 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2758,7 +2758,6 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring) while (!list_empty(&ring->request_list)) { struct drm_i915_gem_request *request; - struct intel_ringbuffer *ringbuf; request = list_first_entry(&ring->request_list, struct drm_i915_gem_request, @@ -2769,23 +2768,12 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring) trace_i915_gem_request_retire(request); - /* This is one of the few common intersection points -* between legacy ringbuffer submission and execlists: -* we need to tell them apart in order to find the correct -* ringbuffer to which the request belongs to. -*/ - if (i915.enable_execlists) { - struct intel_context *ctx = request->ctx; - ringbuf = ctx->engine[ring->id].ringbuf; - } else - ringbuf = ring->buffer; - /* We know the GPU must have read the request to have * sent us the seqno + interrupt, so use the position * of tail of the request to update the last known position * of the GPU head. */ - ringbuf->last_retired_head = request->postfix; + request->ringbuf->last_retired_head = request->postfix; i915_gem_free_request(request); } diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 73c1861..762136b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -878,12 +878,14 @@ static int logical_ring_alloc_request(struct intel_engine_cs *ring, return ret; } - /* Hold a reference to the context this request belongs to + /* +* Hold a reference to the context this request belongs to * (we will need it when the time comes to emit/retire the -* request). +* request). Likewise, the ringbuff is useful to keep track of. */ request->ctx = ctx; i915_gem_context_reference(request->ctx); + request->ringbuf = ctx->engine[ring->id].ringbuf; ring->outstanding_lazy_request = request; return 0; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index d611608..ca9e7e6 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2100,6 +2100,7 @@ intel_ring_alloc_request(struct intel_engine_cs *ring) kref_init(&request->ref); request->ring = ring; + request->ringbuf = ring->buffer; request->uniq = dev_private->request_uniq++; ret = i915_gem_get_seqno(ring->dev, &request->seqno); -- 1.7.9.5
[Intel-gfx] [PATCH 15/51] drm/i915: Update i915_gem_object_sync() to take a request structure
From: John Harrison The plan is to pass requests around as the basic submission tracking structure rather than rings and contexts. This patch updates the i915_gem_object_sync() code path. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h|2 +- drivers/gpu/drm/i915/i915_gem.c|7 --- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- drivers/gpu/drm/i915/intel_lrc.c |2 +- 4 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 5c87876..1bfb8d3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2728,7 +2728,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); int i915_gem_object_sync(struct drm_i915_gem_object *obj, -struct intel_engine_cs *to); +struct drm_i915_gem_request *to_req); void i915_vma_move_to_active(struct i915_vma *vma, struct intel_engine_cs *ring); int i915_gem_dumb_create(struct drm_file *file_priv, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ef561e5..00a031b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2951,7 +2951,7 @@ out: * i915_gem_object_sync - sync an object to a ring. * * @obj: object which may be in use on another ring. - * @to: ring we wish to use the object on. May be NULL. + * @to_req: request we wish to use the object for. May be NULL. * * This code is meant to abstract object synchronization with the GPU. * Calling with NULL implies synchronizing the object with the CPU @@ -2961,8 +2961,9 @@ out: */ int i915_gem_object_sync(struct drm_i915_gem_object *obj, -struct intel_engine_cs *to) +struct drm_i915_gem_request *to_req) { + struct intel_engine_cs *to = to_req ? to_req->ring : NULL; struct intel_engine_cs *from; u32 seqno; int ret, idx; @@ -3948,7 +3949,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, if (ret) return ret; - ret = i915_gem_object_sync(obj, req->ring); + ret = i915_gem_object_sync(obj, req); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 76f6dcf..2cd0579 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -838,7 +838,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; - ret = i915_gem_object_sync(obj, req->ring); + ret = i915_gem_object_sync(obj, req); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 318500c..cb12eea 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -580,7 +580,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; - ret = i915_gem_object_sync(obj, req->ring); + ret = i915_gem_object_sync(obj, req); if (ret) return ret; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 44/51] drm/i915: Update ring->sync_to() to take a request structure
From: John Harrison Updated the ring->sync_to() implementations to take a request instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c |2 +- drivers/gpu/drm/i915/intel_ringbuffer.c |6 -- drivers/gpu/drm/i915/intel_ringbuffer.h |4 ++-- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 38f8a3b..e60ea05 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2980,7 +2980,7 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj, return ret; trace_i915_gem_ring_sync_to(from, to, obj->last_read_req); - ret = to->semaphore.sync_to(to, from, seqno); + ret = to->semaphore.sync_to(to_req, from, seqno); if (!ret) /* We use last_read_req because sync_to() * might have just caused seqno wrap under diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index cf23767..aa521c7 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1126,10 +1126,11 @@ static inline bool i915_gem_has_seqno_wrapped(struct drm_device *dev, */ static int -gen8_ring_sync(struct intel_engine_cs *waiter, +gen8_ring_sync(struct drm_i915_gem_request *waiter_req, struct intel_engine_cs *signaller, u32 seqno) { + struct intel_engine_cs *waiter = waiter_req->ring; struct drm_i915_private *dev_priv = waiter->dev->dev_private; int ret; @@ -1151,10 +1152,11 @@ gen8_ring_sync(struct intel_engine_cs *waiter, } static int -gen6_ring_sync(struct intel_engine_cs *waiter, +gen6_ring_sync(struct drm_i915_gem_request *waiter_req, struct intel_engine_cs *signaller, u32 seqno) { + struct intel_engine_cs *waiter = waiter_req->ring; u32 dw1 = MI_SEMAPHORE_MBOX | MI_SEMAPHORE_COMPARE | MI_SEMAPHORE_REGISTER; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 514ddcb..4b4fd2d 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -219,8 +219,8 @@ struct intel_engine_cs { }; /* AKA wait() */ - int (*sync_to)(struct intel_engine_cs *ring, - struct intel_engine_cs *to, + int (*sync_to)(struct drm_i915_gem_request *to_req, + struct intel_engine_cs *from, u32 seqno); int (*signal)(struct intel_engine_cs *signaller, /* num_dwords needed by caller */ -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 13/51] drm/i915: Add flag to i915_add_request() to skip the cache flush
From: John Harrison In order to explcitly track all GPU work (and completely remove the outstanding lazy request), it is necessary to add extra i915_add_request() calls to various places. Some of these do not need the implicit cache flush done as part of the standard batch buffer submission process. This patch adds a flag to _add_request() to specify whether the flush is required or not. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |7 +-- drivers/gpu/drm/i915/i915_gem.c | 25 +++-- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- drivers/gpu/drm/i915/i915_gem_render_state.c |2 +- drivers/gpu/drm/i915/intel_lrc.c |2 +- 5 files changed, 19 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 21a2b35..5c87876 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2815,9 +2815,12 @@ int __must_check i915_gpu_idle(struct drm_device *dev); int __must_check i915_gem_suspend(struct drm_device *dev); int __i915_add_request(struct intel_engine_cs *ring, struct drm_file *file, - struct drm_i915_gem_object *batch_obj); + struct drm_i915_gem_object *batch_obj, + bool flush_caches); #define i915_add_request(ring) \ - __i915_add_request(ring, NULL, NULL) + __i915_add_request(ring, NULL, NULL, true) +#define i915_add_request_no_flush(ring) \ + __i915_add_request(ring, NULL, NULL, false) int __i915_wait_request(struct drm_i915_gem_request *req, unsigned reset_counter, bool interruptible, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9546992..96f9155 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2408,7 +2408,8 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) int __i915_add_request(struct intel_engine_cs *ring, struct drm_file *file, - struct drm_i915_gem_object *obj) + struct drm_i915_gem_object *obj, + bool flush_caches) { struct drm_i915_private *dev_priv = ring->dev->dev_private; struct drm_i915_gem_request *request; @@ -2433,12 +2434,11 @@ int __i915_add_request(struct intel_engine_cs *ring, * is that the flush _must_ happen before the next request, no matter * what. */ - if (i915.enable_execlists) { - ret = logical_ring_flush_all_caches(ringbuf, request->ctx); - if (ret) - return ret; - } else { - ret = intel_ring_flush_all_caches(ring); + if (flush_caches) { + if (i915.enable_execlists) + ret = logical_ring_flush_all_caches(ringbuf, request->ctx); + else + ret = intel_ring_flush_all_caches(ring); if (ret) return ret; } @@ -2450,15 +2450,12 @@ int __i915_add_request(struct intel_engine_cs *ring, */ request->postfix = intel_ring_get_tail(ringbuf); - if (i915.enable_execlists) { + if (i915.enable_execlists) ret = ring->emit_request(ringbuf, request); - if (ret) - return ret; - } else { + else ret = ring->add_request(ring); - if (ret) - return ret; - } + if (ret) + return ret; request->head = request_start; request->tail = intel_ring_get_tail(ringbuf); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index f7c19bc..76f6dcf 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -997,7 +997,7 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params) /* Add a breadcrumb for the completion of the batch buffer */ return __i915_add_request(params->ring, params->file, - params->batch_obj); + params->batch_obj, true); } static int diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 521548a..aba39c3 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -173,7 +173,7 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring) i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring); - ret = __i915_add_request(ring, NULL, so.obj); + ret = __i915_add_request(ring, NULL, so.obj, true); /* __i915_add_request moves object to inactive if it fails */ out: i915_gem_render_state_fini(&so); diff
[Intel-gfx] [PATCH 41/51] drm/i915: Update ring->emit_request() to take a request structure
From: John Harrison Updated the ring->emit_request() implementation to take a request instead of a ringbuf/request pair. Also removed it's use of the OLR for obtaining the request's seqno. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c |2 +- drivers/gpu/drm/i915/intel_lrc.c|7 +++ drivers/gpu/drm/i915/intel_ringbuffer.h |3 +-- 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 616b34a..38f8a3b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2450,7 +2450,7 @@ int __i915_add_request(struct drm_i915_gem_request *request, request->postfix = intel_ring_get_tail(ringbuf); if (i915.enable_execlists) - ret = ring->emit_request(ringbuf, request); + ret = ring->emit_request(request); else ret = ring->add_request(request); if (ret) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 21bda2d..02769f8 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1292,9 +1292,9 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno) intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno); } -static int gen8_emit_request(struct intel_ringbuffer *ringbuf, -struct drm_i915_gem_request *request) +static int gen8_emit_request(struct drm_i915_gem_request *request) { + struct intel_ringbuffer *ringbuf = request->ringbuf; struct intel_engine_cs *ring = ringbuf->ring; u32 cmd; int ret; @@ -1311,8 +1311,7 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf, (ring->status_page.gfx_addr + (I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT))); intel_logical_ring_emit(ringbuf, 0); - intel_logical_ring_emit(ringbuf, - i915_gem_request_get_seqno(ring->outstanding_lazy_request)); + intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request)); intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT); intel_logical_ring_emit(ringbuf, MI_NOOP); intel_logical_ring_advance_and_submit(ringbuf, request->ctx, request); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index e33e010..ef20c49 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -233,8 +233,7 @@ struct intel_engine_cs { struct list_head execlist_retired_req_list; u8 next_context_status_buffer; u32 irq_keep_mask; /* bitmask for interrupts that should not be masked */ - int (*emit_request)(struct intel_ringbuffer *ringbuf, - struct drm_i915_gem_request *request); + int (*emit_request)(struct drm_i915_gem_request *request); int (*emit_flush)(struct drm_i915_gem_request *request, u32 invalidate_domains, u32 flush_domains); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 36/51] drm/i915: Update switch_mm() to take a request structure
From: John Harrison Updated the switch_mm() code paths to take a request instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_context.c |2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 20 drivers/gpu/drm/i915/i915_gem_gtt.h |2 +- 3 files changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 9e66fac..816a442 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -591,7 +591,7 @@ static int do_switch(struct drm_i915_gem_request *req) if (to->ppgtt) { trace_switch_mm(ring, to); - ret = to->ppgtt->switch_mm(to->ppgtt, req->ring); + ret = to->ppgtt->switch_mm(to->ppgtt, req); if (ret) goto unpin_out; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index e7c9137..89bbc2c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -276,9 +276,10 @@ static gen6_gtt_pte_t iris_pte_encode(dma_addr_t addr, } /* Broadwell Page Directory Pointer Descriptors */ -static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry, - uint64_t val) +static int gen8_write_pdp(struct drm_i915_gem_request *req, unsigned entry, + uint64_t val) { + struct intel_engine_cs *ring = req->ring; int ret; BUG_ON(entry >= 4); @@ -299,7 +300,7 @@ static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry, } static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, - struct intel_engine_cs *ring) + struct drm_i915_gem_request *req) { int i, ret; @@ -308,7 +309,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt, for (i = used_pd - 1; i >= 0; i--) { dma_addr_t addr = ppgtt->pd_dma_addr[i]; - ret = gen8_write_pdp(ring, i, addr); + ret = gen8_write_pdp(req, i, addr); if (ret) return ret; } @@ -773,8 +774,9 @@ static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt) } static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt, -struct intel_engine_cs *ring) +struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; /* NB: TLBs must be flushed and invalidated before a switch */ @@ -798,8 +800,9 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt, } static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt, - struct intel_engine_cs *ring) + struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; /* NB: TLBs must be flushed and invalidated before a switch */ @@ -830,8 +833,9 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt, } static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt, - struct intel_engine_cs *ring) + struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; struct drm_device *dev = ppgtt->base.dev; struct drm_i915_private *dev_priv = dev->dev_private; @@ -1215,7 +1219,7 @@ int i915_ppgtt_init_ring(struct drm_i915_gem_request *req) if (!ppgtt) return 0; - return ppgtt->switch_mm(ppgtt, req->ring); + return ppgtt->switch_mm(ppgtt, req); } struct i915_hw_ppgtt * diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 0804bbc..96a58d4 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -289,7 +289,7 @@ struct i915_hw_ppgtt { int (*enable)(struct i915_hw_ppgtt *ppgtt); int (*switch_mm)(struct i915_hw_ppgtt *ppgtt, -struct intel_engine_cs *ring); +struct drm_i915_gem_request *req); void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m); }; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 28/51] drm/i915: Update queue_flip() to do explicit request management
From: John Harrison Updated the display page flip code to do explicit request creation and submission rather than relying on the OLR and just hoping that the request actually gets submitted at some random point. The sequence is now to create a request, queue the work to the ring, assign the known request to the flip queue work item then actually submit the work and post the request. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |2 +- drivers/gpu/drm/i915/intel_display.c| 47 --- drivers/gpu/drm/i915/intel_ringbuffer.c |2 +- drivers/gpu/drm/i915/intel_ringbuffer.h |1 - 4 files changed, 32 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e5132d3..4b82b2e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -559,7 +559,7 @@ struct drm_i915_display_funcs { int (*queue_flip)(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, struct drm_i915_gem_object *obj, - struct intel_engine_cs *ring, + struct drm_i915_gem_request *req, uint32_t flags); void (*update_primary_plane)(struct drm_crtc *crtc, struct drm_framebuffer *fb, diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ebf973c..f23f28e 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9201,9 +9201,10 @@ static int intel_gen2_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, struct drm_i915_gem_object *obj, -struct intel_engine_cs *ring, +struct drm_i915_gem_request *req, uint32_t flags) { + struct intel_engine_cs *ring = req->ring; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); u32 flip_mask; int ret; @@ -9228,7 +9229,6 @@ static int intel_gen2_queue_flip(struct drm_device *dev, intel_ring_emit(ring, 0); /* aux display base address, unused */ intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); return 0; } @@ -9236,9 +9236,10 @@ static int intel_gen3_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, struct drm_i915_gem_object *obj, -struct intel_engine_cs *ring, +struct drm_i915_gem_request *req, uint32_t flags) { + struct intel_engine_cs *ring = req->ring; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); u32 flip_mask; int ret; @@ -9260,7 +9261,6 @@ static int intel_gen3_queue_flip(struct drm_device *dev, intel_ring_emit(ring, MI_NOOP); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); return 0; } @@ -9268,9 +9268,10 @@ static int intel_gen4_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, struct drm_i915_gem_object *obj, -struct intel_engine_cs *ring, +struct drm_i915_gem_request *req, uint32_t flags) { + struct intel_engine_cs *ring = req->ring; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); uint32_t pf, pipesrc; @@ -9299,7 +9300,6 @@ static int intel_gen4_queue_flip(struct drm_device *dev, intel_ring_emit(ring, pf | pipesrc); intel_mark_page_flip_active(intel_crtc); - __intel_ring_advance(ring); return 0; } @@ -9307,9 +9307,10 @@ static int intel_gen6_queue_flip(struct drm_device *dev, struct drm_crtc *crtc, struct drm_framebuffer *fb, struct drm_i915_gem_object *obj, -struct intel_engine_cs *ring, +struct drm_i915_gem_request *req, uint32_t flags) { + struct intel_engine_cs *ring = req->ring; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_crtc *intel_crtc = to_intel_crtc(crtc); uint32_t pf, pipesrc; @@ -9335,7 +9336,6 @@ static int intel_gen6_queue_flip(struct drm_device *dev, intel_ring_emit(ring, pf | pipesrc); intel_mark_page_
[Intel-gfx] [PATCH 04/51] drm/i915: Merged the many do_execbuf() parameters into a structure
From: John Harrison The do_execbuf() function takes quite a few parameters. The actual set of parameters is going to change with the conversion to passing requests around. Further, it is due to grow massively with the arrival of the GPU scheduler. This patch simplies the prototype by passing a parameter structure instead. Changing the parameter set in the future is then simply a matter of adding/removing items to the structure. Note that the structure does not contain absolutely everything that is passed in. This is because the intention is to use this structure more extensively later in this patch series and more especially in the GPU scheduler that is coming soon. The latter requires hanging on to the structure as the final hardware submission can be delayed until long after the execbuf IOCTL has returned to user land. Thus it is unsafe to put anything in the structure that is local to the IOCTL call itself - such as the 'args' parameter. All entries must be copies of data or pointers to structures that are reference counted in someway and guaranteed to exist for the duration of the batch buffer's life. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h| 27 +++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 56 ++-- drivers/gpu/drm/i915/intel_lrc.c | 26 +++-- drivers/gpu/drm/i915/intel_lrc.h |9 ++--- 4 files changed, 67 insertions(+), 51 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e90b786..e6d616b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1640,6 +1640,16 @@ struct i915_workarounds { u32 count; }; +struct i915_execbuffer_params { + struct drm_device *dev; + struct drm_file *file; + uint32_tdispatch_flags; + uint32_tbatch_obj_vm_offset; + struct intel_engine_cs *ring; + struct drm_i915_gem_object *batch_obj; + struct intel_context*ctx; +}; + struct drm_i915_private { struct drm_device *dev; struct kmem_cache *slab; @@ -1891,13 +1901,9 @@ struct drm_i915_private { /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ struct { - int (*do_execbuf)(struct drm_device *dev, struct drm_file *file, - struct intel_engine_cs *ring, - struct intel_context *ctx, + int (*do_execbuf)(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, - struct list_head *vmas, - struct drm_i915_gem_object *batch_obj, - u64 exec_start, u32 flags); + struct list_head *vmas); int (*init_rings)(struct drm_device *dev); void (*cleanup_ring)(struct intel_engine_cs *ring); void (*stop_ring)(struct intel_engine_cs *ring); @@ -2622,14 +2628,9 @@ void i915_gem_execbuffer_retire_commands(struct drm_device *dev, struct drm_file *file, struct intel_engine_cs *ring, struct drm_i915_gem_object *obj); -int i915_gem_ringbuffer_submission(struct drm_device *dev, - struct drm_file *file, - struct intel_engine_cs *ring, - struct intel_context *ctx, +int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, - struct list_head *vmas, - struct drm_i915_gem_object *batch_obj, - u64 exec_start, u32 flags); + struct list_head *vmas); int i915_gem_execbuffer(struct drm_device *dev, void *data, struct drm_file *file_priv); int i915_gem_execbuffer2(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index ec9ea45..93b0ef0 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1132,17 +1132,15 @@ i915_gem_execbuffer_parse(struct intel_engine_cs *ring, } int -i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, - struct intel_engine_cs *ring, - struct intel_context *ctx, +i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, -
[Intel-gfx] [PATCH 05/51] drm/i915: Add return code check to i915_gem_execbuffer_retire_commands()
From: John Harrison For some reason, the i915_add_request() call in i915_gem_execbuffer_retire_commands() was explicitly having its return code ignored. The _retire_commands() function itself was 'void'. Given that _add_request() can fail without dispatching the batch buffer, this seems odd. Also shrunk the parameter list to a single structure as everything it requires is available in the execbuff_params object. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h|5 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++- drivers/gpu/drm/i915/intel_lrc.c |3 +-- 3 files changed, 9 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e6d616b..143bc63 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2624,10 +2624,7 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); void i915_gem_execbuffer_move_to_active(struct list_head *vmas, struct intel_engine_cs *ring); -void i915_gem_execbuffer_retire_commands(struct drm_device *dev, -struct drm_file *file, -struct intel_engine_cs *ring, -struct drm_i915_gem_object *obj); +int i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params); int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 93b0ef0..ca85803 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -989,17 +989,15 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas, } } -void -i915_gem_execbuffer_retire_commands(struct drm_device *dev, - struct drm_file *file, - struct intel_engine_cs *ring, - struct drm_i915_gem_object *obj) +int +i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params) { /* Unconditionally force add_request to emit a full flush. */ - ring->gpu_caches_dirty = true; + params->ring->gpu_caches_dirty = true; /* Add a breadcrumb for the completion of the batch buffer */ - (void)__i915_add_request(ring, file, obj); + return __i915_add_request(params->ring, params->file, + params->batch_obj); } static int @@ -1282,8 +1280,8 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags); i915_gem_execbuffer_move_to_active(vmas, ring); - i915_gem_execbuffer_retire_commands(params->dev, params->file, ring, - params->batch_obj); + + ret = i915_gem_execbuffer_retire_commands(params); error: kfree(cliprects); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index ca29290..90400d0d 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -706,9 +706,8 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), params->dispatch_flags); i915_gem_execbuffer_move_to_active(vmas, ring); - i915_gem_execbuffer_retire_commands(params->dev, params->file, ring, params->batch_obj); - return 0; + return i915_gem_execbuffer_retire_commands(params); } void intel_execlists_retire_requests(struct intel_engine_cs *ring) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 14/51] drm/i915: Update pin_to_display_plane() to do explicit request management
From: John Harrison Added explicit creation creation and submission of the request structure to the display object pinning code. This removes any reliance on the OLR keeping track of the request and the unknown randomness that can ensue with other work becoming part of the same request. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 96f9155..ef561e5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -3938,9 +3938,24 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, int ret; if (pipelined != i915_gem_request_get_ring(obj->last_read_req)) { - ret = i915_gem_object_sync(obj, pipelined); - if (ret) - return ret; + if (!pipelined) { + ret = i915_gem_object_wait_rendering(obj, false); + } else { + struct drm_i915_private *dev_priv = pipelined->dev->dev_private; + struct drm_i915_gem_request *req; + + ret = dev_priv->gt.alloc_request(pipelined, pipelined->default_context, &req); + if (ret) + return ret; + + ret = i915_gem_object_sync(obj, req->ring); + if (ret) + return ret; + + ret = i915_add_request_no_flush(req->ring); + if (ret) + return ret; + } } /* Mark the pin_display early so that we account for the -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 02/51] drm/i915: Add missing trace point to LRC execbuff code path
From: John Harrison There is a trace point in the legacy execbuffer execution path that is missing from the execlist path. Trace points are extremely useful for debugging and are used by various automated validation tests. Hence, this patch adds the missing trace point back in. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_lrc.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0376285..73c1861 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -701,6 +701,8 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file, if (ret) return ret; + trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), dispatch_flags); + i915_gem_execbuffer_move_to_active(vmas, ring); i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 11/51] drm/i915: Update move_to_gpu() to take a request structure
From: John Harrison The plan is to pass requests around as the basic submission tracking structure rather than rings and contexts. This patch updates the move_to_gpu() code paths. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 +- drivers/gpu/drm/i915/intel_lrc.c | 10 -- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 883cabd..da1e232 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -828,7 +828,7 @@ err: } static int -i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring, +i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req, struct list_head *vmas) { struct i915_vma *vma; @@ -838,7 +838,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; - ret = i915_gem_object_sync(obj, ring); + ret = i915_gem_object_sync(obj, req->ring); if (ret) return ret; @@ -849,7 +849,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring, } if (flush_chipset) - i915_gem_chipset_flush(ring->dev); + i915_gem_chipset_flush(req->ring->dev); if (flush_domains & I915_GEM_DOMAIN_GTT) wmb(); @@ -857,7 +857,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_engine_cs *ring, /* Unconditionally invalidate gpu caches and ensure that we do flush * any residual writes from the previous batch. */ - return intel_ring_invalidate_all_caches(ring); + return intel_ring_invalidate_all_caches(req->ring); } static bool @@ -1186,7 +1186,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, } } - ret = i915_gem_execbuffer_move_to_gpu(ring, vmas); + ret = i915_gem_execbuffer_move_to_gpu(params->request, vmas); if (ret) goto error; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 2ab6922..1dbf4b1 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -569,11 +569,9 @@ static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf, return 0; } -static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf, -struct intel_context *ctx, +static int execlists_move_to_gpu(struct drm_i915_gem_request *req, struct list_head *vmas) { - struct intel_engine_cs *ring = ringbuf->ring; struct i915_vma *vma; uint32_t flush_domains = 0; bool flush_chipset = false; @@ -582,7 +580,7 @@ static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf, list_for_each_entry(vma, vmas, exec_list) { struct drm_i915_gem_object *obj = vma->obj; - ret = i915_gem_object_sync(obj, ring); + ret = i915_gem_object_sync(obj, req->ring); if (ret) return ret; @@ -598,7 +596,7 @@ static int execlists_move_to_gpu(struct intel_ringbuffer *ringbuf, /* Unconditionally invalidate gpu caches and ensure that we do flush * any residual writes from the previous batch. */ - return logical_ring_invalidate_all_caches(ringbuf, ctx); + return logical_ring_invalidate_all_caches(req->ringbuf, req->ctx); } /** @@ -677,7 +675,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, return -EINVAL; } - ret = execlists_move_to_gpu(ringbuf, params->ctx, vmas); + ret = execlists_move_to_gpu(params->request, vmas); if (ret) return ret; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 21/51] drm/i915: Set context in request from creation even in legacy mode
From: John Harrison In execlist mode, the context object pointer is written in to the request structure (and reference counted) at the point of request creation. In legacy mode, this only happens inside i915_add_request(). This patch updates the legacy code path to match the execlist version. This allows all the intermediate code between request creation and request submission to get at the context object given only a request structure. Thus negating the need to pass context pointers here, there and everywhere. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c |9 + drivers/gpu/drm/i915/intel_ringbuffer.c |2 ++ 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e34672e..02b921b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2469,14 +2469,7 @@ int __i915_add_request(struct intel_engine_cs *ring, WARN_ON(request->batch_obj && obj); request->batch_obj = obj; - if (!i915.enable_execlists) { - /* Hold a reference to the current context so that we can inspect -* it later in case a hangcheck error event fires. -*/ - request->ctx = ring->last_context; - if (request->ctx) - i915_gem_context_reference(request->ctx); - } + WARN_ON(request->ctx != ring->last_context); request->emitted_jiffies = jiffies; list_add_tail(&request->list, &ring->request_list); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 40b5d83..84a1e22 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2107,6 +2107,8 @@ intel_ring_alloc_request(struct intel_engine_cs *ring, request->ring = ring; request->ringbuf = ring->buffer; request->uniq = dev_private->request_uniq++; + request->ctx = ctx; + i915_gem_context_reference(request->ctx); ret = i915_gem_get_seqno(ring->dev, &request->seqno); if (ret) { -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 00/51] Remove the outstanding_lazy_request
From: John Harrison The driver tracks GPU work using request structures. Unfortunately, this tracking is not currently explicit but is done by means of a catch-all request that floats around in the background hoovering up work until it gets submitted. This background request (ring->outstanding_lazy_request or OLR) is created at the point of actually writing to the ring rather than when a particular piece of GPU work is started. This scheme sort of hangs together but causes a number of issues. It can mean that multiple pieces of independent work are lumped together in the same request or that work is not officially submitted until much later than it was created. This patch series completely removes the OLR and explicitly tracks each piece of work with it's own personal request structure from start to submission. The patch set seems to fix the "'gem_ringfill --r render' + ctrl-c straight after boot" issue logged as BZ:40112. I haven't done any analysis of that particular issue but the descriptions I've seen appear to blame an inconsistent or mangled OLR. Note also that by the end of this series, a number of differences between the legacy and execlist code paths have been removed. For example add_request() and emit_request() now have the same signature thus could be merged back to a single function pointer. Merging some of these together would also allow the removal of a bunch of 'if(execlists)' tests where the difference is simply to call the legacy function or the execlist one. [Patches against drm-intel-nightly tree fetched 03/02/2015] John Harrison (51): drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading drm/i915: Add missing trace point to LRC execbuff code path drm/i915: Cache ringbuf pointer in request structure drm/i915: Merged the many do_execbuf() parameters into a structure drm/i915: Add return code check to i915_gem_execbuffer_retire_commands() drm/i915: Wrap request allocation with a function pointer drm/i915: Early alloc request in execbuff drm/i915: Update alloc_request to return the allocated request drm/i915: Add request to execbuf params and add explicit cleanup drm/i915: Update the dispatch tracepoint to use params->request drm/i915: Update move_to_gpu() to take a request structure drm/i915: Update execbuffer_move_to_active() to take a request structure drm/i915: Add flag to i915_add_request() to skip the cache flush drm/i915: Update pin_to_display_plane() to do explicit request management drm/i915: Update i915_gem_object_sync() to take a request structure drm/i915: Update i915_gpu_idle() to manage its own request drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() drm/i915: Add explicit request management to i915_gem_init_hw() drm/i915: Update ppgtt_init_ring() & context_enable() to take requests drm/i915: Set context in request from creation even in legacy mode drm/i915: Update i915_switch_context() to take a request structure drm/i915: Update do_switch() to take a request structure drm/i915: Update deferred context creation to do explicit request management drm/i915: Update init_context() to take a request structure drm/i915: Update render_state_init() to take a request structure drm/i915: Update overlay code to do explicit request management drm/i915: Update queue_flip() to do explicit request management drm/i915: Update add_request() to take a request structure drm/i915: Update [vma|object]_move_to_active() to take request structures drm/i915: Update l3_remap to take a request structure drm/i915: Update mi_set_context() to take a request structure drm/i915: Update a bunch of execbuffer heplers to take request structures drm/i915: Update workarounds_emit() to take request structures drm/i915: Update flush_all_caches() to take request structures drm/i915: Update switch_mm() to take a request structure drm/i915: Update ring->flush() to take a requests structure drm/i915: Update some flush helpers to take request structures drm/i915: Update ring->emit_flush() to take a request structure drm/i915: Update ring->add_request() to take a request structure drm/i915: Update ring->emit_request() to take a request structure drm/i915: Update ring->dispatch_execbuffer() to take a request structure drm/i915: Update ring->emit_bb_start() to take a request structure drm/i915: Update ring->sync_to() to take a request structure drm/i915: Update ring->signal() to take a request structure drm/i915: Update cacheline_align() to take a request structure drm/i915: Update ironlake_enable_rc6() to do explicit request management drm/i915: Update intel_ring_begin() to take a request structure drm/i915: Update intel_logical_ring_begin() to take a request structure drm/i915: Remove the now obsolete intel_ring_get_request() drm/i915: Remove the now obsolete 'outstanding_lazy_request' drivers/gpu/drm/i
[Intel-gfx] [PATCH 38/51] drm/i915: Update some flush helpers to take request structures
From: John Harrison Updated intel_emit_post_sync_nonzero_flush(), gen7_render_ring_cs_stall_wa(), gen7_ring_fbc_flush() and gen8_emit_pipe_control() to take requests instead of rings. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_ringbuffer.c | 29 - 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index bc3c0e6..2f9ba79 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -214,8 +214,9 @@ gen4_render_ring_flush(struct drm_i915_gem_request *req, * really our business. That leaves only stall at scoreboard. */ static int -intel_emit_post_sync_nonzero_flush(struct intel_engine_cs *ring) +intel_emit_post_sync_nonzero_flush(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; int ret; @@ -258,7 +259,7 @@ gen6_render_ring_flush(struct drm_i915_gem_request *req, int ret; /* Force SNB workarounds for PIPE_CONTROL flushes */ - ret = intel_emit_post_sync_nonzero_flush(ring); + ret = intel_emit_post_sync_nonzero_flush(req); if (ret) return ret; @@ -302,8 +303,9 @@ gen6_render_ring_flush(struct drm_i915_gem_request *req, } static int -gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring) +gen7_render_ring_cs_stall_wa(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; ret = intel_ring_begin(ring, 4); @@ -320,8 +322,9 @@ gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring) return 0; } -static int gen7_ring_fbc_flush(struct intel_engine_cs *ring, u32 value) +static int gen7_ring_fbc_flush(struct drm_i915_gem_request *req, u32 value) { + struct intel_engine_cs *ring = req->ring; int ret; if (!ring->fbc_dirty) @@ -389,7 +392,7 @@ gen7_render_ring_flush(struct drm_i915_gem_request *req, /* Workaround: we must issue a pipe_control with CS-stall bit * set before a pipe_control command that has the state cache * invalidate bit set. */ - gen7_render_ring_cs_stall_wa(ring); + gen7_render_ring_cs_stall_wa(req); } ret = intel_ring_begin(ring, 4); @@ -403,15 +406,16 @@ gen7_render_ring_flush(struct drm_i915_gem_request *req, intel_ring_advance(ring); if (!invalidate_domains && flush_domains) - return gen7_ring_fbc_flush(ring, FBC_REND_NUKE); + return gen7_ring_fbc_flush(req, FBC_REND_NUKE); return 0; } static int -gen8_emit_pipe_control(struct intel_engine_cs *ring, +gen8_emit_pipe_control(struct drm_i915_gem_request *req, u32 flags, u32 scratch_addr) { + struct intel_engine_cs *ring = req->ring; int ret; ret = intel_ring_begin(ring, 6); @@ -433,9 +437,8 @@ static int gen8_render_ring_flush(struct drm_i915_gem_request *req, u32 invalidate_domains, u32 flush_domains) { - struct intel_engine_cs *ring = req->ring; u32 flags = 0; - u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; + u32 scratch_addr = req->ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; int ret; flags |= PIPE_CONTROL_CS_STALL; @@ -455,7 +458,7 @@ gen8_render_ring_flush(struct drm_i915_gem_request *req, flags |= PIPE_CONTROL_GLOBAL_GTT_IVB; /* WaCsStallBeforeStateCacheInvalidate:bdw,chv */ - ret = gen8_emit_pipe_control(ring, + ret = gen8_emit_pipe_control(req, PIPE_CONTROL_CS_STALL | PIPE_CONTROL_STALL_AT_SCOREBOARD, 0); @@ -463,12 +466,12 @@ gen8_render_ring_flush(struct drm_i915_gem_request *req, return ret; } - ret = gen8_emit_pipe_control(ring, flags, scratch_addr); + ret = gen8_emit_pipe_control(req, flags, scratch_addr); if (ret) return ret; if (!invalidate_domains && flush_domains) - return gen7_ring_fbc_flush(ring, FBC_REND_NUKE); + return gen7_ring_fbc_flush(req, FBC_REND_NUKE); return 0; } @@ -2388,7 +2391,7 @@ static int gen6_ring_flush(struct drm_i915_gem_request *req, if (!invalidate && flush) { if (IS_GEN7(dev)) - return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN); + return gen7_ring_fbc_flush(req, FBC_REND_CACHE_CLEAN); else if (IS_BROADWELL(dev)) dev_priv->fbc.need_sw_cache_clean = true; } -- 1.7.9.5 _
[Intel-gfx] [PATCH 01/51] drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading
From: John Harrison There is a flags word that is passed through the execbuffer code path all the way from initial decoding of the user parameters down to the very final dispatch buffer call. It is simply called 'flags'. Unfortuantely, there are many other flags words floating around in the same blocks of code. Even more once the GPU scheduler arrives. This patch makes it more obvious exactly which flags word is which by renaming 'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word already have an 'I915_DISPATCH_' prefix on them and so are not quite so ambiguous. For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 25 ++-- drivers/gpu/drm/i915/intel_lrc.c | 10 drivers/gpu/drm/i915/intel_lrc.h |2 +- drivers/gpu/drm/i915/intel_ringbuffer.c| 35 drivers/gpu/drm/i915/intel_ringbuffer.h|4 ++-- 5 files changed, 41 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index b773368..ec9ea45 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1138,7 +1138,7 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas, struct drm_i915_gem_object *batch_obj, - u64 exec_start, u32 flags) + u64 exec_start, u32 dispatch_flags) { struct drm_clip_rect *cliprects = NULL; struct drm_i915_private *dev_priv = dev->dev_private; @@ -1266,19 +1266,19 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, ret = ring->dispatch_execbuffer(ring, exec_start, exec_len, - flags); + dispatch_flags); if (ret) goto error; } } else { ret = ring->dispatch_execbuffer(ring, exec_start, exec_len, - flags); + dispatch_flags); if (ret) return ret; } - trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), flags); + trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), dispatch_flags); i915_gem_execbuffer_move_to_active(vmas, ring); i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj); @@ -1353,7 +1353,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct i915_address_space *vm; const u32 ctx_id = i915_execbuffer2_get_context_id(*args); u64 exec_start = args->batch_start_offset; - u32 flags; + u32 dispatch_flags; int ret; bool need_relocs; @@ -1364,15 +1364,15 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) return ret; - flags = 0; + dispatch_flags = 0; if (args->flags & I915_EXEC_SECURE) { if (!file->is_master || !capable(CAP_SYS_ADMIN)) return -EPERM; - flags |= I915_DISPATCH_SECURE; + dispatch_flags |= I915_DISPATCH_SECURE; } if (args->flags & I915_EXEC_IS_PINNED) - flags |= I915_DISPATCH_PINNED; + dispatch_flags |= I915_DISPATCH_PINNED; if ((args->flags & I915_EXEC_RING_MASK) > LAST_USER_RING) { DRM_DEBUG("execbuf with unknown ring: %d\n", @@ -1495,7 +1495,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, args->batch_start_offset, args->batch_len, file->is_master, - &flags); + &dispatch_flags); if (IS_ERR(batch_obj)) { ret = PTR_ERR(batch_obj); goto err; @@ -1507,7 +1507,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, /* snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure * batch" bit. Hence we need to pin secure batches into the global gtt. * hsw should have this fixed, but bdw mucks it up again. */ - if (flags & I915_DISPATCH_SECURE) { + if (dispatch_flags & I915_DISPATCH_SECURE) { /* * So on first glance it looks freaky that we pin the ba
[Intel-gfx] [PATCH 25/51] drm/i915: Update init_context() to take a request structure
From: John Harrison Now that everything above has been converted to use requests, it is possible to update init_context() to take a request pointer instead of a ring/context pair. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_context.c |4 ++-- drivers/gpu/drm/i915/intel_lrc.c|9 - drivers/gpu/drm/i915/intel_ringbuffer.c |7 +++ drivers/gpu/drm/i915/intel_ringbuffer.h |3 +-- 4 files changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index eedb994..938cd26 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -412,7 +412,7 @@ int i915_gem_context_enable(struct drm_i915_gem_request *req) if (ring->init_context == NULL) return 0; - ret = ring->init_context(req->ring, ring->default_context); + ret = ring->init_context(req); } else ret = i915_switch_context(req); @@ -678,7 +678,7 @@ done: if (uninitialized) { if (ring->init_context) { - ret = ring->init_context(req->ring, to); + ret = ring->init_context(req); if (ret) DRM_ERROR("ring init context: %d\n", ret); } diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 2882d3f..4689853 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1322,16 +1322,15 @@ static int gen8_emit_request(struct intel_ringbuffer *ringbuf, return 0; } -static int gen8_init_rcs_context(struct intel_engine_cs *ring, - struct intel_context *ctx) +static int gen8_init_rcs_context(struct drm_i915_gem_request *req) { int ret; - ret = intel_logical_ring_workarounds_emit(ring, ctx); + ret = intel_logical_ring_workarounds_emit(req->ring, req->ctx); if (ret) return ret; - return intel_lr_context_render_state_init(ring, ctx); + return intel_lr_context_render_state_init(req->ring, req->ctx); } /** @@ -1909,7 +1908,7 @@ int intel_lr_context_deferred_create(struct intel_context *ctx, if (ret) return ret; - ret = ring->init_context(req->ring, ctx); + ret = ring->init_context(req); if (ret) { DRM_ERROR("ring init context: %d\n", ret); i915_gem_request_unreference(req); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 84a1e22..a0a9d71 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -713,16 +713,15 @@ static int intel_ring_workarounds_emit(struct intel_engine_cs *ring, return 0; } -static int intel_rcs_ctx_init(struct intel_engine_cs *ring, - struct intel_context *ctx) +static int intel_rcs_ctx_init(struct drm_i915_gem_request *req) { int ret; - ret = intel_ring_workarounds_emit(ring, ctx); + ret = intel_ring_workarounds_emit(req->ring, req->ctx); if (ret != 0) return ret; - ret = i915_gem_render_state_init(ring); + ret = i915_gem_render_state_init(req->ring); if (ret) DRM_ERROR("init render state: %d\n", ret); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index fdeaa66..36631e2 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -143,8 +143,7 @@ struct intel_engine_cs { int (*init_hw)(struct intel_engine_cs *ring); - int (*init_context)(struct intel_engine_cs *ring, - struct intel_context *ctx); + int (*init_context)(struct drm_i915_gem_request *req); void(*write_tail)(struct intel_engine_cs *ring, u32 value); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 33/51] drm/i915: Update a bunch of execbuffer heplers to take request structures
From: John Harrison Updated *_ring_invalidate_all_caches(), i915_reset_gen7_sol_offsets() and i915_emit_box() to take request structures instead of ring or ringbuf/context pairs. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 12 +++- drivers/gpu/drm/i915/intel_lrc.c |9 - drivers/gpu/drm/i915/intel_ringbuffer.c|3 ++- drivers/gpu/drm/i915/intel_ringbuffer.h|2 +- 4 files changed, 14 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index dc13751..a79c893 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -857,7 +857,7 @@ i915_gem_execbuffer_move_to_gpu(struct drm_i915_gem_request *req, /* Unconditionally invalidate gpu caches and ensure that we do flush * any residual writes from the previous batch. */ - return intel_ring_invalidate_all_caches(req->ring); + return intel_ring_invalidate_all_caches(req); } static bool @@ -1002,8 +1002,9 @@ i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params) static int i915_reset_gen7_sol_offsets(struct drm_device *dev, - struct intel_engine_cs *ring) + struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; struct drm_i915_private *dev_priv = dev->dev_private; int ret, i; @@ -1028,10 +1029,11 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, } static int -i915_emit_box(struct intel_engine_cs *ring, +i915_emit_box(struct drm_i915_gem_request *req, struct drm_clip_rect *box, int DR1, int DR4) { + struct intel_engine_cs *ring = req->ring; int ret; if (box->y2 <= box->y1 || box->x2 <= box->x1 || @@ -1247,7 +1249,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, } if (args->flags & I915_EXEC_GEN7_SOL_RESET) { - ret = i915_reset_gen7_sol_offsets(params->dev, ring); + ret = i915_reset_gen7_sol_offsets(params->dev, params->request); if (ret) goto error; } @@ -1258,7 +1260,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, if (cliprects) { for (i = 0; i < args->num_cliprects; i++) { - ret = i915_emit_box(ring, &cliprects[i], + ret = i915_emit_box(params->request, &cliprects[i], args->DR1, args->DR4); if (ret) goto error; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 245a5da..0c1a8e5 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -550,10 +550,9 @@ static int execlists_context_queue(struct intel_engine_cs *ring, return 0; } -static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx) +static int logical_ring_invalidate_all_caches(struct drm_i915_gem_request *req) { - struct intel_engine_cs *ring = ringbuf->ring; + struct intel_engine_cs *ring = req->ring; uint32_t flush_domains; int ret; @@ -561,7 +560,7 @@ static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf, if (ring->gpu_caches_dirty) flush_domains = I915_GEM_GPU_DOMAINS; - ret = ring->emit_flush(ringbuf, ctx, + ret = ring->emit_flush(req->ringbuf, req->ctx, I915_GEM_GPU_DOMAINS, flush_domains); if (ret) return ret; @@ -597,7 +596,7 @@ static int execlists_move_to_gpu(struct drm_i915_gem_request *req, /* Unconditionally invalidate gpu caches and ensure that we do flush * any residual writes from the previous batch. */ - return logical_ring_invalidate_all_caches(req->ringbuf, req->ctx); + return logical_ring_invalidate_all_caches(req); } /** diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 178bf49..fc5bc48 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2776,8 +2776,9 @@ intel_ring_flush_all_caches(struct intel_engine_cs *ring) } int -intel_ring_invalidate_all_caches(struct intel_engine_cs *ring) +intel_ring_invalidate_all_caches(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; uint32_t flush_domains; int ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 582b0ec..411cd76 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer
[Intel-gfx] [PATCH 27/51] drm/i915: Update overlay code to do explicit request management
From: John Harrison The overlay update code path to do explicit request creation and submission rather than relying on the OLR to do the right thing. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_overlay.c | 64 +- 1 file changed, 48 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c index f93dfc1..dc209bf 100644 --- a/drivers/gpu/drm/i915/intel_overlay.c +++ b/drivers/gpu/drm/i915/intel_overlay.c @@ -209,19 +209,19 @@ static void intel_overlay_unmap_regs(struct intel_overlay *overlay, } static int intel_overlay_do_wait_request(struct intel_overlay *overlay, +struct drm_i915_gem_request *req, void (*tail)(struct intel_overlay *)) { struct drm_device *dev = overlay->dev; - struct drm_i915_private *dev_priv = dev->dev_private; - struct intel_engine_cs *ring = &dev_priv->ring[RCS]; int ret; BUG_ON(overlay->last_flip_req); - i915_gem_request_assign(&overlay->last_flip_req, -ring->outstanding_lazy_request); - ret = i915_add_request(ring); - if (ret) + i915_gem_request_assign(&overlay->last_flip_req, req); + ret = i915_add_request(req->ring); + if (ret) { + i915_gem_request_unreference(req); return ret; + } overlay->flip_tail = tail; ret = i915_wait_request(overlay->last_flip_req); @@ -239,6 +239,7 @@ static int intel_overlay_on(struct intel_overlay *overlay) struct drm_device *dev = overlay->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *ring = &dev_priv->ring[RCS]; + struct drm_i915_gem_request *req; int ret; BUG_ON(overlay->active); @@ -246,17 +247,23 @@ static int intel_overlay_on(struct intel_overlay *overlay) WARN_ON(IS_I830(dev) && !(dev_priv->quirks & QUIRK_PIPEA_FORCE)); - ret = intel_ring_begin(ring, 4); + ret = dev_priv->gt.alloc_request(ring, ring->default_context, &req); if (ret) return ret; + ret = intel_ring_begin(ring, 4); + if (ret) { + i915_gem_request_unreference(req); + return ret; + } + intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_ON); intel_ring_emit(ring, overlay->flip_addr | OFC_UPDATE); intel_ring_emit(ring, MI_WAIT_FOR_EVENT | MI_WAIT_FOR_OVERLAY_FLIP); intel_ring_emit(ring, MI_NOOP); intel_ring_advance(ring); - return intel_overlay_do_wait_request(overlay, NULL); + return intel_overlay_do_wait_request(overlay, req, NULL); } /* overlay needs to be enabled in OCMD reg */ @@ -266,6 +273,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay, struct drm_device *dev = overlay->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *ring = &dev_priv->ring[RCS]; + struct drm_i915_gem_request *req; u32 flip_addr = overlay->flip_addr; u32 tmp; int ret; @@ -280,18 +288,27 @@ static int intel_overlay_continue(struct intel_overlay *overlay, if (tmp & (1 << 17)) DRM_DEBUG("overlay underrun, DOVSTA: %x\n", tmp); - ret = intel_ring_begin(ring, 2); + ret = dev_priv->gt.alloc_request(ring, ring->default_context, &req); if (ret) return ret; + ret = intel_ring_begin(ring, 2); + if (ret) { + i915_gem_request_unreference(req); + return ret; + } + intel_ring_emit(ring, MI_OVERLAY_FLIP | MI_OVERLAY_CONTINUE); intel_ring_emit(ring, flip_addr); intel_ring_advance(ring); WARN_ON(overlay->last_flip_req); - i915_gem_request_assign(&overlay->last_flip_req, -ring->outstanding_lazy_request); - return i915_add_request(ring); + i915_gem_request_assign(&overlay->last_flip_req, req); + ret = i915_add_request(req->ring); + if (ret) + i915_gem_request_unreference(req); + + return ret; } static void intel_overlay_release_old_vid_tail(struct intel_overlay *overlay) @@ -326,6 +343,7 @@ static int intel_overlay_off(struct intel_overlay *overlay) struct drm_device *dev = overlay->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *ring = &dev_priv->ring[RCS]; + struct drm_i915_gem_request *req; u32 flip_addr = overlay->flip_addr; int ret; @@ -337,10 +355,16 @@ static int intel_overlay_off(struct intel_overlay *overlay) * of the hw. Do it in both cases */ flip_addr |= OFC_UPDATE; - ret = intel_ring_begin(ring, 6); + ret = dev_priv->gt.alloc_request(ring, rin
[Intel-gfx] [PATCH 17/51] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring
From: John Harrison The i915_gem_init_hw() function calls a bunch of smaller initialisation functions. Multiple of which have generic sections and per ring sections. This means multiple passes are done over the rings. Each pass writes data to the ring which floats around in that ring's OLR until some random point in the future when an add_request() is done by some random other piece of code. This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation now being done in i915_ppgtt_init_ring(). The ring looping is now done at the top level in i915_gem_init_hw(). For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c | 25 +++-- drivers/gpu/drm/i915/i915_gem_gtt.c | 28 drivers/gpu/drm/i915/i915_gem_gtt.h |1 + 3 files changed, 36 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8923ecd..e298119 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4839,19 +4839,32 @@ i915_gem_init_hw(struct drm_device *dev) */ init_unused_rings(dev); + ret = i915_ppgtt_init_hw(dev); + if (ret) { + DRM_ERROR("PPGTT enable HW failed %d\n", ret); + return ret; + } + + /* Need to do basic initialisation of all rings first: */ for_each_ring(ring, dev_priv, i) { ret = ring->init_hw(ring); if (ret) return ret; } - for (i = 0; i < NUM_L3_SLICES(dev); i++) - i915_gem_l3_remap(&dev_priv->ring[RCS], i); + /* Now it is safe to go back round and do everything else: */ + for_each_ring(ring, dev_priv, i) { + if (ring->id == RCS) { + for (i = 0; i < NUM_L3_SLICES(dev); i++) + i915_gem_l3_remap(ring, i); + } - ret = i915_ppgtt_init_hw(dev); - if (ret && ret != -EIO) { - DRM_ERROR("PPGTT enable failed %d\n", ret); - i915_gem_cleanup_ringbuffer(dev); + ret = i915_ppgtt_init_ring(ring); + if (ret && ret != -EIO) { + DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); + i915_gem_cleanup_ringbuffer(dev); + return ret; + } } ret = i915_gem_context_enable(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 746f77f..1528e77 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1186,11 +1186,6 @@ int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt) int i915_ppgtt_init_hw(struct drm_device *dev) { - struct drm_i915_private *dev_priv = dev->dev_private; - struct intel_engine_cs *ring; - struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; - int i, ret = 0; - /* In the case of execlists, PPGTT is enabled by the context descriptor * and the PDPs are contained within the context itself. We don't * need to do anything here. */ @@ -1209,16 +1204,25 @@ int i915_ppgtt_init_hw(struct drm_device *dev) else MISSING_CASE(INTEL_INFO(dev)->gen); - if (ppgtt) { - for_each_ring(ring, dev_priv, i) { - ret = ppgtt->switch_mm(ppgtt, ring); - if (ret != 0) - return ret; - } - } + return 0; +} + +int i915_ppgtt_init_ring(struct intel_engine_cs *ring) +{ + struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; + int ret = 0; + + if (!ppgtt) + return 0; + + ret = ppgtt->switch_mm(ppgtt, ring); + if (ret != 0) + return ret; return ret; } + struct i915_hw_ppgtt * i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv) { diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index e377c7d..78a107e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -300,6 +300,7 @@ void i915_global_gtt_cleanup(struct drm_device *dev); int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt); int i915_ppgtt_init_hw(struct drm_device *dev); +int i915_ppgtt_init_ring(struct intel_engine_cs *ring); void i915_ppgtt_release(struct kref *kref); struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 26/51] drm/i915: Update render_state_init() to take a request structure
From: John Harrison Updated the two render_state_init() functions to take a request pointer instead of a ring. This removes their reliance on the OLR. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_render_state.c | 18 +- drivers/gpu/drm/i915/i915_gem_render_state.h |2 +- drivers/gpu/drm/i915/intel_lrc.c | 23 +++ drivers/gpu/drm/i915/intel_lrc.h |2 -- drivers/gpu/drm/i915/intel_ringbuffer.c |2 +- 5 files changed, 22 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 989476e..85cc746 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -152,29 +152,29 @@ int i915_gem_render_state_prepare(struct intel_engine_cs *ring, return 0; } -int i915_gem_render_state_init(struct intel_engine_cs *ring) +int i915_gem_render_state_init(struct drm_i915_gem_request *req) { struct render_state so; int ret; - ret = i915_gem_render_state_prepare(ring, &so); + ret = i915_gem_render_state_prepare(req->ring, &so); if (ret) return ret; if (so.rodata == NULL) return 0; - ret = ring->dispatch_execbuffer(ring, - so.ggtt_offset, - so.rodata->batch_items * 4, - I915_DISPATCH_SECURE); + ret = req->ring->dispatch_execbuffer(req->ring, +so.ggtt_offset, +so.rodata->batch_items * 4, +I915_DISPATCH_SECURE); if (ret) goto out; - i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring); + i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req->ring); - WARN_ON(ring->outstanding_lazy_request->batch_obj); - ring->outstanding_lazy_request->batch_obj = so.obj; + WARN_ON(req->batch_obj); + req->batch_obj = so.obj; /* __i915_add_request moves object to inactive if it fails */ out: i915_gem_render_state_fini(&so); diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.h b/drivers/gpu/drm/i915/i915_gem_render_state.h index c44961e..7aa7372 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.h +++ b/drivers/gpu/drm/i915/i915_gem_render_state.h @@ -39,7 +39,7 @@ struct render_state { int gen; }; -int i915_gem_render_state_init(struct intel_engine_cs *ring); +int i915_gem_render_state_init(struct drm_i915_gem_request *req); void i915_gem_render_state_fini(struct render_state *so); int i915_gem_render_state_prepare(struct intel_engine_cs *ring, struct render_state *so); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 4689853..ad13cc7 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -203,6 +203,7 @@ enum { }; #define GEN8_CTX_ID_SHIFT 32 +static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req); static int intel_lr_context_pin(struct intel_engine_cs *ring, struct intel_context *ctx); @@ -1330,7 +1331,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req) if (ret) return ret; - return intel_lr_context_render_state_init(req->ring, req->ctx); + return intel_lr_context_render_state_init(req); } /** @@ -1586,31 +1587,29 @@ cleanup_render_ring: return ret; } -int intel_lr_context_render_state_init(struct intel_engine_cs *ring, - struct intel_context *ctx) +static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req) { - struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; struct render_state so; int ret; - ret = i915_gem_render_state_prepare(ring, &so); + ret = i915_gem_render_state_prepare(req->ring, &so); if (ret) return ret; if (so.rodata == NULL) return 0; - ret = ring->emit_bb_start(ringbuf, - ctx, - so.ggtt_offset, - I915_DISPATCH_SECURE); + ret = req->ring->emit_bb_start(req->ringbuf, + req->ctx, + so.ggtt_offset, + I915_DISPATCH_SECURE); if (ret) goto out; - i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), ring); + i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req->ring); - WARN_ON(ring->outstanding_lazy_request->batch_obj); - ring->outstanding_lazy_request->batch_obj = so.obj; + WARN_ON(req->batch_obj
[Intel-gfx] [PATCH 08/51] drm/i915: Update alloc_request to return the allocated request
From: John Harrison The alloc_request() function does not actually return the newly allocated request. Instead, it must be pulled from ring->outstanding_lazy_request. This patch fixes this so that code can create a request and start using it knowing exactly which request it actually owns. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h|3 ++- drivers/gpu/drm/i915/i915_gem_execbuffer.c |3 ++- drivers/gpu/drm/i915/intel_lrc.c | 13 + drivers/gpu/drm/i915/intel_lrc.h |3 ++- drivers/gpu/drm/i915/intel_ringbuffer.c| 14 ++ drivers/gpu/drm/i915/intel_ringbuffer.h|3 ++- 6 files changed, 27 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7959dfa..92c183f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1902,7 +1902,8 @@ struct drm_i915_private { /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ struct { int (*alloc_request)(struct intel_engine_cs *ring, -struct intel_context *ctx); +struct intel_context *ctx, +struct drm_i915_gem_request **req_out); int (*do_execbuf)(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 61471e9..37dcc6f 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1353,6 +1353,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct i915_address_space *vm; struct i915_execbuffer_params params_master; /* XXX: will be removed later */ struct i915_execbuffer_params *params = ¶ms_master; + struct drm_i915_gem_request *request; const u32 ctx_id = i915_execbuffer2_get_context_id(*args); u32 dispatch_flags; int ret; @@ -1531,7 +1532,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, params->batch_obj_vm_offset = i915_gem_obj_offset(batch_obj, vm); /* Allocate a request for this batch buffer nice and early. */ - ret = dev_priv->gt.alloc_request(ring, ctx); + ret = dev_priv->gt.alloc_request(ring, ctx, &request); if (ret) goto err; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 2f906a2..325ef2c 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -847,13 +847,17 @@ void intel_lr_context_unpin(struct intel_engine_cs *ring, } int intel_logical_ring_alloc_request(struct intel_engine_cs *ring, -struct intel_context *ctx) +struct intel_context *ctx, +struct drm_i915_gem_request **req_out) { struct drm_i915_gem_request *request; struct drm_i915_private *dev_private = ring->dev->dev_private; int ret; - if (ring->outstanding_lazy_request) + if (!req_out) + return -EINVAL; + + if ((*req_out = ring->outstanding_lazy_request) != NULL) return 0; request = kzalloc(sizeof(*request), GFP_KERNEL); @@ -888,7 +892,7 @@ int intel_logical_ring_alloc_request(struct intel_engine_cs *ring, i915_gem_context_reference(request->ctx); request->ringbuf = ctx->engine[ring->id].ringbuf; - ring->outstanding_lazy_request = request; + *req_out = ring->outstanding_lazy_request = request; return 0; } @@ -1041,6 +1045,7 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf, int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, struct intel_context *ctx, int num_dwords) { + struct drm_i915_gem_request *req; struct intel_engine_cs *ring = ringbuf->ring; struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; @@ -1056,7 +1061,7 @@ int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, return ret; /* Preallocate the olr before touching the ring */ - ret = intel_logical_ring_alloc_request(ring, ctx); + ret = intel_logical_ring_alloc_request(ring, ctx, &req); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 2b1bf83..b4620b9 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -35,7 +35,8 @@ /* Logical Rings */ int __must_check intel_logical_ring_alloc_request(struct intel_engine_cs *ring, -
[Intel-gfx] [PATCH 18/51] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()
From: John Harrison The start of day context initialisation code in i915_gem_context_enable() loops over each ring and calls the legacy switch context or the execlist init context code as appropriate. This patch moves the ring looping out of that function in to the top level caller i915_gem_init_hw(). This means the a single pass can be made over all rings doing the PPGTT, L3 remap and context initialisation of each ring altogether. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |2 +- drivers/gpu/drm/i915/i915_gem.c | 18 ++--- drivers/gpu/drm/i915/i915_gem_context.c | 32 +++ 3 files changed, 23 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1bfb8d3..099a3ee 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2973,7 +2973,7 @@ int __must_check i915_gem_context_init(struct drm_device *dev); void i915_gem_context_fini(struct drm_device *dev); void i915_gem_context_reset(struct drm_device *dev); int i915_gem_context_open(struct drm_device *dev, struct drm_file *file); -int i915_gem_context_enable(struct drm_i915_private *dev_priv); +int i915_gem_context_enable(struct intel_engine_cs *ring); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); int i915_switch_context(struct intel_engine_cs *ring, struct intel_context *to); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e298119..1c711c0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4839,6 +4839,8 @@ i915_gem_init_hw(struct drm_device *dev) */ init_unused_rings(dev); + BUG_ON(!dev_priv->ring[RCS].default_context); + ret = i915_ppgtt_init_hw(dev); if (ret) { DRM_ERROR("PPGTT enable HW failed %d\n", ret); @@ -4854,6 +4856,8 @@ i915_gem_init_hw(struct drm_device *dev) /* Now it is safe to go back round and do everything else: */ for_each_ring(ring, dev_priv, i) { + WARN_ON(!ring->default_context); + if (ring->id == RCS) { for (i = 0; i < NUM_L3_SLICES(dev); i++) i915_gem_l3_remap(ring, i); @@ -4865,17 +4869,17 @@ i915_gem_init_hw(struct drm_device *dev) i915_gem_cleanup_ringbuffer(dev); return ret; } - } - ret = i915_gem_context_enable(dev_priv); - if (ret && ret != -EIO) { - DRM_ERROR("Context enable failed %d\n", ret); - i915_gem_cleanup_ringbuffer(dev); + ret = i915_gem_context_enable(ring); + if (ret && ret != -EIO) { + DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); + i915_gem_cleanup_ringbuffer(dev); - return ret; + return ret; + } } - return ret; + return 0; } int i915_gem_init(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 8603bf4..dd83d61 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -403,32 +403,22 @@ void i915_gem_context_fini(struct drm_device *dev) i915_gem_context_unreference(dctx); } -int i915_gem_context_enable(struct drm_i915_private *dev_priv) +int i915_gem_context_enable(struct intel_engine_cs *ring) { - struct intel_engine_cs *ring; - int ret, i; - - BUG_ON(!dev_priv->ring[RCS].default_context); + int ret; if (i915.enable_execlists) { - for_each_ring(ring, dev_priv, i) { - if (ring->init_context) { - ret = ring->init_context(ring, - ring->default_context); - if (ret) { - DRM_ERROR("ring init context: %d\n", - ret); - return ret; - } - } - } + if (ring->init_context == NULL) + return 0; + ret = ring->init_context(ring, ring->default_context); } else - for_each_ring(ring, dev_priv, i) { - ret = i915_switch_context(ring, ring->default_context); - if (ret) - return ret; - } + ret = i915_switch_context(ring, ring->default_context); + + if (ret) { + DRM_ERROR("ring init context: %d\n", ret); + return ret; + } return 0; } -- 1.7.9.5 __
[Intel-gfx] [PATCH 12/51] drm/i915: Update execbuffer_move_to_active() to take a request structure
From: John Harrison The plan is to pass requests around as the basic submission tracking structure rather than rings and contexts. This patch updates the execbuffer_move_to_active() code path. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h|2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c |6 +++--- drivers/gpu/drm/i915/intel_lrc.c |2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index df9b5d7..21a2b35 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2627,7 +2627,7 @@ int i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); void i915_gem_execbuffer_move_to_active(struct list_head *vmas, - struct intel_engine_cs *ring); + struct drm_i915_gem_request *req); int i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params); int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index da1e232..f7c19bc 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -950,9 +950,9 @@ i915_gem_validate_context(struct drm_device *dev, struct drm_file *file, void i915_gem_execbuffer_move_to_active(struct list_head *vmas, - struct intel_engine_cs *ring) + struct drm_i915_gem_request *req) { - struct drm_i915_gem_request *req = intel_ring_get_request(ring); + struct intel_engine_cs *ring = i915_gem_request_get_ring(req); struct i915_vma *vma; list_for_each_entry(vma, vmas, exec_list) { @@ -1279,7 +1279,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); - i915_gem_execbuffer_move_to_active(vmas, ring); + i915_gem_execbuffer_move_to_active(vmas, params->request); ret = i915_gem_execbuffer_retire_commands(params); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 1dbf4b1..450eed4 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -703,7 +703,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); - i915_gem_execbuffer_move_to_active(vmas, ring); + i915_gem_execbuffer_move_to_active(vmas, params->request); return i915_gem_execbuffer_retire_commands(params); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 29/51] drm/i915: Update add_request() to take a request structure
From: John Harrison Now that all callers of i915_add_request() have a request pointer to hand, it is possible to update the add request function to take a request pointer rather than pulling it out of the OLR. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h| 10 +- drivers/gpu/drm/i915/i915_gem.c| 24 drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- drivers/gpu/drm/i915/intel_display.c |2 +- drivers/gpu/drm/i915/intel_lrc.c |2 +- drivers/gpu/drm/i915/intel_overlay.c |4 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c|3 ++- 7 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4b82b2e..b7c01e2 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2814,14 +2814,14 @@ void i915_gem_init_swizzling(struct drm_device *dev); void i915_gem_cleanup_ringbuffer(struct drm_device *dev); int __must_check i915_gpu_idle(struct drm_device *dev); int __must_check i915_gem_suspend(struct drm_device *dev); -int __i915_add_request(struct intel_engine_cs *ring, +int __i915_add_request(struct drm_i915_gem_request *req, struct drm_file *file, struct drm_i915_gem_object *batch_obj, bool flush_caches); -#define i915_add_request(ring) \ - __i915_add_request(ring, NULL, NULL, true) -#define i915_add_request_no_flush(ring) \ - __i915_add_request(ring, NULL, NULL, false) +#define i915_add_request(req) \ + __i915_add_request(req, NULL, NULL, true) +#define i915_add_request_no_flush(req) \ + __i915_add_request(req, NULL, NULL, false) int __i915_wait_request(struct drm_i915_gem_request *req, unsigned reset_counter, bool interruptible, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5f17ade..8b0bfbd 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1161,7 +1161,7 @@ i915_gem_check_olr(struct drm_i915_gem_request *req) ret = 0; if (req == req->ring->outstanding_lazy_request) - ret = i915_add_request(req->ring); + ret = i915_add_request(req); return ret; } @@ -2406,25 +2406,25 @@ i915_gem_get_seqno(struct drm_device *dev, u32 *seqno) return 0; } -int __i915_add_request(struct intel_engine_cs *ring, +int __i915_add_request(struct drm_i915_gem_request *request, struct drm_file *file, struct drm_i915_gem_object *obj, bool flush_caches) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; - struct drm_i915_gem_request *request; + struct intel_engine_cs *ring; + struct drm_i915_private *dev_priv; struct intel_ringbuffer *ringbuf; u32 request_start; int ret; - request = ring->outstanding_lazy_request; if (WARN_ON(request == NULL)) return -ENOMEM; - if (i915.enable_execlists) { - ringbuf = request->ctx->engine[ring->id].ringbuf; - } else - ringbuf = ring->buffer; + ring = request->ring; + dev_priv = ring->dev->dev_private; + ringbuf = request->ringbuf; + + WARN_ON(request != ring->outstanding_lazy_request); request_start = intel_ring_get_tail(ringbuf); /* @@ -3113,7 +3113,7 @@ int i915_gpu_idle(struct drm_device *dev) return ret; } - ret = i915_add_request_no_flush(req->ring); + ret = i915_add_request_no_flush(req); if (ret) { i915_gem_request_unreference(req); return ret; @@ -3961,7 +3961,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, if (ret) return ret; - ret = i915_add_request_no_flush(req->ring); + ret = i915_add_request_no_flush(req); if (ret) return ret; } @@ -4879,7 +4879,7 @@ i915_gem_init_hw(struct drm_device *dev) return ret; } - ret = i915_add_request_no_flush(ring); + ret = i915_add_request_no_flush(req); if (ret) { DRM_ERROR("Add request ring #%d failed: %d\n", i, ret); i915_gem_request_unreference(req); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 1e2fc80..15e33a9 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -996,7
[Intel-gfx] [PATCH 24/51] drm/i915: Update deferred context creation to do explicit request management
From: John Harrison In execlist mode, context initialisation is deferred until first use of the given context. This is because execlist mode has many more contexts than legacy mode and many are never actually used. Previously, the initialisation commands were written to the ring and tagged with some random request structure via the OLR. This seemed to be causing a null pointer deference bug under certain circumstances (BZ:40112). This patch adds explicit request creation and submission to the deferred initialisation code path. Thus removing any reliance on or randomness caused by the OLR. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_lrc.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index de01cae..2882d3f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1822,6 +1822,7 @@ static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring, int intel_lr_context_deferred_create(struct intel_context *ctx, struct intel_engine_cs *ring) { + struct drm_i915_private *dev_priv = ring->dev->dev_private; const bool is_global_default_ctx = (ctx == ring->default_context); struct drm_device *dev = ring->dev; struct drm_i915_gem_object *ctx_obj; @@ -1902,13 +1903,27 @@ int intel_lr_context_deferred_create(struct intel_context *ctx, lrc_setup_hardware_status_page(ring, ctx_obj); else if (ring->id == RCS && !ctx->rcs_initialized) { if (ring->init_context) { - ret = ring->init_context(ring, ctx); + struct drm_i915_gem_request *req; + + ret = dev_priv->gt.alloc_request(ring, ctx, &req); + if (ret) + return ret; + + ret = ring->init_context(req->ring, ctx); if (ret) { DRM_ERROR("ring init context: %d\n", ret); + i915_gem_request_unreference(req); ctx->engine[ring->id].ringbuf = NULL; ctx->engine[ring->id].state = NULL; goto error; } + + ret = i915_add_request_no_flush(req->ring); + if (ret) { + DRM_ERROR("ring init context: %d\n", ret); + i915_gem_request_unreference(req); + goto error; + } } ctx->rcs_initialized = true; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 31/51] drm/i915: Update l3_remap to take a request structure
From: John Harrison Converted i915_gem_l3_remap() to take a request structure instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |2 +- drivers/gpu/drm/i915/i915_gem.c |5 +++-- drivers/gpu/drm/i915/i915_gem_context.c |2 +- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index be1e143..8f20c37 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2809,7 +2809,7 @@ int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj); int __must_check i915_gem_init(struct drm_device *dev); int i915_gem_init_rings(struct drm_device *dev); int __must_check i915_gem_init_hw(struct drm_device *dev); -int i915_gem_l3_remap(struct intel_engine_cs *ring, int slice); +int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice); void i915_gem_init_swizzling(struct drm_device *dev); void i915_gem_cleanup_ringbuffer(struct drm_device *dev); int __must_check i915_gpu_idle(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e8257dd..ab31cb0 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4639,8 +4639,9 @@ err: return ret; } -int i915_gem_l3_remap(struct intel_engine_cs *ring, int slice) +int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice) { + struct intel_engine_cs *ring = req->ring; struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; u32 reg_base = GEN7_L3LOG_BASE + (slice * 0x200); @@ -4859,7 +4860,7 @@ i915_gem_init_hw(struct drm_device *dev) if (ring->id == RCS) { for (i = 0; i < NUM_L3_SLICES(dev); i++) - i915_gem_l3_remap(ring, i); + i915_gem_l3_remap(req, i); } ret = i915_ppgtt_init_ring(req); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index e4d75be..475d1fd 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -636,7 +636,7 @@ static int do_switch(struct drm_i915_gem_request *req) if (!(to->remap_slice & (1
[Intel-gfx] [PATCH 34/51] drm/i915: Update workarounds_emit() to take request structures
From: John Harrison Updated the *_ring_workarounds_emit() functions to take requests instead of ring/context pairs. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_lrc.c| 14 +++--- drivers/gpu/drm/i915/intel_ringbuffer.c |6 +++--- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 0c1a8e5..ee1f062 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1067,11 +1067,11 @@ int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, return 0; } -static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring, - struct intel_context *ctx) +static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) { int ret, i; - struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf; + struct intel_engine_cs *ring = req->ring; + struct intel_ringbuffer *ringbuf = req->ringbuf; struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct i915_workarounds *w = &dev_priv->workarounds; @@ -1080,11 +1080,11 @@ static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring, return 0; ring->gpu_caches_dirty = true; - ret = logical_ring_flush_all_caches(ringbuf, ctx); + ret = logical_ring_flush_all_caches(ringbuf, req->ctx); if (ret) return ret; - ret = intel_logical_ring_begin(ringbuf, ctx, w->count * 2 + 2); + ret = intel_logical_ring_begin(ringbuf, req->ctx, w->count * 2 + 2); if (ret) return ret; @@ -1098,7 +1098,7 @@ static int intel_logical_ring_workarounds_emit(struct intel_engine_cs *ring, intel_logical_ring_advance(ringbuf); ring->gpu_caches_dirty = true; - ret = logical_ring_flush_all_caches(ringbuf, ctx); + ret = logical_ring_flush_all_caches(ringbuf, req->ctx); if (ret) return ret; @@ -1326,7 +1326,7 @@ static int gen8_init_rcs_context(struct drm_i915_gem_request *req) { int ret; - ret = intel_logical_ring_workarounds_emit(req->ring, req->ctx); + ret = intel_logical_ring_workarounds_emit(req); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index fc5bc48..32cae54 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -674,10 +674,10 @@ err: return ret; } -static int intel_ring_workarounds_emit(struct intel_engine_cs *ring, - struct intel_context *ctx) +static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req) { int ret, i; + struct intel_engine_cs *ring = req->ring; struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct i915_workarounds *w = &dev_priv->workarounds; @@ -717,7 +717,7 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request *req) { int ret; - ret = intel_ring_workarounds_emit(req->ring, req->ctx); + ret = intel_ring_workarounds_emit(req); if (ret != 0) return ret; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 06/51] drm/i915: Wrap request allocation with a function pointer
From: John Harrison In order to explicitly manage requests from creation to submission, it is necessary to be able to explicitly create them in the first place. This patch adds an indirection wrapper to the request creation function so that it can be called from generic code without having to worry about execlist vs legacy mode. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |2 ++ drivers/gpu/drm/i915/i915_gem.c |2 ++ drivers/gpu/drm/i915/intel_lrc.c|6 +++--- drivers/gpu/drm/i915/intel_lrc.h|2 ++ drivers/gpu/drm/i915/intel_ringbuffer.c |6 +++--- drivers/gpu/drm/i915/intel_ringbuffer.h |2 ++ 6 files changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 143bc63..7959dfa 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1901,6 +1901,8 @@ struct drm_i915_private { /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ struct { + int (*alloc_request)(struct intel_engine_cs *ring, +struct intel_context *ctx); int (*do_execbuf)(struct i915_execbuffer_params *params, struct drm_i915_gem_execbuffer2 *args, struct list_head *vmas); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2ebe914..9546992 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4855,11 +4855,13 @@ int i915_gem_init(struct drm_device *dev) } if (!i915.enable_execlists) { + dev_priv->gt.alloc_request = intel_ring_alloc_request; dev_priv->gt.do_execbuf = i915_gem_ringbuffer_submission; dev_priv->gt.init_rings = i915_gem_init_rings; dev_priv->gt.cleanup_ring = intel_cleanup_ring_buffer; dev_priv->gt.stop_ring = intel_stop_ring_buffer; } else { + dev_priv->gt.alloc_request = intel_logical_ring_alloc_request; dev_priv->gt.do_execbuf = intel_execlists_submission; dev_priv->gt.init_rings = intel_logical_rings_init; dev_priv->gt.cleanup_ring = intel_logical_ring_cleanup; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 90400d0d..2f906a2 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -846,8 +846,8 @@ void intel_lr_context_unpin(struct intel_engine_cs *ring, } } -static int logical_ring_alloc_request(struct intel_engine_cs *ring, - struct intel_context *ctx) +int intel_logical_ring_alloc_request(struct intel_engine_cs *ring, +struct intel_context *ctx) { struct drm_i915_gem_request *request; struct drm_i915_private *dev_private = ring->dev->dev_private; @@ -1056,7 +1056,7 @@ int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, return ret; /* Preallocate the olr before touching the ring */ - ret = logical_ring_alloc_request(ring, ctx); + ret = intel_logical_ring_alloc_request(ring, ctx); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index ae2f3ed..2b1bf83 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -34,6 +34,8 @@ #define RING_CONTEXT_STATUS_PTR(ring) ((ring)->mmio_base+0x3a0) /* Logical Rings */ +int __must_check intel_logical_ring_alloc_request(struct intel_engine_cs *ring, + struct intel_context *ctx); void intel_logical_ring_stop(struct intel_engine_cs *ring); void intel_logical_ring_cleanup(struct intel_engine_cs *ring); int intel_logical_rings_init(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index ca9e7e6..c80e20d 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2084,8 +2084,8 @@ int intel_ring_idle(struct intel_engine_cs *ring) return i915_wait_request(req); } -static int -intel_ring_alloc_request(struct intel_engine_cs *ring) +int +intel_ring_alloc_request(struct intel_engine_cs *ring, struct intel_context *ctx) { int ret; struct drm_i915_gem_request *request; @@ -2150,7 +2150,7 @@ int intel_ring_begin(struct intel_engine_cs *ring, return ret; /* Preallocate the olr before touching the ring */ - ret = intel_ring_alloc_request(ring); + ret = intel_ring_alloc_request(ring, NULL); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 26e5774..74df0fc 100
[Intel-gfx] [PATCH 40/51] drm/i915: Update ring->add_request() to take a request structure
From: John Harrison Updated the various ring->add_request() implementations to take a request instead of a ring. This removes their reliance on the OLR to obtain the seqno value that the request should be tagged with. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c |2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 26 -- drivers/gpu/drm/i915/intel_ringbuffer.h |2 +- 3 files changed, 14 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a728f91..616b34a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2452,7 +2452,7 @@ int __i915_add_request(struct drm_i915_gem_request *request, if (i915.enable_execlists) ret = ring->emit_request(ringbuf, request); else - ret = ring->add_request(ring); + ret = ring->add_request(request); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 2f9ba79..ce1dab4 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1081,16 +1081,16 @@ static int gen6_signal(struct intel_engine_cs *signaller, /** * gen6_add_request - Update the semaphore mailbox registers - * - * @ring - ring that is adding a request - * @seqno - return seqno stuck into the ring + * + * @request - request to write to the ring * * Update the mailbox registers in the *other* rings with the current seqno. * This acts like a signal in the canonical semaphore. */ static int -gen6_add_request(struct intel_engine_cs *ring) +gen6_add_request(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; if (ring->semaphore.signal) @@ -1103,8 +1103,7 @@ gen6_add_request(struct intel_engine_cs *ring) intel_ring_emit(ring, MI_STORE_DWORD_INDEX); intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT); - intel_ring_emit(ring, - i915_gem_request_get_seqno(ring->outstanding_lazy_request)); + intel_ring_emit(ring, i915_gem_request_get_seqno(req)); intel_ring_emit(ring, MI_USER_INTERRUPT); __intel_ring_advance(ring); @@ -1201,8 +1200,9 @@ do { \ } while (0) static int -pc_render_add_request(struct intel_engine_cs *ring) +pc_render_add_request(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; int ret; @@ -1222,8 +1222,7 @@ pc_render_add_request(struct intel_engine_cs *ring) PIPE_CONTROL_WRITE_FLUSH | PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE); intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT); - intel_ring_emit(ring, - i915_gem_request_get_seqno(ring->outstanding_lazy_request)); + intel_ring_emit(ring, i915_gem_request_get_seqno(req)); intel_ring_emit(ring, 0); PIPE_CONTROL_FLUSH(ring, scratch_addr); scratch_addr += 2 * CACHELINE_BYTES; /* write to separate cachelines */ @@ -1242,8 +1241,7 @@ pc_render_add_request(struct intel_engine_cs *ring) PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | PIPE_CONTROL_NOTIFY); intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT); - intel_ring_emit(ring, - i915_gem_request_get_seqno(ring->outstanding_lazy_request)); + intel_ring_emit(ring, i915_gem_request_get_seqno(req)); intel_ring_emit(ring, 0); __intel_ring_advance(ring); @@ -1474,8 +1472,9 @@ bsd_ring_flush(struct drm_i915_gem_request *req, } static int -i9xx_add_request(struct intel_engine_cs *ring) +i9xx_add_request(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; ret = intel_ring_begin(ring, 4); @@ -1484,8 +1483,7 @@ i9xx_add_request(struct intel_engine_cs *ring) intel_ring_emit(ring, MI_STORE_DWORD_INDEX); intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT); - intel_ring_emit(ring, - i915_gem_request_get_seqno(ring->outstanding_lazy_request)); + intel_ring_emit(ring, i915_gem_request_get_seqno(req)); intel_ring_emit(ring, MI_USER_INTERRUPT); __intel_ring_advance(ring); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 373ebf3..e33e010 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -150,7 +150,7 @@ struct intel_engine_cs { int __must_check (*flush)(struct drm_i915_gem_request *req,
[Intel-gfx] [PATCH 35/51] drm/i915: Update flush_all_caches() to take request structures
From: John Harrison Updated the *_ring_flush_all_caches() functions to take requests instead of rings or ringbuf/context pairs. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c |4 ++-- drivers/gpu/drm/i915/intel_lrc.c| 11 +-- drivers/gpu/drm/i915/intel_lrc.h|3 +-- drivers/gpu/drm/i915/intel_ringbuffer.c |7 --- drivers/gpu/drm/i915/intel_ringbuffer.h |2 +- 5 files changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index ab31cb0..a728f91 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2435,9 +2435,9 @@ int __i915_add_request(struct drm_i915_gem_request *request, */ if (flush_caches) { if (i915.enable_execlists) - ret = logical_ring_flush_all_caches(ringbuf, request->ctx); + ret = logical_ring_flush_all_caches(request); else - ret = intel_ring_flush_all_caches(ring); + ret = intel_ring_flush_all_caches(request); if (ret) return ret; } diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index ee1f062..a31cc33 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -760,16 +760,15 @@ void intel_logical_ring_stop(struct intel_engine_cs *ring) I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING)); } -int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx) +int logical_ring_flush_all_caches(struct drm_i915_gem_request *req) { - struct intel_engine_cs *ring = ringbuf->ring; + struct intel_engine_cs *ring = req->ring; int ret; if (!ring->gpu_caches_dirty) return 0; - ret = ring->emit_flush(ringbuf, ctx, 0, I915_GEM_GPU_DOMAINS); + ret = ring->emit_flush(req->ringbuf, req->ctx, 0, I915_GEM_GPU_DOMAINS); if (ret) return ret; @@ -1080,7 +1079,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) return 0; ring->gpu_caches_dirty = true; - ret = logical_ring_flush_all_caches(ringbuf, req->ctx); + ret = logical_ring_flush_all_caches(req); if (ret) return ret; @@ -1098,7 +1097,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) intel_logical_ring_advance(ringbuf); ring->gpu_caches_dirty = true; - ret = logical_ring_flush_all_caches(ringbuf, req->ctx); + ret = logical_ring_flush_all_caches(req); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 975effb..474597e 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -41,8 +41,7 @@ void intel_logical_ring_stop(struct intel_engine_cs *ring); void intel_logical_ring_cleanup(struct intel_engine_cs *ring); int intel_logical_rings_init(struct drm_device *dev); -int logical_ring_flush_all_caches(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx); +int logical_ring_flush_all_caches(struct drm_i915_gem_request *req); void intel_logical_ring_advance_and_submit( struct intel_ringbuffer *ringbuf, struct intel_context *ctx, diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 32cae54..6a53bfc 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -686,7 +686,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req) return 0; ring->gpu_caches_dirty = true; - ret = intel_ring_flush_all_caches(ring); + ret = intel_ring_flush_all_caches(req); if (ret) return ret; @@ -704,7 +704,7 @@ static int intel_ring_workarounds_emit(struct drm_i915_gem_request *req) intel_ring_advance(ring); ring->gpu_caches_dirty = true; - ret = intel_ring_flush_all_caches(ring); + ret = intel_ring_flush_all_caches(req); if (ret) return ret; @@ -2758,8 +2758,9 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev) } int -intel_ring_flush_all_caches(struct intel_engine_cs *ring) +intel_ring_flush_all_caches(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; if (!ring->gpu_caches_dirty) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 411cd76..8a672b1 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -413,7 +413,7 @@
[Intel-gfx] [PATCH 20/51] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests
From: John Harrison The final step in removing the OLR from i915_gem_init_hw() is to pass the newly allocated request structure in to each step rather than passing a ring structure. This patch updates both i915_ppgtt_init_ring() and i915_gem_context_enable() to take request pointers. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |2 +- drivers/gpu/drm/i915/i915_gem.c |4 ++-- drivers/gpu/drm/i915/i915_gem_context.c |7 --- drivers/gpu/drm/i915/i915_gem_gtt.c |6 +++--- drivers/gpu/drm/i915/i915_gem_gtt.h |2 +- 5 files changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2229238..2eab9f4 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2974,7 +2974,7 @@ int __must_check i915_gem_context_init(struct drm_device *dev); void i915_gem_context_fini(struct drm_device *dev); void i915_gem_context_reset(struct drm_device *dev); int i915_gem_context_open(struct drm_device *dev, struct drm_file *file); -int i915_gem_context_enable(struct intel_engine_cs *ring); +int i915_gem_context_enable(struct drm_i915_gem_request *req); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); int i915_switch_context(struct intel_engine_cs *ring, struct intel_context *to); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 5f24ce1..e34672e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4872,7 +4872,7 @@ i915_gem_init_hw(struct drm_device *dev) i915_gem_l3_remap(ring, i); } - ret = i915_ppgtt_init_ring(ring); + ret = i915_ppgtt_init_ring(req); if (ret && ret != -EIO) { DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); i915_gem_request_unreference(req); @@ -4880,7 +4880,7 @@ i915_gem_init_hw(struct drm_device *dev) return ret; } - ret = i915_gem_context_enable(ring); + ret = i915_gem_context_enable(req); if (ret && ret != -EIO) { DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); i915_gem_request_unreference(req); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index dd83d61..04d2a20 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -403,17 +403,18 @@ void i915_gem_context_fini(struct drm_device *dev) i915_gem_context_unreference(dctx); } -int i915_gem_context_enable(struct intel_engine_cs *ring) +int i915_gem_context_enable(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int ret; if (i915.enable_execlists) { if (ring->init_context == NULL) return 0; - ret = ring->init_context(ring, ring->default_context); + ret = ring->init_context(req->ring, ring->default_context); } else - ret = i915_switch_context(ring, ring->default_context); + ret = i915_switch_context(req->ring, ring->default_context); if (ret) { DRM_ERROR("ring init context: %d\n", ret); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 6048ed9..e7c9137 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1207,15 +1207,15 @@ int i915_ppgtt_init_hw(struct drm_device *dev) return 0; } -int i915_ppgtt_init_ring(struct intel_engine_cs *ring) +int i915_ppgtt_init_ring(struct drm_i915_gem_request *req) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct drm_i915_private *dev_priv = req->ring->dev->dev_private; struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; if (!ppgtt) return 0; - return ppgtt->switch_mm(ppgtt, ring); + return ppgtt->switch_mm(ppgtt, req->ring); } struct i915_hw_ppgtt * diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 78a107e..0804bbc 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -300,7 +300,7 @@ void i915_global_gtt_cleanup(struct drm_device *dev); int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt); int i915_ppgtt_init_hw(struct drm_device *dev); -int i915_ppgtt_init_ring(struct intel_engine_cs *ring); +int i915_ppgtt_init_ring(struct drm_i915_gem_request *req); void i915_ppgtt_release(struct kref *kref); struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_device *dev, struct drm_i915_file_private *fpriv); -- 1.7.9.5 _
[Intel-gfx] [PATCH 39/51] drm/i915: Update ring->emit_flush() to take a request structure
From: John Harrison Updated the various ring->emit_flush() implementations to take a request instead of a ringbuf/context pair. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_lrc.c| 17 - drivers/gpu/drm/i915/intel_ringbuffer.h |3 +-- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index a31cc33..21bda2d 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -560,8 +560,7 @@ static int logical_ring_invalidate_all_caches(struct drm_i915_gem_request *req) if (ring->gpu_caches_dirty) flush_domains = I915_GEM_GPU_DOMAINS; - ret = ring->emit_flush(req->ringbuf, req->ctx, - I915_GEM_GPU_DOMAINS, flush_domains); + ret = ring->emit_flush(req, I915_GEM_GPU_DOMAINS, flush_domains); if (ret) return ret; @@ -768,7 +767,7 @@ int logical_ring_flush_all_caches(struct drm_i915_gem_request *req) if (!ring->gpu_caches_dirty) return 0; - ret = ring->emit_flush(req->ringbuf, req->ctx, 0, I915_GEM_GPU_DOMAINS); + ret = ring->emit_flush(req, 0, I915_GEM_GPU_DOMAINS); if (ret) return ret; @@ -1201,18 +1200,18 @@ static void gen8_logical_ring_put_irq(struct intel_engine_cs *ring) spin_unlock_irqrestore(&dev_priv->irq_lock, flags); } -static int gen8_emit_flush(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx, +static int gen8_emit_flush(struct drm_i915_gem_request *request, u32 invalidate_domains, u32 unused) { + struct intel_ringbuffer *ringbuf = request->ringbuf; struct intel_engine_cs *ring = ringbuf->ring; struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; uint32_t cmd; int ret; - ret = intel_logical_ring_begin(ringbuf, ctx, 4); + ret = intel_logical_ring_begin(ringbuf, request->ctx, 4); if (ret) return ret; @@ -1240,11 +1239,11 @@ static int gen8_emit_flush(struct intel_ringbuffer *ringbuf, return 0; } -static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx, +static int gen8_emit_flush_render(struct drm_i915_gem_request *request, u32 invalidate_domains, u32 flush_domains) { + struct intel_ringbuffer *ringbuf = request->ringbuf; struct intel_engine_cs *ring = ringbuf->ring; u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; u32 flags = 0; @@ -1268,7 +1267,7 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf, flags |= PIPE_CONTROL_GLOBAL_GTT_IVB; } - ret = intel_logical_ring_begin(ringbuf, ctx, 6); + ret = intel_logical_ring_begin(ringbuf, request->ctx, 6); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 824c71b..373ebf3 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -235,8 +235,7 @@ struct intel_engine_cs { u32 irq_keep_mask; /* bitmask for interrupts that should not be masked */ int (*emit_request)(struct intel_ringbuffer *ringbuf, struct drm_i915_gem_request *request); - int (*emit_flush)(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx, + int (*emit_flush)(struct drm_i915_gem_request *request, u32 invalidate_domains, u32 flush_domains); int (*emit_bb_start)(struct intel_ringbuffer *ringbuf, -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 22/51] drm/i915: Update i915_switch_context() to take a request structure
From: John Harrison Now that the request is guaranteed to specify the context, it is possible to update the context switch code to use requests rather than ring and context pairs. This patch updates i915_switch_context() accordingly. Also removed the warning that the request's context must match the last context switch's context. As the context switch now gets the context object from the request structure, there is no longer any scope for the two to become out of step. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h|3 +-- drivers/gpu/drm/i915/i915_gem.c|4 +--- drivers/gpu/drm/i915/i915_gem_context.c| 19 +-- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- 4 files changed, 12 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2eab9f4..e5132d3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2976,8 +2976,7 @@ void i915_gem_context_reset(struct drm_device *dev); int i915_gem_context_open(struct drm_device *dev, struct drm_file *file); int i915_gem_context_enable(struct drm_i915_gem_request *req); void i915_gem_context_close(struct drm_device *dev, struct drm_file *file); -int i915_switch_context(struct intel_engine_cs *ring, - struct intel_context *to); +int i915_switch_context(struct drm_i915_gem_request *req); struct intel_context * i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id); void i915_gem_context_free(struct kref *ctx_ref); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 02b921b..5f17ade 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2469,8 +2469,6 @@ int __i915_add_request(struct intel_engine_cs *ring, WARN_ON(request->batch_obj && obj); request->batch_obj = obj; - WARN_ON(request->ctx != ring->last_context); - request->emitted_jiffies = jiffies; list_add_tail(&request->list, &ring->request_list); request->file_priv = NULL; @@ -3109,7 +3107,7 @@ int i915_gpu_idle(struct drm_device *dev) if (ret) return ret; - ret = i915_switch_context(req->ring, ring->default_context); + ret = i915_switch_context(req); if (ret) { i915_gem_request_unreference(req); return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 04d2a20..b326f8d 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -414,7 +414,7 @@ int i915_gem_context_enable(struct drm_i915_gem_request *req) ret = ring->init_context(req->ring, ring->default_context); } else - ret = i915_switch_context(req->ring, ring->default_context); + ret = i915_switch_context(req); if (ret) { DRM_ERROR("ring init context: %d\n", ret); @@ -693,8 +693,7 @@ unpin_out: /** * i915_switch_context() - perform a GPU context switch. - * @ring: ring for which we'll execute the context switch - * @to: the context to switch to + * @req: request for which we'll execute the context switch * * The context life cycle is simple. The context refcount is incremented and * decremented by 1 and create and destroy. If the context is in use by the GPU, @@ -705,25 +704,25 @@ unpin_out: * switched by writing to the ELSP and requests keep a reference to their * context. */ -int i915_switch_context(struct intel_engine_cs *ring, - struct intel_context *to) +int i915_switch_context(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; struct drm_i915_private *dev_priv = ring->dev->dev_private; WARN_ON(i915.enable_execlists); WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex)); - if (to->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */ - if (to != ring->last_context) { - i915_gem_context_reference(to); + if (req->ctx->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */ + if (req->ctx != ring->last_context) { + i915_gem_context_reference(req->ctx); if (ring->last_context) i915_gem_context_unreference(ring->last_context); - ring->last_context = to; + ring->last_context = req->ctx; } return 0; } - return do_switch(ring, to); + return do_switch(req->ring, req->ctx); } static bool contexts_enabled(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/
[Intel-gfx] [PATCH 42/51] drm/i915: Update ring->dispatch_execbuffer() to take a request structure
From: John Harrison Updated the various ring->dispatch_execbuffer() implementations to take a request instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c |4 ++-- drivers/gpu/drm/i915/i915_gem_render_state.c |3 +-- drivers/gpu/drm/i915/intel_ringbuffer.c | 18 -- drivers/gpu/drm/i915/intel_ringbuffer.h |2 +- 4 files changed, 16 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a79c893..8b4f8a9 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1265,14 +1265,14 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, if (ret) goto error; - ret = ring->dispatch_execbuffer(ring, + ret = ring->dispatch_execbuffer(params->request, exec_start, exec_len, params->dispatch_flags); if (ret) goto error; } } else { - ret = ring->dispatch_execbuffer(ring, + ret = ring->dispatch_execbuffer(params->request, exec_start, exec_len, params->dispatch_flags); if (ret) diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 866274c..cdf2fee 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c @@ -164,8 +164,7 @@ int i915_gem_render_state_init(struct drm_i915_gem_request *req) if (so.rodata == NULL) return 0; - ret = req->ring->dispatch_execbuffer(req->ring, -so.ggtt_offset, + ret = req->ring->dispatch_execbuffer(req, so.ggtt_offset, so.rodata->batch_items * 4, I915_DISPATCH_SECURE); if (ret) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index ce1dab4..cf23767 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1615,10 +1615,11 @@ gen8_ring_put_irq(struct intel_engine_cs *ring) } static int -i965_dispatch_execbuffer(struct intel_engine_cs *ring, +i965_dispatch_execbuffer(struct drm_i915_gem_request *req, u64 offset, u32 length, unsigned dispatch_flags) { + struct intel_engine_cs *ring = req->ring; int ret; ret = intel_ring_begin(ring, 2); @@ -1641,10 +1642,11 @@ i965_dispatch_execbuffer(struct intel_engine_cs *ring, #define I830_TLB_ENTRIES (2) #define I830_WA_SIZE max(I830_TLB_ENTRIES*4096, I830_BATCH_LIMIT) static int -i830_dispatch_execbuffer(struct intel_engine_cs *ring, +i830_dispatch_execbuffer(struct drm_i915_gem_request *req, u64 offset, u32 len, unsigned dispatch_flags) { + struct intel_engine_cs *ring = req->ring; u32 cs_offset = ring->scratch.gtt_offset; int ret; @@ -1703,10 +1705,11 @@ i830_dispatch_execbuffer(struct intel_engine_cs *ring, } static int -i915_dispatch_execbuffer(struct intel_engine_cs *ring, +i915_dispatch_execbuffer(struct drm_i915_gem_request *req, u64 offset, u32 len, unsigned dispatch_flags) { + struct intel_engine_cs *ring = req->ring; int ret; ret = intel_ring_begin(ring, 2); @@ -2283,10 +2286,11 @@ static int gen6_bsd_ring_flush(struct drm_i915_gem_request *req, } static int -gen8_ring_dispatch_execbuffer(struct intel_engine_cs *ring, +gen8_ring_dispatch_execbuffer(struct drm_i915_gem_request *req, u64 offset, u32 len, unsigned dispatch_flags) { + struct intel_engine_cs *ring = req->ring; bool ppgtt = USES_PPGTT(ring->dev) && !(dispatch_flags & I915_DISPATCH_SECURE); int ret; @@ -2306,10 +2310,11 @@ gen8_ring_dispatch_execbuffer(struct intel_engine_cs *ring, } static int -hsw_ring_dispatch_execbuffer(struct intel_engine_cs *ring, +hsw_ring_dispatch_execbuffer(struct drm_i915_gem_request *req, u64 offset, u32 len, unsigned dispatch_flags) { + struct intel_engine_cs *ring = req->ring; int ret; ret = intel_ring_begin(ring, 2); @@ -2328,10 +2333,11 @@ hsw_ring_dispatch_execbuffer(struct intel_engine_cs *ring, } static int -gen6_ring_dispatch_execbuffer(struct intel_engine_cs *ring, +gen6_
[Intel-gfx] [PATCH 23/51] drm/i915: Update do_switch() to take a request structure
From: John Harrison Updated do_switch() to take a request pointer instead of a ring/context pair. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_context.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index b326f8d..eedb994 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -556,9 +556,10 @@ mi_set_context(struct intel_engine_cs *ring, return ret; } -static int do_switch(struct intel_engine_cs *ring, -struct intel_context *to) +static int do_switch(struct drm_i915_gem_request *req) { + struct intel_context *to = req->ctx; + struct intel_engine_cs *ring = req->ring; struct drm_i915_private *dev_priv = ring->dev->dev_private; struct intel_context *from = ring->last_context; u32 hw_flags = 0; @@ -591,7 +592,7 @@ static int do_switch(struct intel_engine_cs *ring, if (to->ppgtt) { trace_switch_mm(ring, to); - ret = to->ppgtt->switch_mm(to->ppgtt, ring); + ret = to->ppgtt->switch_mm(to->ppgtt, req->ring); if (ret) goto unpin_out; } @@ -627,7 +628,7 @@ static int do_switch(struct intel_engine_cs *ring, if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; - ret = mi_set_context(ring, to, hw_flags); + ret = mi_set_context(req->ring, to, hw_flags); if (ret) goto unpin_out; @@ -635,7 +636,7 @@ static int do_switch(struct intel_engine_cs *ring, if (!(to->remap_slice & (1legacy_hw_ctx.rcs_state->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION; - i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), ring); + i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), req->ring); /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the * whole damn pipeline, we don't need to explicitly mark the * object dirty. The only exception is that the context must be @@ -677,7 +678,7 @@ done: if (uninitialized) { if (ring->init_context) { - ret = ring->init_context(ring, to); + ret = ring->init_context(req->ring, to); if (ret) DRM_ERROR("ring init context: %d\n", ret); } @@ -722,7 +723,7 @@ int i915_switch_context(struct drm_i915_gem_request *req) return 0; } - return do_switch(req->ring, req->ctx); + return do_switch(req); } static bool contexts_enabled(struct drm_device *dev) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 30/51] drm/i915: Update [vma|object]_move_to_active() to take request structures
From: John Harrison Now that everything above has been converted to use request structures, it is possible to update the lower level move_to_active() functions to be request based as well. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |2 +- drivers/gpu/drm/i915/i915_gem.c | 17 - drivers/gpu/drm/i915/i915_gem_context.c |2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- drivers/gpu/drm/i915/i915_gem_render_state.c |2 +- drivers/gpu/drm/i915/intel_lrc.c |2 +- 6 files changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b7c01e2..be1e143 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2731,7 +2731,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev); int i915_gem_object_sync(struct drm_i915_gem_object *obj, struct drm_i915_gem_request *to_req); void i915_vma_move_to_active(struct i915_vma *vma, -struct intel_engine_cs *ring); +struct drm_i915_gem_request *req); int i915_gem_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8b0bfbd..e8257dd 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2264,17 +2264,16 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) static void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, - struct intel_engine_cs *ring) + struct drm_i915_gem_request *req) { - struct drm_i915_gem_request *req; - struct intel_engine_cs *old_ring; + struct intel_engine_cs *new_ring, *old_ring; - BUG_ON(ring == NULL); + BUG_ON(req == NULL); - req = intel_ring_get_request(ring); + new_ring = i915_gem_request_get_ring(req); old_ring = i915_gem_request_get_ring(obj->last_read_req); - if (old_ring != ring && obj->last_write_req) { + if (old_ring != new_ring && obj->last_write_req) { /* Keep the request relative to the current ring */ i915_gem_request_assign(&obj->last_write_req, req); } @@ -2285,16 +2284,16 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj, obj->active = 1; } - list_move_tail(&obj->ring_list, &ring->active_list); + list_move_tail(&obj->ring_list, &new_ring->active_list); i915_gem_request_assign(&obj->last_read_req, req); } void i915_vma_move_to_active(struct i915_vma *vma, -struct intel_engine_cs *ring) +struct drm_i915_gem_request *req) { list_move_tail(&vma->mm_list, &vma->vm->active_list); - return i915_gem_object_move_to_active(vma->obj, ring); + return i915_gem_object_move_to_active(vma->obj, req); } static void diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 938cd26..e4d75be 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -652,7 +652,7 @@ static int do_switch(struct drm_i915_gem_request *req) */ if (from != NULL) { from->legacy_hw_ctx.rcs_state->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION; - i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), req->ring); + i915_vma_move_to_active(i915_gem_obj_to_ggtt(from->legacy_hw_ctx.rcs_state), req); /* As long as MI_SET_CONTEXT is serializing, ie. it flushes the * whole damn pipeline, we don't need to explicitly mark the * object dirty. The only exception is that the context must be diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 15e33a9..dc13751 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -966,7 +966,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas, obj->base.pending_read_domains |= obj->base.read_domains; obj->base.read_domains = obj->base.pending_read_domains; - i915_vma_move_to_active(vma, ring); + i915_vma_move_to_active(vma, req); if (obj->base.write_domain) { obj->dirty = 1; i915_gem_request_assign(&obj->last_write_req, req); diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c index 85cc746..866274c 100644 --- a/drivers/gpu/drm/i915/i915_gem_render_state.c +++ b/driv
[Intel-gfx] [PATCH 32/51] drm/i915: Update mi_set_context() to take a request structure
From: John Harrison Updated mi_set_context() to take a request structure instead of a ring and context pair. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_context.c |9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 475d1fd..9e66fac 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -472,10 +472,9 @@ i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id) } static inline int -mi_set_context(struct intel_engine_cs *ring, - struct intel_context *new_context, - u32 hw_flags) +mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) { + struct intel_engine_cs *ring = req->ring; u32 flags = hw_flags | MI_MM_SPACE_GTT; const int num_rings = /* Use an extended w/a on ivb+ if signalling from other rings */ @@ -527,7 +526,7 @@ mi_set_context(struct intel_engine_cs *ring, intel_ring_emit(ring, MI_NOOP); intel_ring_emit(ring, MI_SET_CONTEXT); - intel_ring_emit(ring, i915_gem_obj_ggtt_offset(new_context->legacy_hw_ctx.rcs_state) | + intel_ring_emit(ring, i915_gem_obj_ggtt_offset(req->ctx->legacy_hw_ctx.rcs_state) | flags); /* * w/a: MI_SET_CONTEXT must always be followed by MI_NOOP @@ -628,7 +627,7 @@ static int do_switch(struct drm_i915_gem_request *req) if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; - ret = mi_set_context(req->ring, to, hw_flags); + ret = mi_set_context(req, hw_flags); if (ret) goto unpin_out; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 19/51] drm/i915: Add explicit request management to i915_gem_init_hw()
From: John Harrison Now that a single per ring loop is being done for all the different intialisation steps in i915_gem_init_hw(), it is possible to add proper request management as well. The last remaining issue is that the context enable call eventually ends up within *_render_state_init() and this does it's own private _i915_add_request() call. This patch adds explicit request creation and submission to the top level loop and removes the add_request() from deep within the sub-functions. Note that the old add_request() call was being passed a batch object. This is now explicitly written to the request object instead. A warning has also been added to i915_add_request() to ensure that there is never an attempt to add two batch objects to a single request - e.g. because render_state_init() was called during execbuffer processing. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h |3 ++- drivers/gpu/drm/i915/i915_gem.c | 18 ++ drivers/gpu/drm/i915/i915_gem_gtt.c |7 +-- drivers/gpu/drm/i915/i915_gem_render_state.c |3 ++- drivers/gpu/drm/i915/intel_lrc.c |8 +++- 5 files changed, 26 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 099a3ee..2229238 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2151,7 +2151,8 @@ struct drm_i915_gem_request { struct intel_context *ctx; struct intel_ringbuffer *ringbuf; - /** Batch buffer related to this request if any */ + /** Batch buffer related to this request if any (used for + error state dump only) */ struct drm_i915_gem_object *batch_obj; /** Time at which this request was emitted, in jiffies. */ diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1c711c0..5f24ce1 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2466,6 +2466,7 @@ int __i915_add_request(struct intel_engine_cs *ring, * inactive_list and lose its active reference. Hence we do not need * to explicitly hold another reference here. */ + WARN_ON(request->batch_obj && obj); request->batch_obj = obj; if (!i915.enable_execlists) { @@ -4856,8 +4857,16 @@ i915_gem_init_hw(struct drm_device *dev) /* Now it is safe to go back round and do everything else: */ for_each_ring(ring, dev_priv, i) { + struct drm_i915_gem_request *req; + WARN_ON(!ring->default_context); + ret = dev_priv->gt.alloc_request(ring, ring->default_context, &req); + if (ret) { + i915_gem_cleanup_ringbuffer(dev); + return ret; + } + if (ring->id == RCS) { for (i = 0; i < NUM_L3_SLICES(dev); i++) i915_gem_l3_remap(ring, i); @@ -4866,6 +4875,7 @@ i915_gem_init_hw(struct drm_device *dev) ret = i915_ppgtt_init_ring(ring); if (ret && ret != -EIO) { DRM_ERROR("PPGTT enable ring #%d failed %d\n", i, ret); + i915_gem_request_unreference(req); i915_gem_cleanup_ringbuffer(dev); return ret; } @@ -4873,8 +4883,16 @@ i915_gem_init_hw(struct drm_device *dev) ret = i915_gem_context_enable(ring); if (ret && ret != -EIO) { DRM_ERROR("Context enable ring #%d failed %d\n", i, ret); + i915_gem_request_unreference(req); i915_gem_cleanup_ringbuffer(dev); + return ret; + } + ret = i915_add_request_no_flush(ring); + if (ret) { + DRM_ERROR("Add request ring #%d failed: %d\n", i, ret); + i915_gem_request_unreference(req); + i915_gem_cleanup_ringbuffer(dev); return ret; } } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 1528e77..6048ed9 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1211,16 +1211,11 @@ int i915_ppgtt_init_ring(struct intel_engine_cs *ring) { struct drm_i915_private *dev_priv = ring->dev->dev_private; struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt; - int ret = 0; if (!ppgtt) return 0; - ret = ppgtt->switch_mm(ppgtt, ring); - if (ret != 0) - return ret; - - return ret; + return ppgtt->switch_mm(ppgtt, ring); } struct i915_hw_ppgtt * diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c ind
[Intel-gfx] [PATCH 37/51] drm/i915: Update ring->flush() to take a requests structure
From: John Harrison Udpated the various ring->flush() functions to take a request instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_context.c |2 +- drivers/gpu/drm/i915/i915_gem_gtt.c |6 +++--- drivers/gpu/drm/i915/intel_ringbuffer.c | 30 +++--- drivers/gpu/drm/i915/intel_ringbuffer.h |2 +- 4 files changed, 24 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 816a442..384f481 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -489,7 +489,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) * itlb_before_ctx_switch. */ if (IS_GEN6(ring->dev)) { - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, 0); + ret = ring->flush(req, I915_GEM_GPU_DOMAINS, 0); if (ret) return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 89bbc2c..e3a65c3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -780,7 +780,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt, int ret; /* NB: TLBs must be flushed and invalidated before a switch */ - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS); + ret = ring->flush(req, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS); if (ret) return ret; @@ -806,7 +806,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt, int ret; /* NB: TLBs must be flushed and invalidated before a switch */ - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS); + ret = ring->flush(req, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS); if (ret) return ret; @@ -824,7 +824,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt, /* XXX: RCS is the only one to auto invalidate the TLBs? */ if (ring->id != RCS) { - ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS); + ret = ring->flush(req, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS); if (ret) return ret; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 6a53bfc..bc3c0e6 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -91,10 +91,11 @@ static void __intel_ring_advance(struct intel_engine_cs *ring) } static int -gen2_render_ring_flush(struct intel_engine_cs *ring, +gen2_render_ring_flush(struct drm_i915_gem_request *req, u32 invalidate_domains, u32 flush_domains) { + struct intel_engine_cs *ring = req->ring; u32 cmd; int ret; @@ -117,10 +118,11 @@ gen2_render_ring_flush(struct intel_engine_cs *ring, } static int -gen4_render_ring_flush(struct intel_engine_cs *ring, +gen4_render_ring_flush(struct drm_i915_gem_request *req, u32 invalidate_domains, u32 flush_domains) { + struct intel_engine_cs *ring = req->ring; struct drm_device *dev = ring->dev; u32 cmd; int ret; @@ -247,9 +249,10 @@ intel_emit_post_sync_nonzero_flush(struct intel_engine_cs *ring) } static int -gen6_render_ring_flush(struct intel_engine_cs *ring, - u32 invalidate_domains, u32 flush_domains) +gen6_render_ring_flush(struct drm_i915_gem_request *req, + u32 invalidate_domains, u32 flush_domains) { + struct intel_engine_cs *ring = req->ring; u32 flags = 0; u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; int ret; @@ -341,9 +344,10 @@ static int gen7_ring_fbc_flush(struct intel_engine_cs *ring, u32 value) } static int -gen7_render_ring_flush(struct intel_engine_cs *ring, +gen7_render_ring_flush(struct drm_i915_gem_request *req, u32 invalidate_domains, u32 flush_domains) { + struct intel_engine_cs *ring = req->ring; u32 flags = 0; u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; int ret; @@ -426,9 +430,10 @@ gen8_emit_pipe_control(struct intel_engine_cs *ring, } static int -gen8_render_ring_flush(struct intel_engine_cs *ring, +gen8_render_ring_flush(struct drm_i915_gem_request *req, u32 invalidate_domains, u32 flush_domains) { + struct intel_engine_cs *ring = req->ring; u32 flags = 0; u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; int ret; @@ -1448,10 +1453,11 @@ void intel_ring_setup_status_page(struct intel_engine_cs *ring) } static int -bsd_ring_flush(struct intel_engine_cs *ring, +bsd_r
[Intel-gfx] [PATCH 43/51] drm/i915: Update ring->emit_bb_start() to take a request structure
From: John Harrison Updated the ring->emit_bb_start() implementation to take a request instead of a ringbuf/context pair. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_lrc.c| 12 +--- drivers/gpu/drm/i915/intel_ringbuffer.h |3 +-- 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 02769f8..54e6a25 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -696,7 +696,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, exec_start = params->batch_obj_vm_offset + args->batch_start_offset; - ret = ring->emit_bb_start(ringbuf, params->ctx, exec_start, params->dispatch_flags); + ret = ring->emit_bb_start(params->request, exec_start, params->dispatch_flags); if (ret) return ret; @@ -1146,14 +1146,14 @@ static int gen8_init_render_ring(struct intel_engine_cs *ring) return init_workarounds_ring(ring); } -static int gen8_emit_bb_start(struct intel_ringbuffer *ringbuf, - struct intel_context *ctx, +static int gen8_emit_bb_start(struct drm_i915_gem_request *req, u64 offset, unsigned dispatch_flags) { + struct intel_ringbuffer *ringbuf = req->ringbuf; bool ppgtt = !(dispatch_flags & I915_DISPATCH_SECURE); int ret; - ret = intel_logical_ring_begin(ringbuf, ctx, 4); + ret = intel_logical_ring_begin(ringbuf, req->ctx, 4); if (ret) return ret; @@ -1595,9 +1595,7 @@ static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req) if (so.rodata == NULL) return 0; - ret = req->ring->emit_bb_start(req->ringbuf, - req->ctx, - so.ggtt_offset, + ret = req->ring->emit_bb_start(req, so.ggtt_offset, I915_DISPATCH_SECURE); if (ret) goto out; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 545b867..514ddcb 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -237,8 +237,7 @@ struct intel_engine_cs { int (*emit_flush)(struct drm_i915_gem_request *request, u32 invalidate_domains, u32 flush_domains); - int (*emit_bb_start)(struct intel_ringbuffer *ringbuf, -struct intel_context *ctx, + int (*emit_bb_start)(struct drm_i915_gem_request *req, u64 offset, unsigned dispatch_flags); /** -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/51] drm/i915: Add request to execbuf params and add explicit cleanup
From: John Harrison Rather than just having a local request variable in the execbuff code, the request pointer is now stored in the execbuff params structure. Also added explicit cleanup of the request (plus wiping the OLR to match) in the error case. This means that the execbuff code is no longer dependent upon the OLR keeping track of the request so as to not leak it when things do go wrong. Note that in the success case, the i915_add_request() at the end of the submission function will tidy up the request and clear the OLR. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h|1 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 92c183f..df9b5d7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1648,6 +1648,7 @@ struct i915_execbuffer_params { struct intel_engine_cs *ring; struct drm_i915_gem_object *batch_obj; struct intel_context*ctx; + struct drm_i915_gem_request *request; }; struct drm_i915_private { diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 37dcc6f..10462f6 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1353,7 +1353,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct i915_address_space *vm; struct i915_execbuffer_params params_master; /* XXX: will be removed later */ struct i915_execbuffer_params *params = ¶ms_master; - struct drm_i915_gem_request *request; const u32 ctx_id = i915_execbuffer2_get_context_id(*args); u32 dispatch_flags; int ret; @@ -1532,7 +1531,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, params->batch_obj_vm_offset = i915_gem_obj_offset(batch_obj, vm); /* Allocate a request for this batch buffer nice and early. */ - ret = dev_priv->gt.alloc_request(ring, ctx, &request); + ret = dev_priv->gt.alloc_request(ring, ctx, ¶ms->request); if (ret) goto err; @@ -1565,6 +1564,16 @@ err: i915_gem_context_unreference(ctx); eb_destroy(eb); + /* +* If the request was created but not successfully submitted then it +* must be freed again. If it was submitted then it is being tracked +* on the active request list and no clean up is required here. +*/ + if (ret && params->request) { + i915_gem_request_unreference(params->request); + ring->outstanding_lazy_request = NULL; + } + mutex_unlock(&dev->struct_mutex); pre_mutex_err: -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 45/51] drm/i915: Update ring->signal() to take a request structure
From: John Harrison Updated the various ring->signal() implementations to take a request instead of a ring. This removes their reliance on the OLR to obtain the seqno value that should be used for the signal. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_ringbuffer.c | 20 ++-- drivers/gpu/drm/i915/intel_ringbuffer.h |2 +- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index aa521c7..e04e881 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -964,10 +964,11 @@ static void render_ring_cleanup(struct intel_engine_cs *ring) intel_fini_pipe_control(ring); } -static int gen8_rcs_signal(struct intel_engine_cs *signaller, +static int gen8_rcs_signal(struct drm_i915_gem_request *signaller_req, unsigned int num_dwords) { #define MBOX_UPDATE_DWORDS 8 + struct intel_engine_cs *signaller = signaller_req->ring; struct drm_device *dev = signaller->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *waiter; @@ -987,8 +988,7 @@ static int gen8_rcs_signal(struct intel_engine_cs *signaller, if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID) continue; - seqno = i915_gem_request_get_seqno( - signaller->outstanding_lazy_request); + seqno = i915_gem_request_get_seqno(signaller_req); intel_ring_emit(signaller, GFX_OP_PIPE_CONTROL(6)); intel_ring_emit(signaller, PIPE_CONTROL_GLOBAL_GTT_IVB | PIPE_CONTROL_QW_WRITE | @@ -1005,10 +1005,11 @@ static int gen8_rcs_signal(struct intel_engine_cs *signaller, return 0; } -static int gen8_xcs_signal(struct intel_engine_cs *signaller, +static int gen8_xcs_signal(struct drm_i915_gem_request *signaller_req, unsigned int num_dwords) { #define MBOX_UPDATE_DWORDS 6 + struct intel_engine_cs *signaller = signaller_req->ring; struct drm_device *dev = signaller->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *waiter; @@ -1028,8 +1029,7 @@ static int gen8_xcs_signal(struct intel_engine_cs *signaller, if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID) continue; - seqno = i915_gem_request_get_seqno( - signaller->outstanding_lazy_request); + seqno = i915_gem_request_get_seqno(signaller_req); intel_ring_emit(signaller, (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW); intel_ring_emit(signaller, lower_32_bits(gtt_offset) | @@ -1044,9 +1044,10 @@ static int gen8_xcs_signal(struct intel_engine_cs *signaller, return 0; } -static int gen6_signal(struct intel_engine_cs *signaller, +static int gen6_signal(struct drm_i915_gem_request *signaller_req, unsigned int num_dwords) { + struct intel_engine_cs *signaller = signaller_req->ring; struct drm_device *dev = signaller->dev; struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *useless; @@ -1064,8 +1065,7 @@ static int gen6_signal(struct intel_engine_cs *signaller, for_each_ring(useless, dev_priv, i) { u32 mbox_reg = signaller->semaphore.mbox.signal[i]; if (mbox_reg != GEN6_NOSYNC) { - u32 seqno = i915_gem_request_get_seqno( - signaller->outstanding_lazy_request); + u32 seqno = i915_gem_request_get_seqno(signaller_req); intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1)); intel_ring_emit(signaller, mbox_reg); intel_ring_emit(signaller, seqno); @@ -1094,7 +1094,7 @@ gen6_add_request(struct drm_i915_gem_request *req) int ret; if (ring->semaphore.signal) - ret = ring->semaphore.signal(ring, 4); + ret = ring->semaphore.signal(req, 4); else ret = intel_ring_begin(ring, 4); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 4b4fd2d..70f3f5d 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -222,7 +222,7 @@ struct intel_engine_cs { int (*sync_to)(struct drm_i915_gem_request *to_req, struct intel_engine_cs *from, u32 seqno); - int (*signal)(struct intel_engine_cs *signaller, + int (*signal)(struct drm_i915_gem_request *signaller_req,
[Intel-gfx] [PATCH 49/51] drm/i915: Update intel_logical_ring_begin() to take a request structure
From: John Harrison Now that everything above has been converted to use requests, intel_logical_ring_begin() can be updated to take a request instead of a ringbuf/context pair. This also means that it no longer needs to lazily allocate a request if no-one happens to have done it earlier. Note that this change makes the execlist signature the same as the legacy version. Thus the two functions could be merged into a ring->begin() wrapper if required. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_lrc.c | 35 +-- drivers/gpu/drm/i915/intel_lrc.h |3 --- 2 files changed, 17 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 54e6a25..c5408bc 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -203,6 +203,8 @@ enum { }; #define GEN8_CTX_ID_SHIFT 32 +static int intel_logical_ring_begin(struct drm_i915_gem_request *req, + int num_dwords); static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req); static int intel_lr_context_pin(struct intel_engine_cs *ring, struct intel_context *ctx); @@ -680,7 +682,7 @@ int intel_execlists_submission(struct i915_execbuffer_params *params, if (ring == &dev_priv->ring[RCS] && instp_mode != dev_priv->relative_constants_mode) { - ret = intel_logical_ring_begin(ringbuf, params->ctx, 4); + ret = intel_logical_ring_begin(params->request, 4); if (ret) return ret; @@ -1028,7 +1030,7 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf, /** * intel_logical_ring_begin() - prepare the logical ringbuffer to accept some commands * - * @ringbuf: Logical ringbuffer. + * @request: The request to start some new work for * @num_dwords: number of DWORDs that we plan to write to the ringbuffer. * * The ringbuffer might not be ready to accept the commands right away (maybe it needs to @@ -1038,30 +1040,27 @@ static int logical_ring_prepare(struct intel_ringbuffer *ringbuf, * * Return: non-zero if the ringbuffer is not ready to be written to. */ -int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf, -struct intel_context *ctx, int num_dwords) +static int intel_logical_ring_begin(struct drm_i915_gem_request *req, + int num_dwords) { - struct drm_i915_gem_request *req; - struct intel_engine_cs *ring = ringbuf->ring; + struct intel_engine_cs *ring = req->ring; struct drm_device *dev = ring->dev; struct drm_i915_private *dev_priv = dev->dev_private; int ret; + WARN_ON(req == NULL); + ret = i915_gem_check_wedge(&dev_priv->gpu_error, dev_priv->mm.interruptible); if (ret) return ret; - ret = logical_ring_prepare(ringbuf, ctx, num_dwords * sizeof(uint32_t)); - if (ret) - return ret; - - /* Preallocate the olr before touching the ring */ - ret = intel_logical_ring_alloc_request(ring, ctx, &req); + ret = logical_ring_prepare(req->ringbuf, req->ctx, + num_dwords * sizeof(uint32_t)); if (ret) return ret; - ringbuf->space -= num_dwords * sizeof(uint32_t); + req->ringbuf->space -= num_dwords * sizeof(uint32_t); return 0; } @@ -1082,7 +1081,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) if (ret) return ret; - ret = intel_logical_ring_begin(ringbuf, req->ctx, w->count * 2 + 2); + ret = intel_logical_ring_begin(req, w->count * 2 + 2); if (ret) return ret; @@ -1153,7 +1152,7 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req, bool ppgtt = !(dispatch_flags & I915_DISPATCH_SECURE); int ret; - ret = intel_logical_ring_begin(ringbuf, req->ctx, 4); + ret = intel_logical_ring_begin(req, 4); if (ret) return ret; @@ -1211,7 +1210,7 @@ static int gen8_emit_flush(struct drm_i915_gem_request *request, uint32_t cmd; int ret; - ret = intel_logical_ring_begin(ringbuf, request->ctx, 4); + ret = intel_logical_ring_begin(request, 4); if (ret) return ret; @@ -1267,7 +1266,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request, flags |= PIPE_CONTROL_GLOBAL_GTT_IVB; } - ret = intel_logical_ring_begin(ringbuf, request->ctx, 6); + ret = intel_logical_ring_begin(request, 6); if (ret) return ret; @@ -1299,7 +1298,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request) u32 cmd; int ret; - ret = intel_logi
[Intel-gfx] [PATCH 50/51] drm/i915: Remove the now obsolete intel_ring_get_request()
From: John Harrison Much of the driver has now been converted to passing requests around instead of rings/ringbufs/contexts. Thus the function for retreiving the request from a ring (i.e. the OLR) is no longer used and can be removed. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_ringbuffer.h |7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 35799dc..e20ba9e 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -432,11 +432,4 @@ static inline u32 intel_ring_get_tail(struct intel_ringbuffer *ringbuf) return ringbuf->tail; } -static inline struct drm_i915_gem_request * -intel_ring_get_request(struct intel_engine_cs *ring) -{ - BUG_ON(ring->outstanding_lazy_request == NULL); - return ring->outstanding_lazy_request; -} - #endif /* _INTEL_RINGBUFFER_H_ */ -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 47/51] drm/i915: Update ironlake_enable_rc6() to do explicit request management
From: John Harrison Updated ironlake_enable_rc6() to do explicit request creation and submission. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_pm.c | 31 +-- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6ece663..0844166 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4959,6 +4959,7 @@ static void ironlake_enable_rc6(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; struct intel_engine_cs *ring = &dev_priv->ring[RCS]; + struct drm_i915_gem_request *req = NULL; bool was_interruptible; int ret; @@ -4977,16 +4978,17 @@ static void ironlake_enable_rc6(struct drm_device *dev) was_interruptible = dev_priv->mm.interruptible; dev_priv->mm.interruptible = false; + ret = dev_priv->gt.alloc_request(ring, NULL, &req); + if (ret) + goto err; + /* * GPU can automatically power down the render unit if given a page * to save state. */ ret = intel_ring_begin(ring, 6); - if (ret) { - ironlake_teardown_rc6(dev); - dev_priv->mm.interruptible = was_interruptible; - return; - } + if (ret) + goto err; intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN); intel_ring_emit(ring, MI_SET_CONTEXT); @@ -5000,6 +5002,11 @@ static void ironlake_enable_rc6(struct drm_device *dev) intel_ring_emit(ring, MI_FLUSH); intel_ring_advance(ring); + ret = i915_add_request_no_flush(req); + if (ret) + goto err; + req = NULL; + /* * Wait for the command parser to advance past MI_SET_CONTEXT. The HW * does an implicit flush, combined with MI_FLUSH above, it should be @@ -5007,16 +5014,20 @@ static void ironlake_enable_rc6(struct drm_device *dev) */ ret = intel_ring_idle(ring); dev_priv->mm.interruptible = was_interruptible; - if (ret) { - DRM_ERROR("failed to enable ironlake power savings\n"); - ironlake_teardown_rc6(dev); - return; - } + if (ret) + goto err; I915_WRITE(PWRCTXA, i915_gem_obj_ggtt_offset(dev_priv->ips.pwrctx) | PWRCTX_EN); I915_WRITE(RSTDBYCTL, I915_READ(RSTDBYCTL) & ~RCX_SW_EXIT); intel_print_rc6_info(dev, GEN6_RC_CTL_RC6_ENABLE); + +err: + DRM_ERROR("failed to enable ironlake power savings\n"); + ironlake_teardown_rc6(dev); + dev_priv->mm.interruptible = was_interruptible; + if (req) + i915_gem_request_unreference(req); } static unsigned long intel_pxfreq(u32 vidfreq) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 46/51] drm/i915: Update cacheline_align() to take a request structure
From: John Harrison Updated intel_ring_cacheline_align() to take a request instead of a ring. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/intel_display.c|2 +- drivers/gpu/drm/i915/intel_ringbuffer.c |3 ++- drivers/gpu/drm/i915/intel_ringbuffer.h |2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4aaa190..30fa5e1 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9388,7 +9388,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev, * then do the cacheline alignment, and finally emit the * MI_DISPLAY_FLIP. */ - ret = intel_ring_cacheline_align(ring); + ret = intel_ring_cacheline_align(req); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e04e881..b3aca4a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2179,8 +2179,9 @@ int intel_ring_begin(struct intel_engine_cs *ring, } /* Align the ring tail to a cacheline boundary */ -int intel_ring_cacheline_align(struct intel_engine_cs *ring) +int intel_ring_cacheline_align(struct drm_i915_gem_request *req) { + struct intel_engine_cs *ring = req->ring; int num_dwords = (ring->buffer->tail & (CACHELINE_BYTES - 1)) / sizeof(uint32_t); int ret; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 70f3f5d..ba7213f 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -387,7 +387,7 @@ void intel_stop_ring_buffer(struct intel_engine_cs *ring); void intel_cleanup_ring_buffer(struct intel_engine_cs *ring); int __must_check intel_ring_begin(struct intel_engine_cs *ring, int n); -int __must_check intel_ring_cacheline_align(struct intel_engine_cs *ring); +int __must_check intel_ring_cacheline_align(struct drm_i915_gem_request *req); int __must_check intel_ring_alloc_request(struct intel_engine_cs *ring, struct intel_context *ctx, struct drm_i915_gem_request **req_out); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 48/51] drm/i915: Update intel_ring_begin() to take a request structure
From: John Harrison Now that everything above has been converted to use requests, intel_ring_begin() can be updated to take a request instead of a ring. This also means that it no longer needs to lazily allocate a request if no-one happens to have done it earlier. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c|2 +- drivers/gpu/drm/i915/i915_gem_context.c|2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c |8 ++-- drivers/gpu/drm/i915/i915_gem_gtt.c|6 +-- drivers/gpu/drm/i915/intel_display.c | 12 ++--- drivers/gpu/drm/i915/intel_overlay.c |8 ++-- drivers/gpu/drm/i915/intel_pm.c|2 +- drivers/gpu/drm/i915/intel_ringbuffer.c| 72 +--- drivers/gpu/drm/i915/intel_ringbuffer.h|2 +- 9 files changed, 55 insertions(+), 59 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e60ea05..4777eb2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4651,7 +4651,7 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice) if (!HAS_L3_DPF(dev) || !remap_info) return 0; - ret = intel_ring_begin(ring, GEN7_L3LOG_SIZE / 4 * 3); + ret = intel_ring_begin(req, GEN7_L3LOG_SIZE / 4 * 3); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 384f481..e348424 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -503,7 +503,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags) if (INTEL_INFO(ring->dev)->gen >= 7) len += 2 + (num_rings ? 4*num_rings + 2 : 0); - ret = intel_ring_begin(ring, len); + ret = intel_ring_begin(req, len); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 8b4f8a9..6a703e6 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1013,7 +1013,7 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev, return -EINVAL; } - ret = intel_ring_begin(ring, 4 * 3); + ret = intel_ring_begin(req, 4 * 3); if (ret) return ret; @@ -1044,7 +1044,7 @@ i915_emit_box(struct drm_i915_gem_request *req, } if (INTEL_INFO(ring->dev)->gen >= 4) { - ret = intel_ring_begin(ring, 4); + ret = intel_ring_begin(req, 4); if (ret) return ret; @@ -1053,7 +1053,7 @@ i915_emit_box(struct drm_i915_gem_request *req, intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) << 16); intel_ring_emit(ring, DR4); } else { - ret = intel_ring_begin(ring, 6); + ret = intel_ring_begin(req, 6); if (ret) return ret; @@ -1235,7 +1235,7 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params, if (ring == &dev_priv->ring[RCS] && instp_mode != dev_priv->relative_constants_mode) { - ret = intel_ring_begin(ring, 4); + ret = intel_ring_begin(params->request, 4); if (ret) goto error; diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index e3a65c3..417a89e 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -284,7 +284,7 @@ static int gen8_write_pdp(struct drm_i915_gem_request *req, unsigned entry, BUG_ON(entry >= 4); - ret = intel_ring_begin(ring, 6); + ret = intel_ring_begin(req, 6); if (ret) return ret; @@ -784,7 +784,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt, if (ret) return ret; - ret = intel_ring_begin(ring, 6); + ret = intel_ring_begin(req, 6); if (ret) return ret; @@ -810,7 +810,7 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt, if (ret) return ret; - ret = intel_ring_begin(ring, 6); + ret = intel_ring_begin(req, 6); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 30fa5e1..ef53839 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -9209,7 +9209,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev, u32 flip_mask; int ret; - ret = intel_ring_begin(ring, 6); + ret = intel_ring_begin(req, 6); if (ret) return ret; @@ -9244,7 +9244,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev, u32 flip_
[Intel-gfx] [PATCH 51/51] drm/i915: Remove the now obsolete 'outstanding_lazy_request'
From: John Harrison The outstanding_lazy_request is no longer used anywhere in the driver. Everything that was looking at it now has a request explicitly passed in from on high. Everything that was relying upon behind the scenes is now explicitly creating/passing/submitting it's own private request. Thus the OLR can be removed. For: VIZ-5115 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem.c| 16 +--- drivers/gpu/drm/i915/i915_gem_execbuffer.c |4 +--- drivers/gpu/drm/i915/intel_lrc.c |6 ++ drivers/gpu/drm/i915/intel_ringbuffer.c| 17 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h|4 5 files changed, 6 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 4777eb2..8febd58 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1155,15 +1155,9 @@ i915_gem_check_wedge(struct i915_gpu_error *error, int i915_gem_check_olr(struct drm_i915_gem_request *req) { - int ret; - WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex)); - ret = 0; - if (req == req->ring->outstanding_lazy_request) - ret = i915_add_request(req); - - return ret; + return 0; } static void fake_irq(unsigned long data) @@ -2423,8 +2417,6 @@ int __i915_add_request(struct drm_i915_gem_request *request, dev_priv = ring->dev->dev_private; ringbuf = request->ringbuf; - WARN_ON(request != ring->outstanding_lazy_request); - request_start = intel_ring_get_tail(ringbuf); /* * Emit any outstanding flushes - execbuf can fail to emit the flush @@ -2483,7 +2475,6 @@ int __i915_add_request(struct drm_i915_gem_request *request, } trace_i915_gem_request_add(request); - ring->outstanding_lazy_request = NULL; i915_queue_hangcheck(ring->dev); @@ -2667,9 +2658,6 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv, i915_gem_free_request(request); } - - /* This may not have been flushed before the reset, so clean it now */ - i915_gem_request_assign(&ring->outstanding_lazy_request, NULL); } void i915_gem_restore_fences(struct drm_device *dev) @@ -3119,8 +3107,6 @@ int i915_gpu_idle(struct drm_device *dev) } } - WARN_ON(ring->outstanding_lazy_request); - ret = intel_ring_idle(ring); if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 6a703e6..0eae592 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1571,10 +1571,8 @@ err: * must be freed again. If it was submitted then it is being tracked * on the active request list and no clean up is required here. */ - if (ret && params->request) { + if (ret && params->request) i915_gem_request_unreference(params->request); - ring->outstanding_lazy_request = NULL; - } mutex_unlock(&dev->struct_mutex); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index c5408bc..1e23702 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -855,8 +855,7 @@ int intel_logical_ring_alloc_request(struct intel_engine_cs *ring, if (!req_out) return -EINVAL; - if ((*req_out = ring->outstanding_lazy_request) != NULL) - return 0; + *req_out = NULL; request = kzalloc(sizeof(*request), GFP_KERNEL); if (request == NULL) @@ -890,7 +889,7 @@ int intel_logical_ring_alloc_request(struct intel_engine_cs *ring, i915_gem_context_reference(request->ctx); request->ringbuf = ctx->engine[ring->id].ringbuf; - *req_out = ring->outstanding_lazy_request = request; + *req_out = request; return 0; } @@ -1346,7 +1345,6 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *ring) intel_logical_ring_stop(ring); WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0); - i915_gem_request_assign(&ring->outstanding_lazy_request, NULL); if (ring->cleanup) ring->cleanup(ring); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 9b4cf99..1cbde62 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1955,7 +1955,6 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring) intel_unpin_ringbuffer_obj(ringbuf); intel_destroy_ringbuffer_obj(ringbuf); - i915_gem_request_assign(&ring->outstanding_lazy_request, NULL); if (ring->cleanup) ring->cleanup(ring); @@ -2074,15 +2073,6 @@ static
Re: [Intel-gfx] [PATCH] drm/i915: Fix an incorrect free rather than derefence issue.
On Fri, Feb 13, 2015 at 10:58 AM, Nick Hoath wrote: > On 13/02/2015 09:32, Daniel Vetter wrote: >> >> On Thu, Feb 12, 2015 at 12:29:21PM +, Nick Hoath wrote: >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 >>> >>> Signed-off-by: Nick Hoath >> >> >> Commit message is missing the absolutely crucial detail about which patch >> introduced this regression: >> >> commit 6d3d8274bc45de4babb62d64562d92af984dd238 >> Author: Nick Hoath >> AuthorDate: Thu Jan 15 13:10:39 2015 + >> >> drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request >> >> Another thing I've noticed is that we explicitly drop the context >> reference for the request before dropping the request reference. Without >> clearing the req->ctx pointer. That has a very high chance to leading to >> tears, imo the context unreferenceing should be pushed into >> i915_gem_request_free. >> >> Except that it's there already, which means we have a double unref now? > > > Looking at the code, it looks like that's the case. > >> >> Also this patch is for the legacy ringbuffer code, but the referenced bug >> is for gen8+ execlists. We're definitely not running this code here I >> think. > > > i915_gem_reset_ring_cleanup is used in execlists in the hang recovery case. Oops I've missed that, somehow I've thought intel_lrc.c has it's own reset recovery code. But I've just mixed up function names a bit. >> Imo step one is to drop all the explicit ctx refcounting for req->ctx and >> always rely on the implicit reference. Then see what happens. > > > I agree that the refcounting needs re-evaluating after the merge of execlist > queue entries & requests, however I think the cleanup of the double > unref/removing the refcounting should be done in another patchset. This > patch is purely to fix the issue raised in 88652. Depends on the relative > priorities. Well this patch here won't work because there's now a double unref. And I spotted that one because intel_execlists_retire_requests seems to have the exact same double unref already. Hence why I think we should fix up that first and then reasses what's left. The bug here (before your patch) is just a use-after-free, if there's some other reference to the request. And it will also be fixed with the redone req->ctx refcounting. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 24/51] drm/i915: Update deferred context creation to do explicit request management
On Fri, Feb 13, 2015 at 11:48:33AM +, john.c.harri...@intel.com wrote: > From: John Harrison > > In execlist mode, context initialisation is deferred until first use of the > given context. This is because execlist mode has many more contexts than > legacy > mode and many are never actually used. That's not correct. There are no more contexts in execlists than legacy. There are more ringbuffers, or rather the contexts have an extra state object associated with them. > Previously, the initialisation commands > were written to the ring and tagged with some random request structure via the > OLR. This seemed to be causing a null pointer deference bug under certain > circumstances (BZ:40112). > > This patch adds explicit request creation and submission to the deferred > initialisation code path. Thus removing any reliance on or randomness caused > by > the OLR. This is upside down though. The request should be referencing the context (thus instantiating it on demand) and nothing in the context allocation requires the request. The initialisation here should be during i915_request_switch_context(), since it can be entirely shared with legacy. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 49/51] drm/i915: Update intel_logical_ring_begin() to take a request structure
On Fri, Feb 13, 2015 at 11:48:58AM +, john.c.harri...@intel.com wrote: > From: John Harrison > > Now that everything above has been converted to use requests, > intel_logical_ring_begin() can be updated to take a request instead of a > ringbuf/context pair. This also means that it no longer needs to lazily > allocate > a request if no-one happens to have done it earlier. > > Note that this change makes the execlist signature the same as the legacy > version. Thus the two functions could be merged into a ring->begin() wrapper > if > required. It should be noted that you don't even need to virtualise the function... Please kill all the duplicated code. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 47/51] drm/i915: Update ironlake_enable_rc6() to do explicit request management
On Fri, Feb 13, 2015 at 11:48:56AM +, john.c.harri...@intel.com wrote: > From: John Harrison > > Updated ironlake_enable_rc6() to do explicit request creation and submission. If you merged the context here with the common context switching code, we don't even need to touch the ring here. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 48/51] drm/i915: Update intel_ring_begin() to take a request structure
On Fri, Feb 13, 2015 at 11:48:57AM +, john.c.harri...@intel.com wrote: > From: John Harrison > > Now that everything above has been converted to use requests, > intel_ring_begin() > can be updated to take a request instead of a ring. This also means that it no > longer needs to lazily allocate a request if no-one happens to have done it > earlier. Hmm, you missed out on returning @ring@ from intel_ring_begin() to make it explicit that accessing @ring@ through any other means is verboten. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/51] drm/i915: Update alloc_request to return the allocated request
On Fri, Feb 13, 2015 at 11:48:17AM +, john.c.harri...@intel.com wrote: > From: John Harrison > > The alloc_request() function does not actually return the newly allocated > request. Instead, it must be pulled from ring->outstanding_lazy_request. This > patch fixes this so that code can create a request and start using it knowing > exactly which request it actually owns. Why do we have different functions in the first place? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix an incorrect free rather than derefence issue.
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang...@intel.com) Task id: 5767 -Summary- Platform Delta drm-intel-nightly Series Applied PNV -1 282/282 281/282 ILK 313/313 313/313 SNB 309/323 309/323 IVB 380/380 380/380 BYT 296/296 296/296 HSW -2 425/425 423/425 BDW -1 318/318 317/318 -Detailed- Platform Testdrm-intel-nightly Series Applied *PNV igt_gen3_render_linear_blits PASS(5) CRASH(1)PASS(1) HSW igt_kms_flip_plain-flip-fb-recreate TIMEOUT(2)PASS(1) TIMEOUT(1)PASS(1) HSW igt_kms_flip_plain-flip-fb-recreate-interruptible TIMEOUT(2)PASS(2) TIMEOUT(1)PASS(1) *BDW igt_gem_gtt_hog PASS(8) DMESG_WARN(1)PASS(1) Note: You need to pay more attention to line start with '*' ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Prevent TLB error on first execution on SNB
Long ago I found that I was getting sporadic errors when booting SNB, with the symptom being that the first batch died with IPEHR != *ACTHD, typically caused by the TLB being invalid. These magically disappeared if I held the forcewake during the entire ring initialisation sequence. (It can probably be shortened to a short critical section, but the whole initialisation is full of register writes and so we would be taking and releasing forcewake almost continually, and so holding it over the entire sequence will probably be a net win!) Note some of the kernels I encounted the issue already had the deferred forcewake release, so it is still relevant. I know that there have been a few other reports with similar failure conditions on SNB, I think such as References: https://bugs.freedesktop.org/show_bug.cgi?id=80913 Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8d15c8110962..2426f6d9b5a5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4868,6 +4868,7 @@ int i915_gem_init(struct drm_device *dev) dev_priv->gt.stop_ring = intel_logical_ring_stop; } + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); ret = i915_gem_init_userptr(dev); if (ret) goto out_unlock; @@ -4894,6 +4895,7 @@ int i915_gem_init(struct drm_device *dev) } out_unlock: + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); mutex_unlock(&dev->struct_mutex); return ret; -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Trim the command parser allocations
Hello, Apparently, I've been volunteered to review these patches despite not knowing too much about these areas of the driver... On 14/01/2015 11:20, Chris Wilson wrote: Currently, the command parser tries to create a secondary batch exactly as large as the original, and vmap both. This is open to abuse by userspace using extremely large batch objects, but only executing very short batches. For example, this would be if userspace were to implement a command submission ringbuffer. However, we only need to allocate pages for just the contents of the command sequence in the batch - all relocations copied to the secondary batch will reference the original batch and so there can be no access to the secondary batch outside of the explicit execution region. Testcase: igt/gem_exec_big #ivb,byt,hsw Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 74 ++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 74 -- 2 files changed, 73 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 806e812340d0..9a6da3536ae5 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -818,24 +818,26 @@ static bool valid_reg(const u32 *table, int count, u32 addr) return false; } -static u32 *vmap_batch(struct drm_i915_gem_object *obj) +static u32 *vmap_batch(struct drm_i915_gem_object *obj, + unsigned start, unsigned len) { int i; void *addr = NULL; struct sg_page_iter sg_iter; + int first_page = start >> PAGE_SHIFT; + int last_page = (len + start + 4095) >> PAGE_SHIFT; + int npages = last_page - first_page; struct page **pages; - pages = drm_malloc_ab(obj->base.size >> PAGE_SHIFT, sizeof(*pages)); + pages = drm_malloc_ab(npages, sizeof(*pages)); if (pages == NULL) { DRM_DEBUG_DRIVER("Failed to get space for pages\n"); goto finish; } i = 0; - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 0) { - pages[i] = sg_page_iter_page(&sg_iter); - i++; - } + for_each_sg_page(obj->pages->sgl, &sg_iter, npages, first_page) + pages[i++] = sg_page_iter_page(&sg_iter); addr = vmap(pages, i, 0, PAGE_KERNEL); if (addr == NULL) { @@ -855,61 +857,61 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, u32 batch_start_offset, u32 batch_len) { - int ret = 0; int needs_clflush = 0; - u32 *src_base, *dest_base = NULL; - u32 *src_addr, *dest_addr; - u32 offset = batch_start_offset / sizeof(*dest_addr); - u32 end = batch_start_offset + batch_len; + void *src_base, *src; + void *dst = NULL; + int ret; - if (end > dest_obj->base.size || end > src_obj->base.size) + if (batch_len > dest_obj->base.size || + batch_len + batch_start_offset > src_obj->base.size) return ERR_PTR(-E2BIG); ret = i915_gem_obj_prepare_shmem_read(src_obj, &needs_clflush); if (ret) { - DRM_DEBUG_DRIVER("CMD: failed to prep read\n"); + DRM_DEBUG_DRIVER("CMD: failed to prepare shadow batch\n"); return ERR_PTR(ret); } - src_base = vmap_batch(src_obj); + src_base = vmap_batch(src_obj, batch_start_offset, batch_len); if (!src_base) { DRM_DEBUG_DRIVER("CMD: Failed to vmap batch\n"); ret = -ENOMEM; goto unpin_src; } - src_addr = src_base + offset; - - if (needs_clflush) - drm_clflush_virt_range((char *)src_addr, batch_len); + ret = i915_gem_object_get_pages(dest_obj); + if (ret) { + DRM_DEBUG_DRIVER("CMD: Failed to get pages for shadow batch\n"); + goto unmap_src; + } + i915_gem_object_pin_pages(dest_obj); ret = i915_gem_object_set_to_cpu_domain(dest_obj, true); if (ret) { - DRM_DEBUG_DRIVER("CMD: Failed to set batch CPU domain\n"); + DRM_DEBUG_DRIVER("CMD: Failed to set shadow batch to CPU\n"); goto unmap_src; } - dest_base = vmap_batch(dest_obj); - if (!dest_base) { + dst = vmap_batch(dest_obj, 0, batch_len); + if (!dst) { DRM_DEBUG_DRIVER("CMD: Failed to vmap shadow batch\n"); + i915_gem_object_unpin_pages(dest_obj); ret = -ENOMEM; goto unmap_src; } - dest_addr = dest_base + offset; - - if (batch_start_offset != 0) - memset((u8 *)dest_base, 0, batch_start_offset); + src = src_base + offset_in_page(batch_start_offset); + if (needs_clflush
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Trim the command parser allocations
On Fri, Feb 13, 2015 at 01:08:59PM +, John Harrison wrote: > >@@ -1155,40 +1154,30 @@ i915_gem_execbuffer_parse(struct intel_engine_cs > >*ring, > > batch_start_offset, > > batch_len, > > is_master); > >-if (ret) { > >-if (ret == -EACCES) > >-return batch_obj; > >-} else { > >-struct i915_vma *vma; > >+if (ret) > >+goto err; > >-memset(shadow_exec_entry, 0, sizeof(*shadow_exec_entry)); > >+ret = i915_gem_obj_ggtt_pin(shadow_batch_obj, 0, 0); > There is no explicit unpin for this. Does it happen automatically > due to adding the vma to the eb->vmas list? We set the exec_flag that tells us to unpin the obj when unwinding the execbuf. > Also, does it matter that it will be pinned again (and explicitly > unpinned) if the SECURE flag is set? No, pin/unpin is just a counter, it just needs to be balanced. (Long answer, yes, the restrictions given to both pin requests much match or else we will attempt to repin the buffer and fail miserably as the object is already pinned.) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Fix a use after free, and unbalanced refcounting
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 When converting from implicitly tracked execlist queue items to ref counted requests, not all free's of requests were replaced with unrefs, and extraneous refs/unrefs of contexts were added. Correct the unbalanced refcount & replace the free's. Problem introduced in: commit 6d3d8274bc45de4babb62d64562d92af984dd238 Author: Nick Hoath AuthorDate: Thu Jan 15 13:10:39 2015 + drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request Signed-off-by: Nick Hoath --- drivers/gpu/drm/i915/i915_gem.c | 3 +-- drivers/gpu/drm/i915/intel_lrc.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 1765989..79e48b2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2660,8 +2660,7 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv, if (submit_req->ctx != ring->default_context) intel_lr_context_unpin(ring, submit_req->ctx); - i915_gem_context_unreference(submit_req->ctx); - kfree(submit_req); + i915_gem_request_unreference(submit_req); } /* diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index aafcef3..a18925d 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -518,12 +518,12 @@ static int execlists_context_queue(struct intel_engine_cs *ring, return -ENOMEM; request->ring = ring; request->ctx = to; + i915_gem_context_reference(request->ctx); } else { WARN_ON(to != request->ctx); } request->tail = tail; i915_gem_request_reference(request); - i915_gem_context_reference(request->ctx); intel_runtime_pm_get(dev_priv); @@ -740,7 +740,6 @@ void intel_execlists_retire_requests(struct intel_engine_cs *ring) if (ctx_obj && (ctx != ring->default_context)) intel_lr_context_unpin(ring, ctx); intel_runtime_pm_put(dev_priv); - i915_gem_context_unreference(ctx); list_del(&req->execlist_link); i915_gem_request_unreference(req); } -- 2.1.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Cache last obj->pages location for i915_gem_object_get_page()
On 14/01/2015 11:20, Chris Wilson wrote: The biggest user of i915_gem_object_get_page() is the relocation processing during execbuffer. Typically userspace passes in a set of relocations in sorted order. Sadly, we alternate between relocations increasing from the start of the buffers, and relocations decreasing from the end. However the majority of consecutive lookups will still be in the same page. We could cache the start of the last sg chain, however for most callers, the entire sgl is inside a single chain and so we see no improve from the extra layer of caching. References: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 31 ++- drivers/gpu/drm/i915/i915_gem.c | 4 2 files changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 66f0c607dbef..04a7d594d933 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2005,6 +2005,10 @@ struct drm_i915_gem_object { struct sg_table *pages; int pages_pin_count; + struct get_page { + struct scatterlist *sg; + int last; + } get_page; /* prime dma-buf support */ void *dma_buf_vmapping; @@ -2612,15 +2616,32 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, int *needs_clflush); int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj); -static inline struct page *i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) + +static inline int sg_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg->length) >> PAGE_SHIFT; +} + +static inline struct page * +i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) { - struct sg_page_iter sg_iter; + if (WARN_ON(n >= obj->base.size >> PAGE_SHIFT)) + return NULL; - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, n) - return sg_page_iter_page(&sg_iter); + if (n < obj->get_page.last) { + obj->get_page.sg = obj->pages->sgl; + obj->get_page.last = 0; + } + + while (obj->get_page.last + sg_page_count(obj->get_page.sg) <= n) { + obj->get_page.last += sg_page_count(obj->get_page.sg); + if (unlikely(sg_is_chain(++obj->get_page.sg))) Is it safe to do the ++ inside a nested pair of macros? There is at least one definition of 'unlikely' in the linux source that would cause multiple evaluations. + obj->get_page.sg = sg_chain_ptr(obj->get_page.sg); + } - return NULL; + return nth_page(sg_page(obj->get_page.sg), n - obj->get_page.last); } + static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { BUG_ON(obj->pages == NULL); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 6c403654e33a..d710da099bdb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2260,6 +2260,10 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) return ret; list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list); + + obj->get_page.sg = obj->pages->sgl; + obj->get_page.last = 0; + return 0; } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Cache last obj->pages location for i915_gem_object_get_page()
Accidentally hit send too early, ignore the other reply! On 14/01/2015 11:20, Chris Wilson wrote: The biggest user of i915_gem_object_get_page() is the relocation processing during execbuffer. Typically userspace passes in a set of relocations in sorted order. Sadly, we alternate between relocations increasing from the start of the buffers, and relocations decreasing from the end. However the majority of consecutive lookups will still be in the same page. We could cache the start of the last sg chain, however for most callers, the entire sgl is inside a single chain and so we see no improve from the extra layer of caching. References: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 31 ++- drivers/gpu/drm/i915/i915_gem.c | 4 2 files changed, 30 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 66f0c607dbef..04a7d594d933 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2005,6 +2005,10 @@ struct drm_i915_gem_object { struct sg_table *pages; int pages_pin_count; + struct get_page { + struct scatterlist *sg; + int last; + } get_page; /* prime dma-buf support */ void *dma_buf_vmapping; @@ -2612,15 +2616,32 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj, int *needs_clflush); int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj); -static inline struct page *i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) + +static inline int sg_page_count(struct scatterlist *sg) +{ + return PAGE_ALIGN(sg->offset + sg->length) >> PAGE_SHIFT; Does this need to be rounded up or are sg->offset and sg->length guaranteed to always be a multiple of the page size? +} + +static inline struct page * +i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n) { - struct sg_page_iter sg_iter; + if (WARN_ON(n >= obj->base.size >> PAGE_SHIFT)) + return NULL; - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, n) - return sg_page_iter_page(&sg_iter); + if (n < obj->get_page.last) { + obj->get_page.sg = obj->pages->sgl; + obj->get_page.last = 0; + } + + while (obj->get_page.last + sg_page_count(obj->get_page.sg) <= n) { + obj->get_page.last += sg_page_count(obj->get_page.sg); + if (unlikely(sg_is_chain(++obj->get_page.sg))) Is it safe to do the ++ inside a nested pair of macros? There is at least one definition of 'unlikely' in the linux source that would cause multiple evaluations. + obj->get_page.sg = sg_chain_ptr(obj->get_page.sg); + } - return NULL; + return nth_page(sg_page(obj->get_page.sg), n - obj->get_page.last); } + static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj) { BUG_ON(obj->pages == NULL); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 6c403654e33a..d710da099bdb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2260,6 +2260,10 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj) return ret; list_add_tail(&obj->global_list, &dev_priv->mm.unbound_list); + + obj->get_page.sg = obj->pages->sgl; + obj->get_page.last = 0; + return 0; } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Prevent TLB error on first execution on SNB
On Fri, Feb 13, 2015 at 12:59:45PM +, Chris Wilson wrote: > Long ago I found that I was getting sporadic errors when booting SNB, > with the symptom being that the first batch died with IPEHR != *ACTHD, > typically caused by the TLB being invalid. These magically disappeared > if I held the forcewake during the entire ring initialisation sequence. > (It can probably be shortened to a short critical section, but the whole > initialisation is full of register writes and so we would be taking and > releasing forcewake almost continually, and so holding it over the > entire sequence will probably be a net win!) > > Note some of the kernels I encounted the issue already had the deferred > forcewake release, so it is still relevant. > > I know that there have been a few other reports with similar failure > conditions on SNB, I think such as > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913 > > Signed-off-by: Chris Wilson Given that we've already added a forcewake critical section around individual ring inits this makes maybe a bit too much sense. But I do wonder whether we don't need the same for resume and gpu resets? With the split into hw/sw setup we could get that by pusing the forcewake_get/put inti i915_gem_init_hw. Does the magic still work with that? And if we put it there there fw_get/put in init_ring_common is fully redundant and could be remove. -Daniel > --- > drivers/gpu/drm/i915/i915_gem.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 8d15c8110962..2426f6d9b5a5 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -4868,6 +4868,7 @@ int i915_gem_init(struct drm_device *dev) > dev_priv->gt.stop_ring = intel_logical_ring_stop; > } > > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); > ret = i915_gem_init_userptr(dev); > if (ret) > goto out_unlock; > @@ -4894,6 +4895,7 @@ int i915_gem_init(struct drm_device *dev) > } > > out_unlock: > + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); > mutex_unlock(&dev->struct_mutex); > > return ret; > -- > 2.1.4 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Fix a use after free, and unbalanced refcounting
On Fri, Feb 13, 2015 at 01:30:35PM +, Nick Hoath wrote: > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652 > > When converting from implicitly tracked execlist queue items to ref counted > requests, not all free's of requests were replaced with unrefs, and extraneous > refs/unrefs of contexts were added. > Correct the unbalanced refcount & replace the free's. > > Problem introduced in: > commit 6d3d8274bc45de4babb62d64562d92af984dd238 > Author: Nick Hoath > AuthorDate: Thu Jan 15 13:10:39 2015 + > > drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request Imo the commit message should be ammended with a short paragraph explainig the various pointers and implied and explicit references we now have around requests and contexts. That way review of this will get a bit easier and we'll avoid another misunderstanding. I even think we should add a comment in the header to request.ctx to explain the rules since apparently they've not been fully clear. > Signed-off-by: Nick Hoath But yeah this makes a lot more sense imo. Please feed this to QA for stress-testing in all the relevant bugs. Today I have my head full with kms code so not a good time for a full in-depth review. But I think it'd be good if other people take a look anyway, so please throw this at a few ppl from the vpg core team too. Thanks, Daniel > --- > drivers/gpu/drm/i915/i915_gem.c | 3 +-- > drivers/gpu/drm/i915/intel_lrc.c | 3 +-- > 2 files changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 1765989..79e48b2 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2660,8 +2660,7 @@ static void i915_gem_reset_ring_cleanup(struct > drm_i915_private *dev_priv, > if (submit_req->ctx != ring->default_context) > intel_lr_context_unpin(ring, submit_req->ctx); > > - i915_gem_context_unreference(submit_req->ctx); > - kfree(submit_req); > + i915_gem_request_unreference(submit_req); > } > > /* > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index aafcef3..a18925d 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -518,12 +518,12 @@ static int execlists_context_queue(struct > intel_engine_cs *ring, > return -ENOMEM; > request->ring = ring; > request->ctx = to; > + i915_gem_context_reference(request->ctx); > } else { > WARN_ON(to != request->ctx); > } > request->tail = tail; > i915_gem_request_reference(request); > - i915_gem_context_reference(request->ctx); > > intel_runtime_pm_get(dev_priv); > > @@ -740,7 +740,6 @@ void intel_execlists_retire_requests(struct > intel_engine_cs *ring) > if (ctx_obj && (ctx != ring->default_context)) > intel_lr_context_unpin(ring, ctx); > intel_runtime_pm_put(dev_priv); > - i915_gem_context_unreference(ctx); > list_del(&req->execlist_link); > i915_gem_request_unreference(req); > } > -- > 2.1.1 > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Tidy batch pool logic
On 14/01/2015 11:20, Chris Wilson wrote: Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_cmd_parser.c | 12 +++- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 31 +++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++ 3 files changed, 21 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 9a6da3536ae5..3e5e8cb54a88 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -866,6 +866,9 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, batch_len + batch_start_offset > src_obj->base.size) return ERR_PTR(-E2BIG); + if (WARN_ON(dest_obj->pages_pin_count == 0)) + return ERR_PTR(-ENODEV); + ret = i915_gem_obj_prepare_shmem_read(src_obj, &needs_clflush); if (ret) { DRM_DEBUG_DRIVER("CMD: failed to prepare shadow batch\n"); @@ -879,13 +882,6 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, goto unpin_src; } - ret = i915_gem_object_get_pages(dest_obj); - if (ret) { - DRM_DEBUG_DRIVER("CMD: Failed to get pages for shadow batch\n"); - goto unmap_src; - } - i915_gem_object_pin_pages(dest_obj); - ret = i915_gem_object_set_to_cpu_domain(dest_obj, true); if (ret) { DRM_DEBUG_DRIVER("CMD: Failed to set shadow batch to CPU\n"); @@ -895,7 +891,6 @@ static u32 *copy_batch(struct drm_i915_gem_object *dest_obj, dst = vmap_batch(dest_obj, 0, batch_len); if (!dst) { DRM_DEBUG_DRIVER("CMD: Failed to vmap shadow batch\n"); - i915_gem_object_unpin_pages(dest_obj); ret = -ENOMEM; goto unmap_src; } @@ -1126,7 +1121,6 @@ int i915_parse_cmds(struct intel_engine_cs *ring, } vunmap(batch_base); - i915_gem_object_unpin_pages(shadow_batch_obj); return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c index c690170a1c4f..e5c88ddd8452 100644 --- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c +++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c @@ -66,9 +66,7 @@ void i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool) struct drm_i915_gem_object, batch_pool_list); - WARN_ON(obj->active); - - list_del_init(&obj->batch_pool_list); + list_del(&obj->batch_pool_list); drm_gem_object_unreference(&obj->base); } } @@ -96,10 +94,9 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, WARN_ON(!mutex_is_locked(&pool->dev->struct_mutex)); list_for_each_entry_safe(tmp, next, - &pool->cache_list, batch_pool_list) { - +&pool->cache_list, batch_pool_list) { if (tmp->active) - continue; + break; /* While we're looping, do some clean up */ if (tmp->madv == __I915_MADV_PURGED) { @@ -113,25 +110,27 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool, * but not 'too much' bigger. A better way to do this * might be to bucket the pool objects based on size. */ - if (tmp->base.size >= size && - tmp->base.size <= (2 * size)) { + if (tmp->base.size >= size && tmp->base.size <= 2 * size) { obj = tmp; break; } } - if (!obj) { + if (obj == NULL) { + int ret; + obj = i915_gem_alloc_object(pool->dev, size); - if (!obj) + if (obj == NULL) return ERR_PTR(-ENOMEM); - list_add_tail(&obj->batch_pool_list, &pool->cache_list); - } - else - /* Keep list in LRU order */ - list_move_tail(&obj->batch_pool_list, &pool->cache_list); + ret = i915_gem_object_get_pages(obj); + if (ret) + return ERR_PTR(ret); - obj->madv = I915_MADV_WILLNEED; + obj->madv = I915_MADV_DONTNEED; + } + list_move_tail(&obj->batch_pool_list, &pool->cache_list); Why is it now safe to do a move_tail instead of add_tail if the node has just been allocated? Was the original add_tail() wrong or am I not spotting some critical difference to how new pool objects are created? + i915_gem_object_pin_pages(obj); Is it worth updating the function description comment to add a line about the returned buffer now being pinned a
Re: [Intel-gfx] [PATCH] drm/i915: Prevent TLB error on first execution on SNB
On Fri, Feb 13, 2015 at 02:43:40PM +0100, Daniel Vetter wrote: > On Fri, Feb 13, 2015 at 12:59:45PM +, Chris Wilson wrote: > > Long ago I found that I was getting sporadic errors when booting SNB, > > with the symptom being that the first batch died with IPEHR != *ACTHD, > > typically caused by the TLB being invalid. These magically disappeared > > if I held the forcewake during the entire ring initialisation sequence. > > (It can probably be shortened to a short critical section, but the whole > > initialisation is full of register writes and so we would be taking and > > releasing forcewake almost continually, and so holding it over the > > entire sequence will probably be a net win!) > > > > Note some of the kernels I encounted the issue already had the deferred > > forcewake release, so it is still relevant. > > > > I know that there have been a few other reports with similar failure > > conditions on SNB, I think such as > > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913 > > > > Signed-off-by: Chris Wilson > > Given that we've already added a forcewake critical section around > individual ring inits this makes maybe a bit too much sense. But I do > wonder whether we don't need the same for resume and gpu resets? > > With the split into hw/sw setup we could get that by pusing the > forcewake_get/put inti i915_gem_init_hw. Does the magic still work with > that? And if we put it there there fw_get/put in init_ring_common is fully > redundant and could be remove. Hmm, my original thought was to keep the engine alive from the first programming of CTL up until we fed in the first request (which is the ppgtt/context init). We can add a second forcewake layer into init_hw to give the same security blanket for resume/reset. Sound reasonable? And I should add a comment saying that this is a security blanket. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Cache last obj->pages location for i915_gem_object_get_page()
On Fri, Feb 13, 2015 at 01:35:26PM +, John Harrison wrote: > Accidentally hit send too early, ignore the other reply! > > On 14/01/2015 11:20, Chris Wilson wrote: > >The biggest user of i915_gem_object_get_page() is the relocation > >processing during execbuffer. Typically userspace passes in a set of > >relocations in sorted order. Sadly, we alternate between relocations > >increasing from the start of the buffers, and relocations decreasing > >from the end. However the majority of consecutive lookups will still be > >in the same page. We could cache the start of the last sg chain, however > >for most callers, the entire sgl is inside a single chain and so we see > >no improve from the extra layer of caching. > > > >References: https://bugs.freedesktop.org/show_bug.cgi?id=88308 > >Signed-off-by: Chris Wilson > >--- > > drivers/gpu/drm/i915/i915_drv.h | 31 ++- > > drivers/gpu/drm/i915/i915_gem.c | 4 > > 2 files changed, 30 insertions(+), 5 deletions(-) > > > >diff --git a/drivers/gpu/drm/i915/i915_drv.h > >b/drivers/gpu/drm/i915/i915_drv.h > >index 66f0c607dbef..04a7d594d933 100644 > >--- a/drivers/gpu/drm/i915/i915_drv.h > >+++ b/drivers/gpu/drm/i915/i915_drv.h > >@@ -2005,6 +2005,10 @@ struct drm_i915_gem_object { > > struct sg_table *pages; > > int pages_pin_count; > >+struct get_page { > >+struct scatterlist *sg; > >+int last; > >+} get_page; > > /* prime dma-buf support */ > > void *dma_buf_vmapping; > >@@ -2612,15 +2616,32 @@ int i915_gem_obj_prepare_shmem_read(struct > >drm_i915_gem_object *obj, > > int *needs_clflush); > > int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object > > *obj); > >-static inline struct page *i915_gem_object_get_page(struct > >drm_i915_gem_object *obj, int n) > >+ > >+static inline int sg_page_count(struct scatterlist *sg) > >+{ > >+return PAGE_ALIGN(sg->offset + sg->length) >> PAGE_SHIFT; > Does this need to be rounded up or are sg->offset and sg->length > guaranteed to always be a multiple of the page size? For our sg, sg->offset is always 0, but sg->length may be a multiple of pages. I kept the generic version, but we could just do sg->length >> PAGE_SHIFT. > >+while (obj->get_page.last + sg_page_count(obj->get_page.sg) <= n) { > >+obj->get_page.last += sg_page_count(obj->get_page.sg); > >+if (unlikely(sg_is_chain(++obj->get_page.sg))) > Is it safe to do the ++ inside a nested pair of macros? There is at > least one definition of 'unlikely' in the linux source that would > cause multiple evaluations. That's easy enough to fix. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
Long ago I found that I was getting sporadic errors when booting SNB, with the symptom being that the first batch died with IPEHR != *ACTHD, typically caused by the TLB being invalid. These magically disappeared if I held the forcewake during the entire ring initialisation sequence. (It can probably be shortened to a short critical section, but the whole initialisation is full of register writes and so we would be taking and releasing forcewake almost continually, and so holding it over the entire sequence will probably be a net win!) Note some of the kernels I encounted the issue already had the deferred forcewake release, so it is still relevant. I know that there have been a few other reports with similar failure conditions on SNB, I think such as References: https://bugs.freedesktop.org/show_bug.cgi?id=80913 v2: Wrap i915_gem_init_hw() with its own security blanket as we take that path following resume and reset. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8d15c8110962..08450922f373 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev) if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt()) return -EIO; + /* Double layer security blanket, see i915_gem_init() */ + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + if (dev_priv->ellc_size) I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf)); @@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev) for_each_ring(ring, dev_priv, i) { ret = ring->init_hw(ring); if (ret) - return ret; + goto out; } for (i = 0; i < NUM_L3_SLICES(dev); i++) @@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev) DRM_ERROR("Context enable failed %d\n", ret); i915_gem_cleanup_ringbuffer(dev); - return ret; + goto out; } +out: + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); return ret; } @@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev) dev_priv->gt.stop_ring = intel_logical_ring_stop; } + /* This is just a security blanket to placate dragons. +* On some systems, we very sporadically observe that the first TLBs +* used by the CS may be stale, despite us poking the TLB reset. If +* we hold the forcewake during initialisation these problems +* just magically go away. +*/ + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + ret = i915_gem_init_userptr(dev); if (ret) goto out_unlock; @@ -4894,6 +4907,7 @@ int i915_gem_init(struct drm_device *dev) } out_unlock: + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); mutex_unlock(&dev->struct_mutex); return ret; -- 2.1.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Remove references to previously removed UMS config option
On Fri, Feb 06, 2015 at 12:48:57PM +0200, Jani Nikula wrote: > On Fri, 06 Feb 2015, Andreas Ruprecht wrote: > > Commit 03dae59c72d8 ("drm/i915: Ditch UMS config option") removed > > CONFIG_DRM_I915_UMS from the Kconfig file, but i915_drv.c still > > references this option in two #ifndef statements. > > > > As an undefined config option will always be 'false', we can drop > > the #ifndefs alltogether and adapt the printed error message. > > > > This inconsistency was found with the undertaker tool. > > > > Signed-off-by: Andreas Ruprecht > > --- > > drivers/gpu/drm/i915/i915_drv.c | 6 +- > > 1 file changed, 1 insertion(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > > b/drivers/gpu/drm/i915/i915_drv.c > > index 8039cec..4ecf85f 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -1630,11 +1630,9 @@ static int __init i915_init(void) > > > > if (!(driver.driver_features & DRIVER_MODESET)) { > > driver.get_vblank_timestamp = NULL; > > -#ifndef CONFIG_DRM_I915_UMS > > /* Silently fail loading to not upset userspace. */ > > - DRM_DEBUG_DRIVER("KMS and UMS disabled.\n"); > > + DRM_DEBUG_DRIVER("KMS disabled.\n"); > > I'm not sure if this logging change is required. UMS will still also be > disabled. Or maybe make it "KMS disabled, UMS not > supported.\n". Background in > > commit c9cd7b65db50175a5f1ff64bbad6d5affdad6aba > Author: Jani Nikula > Date: Mon Jun 2 16:58:30 2014 +0300 > > drm/i915: tell the user if both KMS and UMS are disabled > > Other than that, Undone ... > > Reviewed-by: Jani Nikula and merged for 3.21, thanks for patch&review. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Tidy batch pool logic
On Fri, Feb 13, 2015 at 02:00:50PM +, John Harrison wrote: > >+list_move_tail(&obj->batch_pool_list, &pool->cache_list); > Why is it now safe to do a move_tail instead of add_tail if the node > has just been allocated? Was the original add_tail() wrong or am I > not spotting some critical difference to how new pool objects are > created? The link is initialised in i915_gem_object_init(). It was always safe to use list_move_tail. > >+i915_gem_object_pin_pages(obj); > Is it worth updating the function description comment to add a line > about the returned buffer now being pinned and the caller must worry > about unpinning it? Didn't even spot that there was a function description. The other choice is to just push the pinning into the caller, the emphasis was on moving get_pages() into the allocator, and so for consistency it should also pin the pages. Will update. Now I just want to rename it from batch pool to buffer pool... -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx