[PATCH] drm/radeon/kms: Fix device tree linkage of DP i2c buses too
Properly set the parent device of DP i2c buses before registering them too. Signed-off-by: Jean Delvare Cc: Dave Airlie Cc: Alex Deucher --- I'm sorry, my previous patch missed the fact that DP i2c buses are handled separately so they need the same fix. drivers/gpu/drm/radeon/radeon_i2c.c |1 + 1 file changed, 1 insertion(+) --- linux-3.3-rc1.orig/drivers/gpu/drm/radeon/radeon_i2c.c 2012-01-31 08:55:20.596108280 +0100 +++ linux-3.3-rc1/drivers/gpu/drm/radeon/radeon_i2c.c 2012-01-31 08:55:22.334108306 +0100 @@ -956,6 +956,7 @@ struct radeon_i2c_chan *radeon_i2c_creat i2c->rec = *rec; i2c->adapter.owner = THIS_MODULE; i2c->adapter.class = I2C_CLASS_DDC; + i2c->adapter.dev.parent = &dev->pdev->dev; i2c->dev = dev; snprintf(i2c->adapter.name, sizeof(i2c->adapter.name), "Radeon aux bus %s", name); -- Jean Delvare Suse L3 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf: add dma_data_direction to unmap dma_buf_op
Hi Sumit, > On Friday 27 January 2012 10:43:28 Sumit Semwal wrote: [snip] > static inline void dma_buf_unmap_attachment(struct dma_buf_attachment > *attach, > -struct sg_table *sg) > + struct sg_table *sg, enum dma_data_direction write) On a second thought, would it make sense to store the direction in struct dma_buf_attachment in dma_buf_map_attachment(), and pass the value directly to the .unmap_dma_buf() instead of requiring the dma_buf_unmap_attachment() caller to remember it ? Or is an attachment allowed to map the buffer several times with different directions ? -- Regards, Laurent Pinchart ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 45432] New: memory errors on cayman with VM support + nexuiz
https://bugs.freedesktop.org/show_bug.cgi?id=45432 Bug #: 45432 Summary: memory errors on cayman with VM support + nexuiz Classification: Unclassified Product: Mesa Version: git Platform: Other OS/Version: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r600 AssignedTo: dri-devel@lists.freedesktop.org ReportedBy: airl...@freedesktop.org Running latest kernel + git mesa on cayman start nexuiz-sdl timedemo demos/demo1.dem radeon: Failed to allocate a buffer: radeon:size : 2048 bytes radeon:alignment : 512 bytes radeon:domains : 2 EE r600_texture.c:715 r600_texture_get_transfer - failed to create temporary texture to hold untiled copy Mesa: User error: GL_OUT_OF_MEMORY in glTexImage2D radeon: Failed to allocate a buffer: radeon:size : 2048 bytes radeon:alignment : 512 bytes radeon:domains : 2 EE r600_texture.c:715 r600_texture_get_transfer - failed to create temporary texture to hold untiled copy in dmesg radeon :01:00.0: bo 88021a117400 va 0x09B4F000 conflict with (bo 8801ff5ef000 0x09B4F000 0x09B5) radeon :01:00.0: bo 8801ff528000 va 0x0AA8 conflict with (bo 88021efb6400 0x0AA8 0x0AA84000) -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 45432] memory errors on cayman with VM support + nexuiz
https://bugs.freedesktop.org/show_bug.cgi?id=45432 Dave Airlie changed: What|Removed |Added Keywords||regression -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-buf: add dma_data_direction to unmap dma_buf_op
On Tue, Jan 31, 2012 at 10:42:59AM +0100, Laurent Pinchart wrote: > Hi Sumit, > > > On Friday 27 January 2012 10:43:28 Sumit Semwal wrote: > > [snip] > > > static inline void dma_buf_unmap_attachment(struct dma_buf_attachment > > *attach, > > -struct sg_table *sg) > > + struct sg_table *sg, enum dma_data_direction write) > > On a second thought, would it make sense to store the direction in struct > dma_buf_attachment in dma_buf_map_attachment(), and pass the value directly > to > the .unmap_dma_buf() instead of requiring the dma_buf_unmap_attachment() > caller to remember it ? Or is an attachment allowed to map the buffer several > times with different directions ? Current dma api functions already require you to supply the direction argument on unmap and I think for cpu access I'm also leaning towards an interface where the importer has to supply the direction argument for both begin_access and end_access. So for consistency reasons I'm leaning towards adding it to unmap. -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
r600g: Trying to remove the busywait for fence completion, but hitting inexplicable behaviour
Hello, When profiling my workload on an AMD E-350 (PALM GPU) to see why it still wasn't performing well with Jerome's WIP macrotiling patches, I noticed that r600_fence_finish was taking 10% of my CPU time. I determined experimentally that changing from sched_yield() to os_time_sleep(10) fixed this and resolved my last performance issue on AMD Fusion as compared to Intel Atom, but felt that this was hacky. I've therefore tried to use INT_SEL of 0b10 in the EVENT_WRITE_EOP in Mesa, combined with a new ioctl to wait for a changed value, but it's not working the way I would expect. I'll be sending patches as replies to this message, so that you can see exactly what I've done, but in brief, I have an ioctl that uses wait_event to wait for a chosen offset in a BO to change value. I've added a suitable waitqueue, and made radeon_fence_process call wake_up_all. I'm seeing behaviour from this that I can't explain; as you'll see in the patches, I've moved some IRQ prints from DRM_DEBUG to printk(KERN_INFO), and I'm seeing that I don't get the EOP interrupt in a timely fashion - either because memory is not as coherent between the GPU and CPU as I would like (so I'm reading stale data when I call wait_event), or because the interrupt is genuinely delayed. From dmesg (with commentary): X11 and my GL compositor start. [ 84.423567] IH: CP EOP [ 84.423600] Woke kernel fences [ 84.423606] Waking up all waiters This looks like an EOP for a kernel-side fence. [ 84.651320] wait_user_fence offset 4 value 0 timeout -1 [ 84.651332] Current data value 0 This is my compositor coming in via my new ioctl, to wait for EOP. I get bored of waiting for the ioctl to complete, and send the compositor SIGSTOP then SIGCONT. [ 149.970629] wait_user_fence offset 4 value 0 timeout -1 [ 149.970635] Finished data value 1 My new ioctl completes, as the data has changed value. I was expecting an EOP interrupt before this, which hasn't arrived - why? [ 150.224675] wait_user_fence offset 8 value 0 timeout -1 [ 150.224692] Current data value 0 The compositor comes in again, waiting on a different fence. [ 150.250166] IH: CP EOP [ 150.250194] Woke kernel fences [ 150.250198] Waking up all waiters This looks like an EOP for a kernel-side fence. [ 150.250212] IH: CP EOP [ 150.250216] Waking up all waiters This looks like an EOP for the fence that completed at time 149.970 - why's it been delayed? [ 150.250219] IH: CP EOP [ 150.250221] Waking up all waiters And another EOP that I can't tie up to command buffers. [ 150.250327] IH: CP EOP [ 150.250335] Woke kernel fences [ 150.250337] Waking up all waiters Kernel fence. [ 150.250559] IH: CP EOP [ 150.250567] Woke kernel fences [ 150.250570] Waking up all waiters Another kernel fence. [ 150.250581] IH: CP EOP [ 150.250583] Waking up all waiters [ 150.250595] wait_user_fence offset 8 value 0 timeout -1 [ 150.250604] IH: CP EOP [ 150.250608] Waking up all waiters [ 150.250615] Finished data value 1 Two user fence EOPs in a row, one of which woke up my process. [ 150.251462] IH: CP EOP [ 150.251477] Woke kernel fences [ 150.251480] Waking up all waiters Kernel fence. [ 150.256806] wait_user_fence offset 0 value 0 timeout -1 [ 150.256828] Current data value 0 Stalled again, waiting for EOP interrupt. Could be because the GPU and CPU have different views of memory at this point. There will be two patches in reply to this mail - one is the Mesa patch, one the kernel patch; I would greatly appreciate help getting this going. -- Simon Farnsworth Software Engineer ONELAN Limited http://www.onelan.com/ signature.asc Description: This is a digitally signed message part. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon/kms: Fix device tree linkage of DP i2c buses too
On Tue, Jan 31, 2012 at 3:55 AM, Jean Delvare wrote: > Properly set the parent device of DP i2c buses before registering them > too. > > Signed-off-by: Jean Delvare > Cc: Dave Airlie > Cc: Alex Deucher Reviewed-by: Alex Deucher > --- > I'm sorry, my previous patch missed the fact that DP i2c buses are > handled separately so they need the same fix. > > drivers/gpu/drm/radeon/radeon_i2c.c | 1 + > 1 file changed, 1 insertion(+) > > --- linux-3.3-rc1.orig/drivers/gpu/drm/radeon/radeon_i2c.c 2012-01-31 > 08:55:20.596108280 +0100 > +++ linux-3.3-rc1/drivers/gpu/drm/radeon/radeon_i2c.c 2012-01-31 > 08:55:22.334108306 +0100 > @@ -956,6 +956,7 @@ struct radeon_i2c_chan *radeon_i2c_creat > i2c->rec = *rec; > i2c->adapter.owner = THIS_MODULE; > i2c->adapter.class = I2C_CLASS_DDC; > + i2c->adapter.dev.parent = &dev->pdev->dev; > i2c->dev = dev; > snprintf(i2c->adapter.name, sizeof(i2c->adapter.name), > "Radeon aux bus %s", name); > > -- > Jean Delvare > Suse L3 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2] drm/radeon/kms: remove benchmarks shorter than one page
On Mon, Jan 30, 2012 at 11:10 PM, Ilija Hadzic wrote: > copy_blit operation works only on integral number of pages > so benchmarks shorter than one page size (4K) do not make sense > > v2: use RADEON_GPU_PAGE_SIZE instead of "magic" 1024 number and > sweep sizes between 1x to 16x doubling > the size in each iteration; we get the same coverage, as > in the original benchmark, but guarantee integer multiples > of page size > > Signed-off-by: Ilija Hadzic Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/radeon/radeon_benchmark.c | 12 ++-- > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c > b/drivers/gpu/drm/radeon/radeon_benchmark.c > index 815f234..d3c0002 100644 > --- a/drivers/gpu/drm/radeon/radeon_benchmark.c > +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c > @@ -208,22 +208,22 @@ void radeon_benchmark(struct radeon_device *rdev, int > test_number) > break; > case 3: > /* GTT to VRAM, buffer size sweep, powers of 2 */ > - for (i = 1; i <= 65536; i <<= 1) > - radeon_benchmark_move(rdev, i*1024, > + for (i = 1; i <= 16384; i <<= 1) > + radeon_benchmark_move(rdev, i*RADEON_GPU_PAGE_SIZE, > RADEON_GEM_DOMAIN_GTT, > RADEON_GEM_DOMAIN_VRAM); > break; > case 4: > /* VRAM to GTT, buffer size sweep, powers of 2 */ > - for (i = 1; i <= 65536; i <<= 1) > - radeon_benchmark_move(rdev, i*1024, > + for (i = 1; i <= 16384; i <<= 1) > + radeon_benchmark_move(rdev, i*RADEON_GPU_PAGE_SIZE, > RADEON_GEM_DOMAIN_VRAM, > RADEON_GEM_DOMAIN_GTT); > break; > case 5: > /* VRAM to VRAM, buffer size sweep, powers of 2 */ > - for (i = 1; i <= 65536; i <<= 1) > - radeon_benchmark_move(rdev, i*1024, > + for (i = 1; i <= 16384; i <<= 1) > + radeon_benchmark_move(rdev, i*RADEON_GPU_PAGE_SIZE, > RADEON_GEM_DOMAIN_VRAM, > RADEON_GEM_DOMAIN_VRAM); > break; > -- > 1.7.7 > > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] [r600g] Use new kernel interface to wait for fences
Instead of busywaiting for the GPU to finish a fence, use the new kernel interface to wait for fence completion. This code needs completion - in particular, we should fall back to busywaiting (using the nokernel function that's in radeon_drm_bo.c) if the kernel doesn't support the new interface. Signed-off-by: Simon Farnsworth --- src/gallium/drivers/r600/r600_hw_context.c|2 +- src/gallium/drivers/r600/r600_pipe.c | 12 +++-- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 30 + src/gallium/winsys/radeon/drm/radeon_winsys.h | 16 + 4 files changed, 51 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index 8eb8e6d..35a57a7 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -1618,7 +1618,7 @@ void r600_context_emit_fence(struct r600_context *ctx, struct r600_resource *fen ctx->pm4[ctx->pm4_cdwords++] = EVENT_TYPE(EVENT_TYPE_CACHE_FLUSH_AND_INV_TS_EVENT) | EVENT_INDEX(5); ctx->pm4[ctx->pm4_cdwords++] = va & 0xUL; /* ADDRESS_LO */ /* DATA_SEL | INT_EN | ADDRESS_HI */ - ctx->pm4[ctx->pm4_cdwords++] = (1 << 29) | (0 << 24) | ((va >> 32UL) & 0xFF); + ctx->pm4[ctx->pm4_cdwords++] = (1 << 29) | (2 << 24) | ((va >> 32UL) & 0xFF); ctx->pm4[ctx->pm4_cdwords++] = value; /* DATA_LO */ ctx->pm4[ctx->pm4_cdwords++] = 0; /* DATA_HI */ ctx->pm4[ctx->pm4_cdwords++] = PKT3(PKT3_NOP, 0, 0); diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index c38fbc5..12c5bf5 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -595,7 +595,6 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen, struct r600_screen *rscreen = (struct r600_screen *)pscreen; struct r600_fence *rfence = (struct r600_fence*)fence; int64_t start_time = 0; - unsigned spins = 0; if (timeout != PIPE_TIMEOUT_INFINITE) { start_time = os_time_get(); @@ -605,13 +604,10 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen, } while (rscreen->fences.data[rfence->index] == 0) { - if (++spins % 256) - continue; -#ifdef PIPE_OS_UNIX - sched_yield(); -#else - os_time_sleep(10); -#endif + rscreen->ws->buffer_wait_fence(rscreen->fences.bo->buf, + rfence->index << 2, + 0, + timeout); if (timeout != PIPE_TIMEOUT_INFINITE && os_time_get() - start_time >= timeout) { return FALSE; diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index 143dcf9..b552c11 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c @@ -879,6 +879,35 @@ static uint64_t radeon_winsys_bo_va(struct pb_buffer *buffer) return bo->va; } +/* No kernel support for doing this faster - just spin */ +static void radeon_winsys_bo_wait_fence_nokernel(struct pb_buffer *buf, +unsigned offset, +uint32_t value, +uint64_t timeout) +{ +#ifdef PIPE_OS_UNIX +sched_yield(); +#else +os_time_sleep(10); +#endif +} + +static void radeon_winsys_bo_wait_fence(struct pb_buffer *_buf, + unsigned offset, + uint32_t value, + uint64_t timeout) +{ +struct radeon_bo *bo = get_radeon_bo(_buf); +struct drm_radeon_gem_wait_user_fence args; +memset(&args, 0, sizeof(args)); +args.handle = bo->handle; +args.offset = offset; +args.value = value; +args.timeout_usec = timeout; +while (drmCommandWrite(bo->rws->fd, DRM_RADEON_GEM_WAIT_USER_FENCE, + &args, sizeof(args)) == -EBUSY); +} + void radeon_bomgr_init_functions(struct radeon_drm_winsys *ws) { ws->base.buffer_get_cs_handle = radeon_drm_get_cs_handle; @@ -892,4 +921,5 @@ void radeon_bomgr_init_functions(struct radeon_drm_winsys *ws) ws->base.buffer_from_handle = radeon_winsys_bo_from_handle; ws->base.buffer_get_handle = radeon_winsys_bo_get_handle; ws->base.buffer_get_virtual_address = radeon_winsys_bo_va; +ws->base.buffer_wait_fence = radeon_winsys_bo_wait_fence; } diff --git a/src/gallium/winsys/radeon/drm/radeon_winsys.h b/src/gallium/winsys/radeon/drm/radeon_winsys.h index e462e86..869961f 100644 --- a/src/gallium/winsys/radeon/drm/radeon_winsys.h +++ b/src/gal
Re: [PATCH v2] drm/radeon/kms: remove benchmarks shorter than one page
On Tue, 31 Jan 2012, Alex Deucher wrote: Signed-off-by: Ilija Hadzic Reviewed-by: Alex Deucher Thanks. There will be a v3, though to address one tirvial comment (whitespace between binary '*' operator) that I received in E-mail sent directly to me. I guess the diff v2/v3 will be trivial enough that we can consider that this review applies too. -- Ilija ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2] drm/radeon/kms: remove benchmarks shorter than one page
On Tue, Jan 31, 2012 at 9:27 AM, Ilija Hadzic wrote: > > > On Tue, 31 Jan 2012, Alex Deucher wrote: > >>> >>> Signed-off-by: Ilija Hadzic >> >> >> Reviewed-by: Alex Deucher >> > > Thanks. There will be a v3, though to address one tirvial comment > (whitespace between binary '*' operator) that I received in E-mail sent > directly to me. > > I guess the diff v2/v3 will be trivial enough that we can consider that this > review applies too. yes, consider that trivial change reviewed as well. Alex > > -- Ilija ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/radeon: Add support for userspace fence waits
Userspace currently busywaits for fences to complete; on my workload, this busywait consumes 10% of the available CPU time. Provide an ioctl so that userspace can wait for an EOP interrupt that corresponds to a previous EVENT_WRITE_EOP. This currently doesn't work, hence the debug code piled in. Signed-off-by: Simon Farnsworth --- drivers/gpu/drm/radeon/evergreen.c |8 ++-- drivers/gpu/drm/radeon/radeon.h|3 + drivers/gpu/drm/radeon/radeon_device.c |1 + drivers/gpu/drm/radeon/radeon_fence.c |3 + drivers/gpu/drm/radeon/radeon_gem.c| 70 drivers/gpu/drm/radeon/radeon_kms.c|1 + include/drm/radeon_drm.h | 28 + 7 files changed, 110 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 0c5dd78..5b886b0 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3083,11 +3083,11 @@ restart_ih: case 176: /* CP_INT in ring buffer */ case 177: /* CP_INT in IB1 */ case 178: /* CP_INT in IB2 */ - DRM_DEBUG("IH: CP int: 0x%08x\n", src_data); + printk(KERN_INFO "IH: CP int: 0x%08x\n", src_data); radeon_fence_process(rdev, RADEON_RING_TYPE_GFX_INDEX); break; case 181: /* CP EOP event */ - DRM_DEBUG("IH: CP EOP\n"); + printk(KERN_INFO "IH: CP EOP\n"); if (rdev->family >= CHIP_CAYMAN) { switch (src_data) { case 0: @@ -3104,12 +3104,12 @@ restart_ih: radeon_fence_process(rdev, RADEON_RING_TYPE_GFX_INDEX); break; case 233: /* GUI IDLE */ - DRM_DEBUG("IH: GUI idle\n"); + printk(KERN_INFO "IH: GUI idle\n"); rdev->pm.gui_idle = true; wake_up(&rdev->irq.idle_queue); break; default: - DRM_DEBUG("Unhandled interrupt: %d %d\n", src_id, src_data); + printk(KERN_INFO "Unhandled interrupt: %d %d\n", src_id, src_data); break; } diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2859406..fb0eafd 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -1348,6 +1348,8 @@ int radeon_gem_set_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); int radeon_gem_get_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, +struct drm_file *filp); /* VRAM scratch page for HDP bug, default vram page */ struct r600_vram_scratch { @@ -1444,6 +1446,7 @@ struct radeon_device { struct radeon_mman mman; rwlock_tfence_lock; struct radeon_fence_driver fence_drv[RADEON_NUM_RINGS]; + wait_queue_head_t userspace_fence_wait_queue; struct radeon_semaphore_driver semaphore_drv; struct radeon_ring ring[RADEON_NUM_RINGS]; struct radeon_ib_pool ib_pool; diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 0afb13b..dcf11e5 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -730,6 +730,7 @@ int radeon_device_init(struct radeon_device *rdev, mutex_init(&rdev->pm.mutex); mutex_init(&rdev->vram_mutex); rwlock_init(&rdev->fence_lock); + init_waitqueue_head(&rdev->userspace_fence_wait_queue); rwlock_init(&rdev->semaphore_drv.lock); INIT_LIST_HEAD(&rdev->gem.objects); init_waitqueue_head(&rdev->irq.vblank_queue); diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 64ea3dd..5b8270f 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -355,7 +355,10 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) write_unlock_irqrestore(&rdev->fence_lock, irq_flags); if (wake) { wake_up_all(&rdev->fence_drv[ring].queue); + printk( KERN_INFO "Woke kernel fences\n" ); } + printk( KERN_INFO "Waking up all waiters\n" ); + wake_up_interruptible_all(&rdev->userspace_fence_wait_queue); } int radeon_fence_count_emitted(struct radeon_device *rdev, int ring) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 7337850..6866f75 100644 --- a/drivers/
[PATCH v3] drm/radeon/kms: remove benchmarks shorter than one page
copy_blit operation works only on integral number of pages so benchmarks shorter than one page size (4K) do not make sense v2: use RADEON_GPU_PAGE_SIZE instead of "magic" 1024 number and sweep sizes between 1 * to 16K * doubling the size in each iteration; we get the same coverage, as in the original benchmark, but guarantee integer multiples of page size v3: add whitespace between '*' operator per review received from zaj...@gmail.com Signed-off-by: Ilija Hadzic --- drivers/gpu/drm/radeon/radeon_benchmark.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c b/drivers/gpu/drm/radeon/radeon_benchmark.c index 815f234..58cee89 100644 --- a/drivers/gpu/drm/radeon/radeon_benchmark.c +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c @@ -208,22 +208,22 @@ void radeon_benchmark(struct radeon_device *rdev, int test_number) break; case 3: /* GTT to VRAM, buffer size sweep, powers of 2 */ - for (i = 1; i <= 65536; i <<= 1) - radeon_benchmark_move(rdev, i*1024, + for (i = 1; i <= 16384; i <<= 1) + radeon_benchmark_move(rdev, i * RADEON_GPU_PAGE_SIZE, RADEON_GEM_DOMAIN_GTT, RADEON_GEM_DOMAIN_VRAM); break; case 4: /* VRAM to GTT, buffer size sweep, powers of 2 */ - for (i = 1; i <= 65536; i <<= 1) - radeon_benchmark_move(rdev, i*1024, + for (i = 1; i <= 16384; i <<= 1) + radeon_benchmark_move(rdev, i * RADEON_GPU_PAGE_SIZE, RADEON_GEM_DOMAIN_VRAM, RADEON_GEM_DOMAIN_GTT); break; case 5: /* VRAM to VRAM, buffer size sweep, powers of 2 */ - for (i = 1; i <= 65536; i <<= 1) - radeon_benchmark_move(rdev, i*1024, + for (i = 1; i <= 16384; i <<= 1) + radeon_benchmark_move(rdev, i * RADEON_GPU_PAGE_SIZE, RADEON_GEM_DOMAIN_VRAM, RADEON_GEM_DOMAIN_VRAM); break; -- 1.7.7 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
Alex Deucher pointed out an error on IRC; I'm not using radeon_irq_kms_sw_irq_get and put to manage the IRQ enablement. I've fixed this up (as per the partial hunk below), and my bug goes away. I will be cleaning these patches up for proper submission. Simon On Tuesday 31 January 2012, Simon Farnsworth wrote: > +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp) > +{ > + struct drm_radeon_gem_wait_user_fence *args = data; > + struct radeon_device *rdev = dev->dev_private; > + struct drm_gem_object *gobj; > + struct radeon_bo *robj; > + void *buffer_data; > + uint32_t *fence_data; > + int r = 0; > + long timeout; > + > + printk( KERN_INFO "wait_user_fence offset %lld value %d timeout > %lld\n", args->offset, args->value, args->timeout_usec ); > + > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > + if (gobj == NULL) { > + return -ENOENT; > + } > + robj = gem_to_radeon_bo(gobj); > + > + if (gobj->size < args->offset) { > + printk( KERN_INFO "Offset too large\n" ); > + r = -EINVAL; > + goto unreference; > + } > + > + r = radeon_bo_reserve(robj, true); > + if (r) { > + printk( KERN_INFO "Reserve fail\n" ); > + goto unreference; > + } > + > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > + if (r) { > + printk( KERN_INFO "Pin fail\n" ); > + goto unreserve; > + } > + > + r = radeon_bo_kmap(robj, &buffer_data); > + if (r) { > + printk( KERN_INFO "kmap fail\n" ); > + goto unpin; > + } > + Missing radeon_irq_kms_sw_irq_get(rdev, RADEON_RING_TYPE_GFX_INDEX); here > + fence_data = (uint32_t*)buffer_data; > + > + printk( KERN_INFO "Current data value %d\n", fence_data[args->offset >> > 2] ); > + > + timeout = > wait_event_interruptible_timeout(rdev->userspace_fence_wait_queue, > +fence_data[args->offset >> > 2] != args->value, > + > usecs_to_jiffies(args->timeout_usec)); And missing radeon_irq_kms_sw_irq_put(rdev, RADEON_RING_TYPE_GFX_INDEX); here. > + if (timeout == 0) > + r = -ETIMEDOUT; > + else if (timeout < 0) > + r = timeout; > + > + printk( KERN_INFO "wait_user_fence offset %lld value %d timeout > %lld\n", args->offset, args->value, args->timeout_usec ); > + printk( KERN_INFO "Finished data value %d\n", fence_data[args->offset > >> 2] ); > + > + radeon_bo_kunmap(robj); > +unpin: > + radeon_bo_unpin(robj); > +unreserve: > + radeon_bo_unreserve(robj); > +unreference: > + drm_gem_object_unreference_unlocked(gobj); > + > + return r; > +} -- Simon Farnsworth Software Engineer ONELAN Limited http://www.onelan.com/ signature.asc Description: This is a digitally signed message part. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: r600g: Trying to remove the busywait for fence completion, but hitting inexplicable behaviour
On Tuesday 31 January 2012, Alan Swanson wrote: > On Tue, 2012-01-31 at 13:16 +, Simon Farnsworth wrote: > > Hello, > > > > When profiling my workload on an AMD E-350 (PALM GPU) to see why it still > > wasn't performing well with Jerome's WIP macrotiling patches, I noticed that > > r600_fence_finish was taking 10% of my CPU time. I determined experimentally > > that changing from sched_yield() to os_time_sleep(10) fixed this and > > resolved my last performance issue on AMD Fusion as compared to Intel Atom, > > but felt that this was hacky. > > No, you were right the first time, sched_yield should definitely not be > being used in this busy wait under Linux (even with its preceding few > spins). > Given that I can get the hardware to raise an interrupt when the fence finishes, I think the sleep is hacky - I should not be spinning under normal circumstances. I'm preparing patches that remove the sleep completely if you have new enough kernel, libdrm and Mesa. If someone else prepares patches to remove the sched_yield usage completely, I'll happily test them on my Radeon hardware. -- Simon Farnsworth Software Engineer ONELAN Limited http://www.onelan.com/ signature.asc Description: This is a digitally signed message part. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: r600g: Trying to remove the busywait for fence completion, but hitting inexplicable behaviour
On Tue, 2012-01-31 at 13:16 +, Simon Farnsworth wrote: > Hello, > > When profiling my workload on an AMD E-350 (PALM GPU) to see why it still > wasn't performing well with Jerome's WIP macrotiling patches, I noticed that > r600_fence_finish was taking 10% of my CPU time. I determined experimentally > that changing from sched_yield() to os_time_sleep(10) fixed this and > resolved my last performance issue on AMD Fusion as compared to Intel Atom, > but felt that this was hacky. No, you were right the first time, sched_yield should definitely not be being used in this busy wait under Linux (even with its preceding few spins). Back in 2003 when Linux 2.5 changed sched_yield to move processes from run queue to the expired queue instead, its use was discussed on the DRI devel list. http://lwn.net/Articles/31462/ http://thread.gmane.org/gmane.comp.video.dri.devel/5455/focus=6779 There stills seems to be five uses that should be checked. src/gallium/drivers/r600/r600_pipe.c src/gallium/drivers/nouveau/nouveau_fence.c src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c src/gallium/winsys/radeon/drm/radeon_drm_bo.c (*2) -- Alan. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
On Mon, 30 Jan 2012 15:25:20 -0800, Jeremy Huddleston wrote: > This fixes a failure in 'make check' found by the tinderbox when trying to > build this code on Linux/ppc. This code is only designed to run on > Intel platforms, so don't even bother building it if we're not in that set. Looks reasonable to me, except it should probably be checking $target_os (cross-compile target) rather than $host_os (cross compile build host). pgp3K5slIMz9Q.pgp Description: PGP signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/radeon: Add support for userspace fence waits
Userspace currently busywaits for fences to complete; on my workload, this busywait consumes 10% of the available CPU time. Provide an ioctl so that userspace can wait for an EOP interrupt that corresponds to a previous EVENT_WRITE_EOP. Signed-off-by: Simon Farnsworth --- I've been working on top of Jerome's tiling patches, so this doesn't apply directly on top of current upstream kernels. I can easily rebase to another version upon request - just point me to a git tree. My goal is to remove the sched_yield in Mesa's r600_fence_finish given up to date enough kernel; I hope, though, that the interface is clean enough for other users to extend it in the future (e.g. using compute rings). drivers/gpu/drm/radeon/radeon.h |3 + drivers/gpu/drm/radeon/radeon_drv.c |3 +- drivers/gpu/drm/radeon/radeon_fence.c |2 + drivers/gpu/drm/radeon/radeon_gem.c | 70 + drivers/gpu/drm/radeon/radeon_kms.c |1 + include/drm/radeon_drm.h | 30 ++ 6 files changed, 108 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2859406..00c187b 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -217,6 +217,7 @@ struct radeon_fence_driver { unsigned long last_jiffies; unsigned long last_timeout; wait_queue_head_t queue; + wait_queue_head_t userspace_queue; struct list_headcreated; struct list_heademitted; struct list_headsignaled; @@ -1348,6 +1349,8 @@ int radeon_gem_set_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); int radeon_gem_get_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, +struct drm_file *filp); /* VRAM scratch page for HDP bug, default vram page */ struct r600_vram_scratch { diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4ae2e1d..9f82fa9 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -56,9 +56,10 @@ * 2.12.0 - RADEON_CS_KEEP_TILING_FLAGS * 2.13.0 - virtual memory support * 2.14.0 - add evergreen tiling informations + * 2.15.0 - gem_wait_user_fence ioctl */ #define KMS_DRIVER_MAJOR 2 -#define KMS_DRIVER_MINOR 14 +#define KMS_DRIVER_MINOR 15 #define KMS_DRIVER_PATCHLEVEL 0 int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags); int radeon_driver_unload_kms(struct drm_device *dev); diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 64ea3dd..d86bc28 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -356,6 +356,7 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) if (wake) { wake_up_all(&rdev->fence_drv[ring].queue); } + wake_up_interruptible_all(&rdev->fence_drv[ring].userspace_queue); } int radeon_fence_count_emitted(struct radeon_device *rdev, int ring) @@ -421,6 +422,7 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring) INIT_LIST_HEAD(&rdev->fence_drv[ring].emitted); INIT_LIST_HEAD(&rdev->fence_drv[ring].signaled); init_waitqueue_head(&rdev->fence_drv[ring].queue); + init_waitqueue_head(&rdev->fence_drv[ring].userspace_queue); rdev->fence_drv[ring].initialized = false; } diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 7337850..602274f 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -531,3 +531,73 @@ int radeon_mode_dumb_destroy(struct drm_file *file_priv, { return drm_gem_handle_delete(file_priv, handle); } + +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_radeon_gem_wait_user_fence *args = data; + struct radeon_device *rdev = dev->dev_private; + struct drm_gem_object *gobj; + struct radeon_bo *robj; + void *buffer_data; + uint32_t *fence_data; + int r = 0; + long timeout; + int ring = RADEON_RING_TYPE_GFX_INDEX; + + /* If you're implementing this for other rings, you'll want to share + code with radeon_cs_get_ring in radeon_cs.c */ + if (args->ring != RADEON_CS_RING_GFX) { + return -EINVAL; + } + + gobj = drm_gem_object_lookup(dev, filp, args->handle); + if (gobj == NULL) { + return -ENOENT; + } + robj = gem_to_radeon_bo(gobj); + + if (gobj->size
Re: [PATCH] drm/radeon: Add support for userspace fence waits
On Die, 2012-01-31 at 16:59 +, Simon Farnsworth wrote: > Userspace currently busywaits for fences to complete; on my workload, this > busywait consumes 10% of the available CPU time. > > Provide an ioctl so that userspace can wait for an EOP interrupt that > corresponds to a previous EVENT_WRITE_EOP. > > Signed-off-by: Simon Farnsworth > --- > I've been working on top of Jerome's tiling patches, so this doesn't apply > directly on top of current upstream kernels. I can easily rebase to another > version upon request - just point me to a git tree. > > My goal is to remove the sched_yield in Mesa's r600_fence_finish given up to > date enough kernel; I hope, though, that the interface is clean enough for > other users to extend it in the future (e.g. using compute rings). I'm afraid not: Unless I'm missing something, userspace can't know which ring the kernel submitted the CS to, and the kernel can't guess which ring userspace needs to wait for. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 30502] [RADEON:R600C] atmospheres of planets get transparent when going away from planets in celestia
https://bugs.freedesktop.org/show_bug.cgi?id=30502 aceman changed: What|Removed |Added Status|NEW |RESOLVED Resolution||WORKSFORME --- Comment #15 from aceman 2012-01-31 10:15:27 PST --- I have not seen it since the report so let's close it for now. And I am on R600g as it gets usable and the default. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 31423] Stars in Celestia misrendered
https://bugs.freedesktop.org/show_bug.cgi?id=31423 --- Comment #2 from aceman 2012-01-31 10:17:34 PST --- On R600g, Mesa 8.0rc2 the stars seem to have proper colors. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 28905] [RADEON:KMS:R600C] missing scene triangle in scorched3d
https://bugs.freedesktop.org/show_bug.cgi?id=28905 aceman changed: What|Removed |Added Status|NEW |RESOLVED Resolution||WORKSFORME --- Comment #6 from aceman 2012-01-31 10:19:04 PST --- With R600g, Mesa 8.0rc2 it looks much better, the triangle is not missing and there are much more scene details and bugs fixed. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
Some comments below. > + struct radeon_device *rdev = dev->dev_private; > + struct drm_gem_object *gobj; > + struct radeon_bo *robj; > + void *buffer_data; > + uint32_t *fence_data; > + int r = 0; > + long timeout; > + int ring = RADEON_RING_TYPE_GFX_INDEX; > + > + /* If you're implementing this for other rings, you'll want to > share > +code with radeon_cs_get_ring in radeon_cs.c */ > + if (args->ring != RADEON_CS_RING_GFX) { > + return -EINVAL; > + } > + > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > + if (gobj == NULL) { > + return -ENOENT; > + } > + robj = gem_to_radeon_bo(gobj); > + > + if (gobj->size < args->offset) { > + r = -EINVAL; > + goto unreference; > + } > + > + r = radeon_bo_reserve(robj, true); > + if (r) { > + goto unreference; > + } > + > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > + if (r) { > + goto unreserve; > + } > + > + r = radeon_bo_kmap(robj, &buffer_data); > + if (r) { > + goto unpin; > + } > + Do you need to pin it? I think if you have it reserved and you are in here you shouldn't need to. (unless kmapping requires a pin)/ > + radeon_irq_kms_sw_irq_get(rdev, ring); > + > + fence_data = (uint32_t*)buffer_data; > + timeout = > an > + * interrupt and the value in the buffer might have changed. > + */ > +struct drm_radeon_gem_wait_user_fence { > + uint32_thandle; > + uint32_tring; > + uint64_toffset; > + uint32_tvalue; > + uint64_ttimeout_usec; Align things here, 64 then 64 then 32 32 32 and a 32 pad. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 06:56:01PM +0100, Michel Dänzer wrote: > On Die, 2012-01-31 at 16:59 +, Simon Farnsworth wrote: > > Userspace currently busywaits for fences to complete; on my workload, this > > busywait consumes 10% of the available CPU time. > > > > Provide an ioctl so that userspace can wait for an EOP interrupt that > > corresponds to a previous EVENT_WRITE_EOP. > > > > Signed-off-by: Simon Farnsworth > > --- > > I've been working on top of Jerome's tiling patches, so this doesn't apply > > directly on top of current upstream kernels. I can easily rebase to another > > version upon request - just point me to a git tree. > > > > My goal is to remove the sched_yield in Mesa's r600_fence_finish given up to > > date enough kernel; I hope, though, that the interface is clean enough for > > other users to extend it in the future (e.g. using compute rings). > > I'm afraid not: Unless I'm missing something, userspace can't know which > ring the kernel submitted the CS to, and the kernel can't guess which > ring userspace needs to wait for. iirc the plan was to add a return value to cs ioctl and add an ioctl to allow to wait on this return value. ie allowing userspace to wait on specific submited cs. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: > > > Some comments below. > > > + struct radeon_device *rdev = dev->dev_private; > > + struct drm_gem_object *gobj; > > + struct radeon_bo *robj; > > + void *buffer_data; > > + uint32_t *fence_data; > > + int r = 0; > > + long timeout; > > + int ring = RADEON_RING_TYPE_GFX_INDEX; > > + > > + /* If you're implementing this for other rings, you'll want to > > share > > + code with radeon_cs_get_ring in radeon_cs.c */ > > + if (args->ring != RADEON_CS_RING_GFX) { > > + return -EINVAL; > > + } > > + > > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > > + if (gobj == NULL) { > > + return -ENOENT; > > + } > > + robj = gem_to_radeon_bo(gobj); > > + > > + if (gobj->size < args->offset) { > > + r = -EINVAL; > > + goto unreference; > > + } > > + > > + r = radeon_bo_reserve(robj, true); > > + if (r) { > > + goto unreference; > > + } > > + > > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > > + if (r) { > > + goto unreserve; > > + } > > + > > + r = radeon_bo_kmap(robj, &buffer_data); > > + if (r) { > > + goto unpin; > > + } > > + > > > Do you need to pin it? I think if you have it reserved and you are in here > you shouldn't need to. (unless kmapping requires a pin)/ No you don't need to pin, actually it's bad to pin the buffer you might want to wait on might be in vram. Also you never assume that things could go wrong and that your value might never be written to the buffer. So it will wait forever in the kernel. As i said in the other mail i would rather as a wait on submited cs ioctl and add some return value from cs ioctl. I can't remember the outcome of this discusion we had when we were doing virtual memory support. Alex ? Michel ? better memory than i do ? :) Cheers, Jerome > > > > + radeon_irq_kms_sw_irq_get(rdev, ring); > > + > > + fence_data = (uint32_t*)buffer_data; > > + timeout = > > an > > + * interrupt and the value in the buffer might have changed. > > + */ > > +struct drm_radeon_gem_wait_user_fence { > > + uint32_thandle; > > + uint32_tring; > > + uint64_toffset; > > + uint32_tvalue; > > + uint64_ttimeout_usec; > > Align things here, 64 then 64 then 32 32 32 and a 32 pad. > > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 2:07 PM, Jerome Glisse wrote: > On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: >> >> >> Some comments below. >> >> > + struct radeon_device *rdev = dev->dev_private; >> > + struct drm_gem_object *gobj; >> > + struct radeon_bo *robj; >> > + void *buffer_data; >> > + uint32_t *fence_data; >> > + int r = 0; >> > + long timeout; >> > + int ring = RADEON_RING_TYPE_GFX_INDEX; >> > + >> > + /* If you're implementing this for other rings, you'll want to >> > share >> > + code with radeon_cs_get_ring in radeon_cs.c */ >> > + if (args->ring != RADEON_CS_RING_GFX) { >> > + return -EINVAL; >> > + } >> > + >> > + gobj = drm_gem_object_lookup(dev, filp, args->handle); >> > + if (gobj == NULL) { >> > + return -ENOENT; >> > + } >> > + robj = gem_to_radeon_bo(gobj); >> > + >> > + if (gobj->size < args->offset) { >> > + r = -EINVAL; >> > + goto unreference; >> > + } >> > + >> > + r = radeon_bo_reserve(robj, true); >> > + if (r) { >> > + goto unreference; >> > + } >> > + >> > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); >> > + if (r) { >> > + goto unreserve; >> > + } >> > + >> > + r = radeon_bo_kmap(robj, &buffer_data); >> > + if (r) { >> > + goto unpin; >> > + } >> > + >> >> >> Do you need to pin it? I think if you have it reserved and you are in here >> you shouldn't need to. (unless kmapping requires a pin)/ > > No you don't need to pin, actually it's bad to pin the buffer you might > want to wait on might be in vram. > > Also you never assume that things could go wrong and that your value > might never be written to the buffer. So it will wait forever in the > kernel. > > As i said in the other mail i would rather as a wait on submited cs > ioctl and add some return value from cs ioctl. I can't remember the > outcome of this discusion we had when we were doing virtual memory > support. Alex ? Michel ? better memory than i do ? :) > I vaguely recall the discussion, but I don't remember the details, I'll have to look through my old mail. Maybe a new CS ioctl flag requesting EOP irqs for the CS would be a better approach than a separate ioctl. Alex > Cheers, > Jerome > >> >> >> > + radeon_irq_kms_sw_irq_get(rdev, ring); >> > + >> > + fence_data = (uint32_t*)buffer_data; >> > + timeout = >> > an >> > + * interrupt and the value in the buffer might have changed. >> > + */ >> > +struct drm_radeon_gem_wait_user_fence { >> > + uint32_t handle; >> > + uint32_t ring; >> > + uint64_t offset; >> > + uint32_t value; >> > + uint64_t timeout_usec; >> >> Align things here, 64 then 64 then 32 32 32 and a 32 pad. >> >> ___ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 02:13:00PM -0500, Alex Deucher wrote: > On Tue, Jan 31, 2012 at 2:07 PM, Jerome Glisse wrote: > > On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: > >> > >> > >> Some comments below. > >> > >> > + struct radeon_device *rdev = dev->dev_private; > >> > + struct drm_gem_object *gobj; > >> > + struct radeon_bo *robj; > >> > + void *buffer_data; > >> > + uint32_t *fence_data; > >> > + int r = 0; > >> > + long timeout; > >> > + int ring = RADEON_RING_TYPE_GFX_INDEX; > >> > + > >> > + /* If you're implementing this for other rings, you'll want to > >> > share > >> > + code with radeon_cs_get_ring in radeon_cs.c */ > >> > + if (args->ring != RADEON_CS_RING_GFX) { > >> > + return -EINVAL; > >> > + } > >> > + > >> > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > >> > + if (gobj == NULL) { > >> > + return -ENOENT; > >> > + } > >> > + robj = gem_to_radeon_bo(gobj); > >> > + > >> > + if (gobj->size < args->offset) { > >> > + r = -EINVAL; > >> > + goto unreference; > >> > + } > >> > + > >> > + r = radeon_bo_reserve(robj, true); > >> > + if (r) { > >> > + goto unreference; > >> > + } > >> > + > >> > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > >> > + if (r) { > >> > + goto unreserve; > >> > + } > >> > + > >> > + r = radeon_bo_kmap(robj, &buffer_data); > >> > + if (r) { > >> > + goto unpin; > >> > + } > >> > + > >> > >> > >> Do you need to pin it? I think if you have it reserved and you are in here > >> you shouldn't need to. (unless kmapping requires a pin)/ > > > > No you don't need to pin, actually it's bad to pin the buffer you might > > want to wait on might be in vram. > > > > Also you never assume that things could go wrong and that your value > > might never be written to the buffer. So it will wait forever in the > > kernel. > > > > As i said in the other mail i would rather as a wait on submited cs > > ioctl and add some return value from cs ioctl. I can't remember the > > outcome of this discusion we had when we were doing virtual memory > > support. Alex ? Michel ? better memory than i do ? :) > > > > I vaguely recall the discussion, but I don't remember the details, > I'll have to look through my old mail. Maybe a new CS ioctl flag > requesting EOP irqs for the CS would be a better approach than a > separate ioctl. > > Alex I think the idea was to return u64 value of specific ring and only enable irq if new wait cs ioctl was call, a little bit like bo wait. Will try to dig through my mail too. Cheers, Jerome > > > >> > >> > >> > + radeon_irq_kms_sw_irq_get(rdev, ring); > >> > + > >> > + fence_data = (uint32_t*)buffer_data; > >> > + timeout = > >> > an > >> > + * interrupt and the value in the buffer might have changed. > >> > + */ > >> > +struct drm_radeon_gem_wait_user_fence { > >> > + uint32_t handle; > >> > + uint32_t ring; > >> > + uint64_t offset; > >> > + uint32_t value; > >> > + uint64_t timeout_usec; > >> > >> Align things here, 64 then 64 then 32 32 32 and a 32 pad. > >> > >> ___ > >> dri-devel mailing list > >> dri-devel@lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/dri-devel > > ___ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/radeon: Add support for userspace fence waits
2012/1/31 Jerome Glisse : > On Tue, Jan 31, 2012 at 06:56:01PM +0100, Michel Dänzer wrote: >> On Die, 2012-01-31 at 16:59 +, Simon Farnsworth wrote: >> > Userspace currently busywaits for fences to complete; on my workload, this >> > busywait consumes 10% of the available CPU time. >> > >> > Provide an ioctl so that userspace can wait for an EOP interrupt that >> > corresponds to a previous EVENT_WRITE_EOP. >> > >> > Signed-off-by: Simon Farnsworth >> > --- >> > I've been working on top of Jerome's tiling patches, so this doesn't apply >> > directly on top of current upstream kernels. I can easily rebase to another >> > version upon request - just point me to a git tree. >> > >> > My goal is to remove the sched_yield in Mesa's r600_fence_finish given up >> > to >> > date enough kernel; I hope, though, that the interface is clean enough for >> > other users to extend it in the future (e.g. using compute rings). >> >> I'm afraid not: Unless I'm missing something, userspace can't know which >> ring the kernel submitted the CS to, and the kernel can't guess which >> ring userspace needs to wait for. > > iirc the plan was to add a return value to cs ioctl and add an ioctl to > allow to wait on this return value. ie allowing userspace to wait on > specific submited cs. You don't need a new API for that, r300g already does that. It adds a dummy relocation and later uses GEM_WAIT_IDLE to wait for it. r600g can be updated to do the same thing without kernel changes (besides, we must support the old kernels as well, so this is a no-brainer). What would be much more useful is to be able to wait for a fence, which can be in the middle of a CS. Now that's something that would justify changes in the kernel interface. Marek ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
On Tue, 31 Jan 2012 10:34:38 -0800, Jeremy Huddleston wrote: > > On Jan 31, 2012, at 8:59 AM, Eric Anholt wrote: > > > On Mon, 30 Jan 2012 15:25:20 -0800, Jeremy Huddleston > > wrote: > >> This fixes a failure in 'make check' found by the tinderbox when trying to > >> build this code on Linux/ppc. This code is only designed to run on > >> Intel platforms, so don't even bother building it if we're not in that set. > > > > Looks reasonable to me, except it should probably be checking $target_os > > (cross-compile target) rather than $host_os (cross compile build host). > > I think you are misunderstanding the variables (or perhaps you are > following Mozilla's usage of the variables, which is wrong but > internally consistent). You're right. And that was with me looking up information on these variables before mailing last time. :( pgpeZcImYSNkh.pgp Description: PGP signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Re: [PATCH] drm/radeon: Add support for userspace fence waits
On -10.01.-28163 20:59, Jerome Glisse wrote: On Tue, Jan 31, 2012 at 02:13:00PM -0500, Alex Deucher wrote: On Tue, Jan 31, 2012 at 2:07 PM, Jerome Glisse wrote: On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: Some comments below. + struct radeon_device *rdev = dev->dev_private; + struct drm_gem_object *gobj; + struct radeon_bo *robj; + void *buffer_data; + uint32_t *fence_data; + int r = 0; + long timeout; + int ring = RADEON_RING_TYPE_GFX_INDEX; + + /* If you're implementing this for other rings, you'll want to share + code with radeon_cs_get_ring in radeon_cs.c */ + if (args->ring != RADEON_CS_RING_GFX) { + return -EINVAL; + } + + gobj = drm_gem_object_lookup(dev, filp, args->handle); + if (gobj == NULL) { + return -ENOENT; + } + robj = gem_to_radeon_bo(gobj); + + if (gobj->size< args->offset) { + r = -EINVAL; + goto unreference; + } + + r = radeon_bo_reserve(robj, true); + if (r) { + goto unreference; + } + + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); + if (r) { + goto unreserve; + } + + r = radeon_bo_kmap(robj,&buffer_data); + if (r) { + goto unpin; + } + Do you need to pin it? I think if you have it reserved and you are in here you shouldn't need to. (unless kmapping requires a pin)/ No you don't need to pin, actually it's bad to pin the buffer you might want to wait on might be in vram. Also you never assume that things could go wrong and that your value might never be written to the buffer. So it will wait forever in the kernel. As i said in the other mail i would rather as a wait on submited cs ioctl and add some return value from cs ioctl. I can't remember the outcome of this discusion we had when we were doing virtual memory support. Alex ? Michel ? better memory than i do ? :) I vaguely recall the discussion, but I don't remember the details, I'll have to look through my old mail. Maybe a new CS ioctl flag requesting EOP irqs for the CS would be a better approach than a separate ioctl. Alex I think the idea was to return u64 value of specific ring and only enable irq if new wait cs ioctl was call, a little bit like bo wait. Will try to dig through my mail too. Cheers, Jerome As far as I remember we wanted the CS ioctl to return ring+fence number (expanded to 64bit) and add another ioctl to wait for a ring to reach that fence number. That has the clear advantages that a) we no longer need a fence bo for each application and b) not all rings can write a fence value from an userspace IB. I've started to implement that and at least had the 64bit expansion of fence values and the returning of them in the CS ioctl running, but then got to work on other things first. I probably have those patches still laying around somewhere, on the other hand it should be pretty easy to implement... Regards, Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 45018] [bisected] rendering regression since added support for virtual address space on cayman v11
https://bugs.freedesktop.org/show_bug.cgi?id=45018 --- Comment #11 from Alexandre Demers 2012-01-31 16:35:53 PST --- (In reply to comment #10) > (In reply to comment #9) > > The bugs is not visible under kernel 3.2, [...] > > 3.2 lacks Radeon virtual address space support. > > > I will try with a 3.3-rc2 kernel once it will be available. > > That's unlikely to make a difference, this is likely a userspace bug. Indeed and I can now confirm it. 3.3-rc2 doesn't change the problem. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 45473] New: vdpau-r300 requires softpipe
https://bugs.freedesktop.org/show_bug.cgi?id=45473 Bug #: 45473 Summary: vdpau-r300 requires softpipe Classification: Unclassified Product: Mesa Version: git Platform: x86 (IA32) OS/Version: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r300 AssignedTo: dri-devel@lists.freedesktop.org ReportedBy: neotheu...@ymail.com When compiling vdpau-r300 (libvdpau_r300.so) I get a compiling error: No rule to make target `../../../../src/gallium/drivers/softpipe/libsoftpipe.a', needed by `../../../../lib/gallium/libvdpau_r300.so'. Stop. The solution: Instead of ./autogen.sh --with-dri-drivers="" --with-gallium-drivers=r300 --enable-vdpau add swrast to gallium driver ./autogen.sh --with-dri-drivers="" --with-gallium-drivers=r300,swrast --enable-vdpau Compiles fine now and without error. Should gallium targets for real hardware use softpipe code / is it necessary? If so, maybe move the necessary code from softpipe into a common directory which other gallium targets can easily find, without having to build a whole new gallium driver such as softpipe to satisfy dependencies. Reported by Alec Ari -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 45290] [bisected r200] fdo23670-drawpix_stencil fails and crashes
https://bugs.freedesktop.org/show_bug.cgi?id=45290 --- Comment #4 from Ian Romanick 2012-01-31 22:25:22 PST --- (In reply to comment #2) > Can you upgrade to a DDX from -git? > > I think the fix is in there, it was allocating too small depth buffers. I can, but it won't be until next week. I'm currently traveling. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
On Jan 31, 2012, at 8:59 AM, Eric Anholt wrote: > On Mon, 30 Jan 2012 15:25:20 -0800, Jeremy Huddleston > wrote: >> This fixes a failure in 'make check' found by the tinderbox when trying to >> build this code on Linux/ppc. This code is only designed to run on >> Intel platforms, so don't even bother building it if we're not in that set. > > Looks reasonable to me, except it should probably be checking $target_os > (cross-compile target) rather than $host_os (cross compile build host). I think you are misunderstanding the variables (or perhaps you are following Mozilla's usage of the variables, which is wrong but internally consistent). CBUILD - The platform of the machine being built on. CHOST - The platform that the delivered product will run on. CTARGET- The platform that the delivered product will generate code for. CTARGET is only really relevant for toolchain packages. eg. I want to build a gcc that builds mips code and runs on my intel box: CTARGET=mips-* CHOST=i686-* If I want to do the building of that compiler on my ppc box: CBUILD=powerpc-* CTARGET=mips-* CHOST=i686-* --Jeremy ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: radeon issues on MacBook Pro 8,2
On Fri, Jan 20, 2012 at 05:08:31PM -0600, Seth Forshee wrote: > > Can you track down who is calling the connector->detect() callbacks > > during suspend and resume? > > I got two different stack traces, see below. > > And to slightly amend my statement above, I'm only seeing the wrong > status returned on the suspend side of things. The status during resume > always seems to be correct. > > Pid: 49, comm: kworker/0:2 Tainted: G C O 3.2.0-10-generic #17-Ubuntu > Call Trace: > [] radeon_dp_detect+0x1de/0x230 [radeon] > [] output_poll_execute+0xbd/0x1a0 [drm_kms_helper] > [] ? drm_helper_mode_fill_fb_struct+0x30/0x30 > [drm_kms_helper] > [] process_one_work+0x11a/0x480 > [] worker_thread+0x164/0x370 > [] ? manage_workers.isra.30+0x130/0x130 > [] kthread+0x8c/0xa0 > [] kernel_thread_helper+0x4/0x10 > [] ? flush_kthread_worker+0xa0/0xa0 > [] ? gs_change+0x13/0x13 > > Pid: 49, comm: kworker/0:2 Tainted: G C O 3.2.0-10-generic #17-Ubuntu > Call Trace: > [] radeon_dp_detect+0x1de/0x230 [radeon] > [] drm_helper_probe_single_connector_modes+0x2c3/0x380 > [drm_kms_helper] > [] drm_fb_helper_hotplug_event+0xf2/0x150 [drm_kms_helper] > [] radeon_fb_output_poll_changed+0x15/0x20 [radeon] > [] radeon_output_poll_changed+0x15/0x20 [radeon] > [] output_poll_execute+0x190/0x1a0 [drm_kms_helper] > [] ? drm_helper_mode_fill_fb_struct+0x30/0x30 > [drm_kms_helper] > [] process_one_work+0x11a/0x480 > [] worker_thread+0x164/0x370 > [] ? manage_workers.isra.30+0x130/0x130 > [] kthread+0x8c/0xa0 > [] kernel_thread_helper+0x4/0x10 > [] ? flush_kthread_worker+0xa0/0xa0 > [] ? gs_change+0x13/0x13 >From these traces it looks like all that really needs to happen is to disable the output polling during suspend. The following patch seems to get rid of the problems I'm seeing. Does this look okay to you? >From 0c01950f248c541198b7560793cf57c59b2c11f8 Mon Sep 17 00:00:00 2001 From: Seth Forshee Date: Tue, 31 Jan 2012 15:37:02 -0600 Subject: [PATCH 1/1] drm/radeon/kms: disable output polling when suspended Polling the outputs when the device is suspended can result in erroneous status updates. Disable output polling during suspend to prevent this from happening. Signed-off-by: Seth Forshee --- drivers/gpu/drm/radeon/radeon_device.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index cec51a5..49f7cb7 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -883,6 +883,8 @@ int radeon_suspend_kms(struct drm_device *dev, pm_message_t state) if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0; + drm_kms_helper_poll_disable(dev); + /* turn off display hw */ list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_OFF); @@ -972,6 +974,8 @@ int radeon_resume_kms(struct drm_device *dev) list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_ON); } + + drm_kms_helper_poll_enable(dev); return 0; } -- 1.7.8.3 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] dma-buf: add dma_data_direction to unmap dma_buf_op
Hi Laurent, On Jan 30, 2012 7:48 PM, "Laurent Pinchart" < laurent.pinchart at ideasonboard.com> wrote: > > Hi Sumit, > > On Friday 27 January 2012 10:43:28 Sumit Semwal wrote: > > Some exporters may use DMA map/unmap APIs in dma-buf ops, which require > > enum dma_data_direction for both map and unmap operations. > > > > Thus, the unmap dma_buf_op also needs to have enum dma_data_direction as > > a parameter. > > > > Reported-by: Tomasz Stanislawski > > Signed-off-by: Sumit Semwal > > --- > > drivers/base/dma-buf.c |7 +-- > > include/linux/dma-buf.h |8 +--- > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c > > index 8afe2dd..c9a945f 100644 > > --- a/drivers/base/dma-buf.c > > +++ b/drivers/base/dma-buf.c > > @@ -271,16 +271,19 @@ EXPORT_SYMBOL_GPL(dma_buf_map_attachment); > > * dma_buf_ops. > > * @attach: [in]attachment to unmap buffer from > > * @sg_table:[in]scatterlist info of the buffer to unmap > > + * @direction: [in]direction of DMA transfer > > * > > */ > > void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, > > - struct sg_table *sg_table) > > + struct sg_table *sg_table, > > + enum dma_data_direction direction) > > { > > if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) > > return; > > > > mutex_lock(&attach->dmabuf->lock); > > - attach->dmabuf->ops->unmap_dma_buf(attach, sg_table); > > + attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, > > + direction); > > mutex_unlock(&attach->dmabuf->lock); > > > > } > > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > > index 86f6241..847b026 100644 > > --- a/include/linux/dma-buf.h > > +++ b/include/linux/dma-buf.h > > @@ -63,7 +63,8 @@ struct dma_buf_ops { > > struct sg_table * (*map_dma_buf)(struct dma_buf_attachment *, > > enum dma_data_direction); > > void (*unmap_dma_buf)(struct dma_buf_attachment *, > > - struct sg_table *); > > + struct sg_table *, > > + enum dma_data_direction); > > /* TODO: Add try_map_dma_buf version, to return immed with -EBUSY > >* if the call would block. > >*/ > > @@ -122,7 +123,8 @@ void dma_buf_put(struct dma_buf *dmabuf); > > > > struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *, > > enum dma_data_direction); > > -void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table > > *); +void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct > > sg_table *, + enum dma_data_direction); > > #else > > > > static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf > > *dmabuf, @@ -166,7 +168,7 @@ static inline struct sg_table > > *dma_buf_map_attachment( } > > > > static inline void dma_buf_unmap_attachment(struct dma_buf_attachment > > *attach, -struct sg_table *sg) > > + struct sg_table *sg, enum dma_data_direction write) > > s/write/dir/ (or direction) ? > :-) sure. > > { > > return; > > } > > -- > Regards, > > Laurent Pinchart Best regards, -Sumit. -- next part -- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120131/399047b5/attachment-0001.html>
[PATCH] drm/radeon/kms: Fix device tree linkage of DP i2c buses too
Properly set the parent device of DP i2c buses before registering them too. Signed-off-by: Jean Delvare Cc: Dave Airlie Cc: Alex Deucher --- I'm sorry, my previous patch missed the fact that DP i2c buses are handled separately so they need the same fix. drivers/gpu/drm/radeon/radeon_i2c.c |1 + 1 file changed, 1 insertion(+) --- linux-3.3-rc1.orig/drivers/gpu/drm/radeon/radeon_i2c.c 2012-01-31 08:55:20.596108280 +0100 +++ linux-3.3-rc1/drivers/gpu/drm/radeon/radeon_i2c.c 2012-01-31 08:55:22.334108306 +0100 @@ -956,6 +956,7 @@ struct radeon_i2c_chan *radeon_i2c_creat i2c->rec = *rec; i2c->adapter.owner = THIS_MODULE; i2c->adapter.class = I2C_CLASS_DDC; + i2c->adapter.dev.parent = &dev->pdev->dev; i2c->dev = dev; snprintf(i2c->adapter.name, sizeof(i2c->adapter.name), "Radeon aux bus %s", name); -- Jean Delvare Suse L3
[PATCH] dma-buf: add dma_data_direction to unmap dma_buf_op
Hi Sumit, > On Friday 27 January 2012 10:43:28 Sumit Semwal wrote: [snip] > static inline void dma_buf_unmap_attachment(struct dma_buf_attachment > *attach, > -struct sg_table *sg) > + struct sg_table *sg, enum dma_data_direction write) On a second thought, would it make sense to store the direction in struct dma_buf_attachment in dma_buf_map_attachment(), and pass the value directly to the .unmap_dma_buf() instead of requiring the dma_buf_unmap_attachment() caller to remember it ? Or is an attachment allowed to map the buffer several times with different directions ? -- Regards, Laurent Pinchart
[Bug 45432] New: memory errors on cayman with VM support + nexuiz
https://bugs.freedesktop.org/show_bug.cgi?id=45432 Bug #: 45432 Summary: memory errors on cayman with VM support + nexuiz Classification: Unclassified Product: Mesa Version: git Platform: Other OS/Version: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r600 AssignedTo: dri-devel at lists.freedesktop.org ReportedBy: airlied at freedesktop.org Running latest kernel + git mesa on cayman start nexuiz-sdl timedemo demos/demo1.dem radeon: Failed to allocate a buffer: radeon:size : 2048 bytes radeon:alignment : 512 bytes radeon:domains : 2 EE r600_texture.c:715 r600_texture_get_transfer - failed to create temporary texture to hold untiled copy Mesa: User error: GL_OUT_OF_MEMORY in glTexImage2D radeon: Failed to allocate a buffer: radeon:size : 2048 bytes radeon:alignment : 512 bytes radeon:domains : 2 EE r600_texture.c:715 r600_texture_get_transfer - failed to create temporary texture to hold untiled copy in dmesg radeon :01:00.0: bo 88021a117400 va 0x09B4F000 conflict with (bo 8801ff5ef000 0x09B4F000 0x09B5) radeon :01:00.0: bo 8801ff528000 va 0x0AA8 conflict with (bo 88021efb6400 0x0AA8 0x0AA84000) -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 45432] memory errors on cayman with VM support + nexuiz
https://bugs.freedesktop.org/show_bug.cgi?id=45432 Dave Airlie changed: What|Removed |Added Keywords||regression -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[PATCH] dma-buf: add dma_data_direction to unmap dma_buf_op
On Tue, Jan 31, 2012 at 10:42:59AM +0100, Laurent Pinchart wrote: > Hi Sumit, > > > On Friday 27 January 2012 10:43:28 Sumit Semwal wrote: > > [snip] > > > static inline void dma_buf_unmap_attachment(struct dma_buf_attachment > > *attach, > > -struct sg_table *sg) > > + struct sg_table *sg, enum dma_data_direction write) > > On a second thought, would it make sense to store the direction in struct > dma_buf_attachment in dma_buf_map_attachment(), and pass the value directly > to > the .unmap_dma_buf() instead of requiring the dma_buf_unmap_attachment() > caller to remember it ? Or is an attachment allowed to map the buffer several > times with different directions ? Current dma api functions already require you to supply the direction argument on unmap and I think for cpu access I'm also leaning towards an interface where the importer has to supply the direction argument for both begin_access and end_access. So for consistency reasons I'm leaning towards adding it to unmap. -Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48
r600g: Trying to remove the busywait for fence completion, but hitting inexplicable behaviour
Hello, When profiling my workload on an AMD E-350 (PALM GPU) to see why it still wasn't performing well with Jerome's WIP macrotiling patches, I noticed that r600_fence_finish was taking 10% of my CPU time. I determined experimentally that changing from sched_yield() to os_time_sleep(10) fixed this and resolved my last performance issue on AMD Fusion as compared to Intel Atom, but felt that this was hacky. I've therefore tried to use INT_SEL of 0b10 in the EVENT_WRITE_EOP in Mesa, combined with a new ioctl to wait for a changed value, but it's not working the way I would expect. I'll be sending patches as replies to this message, so that you can see exactly what I've done, but in brief, I have an ioctl that uses wait_event to wait for a chosen offset in a BO to change value. I've added a suitable waitqueue, and made radeon_fence_process call wake_up_all. I'm seeing behaviour from this that I can't explain; as you'll see in the patches, I've moved some IRQ prints from DRM_DEBUG to printk(KERN_INFO), and I'm seeing that I don't get the EOP interrupt in a timely fashion - either because memory is not as coherent between the GPU and CPU as I would like (so I'm reading stale data when I call wait_event), or because the interrupt is genuinely delayed.
[PATCH] drm/radeon/kms: Fix device tree linkage of DP i2c buses too
On Tue, Jan 31, 2012 at 3:55 AM, Jean Delvare wrote: > Properly set the parent device of DP i2c buses before registering them > too. > > Signed-off-by: Jean Delvare > Cc: Dave Airlie > Cc: Alex Deucher Reviewed-by: Alex Deucher > --- > I'm sorry, my previous patch missed the fact that DP i2c buses are > handled separately so they need the same fix. > > ?drivers/gpu/drm/radeon/radeon_i2c.c | ? ?1 + > ?1 file changed, 1 insertion(+) > > --- linux-3.3-rc1.orig/drivers/gpu/drm/radeon/radeon_i2c.c ? ? ?2012-01-31 > 08:55:20.596108280 +0100 > +++ linux-3.3-rc1/drivers/gpu/drm/radeon/radeon_i2c.c ? 2012-01-31 > 08:55:22.334108306 +0100 > @@ -956,6 +956,7 @@ struct radeon_i2c_chan *radeon_i2c_creat > ? ? ? ?i2c->rec = *rec; > ? ? ? ?i2c->adapter.owner = THIS_MODULE; > ? ? ? ?i2c->adapter.class = I2C_CLASS_DDC; > + ? ? ? i2c->adapter.dev.parent = &dev->pdev->dev; > ? ? ? ?i2c->dev = dev; > ? ? ? ?snprintf(i2c->adapter.name, sizeof(i2c->adapter.name), > ? ? ? ? ? ? ? ? "Radeon aux bus %s", name); > > -- > Jean Delvare > Suse L3
[PATCH v2] drm/radeon/kms: remove benchmarks shorter than one page
On Mon, Jan 30, 2012 at 11:10 PM, Ilija Hadzic wrote: > copy_blit operation works only on integral number of pages > so benchmarks shorter than one page size (4K) do not make sense > > v2: use RADEON_GPU_PAGE_SIZE instead of "magic" 1024 number and > ? ?sweep sizes between 1x to 16x doubling > ? ?the size in each iteration; we get the same coverage, as > ? ?in the original benchmark, but guarantee integer multiples > ? ?of page size > > Signed-off-by: Ilija Hadzic Reviewed-by: Alex Deucher > --- > ?drivers/gpu/drm/radeon/radeon_benchmark.c | ? 12 ++-- > ?1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c > b/drivers/gpu/drm/radeon/radeon_benchmark.c > index 815f234..d3c0002 100644 > --- a/drivers/gpu/drm/radeon/radeon_benchmark.c > +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c > @@ -208,22 +208,22 @@ void radeon_benchmark(struct radeon_device *rdev, int > test_number) > ? ? ? ? ? ? ? ?break; > ? ? ? ?case 3: > ? ? ? ? ? ? ? ?/* GTT to VRAM, buffer size sweep, powers of 2 */ > - ? ? ? ? ? ? ? for (i = 1; i <= 65536; i <<= 1) > - ? ? ? ? ? ? ? ? ? ? ? radeon_benchmark_move(rdev, i*1024, > + ? ? ? ? ? ? ? for (i = 1; i <= 16384; i <<= 1) > + ? ? ? ? ? ? ? ? ? ? ? radeon_benchmark_move(rdev, i*RADEON_GPU_PAGE_SIZE, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?RADEON_GEM_DOMAIN_GTT, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?RADEON_GEM_DOMAIN_VRAM); > ? ? ? ? ? ? ? ?break; > ? ? ? ?case 4: > ? ? ? ? ? ? ? ?/* VRAM to GTT, buffer size sweep, powers of 2 */ > - ? ? ? ? ? ? ? for (i = 1; i <= 65536; i <<= 1) > - ? ? ? ? ? ? ? ? ? ? ? radeon_benchmark_move(rdev, i*1024, > + ? ? ? ? ? ? ? for (i = 1; i <= 16384; i <<= 1) > + ? ? ? ? ? ? ? ? ? ? ? radeon_benchmark_move(rdev, i*RADEON_GPU_PAGE_SIZE, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?RADEON_GEM_DOMAIN_VRAM, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?RADEON_GEM_DOMAIN_GTT); > ? ? ? ? ? ? ? ?break; > ? ? ? ?case 5: > ? ? ? ? ? ? ? ?/* VRAM to VRAM, buffer size sweep, powers of 2 */ > - ? ? ? ? ? ? ? for (i = 1; i <= 65536; i <<= 1) > - ? ? ? ? ? ? ? ? ? ? ? radeon_benchmark_move(rdev, i*1024, > + ? ? ? ? ? ? ? for (i = 1; i <= 16384; i <<= 1) > + ? ? ? ? ? ? ? ? ? ? ? radeon_benchmark_move(rdev, i*RADEON_GPU_PAGE_SIZE, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?RADEON_GEM_DOMAIN_VRAM, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?RADEON_GEM_DOMAIN_VRAM); > ? ? ? ? ? ? ? ?break; > -- > 1.7.7 > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] [r600g] Use new kernel interface to wait for fences
Instead of busywaiting for the GPU to finish a fence, use the new kernel interface to wait for fence completion. This code needs completion - in particular, we should fall back to busywaiting (using the nokernel function that's in radeon_drm_bo.c) if the kernel doesn't support the new interface. Signed-off-by: Simon Farnsworth --- src/gallium/drivers/r600/r600_hw_context.c|2 +- src/gallium/drivers/r600/r600_pipe.c | 12 +++-- src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 30 + src/gallium/winsys/radeon/drm/radeon_winsys.h | 16 + 4 files changed, 51 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index 8eb8e6d..35a57a7 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -1618,7 +1618,7 @@ void r600_context_emit_fence(struct r600_context *ctx, struct r600_resource *fen ctx->pm4[ctx->pm4_cdwords++] = EVENT_TYPE(EVENT_TYPE_CACHE_FLUSH_AND_INV_TS_EVENT) | EVENT_INDEX(5); ctx->pm4[ctx->pm4_cdwords++] = va & 0xUL; /* ADDRESS_LO */ /* DATA_SEL | INT_EN | ADDRESS_HI */ - ctx->pm4[ctx->pm4_cdwords++] = (1 << 29) | (0 << 24) | ((va >> 32UL) & 0xFF); + ctx->pm4[ctx->pm4_cdwords++] = (1 << 29) | (2 << 24) | ((va >> 32UL) & 0xFF); ctx->pm4[ctx->pm4_cdwords++] = value; /* DATA_LO */ ctx->pm4[ctx->pm4_cdwords++] = 0; /* DATA_HI */ ctx->pm4[ctx->pm4_cdwords++] = PKT3(PKT3_NOP, 0, 0); diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index c38fbc5..12c5bf5 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -595,7 +595,6 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen, struct r600_screen *rscreen = (struct r600_screen *)pscreen; struct r600_fence *rfence = (struct r600_fence*)fence; int64_t start_time = 0; - unsigned spins = 0; if (timeout != PIPE_TIMEOUT_INFINITE) { start_time = os_time_get(); @@ -605,13 +604,10 @@ static boolean r600_fence_finish(struct pipe_screen *pscreen, } while (rscreen->fences.data[rfence->index] == 0) { - if (++spins % 256) - continue; -#ifdef PIPE_OS_UNIX - sched_yield(); -#else - os_time_sleep(10); -#endif + rscreen->ws->buffer_wait_fence(rscreen->fences.bo->buf, + rfence->index << 2, + 0, + timeout); if (timeout != PIPE_TIMEOUT_INFINITE && os_time_get() - start_time >= timeout) { return FALSE; diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index 143dcf9..b552c11 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c @@ -879,6 +879,35 @@ static uint64_t radeon_winsys_bo_va(struct pb_buffer *buffer) return bo->va; } +/* No kernel support for doing this faster - just spin */ +static void radeon_winsys_bo_wait_fence_nokernel(struct pb_buffer *buf, +unsigned offset, +uint32_t value, +uint64_t timeout) +{ +#ifdef PIPE_OS_UNIX +sched_yield(); +#else +os_time_sleep(10); +#endif +} + +static void radeon_winsys_bo_wait_fence(struct pb_buffer *_buf, + unsigned offset, + uint32_t value, + uint64_t timeout) +{ +struct radeon_bo *bo = get_radeon_bo(_buf); +struct drm_radeon_gem_wait_user_fence args; +memset(&args, 0, sizeof(args)); +args.handle = bo->handle; +args.offset = offset; +args.value = value; +args.timeout_usec = timeout; +while (drmCommandWrite(bo->rws->fd, DRM_RADEON_GEM_WAIT_USER_FENCE, + &args, sizeof(args)) == -EBUSY); +} + void radeon_bomgr_init_functions(struct radeon_drm_winsys *ws) { ws->base.buffer_get_cs_handle = radeon_drm_get_cs_handle; @@ -892,4 +921,5 @@ void radeon_bomgr_init_functions(struct radeon_drm_winsys *ws) ws->base.buffer_from_handle = radeon_winsys_bo_from_handle; ws->base.buffer_get_handle = radeon_winsys_bo_get_handle; ws->base.buffer_get_virtual_address = radeon_winsys_bo_va; +ws->base.buffer_wait_fence = radeon_winsys_bo_wait_fence; } diff --git a/src/gallium/winsys/radeon/drm/radeon_winsys.h b/src/gallium/winsys/radeon/drm/radeon_winsys.h index e462e86..869961f 100644 --- a/src/gallium/winsys/radeon/drm/radeon_winsys.h +++ b/src/galliu
[PATCH v2] drm/radeon/kms: remove benchmarks shorter than one page
On Tue, 31 Jan 2012, Alex Deucher wrote: >> >> Signed-off-by: Ilija Hadzic > > Reviewed-by: Alex Deucher > Thanks. There will be a v3, though to address one tirvial comment (whitespace between binary '*' operator) that I received in E-mail sent directly to me. I guess the diff v2/v3 will be trivial enough that we can consider that this review applies too. -- Ilija
[PATCH v2] drm/radeon/kms: remove benchmarks shorter than one page
On Tue, Jan 31, 2012 at 9:27 AM, Ilija Hadzic wrote: > > > On Tue, 31 Jan 2012, Alex Deucher wrote: > >>> >>> Signed-off-by: Ilija Hadzic >> >> >> Reviewed-by: Alex Deucher >> > > Thanks. There will be a v3, though to address one tirvial comment > (whitespace between binary '*' operator) that I received in E-mail sent > directly to me. > > I guess the diff v2/v3 will be trivial enough that we can consider that this > review applies too. yes, consider that trivial change reviewed as well. Alex > > -- Ilija
[PATCH] drm/radeon: Add support for userspace fence waits
Userspace currently busywaits for fences to complete; on my workload, this busywait consumes 10% of the available CPU time. Provide an ioctl so that userspace can wait for an EOP interrupt that corresponds to a previous EVENT_WRITE_EOP. This currently doesn't work, hence the debug code piled in. Signed-off-by: Simon Farnsworth --- drivers/gpu/drm/radeon/evergreen.c |8 ++-- drivers/gpu/drm/radeon/radeon.h|3 + drivers/gpu/drm/radeon/radeon_device.c |1 + drivers/gpu/drm/radeon/radeon_fence.c |3 + drivers/gpu/drm/radeon/radeon_gem.c| 70 drivers/gpu/drm/radeon/radeon_kms.c|1 + include/drm/radeon_drm.h | 28 + 7 files changed, 110 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 0c5dd78..5b886b0 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3083,11 +3083,11 @@ restart_ih: case 176: /* CP_INT in ring buffer */ case 177: /* CP_INT in IB1 */ case 178: /* CP_INT in IB2 */ - DRM_DEBUG("IH: CP int: 0x%08x\n", src_data); + printk(KERN_INFO "IH: CP int: 0x%08x\n", src_data); radeon_fence_process(rdev, RADEON_RING_TYPE_GFX_INDEX); break; case 181: /* CP EOP event */ - DRM_DEBUG("IH: CP EOP\n"); + printk(KERN_INFO "IH: CP EOP\n"); if (rdev->family >= CHIP_CAYMAN) { switch (src_data) { case 0: @@ -3104,12 +3104,12 @@ restart_ih: radeon_fence_process(rdev, RADEON_RING_TYPE_GFX_INDEX); break; case 233: /* GUI IDLE */ - DRM_DEBUG("IH: GUI idle\n"); + printk(KERN_INFO "IH: GUI idle\n"); rdev->pm.gui_idle = true; wake_up(&rdev->irq.idle_queue); break; default: - DRM_DEBUG("Unhandled interrupt: %d %d\n", src_id, src_data); + printk(KERN_INFO "Unhandled interrupt: %d %d\n", src_id, src_data); break; } diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2859406..fb0eafd 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -1348,6 +1348,8 @@ int radeon_gem_set_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); int radeon_gem_get_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, +struct drm_file *filp); /* VRAM scratch page for HDP bug, default vram page */ struct r600_vram_scratch { @@ -1444,6 +1446,7 @@ struct radeon_device { struct radeon_mman mman; rwlock_tfence_lock; struct radeon_fence_driver fence_drv[RADEON_NUM_RINGS]; + wait_queue_head_t userspace_fence_wait_queue; struct radeon_semaphore_driver semaphore_drv; struct radeon_ring ring[RADEON_NUM_RINGS]; struct radeon_ib_pool ib_pool; diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 0afb13b..dcf11e5 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -730,6 +730,7 @@ int radeon_device_init(struct radeon_device *rdev, mutex_init(&rdev->pm.mutex); mutex_init(&rdev->vram_mutex); rwlock_init(&rdev->fence_lock); + init_waitqueue_head(&rdev->userspace_fence_wait_queue); rwlock_init(&rdev->semaphore_drv.lock); INIT_LIST_HEAD(&rdev->gem.objects); init_waitqueue_head(&rdev->irq.vblank_queue); diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 64ea3dd..5b8270f 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -355,7 +355,10 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) write_unlock_irqrestore(&rdev->fence_lock, irq_flags); if (wake) { wake_up_all(&rdev->fence_drv[ring].queue); + printk( KERN_INFO "Woke kernel fences\n" ); } + printk( KERN_INFO "Waking up all waiters\n" ); + wake_up_interruptible_all(&rdev->userspace_fence_wait_queue); } int radeon_fence_count_emitted(struct radeon_device *rdev, int ring) diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 7337850..6866f75 100644 --- a/drivers/gpu
[PATCH v3] drm/radeon/kms: remove benchmarks shorter than one page
copy_blit operation works only on integral number of pages so benchmarks shorter than one page size (4K) do not make sense v2: use RADEON_GPU_PAGE_SIZE instead of "magic" 1024 number and sweep sizes between 1 * to 16K * doubling the size in each iteration; we get the same coverage, as in the original benchmark, but guarantee integer multiples of page size v3: add whitespace between '*' operator per review received from zajec5 at gmail.com Signed-off-by: Ilija Hadzic --- drivers/gpu/drm/radeon/radeon_benchmark.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c b/drivers/gpu/drm/radeon/radeon_benchmark.c index 815f234..58cee89 100644 --- a/drivers/gpu/drm/radeon/radeon_benchmark.c +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c @@ -208,22 +208,22 @@ void radeon_benchmark(struct radeon_device *rdev, int test_number) break; case 3: /* GTT to VRAM, buffer size sweep, powers of 2 */ - for (i = 1; i <= 65536; i <<= 1) - radeon_benchmark_move(rdev, i*1024, + for (i = 1; i <= 16384; i <<= 1) + radeon_benchmark_move(rdev, i * RADEON_GPU_PAGE_SIZE, RADEON_GEM_DOMAIN_GTT, RADEON_GEM_DOMAIN_VRAM); break; case 4: /* VRAM to GTT, buffer size sweep, powers of 2 */ - for (i = 1; i <= 65536; i <<= 1) - radeon_benchmark_move(rdev, i*1024, + for (i = 1; i <= 16384; i <<= 1) + radeon_benchmark_move(rdev, i * RADEON_GPU_PAGE_SIZE, RADEON_GEM_DOMAIN_VRAM, RADEON_GEM_DOMAIN_GTT); break; case 5: /* VRAM to VRAM, buffer size sweep, powers of 2 */ - for (i = 1; i <= 65536; i <<= 1) - radeon_benchmark_move(rdev, i*1024, + for (i = 1; i <= 16384; i <<= 1) + radeon_benchmark_move(rdev, i * RADEON_GPU_PAGE_SIZE, RADEON_GEM_DOMAIN_VRAM, RADEON_GEM_DOMAIN_VRAM); break; -- 1.7.7
[PATCH] drm/radeon: Add support for userspace fence waits
Alex Deucher pointed out an error on IRC; I'm not using radeon_irq_kms_sw_irq_get and put to manage the IRQ enablement. I've fixed this up (as per the partial hunk below), and my bug goes away. I will be cleaning these patches up for proper submission. Simon On Tuesday 31 January 2012, Simon Farnsworth wrote: > +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, > + struct drm_file *filp) > +{ > + struct drm_radeon_gem_wait_user_fence *args = data; > + struct radeon_device *rdev = dev->dev_private; > + struct drm_gem_object *gobj; > + struct radeon_bo *robj; > + void *buffer_data; > + uint32_t *fence_data; > + int r = 0; > + long timeout; > + > + printk( KERN_INFO "wait_user_fence offset %lld value %d timeout > %lld\n", args->offset, args->value, args->timeout_usec ); > + > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > + if (gobj == NULL) { > + return -ENOENT; > + } > + robj = gem_to_radeon_bo(gobj); > + > + if (gobj->size < args->offset) { > + printk( KERN_INFO "Offset too large\n" ); > + r = -EINVAL; > + goto unreference; > + } > + > + r = radeon_bo_reserve(robj, true); > + if (r) { > + printk( KERN_INFO "Reserve fail\n" ); > + goto unreference; > + } > + > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > + if (r) { > + printk( KERN_INFO "Pin fail\n" ); > + goto unreserve; > + } > + > + r = radeon_bo_kmap(robj, &buffer_data); > + if (r) { > + printk( KERN_INFO "kmap fail\n" ); > + goto unpin; > + } > + Missing radeon_irq_kms_sw_irq_get(rdev, RADEON_RING_TYPE_GFX_INDEX); here > + fence_data = (uint32_t*)buffer_data; > + > + printk( KERN_INFO "Current data value %d\n", fence_data[args->offset >> > 2] ); > + > + timeout = > wait_event_interruptible_timeout(rdev->userspace_fence_wait_queue, > +fence_data[args->offset >> > 2] != args->value, > + > usecs_to_jiffies(args->timeout_usec)); And missing radeon_irq_kms_sw_irq_put(rdev, RADEON_RING_TYPE_GFX_INDEX); here. > + if (timeout == 0) > + r = -ETIMEDOUT; > + else if (timeout < 0) > + r = timeout; > + > + printk( KERN_INFO "wait_user_fence offset %lld value %d timeout > %lld\n", args->offset, args->value, args->timeout_usec ); > + printk( KERN_INFO "Finished data value %d\n", fence_data[args->offset > >> 2] ); > + > + radeon_bo_kunmap(robj); > +unpin: > + radeon_bo_unpin(robj); > +unreserve: > + radeon_bo_unreserve(robj); > +unreference: > + drm_gem_object_unreference_unlocked(gobj); > + > + return r; > +} -- Simon Farnsworth Software Engineer ONELAN Limited http://www.onelan.com/ -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 490 bytes Desc: This is a digitally signed message part. URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120131/a6c775b9/attachment-0001.pgp>
r600g: Trying to remove the busywait for fence completion, but hitting inexplicable behaviour
On Tuesday 31 January 2012, Alan Swanson wrote: > On Tue, 2012-01-31 at 13:16 +, Simon Farnsworth wrote: > > Hello, > > > > When profiling my workload on an AMD E-350 (PALM GPU) to see why it still > > wasn't performing well with Jerome's WIP macrotiling patches, I noticed that > > r600_fence_finish was taking 10% of my CPU time. I determined experimentally > > that changing from sched_yield() to os_time_sleep(10) fixed this and > > resolved my last performance issue on AMD Fusion as compared to Intel Atom, > > but felt that this was hacky. > > No, you were right the first time, sched_yield should definitely not be > being used in this busy wait under Linux (even with its preceding few > spins). > Given that I can get the hardware to raise an interrupt when the fence finishes, I think the sleep is hacky - I should not be spinning under normal circumstances. I'm preparing patches that remove the sleep completely if you have new enough kernel, libdrm and Mesa. If someone else prepares patches to remove the sched_yield usage completely, I'll happily test them on my Radeon hardware. -- Simon Farnsworth Software Engineer ONELAN Limited http://www.onelan.com/ -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 490 bytes Desc: This is a digitally signed message part. URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120131/17f2538e/attachment.pgp>
r600g: Trying to remove the busywait for fence completion, but hitting inexplicable behaviour
On Tue, 2012-01-31 at 13:16 +, Simon Farnsworth wrote: > Hello, > > When profiling my workload on an AMD E-350 (PALM GPU) to see why it still > wasn't performing well with Jerome's WIP macrotiling patches, I noticed that > r600_fence_finish was taking 10% of my CPU time. I determined experimentally > that changing from sched_yield() to os_time_sleep(10) fixed this and > resolved my last performance issue on AMD Fusion as compared to Intel Atom, > but felt that this was hacky. No, you were right the first time, sched_yield should definitely not be being used in this busy wait under Linux (even with its preceding few spins). Back in 2003 when Linux 2.5 changed sched_yield to move processes from run queue to the expired queue instead, its use was discussed on the DRI devel list. http://lwn.net/Articles/31462/ http://thread.gmane.org/gmane.comp.video.dri.devel/5455/focus=6779 There stills seems to be five uses that should be checked. src/gallium/drivers/r600/r600_pipe.c src/gallium/drivers/nouveau/nouveau_fence.c src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c src/gallium/winsys/radeon/drm/radeon_drm_bo.c (*2) -- Alan.
[PATCH] Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
On Mon, 30 Jan 2012 15:25:20 -0800, Jeremy Huddleston wrote: > This fixes a failure in 'make check' found by the tinderbox when trying to > build this code on Linux/ppc. This code is only designed to run on > Intel platforms, so don't even bother building it if we're not in that set. Looks reasonable to me, except it should probably be checking $target_os (cross-compile target) rather than $host_os (cross compile build host). -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120131/f3b9dc7d/attachment.pgp>
[PATCH] drm/radeon: Add support for userspace fence waits
Userspace currently busywaits for fences to complete; on my workload, this busywait consumes 10% of the available CPU time. Provide an ioctl so that userspace can wait for an EOP interrupt that corresponds to a previous EVENT_WRITE_EOP. Signed-off-by: Simon Farnsworth --- I've been working on top of Jerome's tiling patches, so this doesn't apply directly on top of current upstream kernels. I can easily rebase to another version upon request - just point me to a git tree. My goal is to remove the sched_yield in Mesa's r600_fence_finish given up to date enough kernel; I hope, though, that the interface is clean enough for other users to extend it in the future (e.g. using compute rings). drivers/gpu/drm/radeon/radeon.h |3 + drivers/gpu/drm/radeon/radeon_drv.c |3 +- drivers/gpu/drm/radeon/radeon_fence.c |2 + drivers/gpu/drm/radeon/radeon_gem.c | 70 + drivers/gpu/drm/radeon/radeon_kms.c |1 + include/drm/radeon_drm.h | 30 ++ 6 files changed, 108 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 2859406..00c187b 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -217,6 +217,7 @@ struct radeon_fence_driver { unsigned long last_jiffies; unsigned long last_timeout; wait_queue_head_t queue; + wait_queue_head_t userspace_queue; struct list_headcreated; struct list_heademitted; struct list_headsignaled; @@ -1348,6 +1349,8 @@ int radeon_gem_set_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); int radeon_gem_get_tiling_ioctl(struct drm_device *dev, void *data, struct drm_file *filp); +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, +struct drm_file *filp); /* VRAM scratch page for HDP bug, default vram page */ struct r600_vram_scratch { diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4ae2e1d..9f82fa9 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -56,9 +56,10 @@ * 2.12.0 - RADEON_CS_KEEP_TILING_FLAGS * 2.13.0 - virtual memory support * 2.14.0 - add evergreen tiling informations + * 2.15.0 - gem_wait_user_fence ioctl */ #define KMS_DRIVER_MAJOR 2 -#define KMS_DRIVER_MINOR 14 +#define KMS_DRIVER_MINOR 15 #define KMS_DRIVER_PATCHLEVEL 0 int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags); int radeon_driver_unload_kms(struct drm_device *dev); diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index 64ea3dd..d86bc28 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -356,6 +356,7 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) if (wake) { wake_up_all(&rdev->fence_drv[ring].queue); } + wake_up_interruptible_all(&rdev->fence_drv[ring].userspace_queue); } int radeon_fence_count_emitted(struct radeon_device *rdev, int ring) @@ -421,6 +422,7 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring) INIT_LIST_HEAD(&rdev->fence_drv[ring].emitted); INIT_LIST_HEAD(&rdev->fence_drv[ring].signaled); init_waitqueue_head(&rdev->fence_drv[ring].queue); + init_waitqueue_head(&rdev->fence_drv[ring].userspace_queue); rdev->fence_drv[ring].initialized = false; } diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 7337850..602274f 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -531,3 +531,73 @@ int radeon_mode_dumb_destroy(struct drm_file *file_priv, { return drm_gem_handle_delete(file_priv, handle); } + +int radeon_gem_wait_user_fence_ioctl(struct drm_device *dev, void *data, + struct drm_file *filp) +{ + struct drm_radeon_gem_wait_user_fence *args = data; + struct radeon_device *rdev = dev->dev_private; + struct drm_gem_object *gobj; + struct radeon_bo *robj; + void *buffer_data; + uint32_t *fence_data; + int r = 0; + long timeout; + int ring = RADEON_RING_TYPE_GFX_INDEX; + + /* If you're implementing this for other rings, you'll want to share + code with radeon_cs_get_ring in radeon_cs.c */ + if (args->ring != RADEON_CS_RING_GFX) { + return -EINVAL; + } + + gobj = drm_gem_object_lookup(dev, filp, args->handle); + if (gobj == NULL) { + return -ENOENT; + } + robj = gem_to_radeon_bo(gobj); + + if (gobj->size <
[PATCH] drm/radeon: Add support for userspace fence waits
On Die, 2012-01-31 at 16:59 +, Simon Farnsworth wrote: > Userspace currently busywaits for fences to complete; on my workload, this > busywait consumes 10% of the available CPU time. > > Provide an ioctl so that userspace can wait for an EOP interrupt that > corresponds to a previous EVENT_WRITE_EOP. > > Signed-off-by: Simon Farnsworth > --- > I've been working on top of Jerome's tiling patches, so this doesn't apply > directly on top of current upstream kernels. I can easily rebase to another > version upon request - just point me to a git tree. > > My goal is to remove the sched_yield in Mesa's r600_fence_finish given up to > date enough kernel; I hope, though, that the interface is clean enough for > other users to extend it in the future (e.g. using compute rings). I'm afraid not: Unless I'm missing something, userspace can't know which ring the kernel submitted the CS to, and the kernel can't guess which ring userspace needs to wait for. -- Earthling Michel D?nzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer
[Bug 30502] [RADEON:R600C] atmospheres of planets get transparent when going away from planets in celestia
https://bugs.freedesktop.org/show_bug.cgi?id=30502 aceman changed: What|Removed |Added Status|NEW |RESOLVED Resolution||WORKSFORME --- Comment #15 from aceman 2012-01-31 10:15:27 PST --- I have not seen it since the report so let's close it for now. And I am on R600g as it gets usable and the default. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 31423] Stars in Celestia misrendered
https://bugs.freedesktop.org/show_bug.cgi?id=31423 --- Comment #2 from aceman 2012-01-31 10:17:34 PST --- On R600g, Mesa 8.0rc2 the stars seem to have proper colors. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 28905] [RADEON:KMS:R600C] missing scene triangle in scorched3d
https://bugs.freedesktop.org/show_bug.cgi?id=28905 aceman changed: What|Removed |Added Status|NEW |RESOLVED Resolution||WORKSFORME --- Comment #6 from aceman 2012-01-31 10:19:04 PST --- With R600g, Mesa 8.0rc2 it looks much better, the triangle is not missing and there are much more scene details and bugs fixed. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[PATCH] drm/radeon: Add support for userspace fence waits
Some comments below. > + struct radeon_device *rdev = dev->dev_private; > + struct drm_gem_object *gobj; > + struct radeon_bo *robj; > + void *buffer_data; > + uint32_t *fence_data; > + int r = 0; > + long timeout; > + int ring = RADEON_RING_TYPE_GFX_INDEX; > + > + /* If you're implementing this for other rings, you'll want to > share > +code with radeon_cs_get_ring in radeon_cs.c */ > + if (args->ring != RADEON_CS_RING_GFX) { > + return -EINVAL; > + } > + > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > + if (gobj == NULL) { > + return -ENOENT; > + } > + robj = gem_to_radeon_bo(gobj); > + > + if (gobj->size < args->offset) { > + r = -EINVAL; > + goto unreference; > + } > + > + r = radeon_bo_reserve(robj, true); > + if (r) { > + goto unreference; > + } > + > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > + if (r) { > + goto unreserve; > + } > + > + r = radeon_bo_kmap(robj, &buffer_data); > + if (r) { > + goto unpin; > + } > + Do you need to pin it? I think if you have it reserved and you are in here you shouldn't need to. (unless kmapping requires a pin)/ > + radeon_irq_kms_sw_irq_get(rdev, ring); > + > + fence_data = (uint32_t*)buffer_data; > + timeout = > an > + * interrupt and the value in the buffer might have changed. > + */ > +struct drm_radeon_gem_wait_user_fence { > + uint32_thandle; > + uint32_tring; > + uint64_toffset; > + uint32_tvalue; > + uint64_ttimeout_usec; Align things here, 64 then 64 then 32 32 32 and a 32 pad.
[PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 06:56:01PM +0100, Michel D?nzer wrote: > On Die, 2012-01-31 at 16:59 +, Simon Farnsworth wrote: > > Userspace currently busywaits for fences to complete; on my workload, this > > busywait consumes 10% of the available CPU time. > > > > Provide an ioctl so that userspace can wait for an EOP interrupt that > > corresponds to a previous EVENT_WRITE_EOP. > > > > Signed-off-by: Simon Farnsworth > > --- > > I've been working on top of Jerome's tiling patches, so this doesn't apply > > directly on top of current upstream kernels. I can easily rebase to another > > version upon request - just point me to a git tree. > > > > My goal is to remove the sched_yield in Mesa's r600_fence_finish given up to > > date enough kernel; I hope, though, that the interface is clean enough for > > other users to extend it in the future (e.g. using compute rings). > > I'm afraid not: Unless I'm missing something, userspace can't know which > ring the kernel submitted the CS to, and the kernel can't guess which > ring userspace needs to wait for. iirc the plan was to add a return value to cs ioctl and add an ioctl to allow to wait on this return value. ie allowing userspace to wait on specific submited cs. Cheers, Jerome
[PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: > > > Some comments below. > > > + struct radeon_device *rdev = dev->dev_private; > > + struct drm_gem_object *gobj; > > + struct radeon_bo *robj; > > + void *buffer_data; > > + uint32_t *fence_data; > > + int r = 0; > > + long timeout; > > + int ring = RADEON_RING_TYPE_GFX_INDEX; > > + > > + /* If you're implementing this for other rings, you'll want to > > share > > + code with radeon_cs_get_ring in radeon_cs.c */ > > + if (args->ring != RADEON_CS_RING_GFX) { > > + return -EINVAL; > > + } > > + > > + gobj = drm_gem_object_lookup(dev, filp, args->handle); > > + if (gobj == NULL) { > > + return -ENOENT; > > + } > > + robj = gem_to_radeon_bo(gobj); > > + > > + if (gobj->size < args->offset) { > > + r = -EINVAL; > > + goto unreference; > > + } > > + > > + r = radeon_bo_reserve(robj, true); > > + if (r) { > > + goto unreference; > > + } > > + > > + r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > > + if (r) { > > + goto unreserve; > > + } > > + > > + r = radeon_bo_kmap(robj, &buffer_data); > > + if (r) { > > + goto unpin; > > + } > > + > > > Do you need to pin it? I think if you have it reserved and you are in here > you shouldn't need to. (unless kmapping requires a pin)/ No you don't need to pin, actually it's bad to pin the buffer you might want to wait on might be in vram. Also you never assume that things could go wrong and that your value might never be written to the buffer. So it will wait forever in the kernel. As i said in the other mail i would rather as a wait on submited cs ioctl and add some return value from cs ioctl. I can't remember the outcome of this discusion we had when we were doing virtual memory support. Alex ? Michel ? better memory than i do ? :) Cheers, Jerome > > > > + radeon_irq_kms_sw_irq_get(rdev, ring); > > + > > + fence_data = (uint32_t*)buffer_data; > > + timeout = > > an > > + * interrupt and the value in the buffer might have changed. > > + */ > > +struct drm_radeon_gem_wait_user_fence { > > + uint32_thandle; > > + uint32_tring; > > + uint64_toffset; > > + uint32_tvalue; > > + uint64_ttimeout_usec; > > Align things here, 64 then 64 then 32 32 32 and a 32 pad. > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 2:07 PM, Jerome Glisse wrote: > On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: >> >> >> Some comments below. >> >> > + ? struct radeon_device *rdev = dev->dev_private; >> > + ? struct drm_gem_object *gobj; >> > + ? struct radeon_bo *robj; >> > + ? void *buffer_data; >> > + ? uint32_t *fence_data; >> > + ? int r = 0; >> > + ? long timeout; >> > + ? int ring = RADEON_RING_TYPE_GFX_INDEX; >> > + >> > + ? /* If you're implementing this for other rings, you'll want to >> > share >> > + ? ? ?code with radeon_cs_get_ring in radeon_cs.c */ >> > + ? if (args->ring != RADEON_CS_RING_GFX) { >> > + ? ? ? ? ? return -EINVAL; >> > + ? } >> > + >> > + ? gobj = drm_gem_object_lookup(dev, filp, args->handle); >> > + ? if (gobj == NULL) { >> > + ? ? ? ? ? return -ENOENT; >> > + ? } >> > + ? robj = gem_to_radeon_bo(gobj); >> > + >> > + ? if (gobj->size < args->offset) { >> > + ? ? ? ? ? r = -EINVAL; >> > + ? ? ? ? ? goto unreference; >> > + ? } >> > + >> > + ? r = radeon_bo_reserve(robj, true); >> > + ? if (r) { >> > + ? ? ? ? ? goto unreference; >> > + ? } >> > + >> > + ? r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); >> > + ? if (r) { >> > + ? ? ? ? ? goto unreserve; >> > + ? } >> > + >> > + ? r = radeon_bo_kmap(robj, &buffer_data); >> > + ? if (r) { >> > + ? ? ? ? ? goto unpin; >> > + ? } >> > + >> >> >> Do you need to pin it? I think if you have it reserved and you are in here >> you shouldn't need to. (unless kmapping requires a pin)/ > > No you don't need to pin, actually it's bad to pin the buffer you might > want to wait on might be in vram. > > Also you never assume that things could go wrong and that your value > might never be written to the buffer. So it will wait forever in the > kernel. > > As i said in the other mail i would rather as a wait on submited cs > ioctl and add some return value from cs ioctl. I can't remember the > outcome of this discusion we had when we were doing virtual memory > support. Alex ? Michel ? better memory than i do ? :) > I vaguely recall the discussion, but I don't remember the details, I'll have to look through my old mail. Maybe a new CS ioctl flag requesting EOP irqs for the CS would be a better approach than a separate ioctl. Alex > Cheers, > Jerome > >> >> >> > + ? radeon_irq_kms_sw_irq_get(rdev, ring); >> > + >> > + ? fence_data = (uint32_t*)buffer_data; >> > + ? timeout = >> > an >> > + * interrupt and the value in the buffer might have changed. >> > + */ >> > +struct drm_radeon_gem_wait_user_fence { >> > + ? uint32_t ? ? ? ? ? ? ? ?handle; >> > + ? uint32_t ? ? ? ? ? ? ? ?ring; >> > + ? uint64_t ? ? ? ? ? ? ? ?offset; >> > + ? uint32_t ? ? ? ? ? ? ? ?value; >> > + ? uint64_t ? ? ? ? ? ? ? ?timeout_usec; >> >> Align things here, 64 then 64 then 32 32 32 and a 32 pad. >> >> ___ >> dri-devel mailing list >> dri-devel at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/radeon: Add support for userspace fence waits
On Tue, Jan 31, 2012 at 02:13:00PM -0500, Alex Deucher wrote: > On Tue, Jan 31, 2012 at 2:07 PM, Jerome Glisse wrote: > > On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote: > >> > >> > >> Some comments below. > >> > >> > + ? struct radeon_device *rdev = dev->dev_private; > >> > + ? struct drm_gem_object *gobj; > >> > + ? struct radeon_bo *robj; > >> > + ? void *buffer_data; > >> > + ? uint32_t *fence_data; > >> > + ? int r = 0; > >> > + ? long timeout; > >> > + ? int ring = RADEON_RING_TYPE_GFX_INDEX; > >> > + > >> > + ? /* If you're implementing this for other rings, you'll want to > >> > share > >> > + ? ? ?code with radeon_cs_get_ring in radeon_cs.c */ > >> > + ? if (args->ring != RADEON_CS_RING_GFX) { > >> > + ? ? ? ? ? return -EINVAL; > >> > + ? } > >> > + > >> > + ? gobj = drm_gem_object_lookup(dev, filp, args->handle); > >> > + ? if (gobj == NULL) { > >> > + ? ? ? ? ? return -ENOENT; > >> > + ? } > >> > + ? robj = gem_to_radeon_bo(gobj); > >> > + > >> > + ? if (gobj->size < args->offset) { > >> > + ? ? ? ? ? r = -EINVAL; > >> > + ? ? ? ? ? goto unreference; > >> > + ? } > >> > + > >> > + ? r = radeon_bo_reserve(robj, true); > >> > + ? if (r) { > >> > + ? ? ? ? ? goto unreference; > >> > + ? } > >> > + > >> > + ? r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL); > >> > + ? if (r) { > >> > + ? ? ? ? ? goto unreserve; > >> > + ? } > >> > + > >> > + ? r = radeon_bo_kmap(robj, &buffer_data); > >> > + ? if (r) { > >> > + ? ? ? ? ? goto unpin; > >> > + ? } > >> > + > >> > >> > >> Do you need to pin it? I think if you have it reserved and you are in here > >> you shouldn't need to. (unless kmapping requires a pin)/ > > > > No you don't need to pin, actually it's bad to pin the buffer you might > > want to wait on might be in vram. > > > > Also you never assume that things could go wrong and that your value > > might never be written to the buffer. So it will wait forever in the > > kernel. > > > > As i said in the other mail i would rather as a wait on submited cs > > ioctl and add some return value from cs ioctl. I can't remember the > > outcome of this discusion we had when we were doing virtual memory > > support. Alex ? Michel ? better memory than i do ? :) > > > > I vaguely recall the discussion, but I don't remember the details, > I'll have to look through my old mail. Maybe a new CS ioctl flag > requesting EOP irqs for the CS would be a better approach than a > separate ioctl. > > Alex I think the idea was to return u64 value of specific ring and only enable irq if new wait cs ioctl was call, a little bit like bo wait. Will try to dig through my mail too. Cheers, Jerome > > > >> > >> > >> > + ? radeon_irq_kms_sw_irq_get(rdev, ring); > >> > + > >> > + ? fence_data = (uint32_t*)buffer_data; > >> > + ? timeout = > >> > an > >> > + * interrupt and the value in the buffer might have changed. > >> > + */ > >> > +struct drm_radeon_gem_wait_user_fence { > >> > + ? uint32_t ? ? ? ? ? ? ? ?handle; > >> > + ? uint32_t ? ? ? ? ? ? ? ?ring; > >> > + ? uint64_t ? ? ? ? ? ? ? ?offset; > >> > + ? uint32_t ? ? ? ? ? ? ? ?value; > >> > + ? uint64_t ? ? ? ? ? ? ? ?timeout_usec; > >> > >> Align things here, 64 then 64 then 32 32 32 and a 32 pad. > >> > >> ___ > >> dri-devel mailing list > >> dri-devel at lists.freedesktop.org > >> http://lists.freedesktop.org/mailman/listinfo/dri-devel > > ___ > > dri-devel mailing list > > dri-devel at lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/radeon: Add support for userspace fence waits
2012/1/31 Jerome Glisse : > On Tue, Jan 31, 2012 at 06:56:01PM +0100, Michel D?nzer wrote: >> On Die, 2012-01-31 at 16:59 +, Simon Farnsworth wrote: >> > Userspace currently busywaits for fences to complete; on my workload, this >> > busywait consumes 10% of the available CPU time. >> > >> > Provide an ioctl so that userspace can wait for an EOP interrupt that >> > corresponds to a previous EVENT_WRITE_EOP. >> > >> > Signed-off-by: Simon Farnsworth >> > --- >> > I've been working on top of Jerome's tiling patches, so this doesn't apply >> > directly on top of current upstream kernels. I can easily rebase to another >> > version upon request - just point me to a git tree. >> > >> > My goal is to remove the sched_yield in Mesa's r600_fence_finish given up >> > to >> > date enough kernel; I hope, though, that the interface is clean enough for >> > other users to extend it in the future (e.g. using compute rings). >> >> I'm afraid not: Unless I'm missing something, userspace can't know which >> ring the kernel submitted the CS to, and the kernel can't guess which >> ring userspace needs to wait for. > > iirc the plan was to add a return value to cs ioctl and add an ioctl to > allow to wait on this return value. ie allowing userspace to wait on > specific submited cs. You don't need a new API for that, r300g already does that. It adds a dummy relocation and later uses GEM_WAIT_IDLE to wait for it. r600g can be updated to do the same thing without kernel changes (besides, we must support the old kernels as well, so this is a no-brainer). What would be much more useful is to be able to wait for a fence, which can be in the middle of a CS. Now that's something that would justify changes in the kernel interface. Marek
[PATCH] Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
On Tue, 31 Jan 2012 10:34:38 -0800, Jeremy Huddleston wrote: > > On Jan 31, 2012, at 8:59 AM, Eric Anholt wrote: > > > On Mon, 30 Jan 2012 15:25:20 -0800, Jeremy Huddleston > freedesktop.org> wrote: > >> This fixes a failure in 'make check' found by the tinderbox when trying to > >> build this code on Linux/ppc. This code is only designed to run on > >> Intel platforms, so don't even bother building it if we're not in that set. > > > > Looks reasonable to me, except it should probably be checking $target_os > > (cross-compile target) rather than $host_os (cross compile build host). > > I think you are misunderstanding the variables (or perhaps you are > following Mozilla's usage of the variables, which is wrong but > internally consistent). You're right. And that was with me looking up information on these variables before mailing last time. :( -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120131/c047449b/attachment.pgp>
[PATCH] Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
On Jan 31, 2012, at 8:59 AM, Eric Anholt wrote: > On Mon, 30 Jan 2012 15:25:20 -0800, Jeremy Huddleston freedesktop.org> wrote: >> This fixes a failure in 'make check' found by the tinderbox when trying to >> build this code on Linux/ppc. This code is only designed to run on >> Intel platforms, so don't even bother building it if we're not in that set. > > Looks reasonable to me, except it should probably be checking $target_os > (cross-compile target) rather than $host_os (cross compile build host). I think you are misunderstanding the variables (or perhaps you are following Mozilla's usage of the variables, which is wrong but internally consistent). CBUILD - The platform of the machine being built on. CHOST - The platform that the delivered product will run on. CTARGET- The platform that the delivered product will generate code for. CTARGET is only really relevant for toolchain packages. eg. I want to build a gcc that builds mips code and runs on my intel box: CTARGET=mips-* CHOST=i686-* If I want to do the building of that compiler on my ppc box: CBUILD=powerpc-* CTARGET=mips-* CHOST=i686-* --Jeremy
radeon issues on MacBook Pro 8,2
On Fri, Jan 20, 2012 at 05:08:31PM -0600, Seth Forshee wrote: > > Can you track down who is calling the connector->detect() callbacks > > during suspend and resume? > > I got two different stack traces, see below. > > And to slightly amend my statement above, I'm only seeing the wrong > status returned on the suspend side of things. The status during resume > always seems to be correct. > > Pid: 49, comm: kworker/0:2 Tainted: G C O 3.2.0-10-generic #17-Ubuntu > Call Trace: > [] radeon_dp_detect+0x1de/0x230 [radeon] > [] output_poll_execute+0xbd/0x1a0 [drm_kms_helper] > [] ? drm_helper_mode_fill_fb_struct+0x30/0x30 > [drm_kms_helper] > [] process_one_work+0x11a/0x480 > [] worker_thread+0x164/0x370 > [] ? manage_workers.isra.30+0x130/0x130 > [] kthread+0x8c/0xa0 > [] kernel_thread_helper+0x4/0x10 > [] ? flush_kthread_worker+0xa0/0xa0 > [] ? gs_change+0x13/0x13 > > Pid: 49, comm: kworker/0:2 Tainted: G C O 3.2.0-10-generic #17-Ubuntu > Call Trace: > [] radeon_dp_detect+0x1de/0x230 [radeon] > [] drm_helper_probe_single_connector_modes+0x2c3/0x380 > [drm_kms_helper] > [] drm_fb_helper_hotplug_event+0xf2/0x150 [drm_kms_helper] > [] radeon_fb_output_poll_changed+0x15/0x20 [radeon] > [] radeon_output_poll_changed+0x15/0x20 [radeon] > [] output_poll_execute+0x190/0x1a0 [drm_kms_helper] > [] ? drm_helper_mode_fill_fb_struct+0x30/0x30 > [drm_kms_helper] > [] process_one_work+0x11a/0x480 > [] worker_thread+0x164/0x370 > [] ? manage_workers.isra.30+0x130/0x130 > [] kthread+0x8c/0xa0 > [] kernel_thread_helper+0x4/0x10 > [] ? flush_kthread_worker+0xa0/0xa0 > [] ? gs_change+0x13/0x13
[PATCH 1/1] drm/radeon/kms: disable output polling when suspended
Polling the outputs when the device is suspended can result in erroneous status updates. Disable output polling during suspend to prevent this from happening. Signed-off-by: Seth Forshee --- drivers/gpu/drm/radeon/radeon_device.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index cec51a5..49f7cb7 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -883,6 +883,8 @@ int radeon_suspend_kms(struct drm_device *dev, pm_message_t state) if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0; + drm_kms_helper_poll_disable(dev); + /* turn off display hw */ list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_OFF); @@ -972,6 +974,8 @@ int radeon_resume_kms(struct drm_device *dev) list_for_each_entry(connector, &dev->mode_config.connector_list, head) { drm_helper_connector_dpms(connector, DRM_MODE_DPMS_ON); } + + drm_kms_helper_poll_enable(dev); return 0; } -- 1.7.8.3