[PATCH 13/83] hsa/radeon: Add 2 new IOCTL to kfd, CREATE_QUEUE and DESTROY_QUEUE

2014-07-12 Thread Dave Airlie
> +/* The 64-bit ABI is the authoritative version. */
> +#pragma pack(push, 8)
> +

Don't do this, pad and align things explicitly in structs.

> +struct kfd_ioctl_create_queue_args {
> +   uint64_t ring_base_address; /* to KFD */
> +   uint32_t ring_size; /* to KFD */
> +   uint32_t gpu_id;/* to KFD */
> +   uint32_t queue_type;/* to KFD */
> +   uint32_t queue_percentage;  /* to KFD */
> +   uint32_t queue_priority;/* to KFD */
> +   uint64_t write_pointer_address; /* to KFD */
> +   uint64_t read_pointer_address;  /* to KFD */
> +
> +   uint64_t doorbell_address;  /* from KFD */
> +   uint32_t queue_id;  /* from KFD */
> +};
> +

maybe put all the uint64_t at the start, or add explicit padding.

Dave.


[Intel-gfx] [v3 09/13] drm/i915: Add rotation property for sprites

2014-07-12 Thread Daniel Vetter
On Tue, Jul 08, 2014 at 10:31:59AM +0530, sonika.jindal at intel.com wrote:
> From: Ville Syrj?l? 
> 
> Sprite planes support 180 degree rotation. The lower layers are now in
> place, so hook in the standard rotation property to expose the feature
> to the users.
> 
> v2: Moving rotation_property to drm_plane
> 
> Cc: dri-devel at lists.freedesktop.org
> Signed-off-by: Ville Syrj?l? 
> Signed-off-by: Sonika Jindal 
> Reviewed-by: Imre Deak 

Also this r-b tag was for v1 (which was ok), not for v2. If you carry over
such a review-by tag and make functional changes not discussed with the
reviewer you _must_ at least mark the r-b with a (v1) or if it's a big
change, drop the tag completely.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Intel-gfx] [v3 09/13] drm/i915: Add rotation property for sprites

2014-07-12 Thread Daniel Vetter
On Tue, Jul 08, 2014 at 10:31:59AM +0530, sonika.jindal at intel.com wrote:
> From: Ville Syrj?l? 
> 
> Sprite planes support 180 degree rotation. The lower layers are now in
> place, so hook in the standard rotation property to expose the feature
> to the users.
> 
> v2: Moving rotation_property to drm_plane
> 
> Cc: dri-devel at lists.freedesktop.org
> Signed-off-by: Ville Syrj?l? 
> Signed-off-by: Sonika Jindal 
> Reviewed-by: Imre Deak 
> ---
>  drivers/gpu/drm/i915/intel_sprite.c |   40 
> ++-
>  include/drm/drm_crtc.h  |1 +

One more: A patch titled with "drm/i915: ..." really shouldn't touch
anything outside of drm/i915 directories and so shouldn't introduce any
changes to core drm code. Such changes always need to be split out into a
separate drm patch. Exceptions (like refactoring function interfaces)
obviously apply.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Bug 73053] dpm hangs with BTC parts

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73053

--- Comment #39 from Alexandre Demers  ---
(In reply to comment #38)
> Attachment 102081 [details] fixes the "hard lockup with small vertical blue
> stripes" issue, when applied to 3.15.4, and AFAICS dpm works fine.
> 
> The new problem is that I get kernel panic after a few hours if dpm is
> enabled. With the good old profile method the system is stable.

Could you test with latest kernel 3.16 RC (the patch is already included)? I
have been running kernel 3.16-RC4 with this patch for Cayman and I don't get
any lockups anymore. I've been running my system for a few days (games, movies
and so on) without problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/7fa5a6b3/attachment.html>


[Bug 73053] dpm hangs with BTC parts

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73053

--- Comment #40 from Alex Deucher  ---
(In reply to comment #38)
> The new problem is that I get kernel panic after a few hours if dpm is
> enabled. With the good old profile method the system is stable.

Can you get a copy of the panic?  I think it may be related to the page
flipping changes in the last couple kernels.  It's not likely dpm would cause a
panic.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/fe863d50/attachment.html>


[PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking srbm_gfx_cntl register

2014-07-12 Thread Bridgman, John
Confirmed. The locking functions are removed from the interface in commit 82 :

[PATCH 82/83] drm/radeon: Remove lock functions from kfd2kgd interface

There is an elegant symmetry there, but yeah we need to find a way to make this 
less awkward to review without screwing up all the work you've done so far. 
It's not obvious how to do that though. I looked at squashing into a smaller 
number of big commits earlier on but unless we completely rip the code out and 
recreate from scratch I don't see anything better than :

- a few foundation commits
- a big code dump that covers everything up to ~patch 54 (with 71 squashed in)
- remaining commits squashed a bit to combine fixes with initial code

Is that what you had in mind when you said ~10 big commits ? Our feeling was 
that the need to skip over the original scheduler would make it more like "one 
really big commit and 10-20 smaller ones", and I think we all felt that the 
"big code dump" required to skip over the original scheduler would be a 
non-starter. 

I guess there is another option, and maybe that's what you had in mind -- 
breaking the "big code dump" into smaller commits would be possible if we were 
willing to not have working code until we got to the equivalent of ~patch 54 
(+71) when all the new scheduler bits were in. Maybe that would still be an 
improvement ?

Thanks,
JB

>-Original Message-
>From: Bridgman, John
>Sent: Friday, July 11, 2014 1:48 PM
>To: 'Jerome Glisse'; Oded Gabbay
>Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org; dri-
>devel at lists.freedesktop.org; Lewycky, Andrew; Joerg Roedel; Gabbay, Oded;
>Koenig, Christian
>Subject: RE: [PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking
>srbm_gfx_cntl register
>
>Checking... we shouldn't need to call the lock from kfd any more.We should
>be able to do any required locking in radeon kgd code.
>
>>-Original Message-
>>From: Jerome Glisse [mailto:j.glisse at gmail.com]
>>Sent: Friday, July 11, 2014 12:35 PM
>>To: Oded Gabbay
>>Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org;
>>dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
>>Joerg Roedel; Gabbay, Oded; Koenig, Christian
>>Subject: Re: [PATCH 07/83] drm/radeon: Add kfd-->kgd interface of
>>locking srbm_gfx_cntl register
>>
>>On Fri, Jul 11, 2014 at 12:50:07AM +0300, Oded Gabbay wrote:
>>> This patch adds a new interface to kfd2kgd_calls structure, which
>>> allows the kfd to lock and unlock the srbm_gfx_cntl register
>>
>>Why does kfd needs to lock this register if kfd can not access any of
>>those register ? This sounds broken to me, exposing a driver internal
>>mutex to another driver is not something i am fan of.
>>
>>Cheers,
>>J?r?me
>>
>>>
>>> Signed-off-by: Oded Gabbay 
>>> ---
>>>  drivers/gpu/drm/radeon/radeon_kfd.c | 20 
>>>  include/linux/radeon_kfd.h  |  4 
>>>  2 files changed, 24 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c
>>> b/drivers/gpu/drm/radeon/radeon_kfd.c
>>> index 66ee36b..594020e 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
>>> @@ -43,6 +43,10 @@ static void unkmap_mem(struct kgd_dev *kgd,
>struct
>>> kgd_mem *mem);
>>>
>>>  static uint64_t get_vmem_size(struct kgd_dev *kgd);
>>>
>>> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd); static void
>>> +unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
>>> +
>>> +
>>>  static const struct kfd2kgd_calls kfd2kgd = {
>>> .allocate_mem = allocate_mem,
>>> .free_mem = free_mem,
>>> @@ -51,6 +55,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>> .kmap_mem = kmap_mem,
>>> .unkmap_mem = unkmap_mem,
>>> .get_vmem_size = get_vmem_size,
>>> +   .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
>>> +   .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
>>>  };
>>>
>>>  static const struct kgd2kfd_calls *kgd2kfd; @@ -233,3 +239,17 @@
>>> static uint64_t get_vmem_size(struct kgd_dev *kgd)
>>>
>>> return rdev->mc.real_vram_size;
>>>  }
>>> +
>>> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd) {
>>> +   struct radeon_device *rdev = (struct radeon_device *)kgd;
>>> +
>>> +   mutex_lock(&rdev->srbm_mutex);
>>> +}
>>> +
>>> +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd) {
>>> +   struct radeon_device *rdev = (struct radeon_device *)kgd;
>>> +
>>> +   mutex_unlock(&rdev->srbm_mutex);
>>> +}
>>> diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
>>> index c7997d4..40b691c 100644
>>> --- a/include/linux/radeon_kfd.h
>>> +++ b/include/linux/radeon_kfd.h
>>> @@ -81,6 +81,10 @@ struct kfd2kgd_calls {
>>> void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem);
>>>
>>> uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
>>> +
>>> +   /* SRBM_GFX_CNTL mutex */
>>> +   void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>>> +   void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>>>  };
>>>
>>>  bool kgd2kfd_init(unsigned interface_version,

Recall: [PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking srbm_gfx_cntl register

2014-07-12 Thread Bridgman, John
Bridgman, John would like to recall the message, "[PATCH 07/83] drm/radeon: Add 
kfd-->kgd interface of locking srbm_gfx_cntl register".


[PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking srbm_gfx_cntl register

2014-07-12 Thread Bridgman, John
>-Original Message-
>From: Bridgman, John
>Sent: Friday, July 11, 2014 1:48 PM
>To: 'Jerome Glisse'; Oded Gabbay
>Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org; dri-
>devel at lists.freedesktop.org; Lewycky, Andrew; Joerg Roedel; Gabbay, Oded;
>Koenig, Christian
>Subject: RE: [PATCH 07/83] drm/radeon: Add kfd-->kgd interface of locking
>srbm_gfx_cntl register
>
>Checking... we shouldn't need to call the lock from kfd any more.We should
>be able to do any required locking in radeon kgd code.

Confirmed. The locking functions are removed from the interface in commit 82 :

[PATCH 82/83] drm/radeon: Remove lock functions from kfd2kgd interface

There is an elegant symmetry there, but yeah we need to find a way to make this 
less awkward to review without screwing up all the work you've done so far. 
It's not obvious how to do that though. I looked at squashing into a smaller 
number of big commits earlier on but unless we completely rip the code out and 
recreate from scratch I don't see anything better than :

- a few foundation commits
- a big code dump that covers everything up to ~patch 54 (with 71 squashed in)
- remaining commits squashed a bit to combine fixes with initial code

Is that what you had in mind when you said ~10 big commits ? Our feeling was 
that the need to skip over the original scheduler would make it more like "one 
really big commit and 10-20 smaller ones", and I think we all felt that the 
"big code dump" required to skip over the original scheduler would be a 
non-starter. 

I guess there is another option, and maybe that's what you had in mind -- 
breaking the "big code dump" into smaller commits would be possible if we were 
willing to not have working code until we got to the equivalent of ~patch 54 
(+71) when all the new scheduler bits were in. Maybe that would still be an 
improvement ?

Thanks,
JB

>
>>-Original Message-
>>From: Jerome Glisse [mailto:j.glisse at gmail.com]
>>Sent: Friday, July 11, 2014 12:35 PM
>>To: Oded Gabbay
>>Cc: David Airlie; Deucher, Alexander; linux-kernel at vger.kernel.org;
>>dri- devel at lists.freedesktop.org; Bridgman, John; Lewycky, Andrew;
>>Joerg Roedel; Gabbay, Oded; Koenig, Christian
>>Subject: Re: [PATCH 07/83] drm/radeon: Add kfd-->kgd interface of
>>locking srbm_gfx_cntl register
>>
>>On Fri, Jul 11, 2014 at 12:50:07AM +0300, Oded Gabbay wrote:
>>> This patch adds a new interface to kfd2kgd_calls structure, which
>>> allows the kfd to lock and unlock the srbm_gfx_cntl register
>>
>>Why does kfd needs to lock this register if kfd can not access any of
>>those register ? This sounds broken to me, exposing a driver internal
>>mutex to another driver is not something i am fan of.
>>
>>Cheers,
>>J?r?me
>>
>>>
>>> Signed-off-by: Oded Gabbay 
>>> ---
>>>  drivers/gpu/drm/radeon/radeon_kfd.c | 20 
>>>  include/linux/radeon_kfd.h  |  4 
>>>  2 files changed, 24 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c
>>> b/drivers/gpu/drm/radeon/radeon_kfd.c
>>> index 66ee36b..594020e 100644
>>> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
>>> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
>>> @@ -43,6 +43,10 @@ static void unkmap_mem(struct kgd_dev *kgd,
>struct
>>> kgd_mem *mem);
>>>
>>>  static uint64_t get_vmem_size(struct kgd_dev *kgd);
>>>
>>> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd); static void
>>> +unlock_srbm_gfx_cntl(struct kgd_dev *kgd);
>>> +
>>> +
>>>  static const struct kfd2kgd_calls kfd2kgd = {
>>> .allocate_mem = allocate_mem,
>>> .free_mem = free_mem,
>>> @@ -51,6 +55,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>> .kmap_mem = kmap_mem,
>>> .unkmap_mem = unkmap_mem,
>>> .get_vmem_size = get_vmem_size,
>>> +   .lock_srbm_gfx_cntl = lock_srbm_gfx_cntl,
>>> +   .unlock_srbm_gfx_cntl = unlock_srbm_gfx_cntl,
>>>  };
>>>
>>>  static const struct kgd2kfd_calls *kgd2kfd; @@ -233,3 +239,17 @@
>>> static uint64_t get_vmem_size(struct kgd_dev *kgd)
>>>
>>> return rdev->mc.real_vram_size;
>>>  }
>>> +
>>> +static void lock_srbm_gfx_cntl(struct kgd_dev *kgd) {
>>> +   struct radeon_device *rdev = (struct radeon_device *)kgd;
>>> +
>>> +   mutex_lock(&rdev->srbm_mutex);
>>> +}
>>> +
>>> +static void unlock_srbm_gfx_cntl(struct kgd_dev *kgd) {
>>> +   struct radeon_device *rdev = (struct radeon_device *)kgd;
>>> +
>>> +   mutex_unlock(&rdev->srbm_mutex);
>>> +}
>>> diff --git a/include/linux/radeon_kfd.h b/include/linux/radeon_kfd.h
>>> index c7997d4..40b691c 100644
>>> --- a/include/linux/radeon_kfd.h
>>> +++ b/include/linux/radeon_kfd.h
>>> @@ -81,6 +81,10 @@ struct kfd2kgd_calls {
>>> void (*unkmap_mem)(struct kgd_dev *kgd, struct kgd_mem *mem);
>>>
>>> uint64_t (*get_vmem_size)(struct kgd_dev *kgd);
>>> +
>>> +   /* SRBM_GFX_CNTL mutex */
>>> +   void (*lock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>>> +   void (*unlock_srbm_gfx_cntl)(struct kgd_dev *kgd);
>>>  };
>>>
>>>  bool kgd2kfd_init(unsigned interface_version,

[Bug 81255] New: EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

  Priority: medium
Bug ID: 81255
  Assignee: dri-devel at lists.freedesktop.org
   Summary: EDEADLK with S. Islands APU+dGPU during ib test on
ring 5
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: joshua.r.marshall.1991 at gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

Created attachment 102648
  --> https://bugs.freedesktop.org/attachment.cgi?id=102648&action=edit
sudo lshw -sanitize

Recently installed a R9 270x on my system.  For the life of me, I cannot
determine what is wrong.  The BIOS uses it by default, during early boot the
system uses it, but during the VT switchoff, my system switches to APU graphics
during a series of GPU page faults.  Also, when running startxfce, the first
time after boot fails, and the following uses only the APU graphics.

sudo lshw -sanitize > http://pastebin.com/tkihM4rH

sudo journalctl -b -kxam > http://pastebin.com/EJcwcYU6

cat /var/log/Xorg.0.log > http://pastebin.com/wCbzws9d

cat /var/log/Xorg.o.log.old > http://pastebin.com/GaAL5Tqa

Xorg is working on auto.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/72f1bf23/attachment.html>


[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

--- Comment #1 from joshua.r.marshall.1991 at gmail.com ---
Created attachment 102649
  --> https://bugs.freedesktop.org/attachment.cgi?id=102649&action=edit
sudo journalctl -b -kxam

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/c987546f/attachment.html>


[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

--- Comment #2 from joshua.r.marshall.1991 at gmail.com ---
Created attachment 102650
  --> https://bugs.freedesktop.org/attachment.cgi?id=102650&action=edit
Xorg.0.log

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/4e207e3e/attachment.html>


[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

--- Comment #3 from joshua.r.marshall.1991 at gmail.com ---
Created attachment 102651
  --> https://bugs.freedesktop.org/attachment.cgi?id=102651&action=edit
Xorg.0.log.old

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/c0765320/attachment-0001.html>


[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

--- Comment #4 from joshua.r.marshall.1991 at gmail.com ---
I would start at /drivers/gpu/drm/radeon/radeon_fence.c:368 since
that is the clause being hit.  So on line 364 on iteration 5 (the last)
radeon_ring_is_lockup returns true.  The function's definition is beyond me, so
that's where I'll have to leave you off.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/1d71744d/attachment.html>


[PATCH v2 0/2] drm: rework flip-work framework

2014-07-12 Thread Boris BREZILLON
Hello,

This patch series reworks the flip-work framework to make it safe when
calling drm_flip_work_queue from atomic contexts.

The 2nd patch of this series is optional, as it only reworks
drm_flip_work_init prototype to remove unneeded size argument and
return code (this function cannot fail anymore).

Best Regards,

Boris


Boris BREZILLON (2):
  drm: rework flip-work helpers to avoid calling func when the FIFO is
full
  drm: flip-work: change drm_flip_work_init prototype

 drivers/gpu/drm/drm_flip_work.c  | 104 ++-
 drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c |  19 ++
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c |  16 +
 drivers/gpu/drm/omapdrm/omap_plane.c |  14 +
 drivers/gpu/drm/tilcdc/tilcdc_crtc.c |   6 +-
 include/drm/drm_flip_work.h  |  31 ++---
 6 files changed, 105 insertions(+), 85 deletions(-)

-- 
1.8.3.2



[PATCH v2 1/2] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-12 Thread Boris BREZILLON
Make use of lists instead of kfifo in order to dynamically allocate
task entry when someone require some delayed work, and thus preventing
drm_flip_work_queue from directly calling func instead of queuing this
call.
This allow drm_flip_work_queue to be safely called even within irq
handlers.

Add new helper functions to allocate a flip work task and queue it when
needed. This prevents allocating data within irq context (which might
impact the time spent in the irq handler).

Signed-off-by: Boris BREZILLON 
---
 drivers/gpu/drm/drm_flip_work.c | 96 ++---
 include/drm/drm_flip_work.h | 29 +
 2 files changed, 93 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
index f9c7fa3..7441aa8 100644
--- a/drivers/gpu/drm/drm_flip_work.c
+++ b/drivers/gpu/drm/drm_flip_work.c
@@ -25,6 +25,44 @@
 #include "drm_flip_work.h"

 /**
+ * drm_flip_work_allocate_task - allocate a flip-work task
+ * @data: data associated to the task
+ * @flags: allocator flags
+ *
+ * Allocate a drm_flip_task object and attach private data to it.
+ */
+struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags)
+{
+   struct drm_flip_task *task;
+
+   task = kzalloc(sizeof(*task), flags);
+   if (task)
+   task->data = data;
+
+   return task;
+}
+EXPORT_SYMBOL(drm_flip_work_allocate_task);
+
+/**
+ * drm_flip_work_queue_task - queue a specific task
+ * @work: the flip-work
+ * @task: the task to handle
+ *
+ * Queues task, that will later be run (passed back to drm_flip_func_t
+ * func) on a work queue after drm_flip_work_commit() is called.
+ */
+void drm_flip_work_queue_task(struct drm_flip_work *work,
+ struct drm_flip_task *task)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&work->lock, flags);
+   list_add_tail(&task->node, &work->queued);
+   spin_unlock_irqrestore(&work->lock, flags);
+}
+EXPORT_SYMBOL(drm_flip_work_queue_task);
+
+/**
  * drm_flip_work_queue - queue work
  * @work: the flip-work
  * @val: the value to queue
@@ -34,10 +72,14 @@
  */
 void drm_flip_work_queue(struct drm_flip_work *work, void *val)
 {
-   if (kfifo_put(&work->fifo, val)) {
-   atomic_inc(&work->pending);
+   struct drm_flip_task *task;
+
+   task = drm_flip_work_allocate_task(val,
+   drm_can_sleep() ? GFP_KERNEL : GFP_ATOMIC);
+   if (task) {
+   drm_flip_work_queue_task(work, task);
} else {
-   DRM_ERROR("%s fifo full!\n", work->name);
+   DRM_ERROR("%s could not allocate task!\n", work->name);
work->func(work, val);
}
 }
@@ -56,9 +98,12 @@ EXPORT_SYMBOL(drm_flip_work_queue);
 void drm_flip_work_commit(struct drm_flip_work *work,
struct workqueue_struct *wq)
 {
-   uint32_t pending = atomic_read(&work->pending);
-   atomic_add(pending, &work->count);
-   atomic_sub(pending, &work->pending);
+   unsigned long flags;
+
+   spin_lock_irqsave(&work->lock, flags);
+   list_splice_tail(&work->queued, &work->commited);
+   INIT_LIST_HEAD(&work->queued);
+   spin_unlock_irqrestore(&work->lock, flags);
queue_work(wq, &work->worker);
 }
 EXPORT_SYMBOL(drm_flip_work_commit);
@@ -66,14 +111,26 @@ EXPORT_SYMBOL(drm_flip_work_commit);
 static void flip_worker(struct work_struct *w)
 {
struct drm_flip_work *work = container_of(w, struct drm_flip_work, 
worker);
-   uint32_t count = atomic_read(&work->count);
-   void *val = NULL;
+   struct list_head tasks;
+   unsigned long flags;

-   atomic_sub(count, &work->count);
+   while (1) {
+   struct drm_flip_task *task, *tmp;

-   while(count--)
-   if (!WARN_ON(!kfifo_get(&work->fifo, &val)))
-   work->func(work, val);
+   INIT_LIST_HEAD(&tasks);
+   spin_lock_irqsave(&work->lock, flags);
+   list_splice_tail(&work->commited, &tasks);
+   INIT_LIST_HEAD(&work->commited);
+   spin_unlock_irqrestore(&work->lock, flags);
+
+   if (list_empty(&tasks))
+   break;
+
+   list_for_each_entry_safe(task, tmp, &tasks, node) {
+   work->func(work, task->data);
+   kfree(task);
+   }
+   }
 }

 /**
@@ -91,19 +148,11 @@ static void flip_worker(struct work_struct *w)
 int drm_flip_work_init(struct drm_flip_work *work, int size,
const char *name, drm_flip_func_t func)
 {
-   int ret;
-
work->name = name;
-   atomic_set(&work->count, 0);
-   atomic_set(&work->pending, 0);
+   INIT_LIST_HEAD(&work->queued);
+   INIT_LIST_HEAD(&work->commited);
work->func = func;

-   ret = kfifo_alloc(&work->fifo, size, GFP_KERNEL);
-   if (ret) {
-   DRM_ERROR("could 

[PATCH v2 2/2] drm: flip-work: change drm_flip_work_init prototype

2014-07-12 Thread Boris BREZILLON
Now that we're using lists instead of kfifo to store drm flip-work tasks
we do not need the size parameter passed to drm_flip_work_init function
anymore.
Moreover this function cannot fail anymore, we can thus remove the return
code.

Modify drm_flip_work_init users to take account of these changes.

Signed-off-by: Boris BREZILLON 
---
 drivers/gpu/drm/drm_flip_work.c  |  8 +---
 drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c | 19 ---
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c | 16 +++-
 drivers/gpu/drm/omapdrm/omap_plane.c | 14 ++
 drivers/gpu/drm/tilcdc/tilcdc_crtc.c |  6 +-
 include/drm/drm_flip_work.h  |  2 +-
 6 files changed, 12 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
index 7441aa8..2b557f2 100644
--- a/drivers/gpu/drm/drm_flip_work.c
+++ b/drivers/gpu/drm/drm_flip_work.c
@@ -136,16 +136,12 @@ static void flip_worker(struct work_struct *w)
 /**
  * drm_flip_work_init - initialize flip-work
  * @work: the flip-work to initialize
- * @size: the max queue depth
  * @name: debug name
  * @func: the callback work function
  *
  * Initializes/allocates resources for the flip-work
- *
- * RETURNS:
- * Zero on success, error code on failure.
  */
-int drm_flip_work_init(struct drm_flip_work *work, int size,
+void drm_flip_work_init(struct drm_flip_work *work,
const char *name, drm_flip_func_t func)
 {
work->name = name;
@@ -154,8 +150,6 @@ int drm_flip_work_init(struct drm_flip_work *work, int size,
work->func = func;

INIT_WORK(&work->worker, flip_worker);
-
-   return 0;
 }
 EXPORT_SYMBOL(drm_flip_work_init);

diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c 
b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
index 74cebb5..44d4f93 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
@@ -755,10 +755,8 @@ struct drm_crtc *mdp4_crtc_init(struct drm_device *dev,
int ret;

mdp4_crtc = kzalloc(sizeof(*mdp4_crtc), GFP_KERNEL);
-   if (!mdp4_crtc) {
-   ret = -ENOMEM;
-   goto fail;
-   }
+   if (!mdp4_crtc)
+   return ERR_PTR(-ENOMEM);

crtc = &mdp4_crtc->base;

@@ -779,12 +777,9 @@ struct drm_crtc *mdp4_crtc_init(struct drm_device *dev,

spin_lock_init(&mdp4_crtc->cursor.lock);

-   ret = drm_flip_work_init(&mdp4_crtc->unref_fb_work, 16,
+   drm_flip_work_init(&mdp4_crtc->unref_fb_work,
"unref fb", unref_fb_worker);
-   if (ret)
-   goto fail;
-
-   ret = drm_flip_work_init(&mdp4_crtc->unref_cursor_work, 64,
+   drm_flip_work_init(&mdp4_crtc->unref_cursor_work,
"unref cursor", unref_cursor_worker);

INIT_FENCE_CB(&mdp4_crtc->pageflip_cb, pageflip_cb);
@@ -795,10 +790,4 @@ struct drm_crtc *mdp4_crtc_init(struct drm_device *dev,
mdp4_plane_install_properties(mdp4_crtc->plane, &crtc->base);

return crtc;
-
-fail:
-   if (crtc)
-   mdp4_crtc_destroy(crtc);
-
-   return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c 
b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
index ebe2e60..a0cb374 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
@@ -537,10 +537,8 @@ struct drm_crtc *mdp5_crtc_init(struct drm_device *dev,
int ret;

mdp5_crtc = kzalloc(sizeof(*mdp5_crtc), GFP_KERNEL);
-   if (!mdp5_crtc) {
-   ret = -ENOMEM;
-   goto fail;
-   }
+   if (!mdp5_crtc)
+   return ERR_PTR(-ENOMEM);

crtc = &mdp5_crtc->base;

@@ -553,10 +551,8 @@ struct drm_crtc *mdp5_crtc_init(struct drm_device *dev,
snprintf(mdp5_crtc->name, sizeof(mdp5_crtc->name), "%s:%d",
pipe2name(mdp5_plane_pipe(plane)), id);

-   ret = drm_flip_work_init(&mdp5_crtc->unref_fb_work, 16,
+   drm_flip_work_init(&mdp5_crtc->unref_fb_work,
"unref fb", unref_fb_worker);
-   if (ret)
-   goto fail;

INIT_FENCE_CB(&mdp5_crtc->pageflip_cb, pageflip_cb);

@@ -566,10 +562,4 @@ struct drm_crtc *mdp5_crtc_init(struct drm_device *dev,
mdp5_plane_install_properties(mdp5_crtc->plane, &crtc->base);

return crtc;
-
-fail:
-   if (crtc)
-   mdp5_crtc_destroy(crtc);
-
-   return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c 
b/drivers/gpu/drm/omapdrm/omap_plane.c
index 3cf31ee..847d1ca 100644
--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -397,14 +397,10 @@ struct drm_plane *omap_plane_init(struct drm_device *dev,

omap_plane = kzalloc(sizeof(*omap_plane), GFP_KERNEL);
if (!omap_plane)
-   goto fail;
+   return NULL;

-   ret = drm_flip_work_init(&omap_plane->unpin_work, 16,
+   drm_fl

[PATCH v2 1/2] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-12 Thread Boris BREZILLON
On Sat, 12 Jul 2014 09:00:08 +0200
Boris BREZILLON  wrote:

> Make use of lists instead of kfifo in order to dynamically allocate
> task entry when someone require some delayed work, and thus preventing
> drm_flip_work_queue from directly calling func instead of queuing this
> call.
> This allow drm_flip_work_queue to be safely called even within irq
> handlers.
> 
> Add new helper functions to allocate a flip work task and queue it when
> needed. This prevents allocating data within irq context (which might
> impact the time spent in the irq handler).
> 
> Signed-off-by: Boris BREZILLON 
> ---
>  drivers/gpu/drm/drm_flip_work.c | 96 
> ++---
>  include/drm/drm_flip_work.h | 29 +
>  2 files changed, 93 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
> index f9c7fa3..7441aa8 100644
> --- a/drivers/gpu/drm/drm_flip_work.c
> +++ b/drivers/gpu/drm/drm_flip_work.c
> @@ -25,6 +25,44 @@
>  #include "drm_flip_work.h"
>  
>  /**
> + * drm_flip_work_allocate_task - allocate a flip-work task
> + * @data: data associated to the task
> + * @flags: allocator flags
> + *
> + * Allocate a drm_flip_task object and attach private data to it.
> + */
> +struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags)
> +{
> + struct drm_flip_task *task;
> +
> + task = kzalloc(sizeof(*task), flags);
> + if (task)
> + task->data = data;
> +
> + return task;
> +}
> +EXPORT_SYMBOL(drm_flip_work_allocate_task);
> +
> +/**
> + * drm_flip_work_queue_task - queue a specific task
> + * @work: the flip-work
> + * @task: the task to handle
> + *
> + * Queues task, that will later be run (passed back to drm_flip_func_t
> + * func) on a work queue after drm_flip_work_commit() is called.
> + */
> +void drm_flip_work_queue_task(struct drm_flip_work *work,
> +   struct drm_flip_task *task)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&work->lock, flags);
> + list_add_tail(&task->node, &work->queued);
> + spin_unlock_irqrestore(&work->lock, flags);
> +}
> +EXPORT_SYMBOL(drm_flip_work_queue_task);
> +
> +/**
>   * drm_flip_work_queue - queue work
>   * @work: the flip-work
>   * @val: the value to queue
> @@ -34,10 +72,14 @@
>   */
>  void drm_flip_work_queue(struct drm_flip_work *work, void *val)
>  {
> - if (kfifo_put(&work->fifo, val)) {
> - atomic_inc(&work->pending);
> + struct drm_flip_task *task;
> +
> + task = drm_flip_work_allocate_task(val,
> + drm_can_sleep() ? GFP_KERNEL : GFP_ATOMIC);
> + if (task) {
> + drm_flip_work_queue_task(work, task);
>   } else {
> - DRM_ERROR("%s fifo full!\n", work->name);
> + DRM_ERROR("%s could not allocate task!\n", work->name);
>   work->func(work, val);
>   }
>  }
> @@ -56,9 +98,12 @@ EXPORT_SYMBOL(drm_flip_work_queue);
>  void drm_flip_work_commit(struct drm_flip_work *work,
>   struct workqueue_struct *wq)
>  {
> - uint32_t pending = atomic_read(&work->pending);
> - atomic_add(pending, &work->count);
> - atomic_sub(pending, &work->pending);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&work->lock, flags);
> + list_splice_tail(&work->queued, &work->commited);
> + INIT_LIST_HEAD(&work->queued);
> + spin_unlock_irqrestore(&work->lock, flags);
>   queue_work(wq, &work->worker);
>  }
>  EXPORT_SYMBOL(drm_flip_work_commit);
> @@ -66,14 +111,26 @@ EXPORT_SYMBOL(drm_flip_work_commit);
>  static void flip_worker(struct work_struct *w)
>  {
>   struct drm_flip_work *work = container_of(w, struct drm_flip_work, 
> worker);
> - uint32_t count = atomic_read(&work->count);
> - void *val = NULL;
> + struct list_head tasks;
> + unsigned long flags;
>  
> - atomic_sub(count, &work->count);
> + while (1) {
> + struct drm_flip_task *task, *tmp;
>  
> - while(count--)
> - if (!WARN_ON(!kfifo_get(&work->fifo, &val)))
> - work->func(work, val);
> + INIT_LIST_HEAD(&tasks);
> + spin_lock_irqsave(&work->lock, flags);
> + list_splice_tail(&work->commited, &tasks);
> + INIT_LIST_HEAD(&work->commited);
> + spin_unlock_irqrestore(&work->lock, flags);
> +
> + if (list_empty(&tasks))
> + break;
> +
> + list_for_each_entry_safe(task, tmp, &tasks, node) {
> + work->func(work, task->data);
> + kfree(task);
> + }
> + }
>  }
>  
>  /**
> @@ -91,19 +148,11 @@ static void flip_worker(struct work_struct *w)
>  int drm_flip_work_init(struct drm_flip_work *work, int size,
>   const char *name, drm_flip_func_t func)
>  {
> - int ret;
> -
>   work->name = name;
> - atomic_set(&work->count, 0);
> - atomic_set(&work

[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

--- Comment #5 from Kertesz Laszlo  ---
You should try disabling the APU from the BIOS.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/098ed5a3/attachment.html>


[PATCH 02/83] drm/radeon: reduce number of free VMIDs and pipes in KV

2014-07-12 Thread Christian König
Am 11.07.2014 18:22, schrieb Alex Deucher:
> On Fri, Jul 11, 2014 at 12:18 PM, Christian K?nig
>  wrote:
>> Am 11.07.2014 18:05, schrieb Jerome Glisse:
>>
>>> On Fri, Jul 11, 2014 at 12:50:02AM +0300, Oded Gabbay wrote:
 To support HSA on KV, we need to limit the number of vmids and pipes
 that are available for radeon's use with KV.

 This patch reserves VMIDs 8-15 for KFD (so radeon can only use VMIDs
 0-7) and also makes radeon thinks that KV has only a single MEC with a
 single
 pipe in it

 Signed-off-by: Oded Gabbay 
>>> Reviewed-by: J?r?me Glisse 
>>
>> At least fro the VMIDs on demand allocation should be trivial to implement,
>> so I would rather prefer this instead of a fixed assignment.
> IIRC, the way the CP hw scheduler works you have to give it a range of
> vmids and it assigns them dynamically as queues are mapped so
> effectively they are potentially in use once the CP scheduler is set
> up.

That's not what I meant. Changing it completely on the fly is nice to 
have, but we should at least make it configurable as a module parameter.

And even if we hardcode it we should use a define for it somewhere 
instead of hardcoding 8 VMIDs on the KGD side and 8 VMIDs on KFD side 
without any relation to each other.

Christian.

> Alex
>
>
>> Christian.
>>
>>
 ---
drivers/gpu/drm/radeon/cik.c | 48
 ++--
1 file changed, 24 insertions(+), 24 deletions(-)

 diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
 index 4bfc2c0..e0c8052 100644
 --- a/drivers/gpu/drm/radeon/cik.c
 +++ b/drivers/gpu/drm/radeon/cik.c
 @@ -4662,12 +4662,11 @@ static int cik_mec_init(struct radeon_device
 *rdev)
  /*
   * KV:2 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 64 Queues total
   * CI/KB: 1 MEC, 4 Pipes/MEC, 8 Queues/Pipe - 32 Queues total
 +* Nonetheless, we assign only 1 pipe because all other pipes
 will
 +* be handled by KFD
   */
 -   if (rdev->family == CHIP_KAVERI)
 -   rdev->mec.num_mec = 2;
 -   else
 -   rdev->mec.num_mec = 1;
 -   rdev->mec.num_pipe = 4;
 +   rdev->mec.num_mec = 1;
 +   rdev->mec.num_pipe = 1;
  rdev->mec.num_queue = rdev->mec.num_mec * rdev->mec.num_pipe * 8;
  if (rdev->mec.hpd_eop_obj == NULL) {
 @@ -4809,28 +4808,24 @@ static int cik_cp_compute_resume(struct
 radeon_device *rdev)
  /* init the pipes */
  mutex_lock(&rdev->srbm_mutex);
 -   for (i = 0; i < (rdev->mec.num_pipe * rdev->mec.num_mec); i++) {
 -   int me = (i < 4) ? 1 : 2;
 -   int pipe = (i < 4) ? i : (i - 4);
- eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr + (i *
 MEC_HPD_SIZE * 2);
 +   eop_gpu_addr = rdev->mec.hpd_eop_gpu_addr;
- cik_srbm_select(rdev, me, pipe, 0, 0);
 +   cik_srbm_select(rdev, 0, 0, 0, 0);
- /* write the EOP addr */
 -   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
 -   WREG32(CP_HPD_EOP_BASE_ADDR_HI,
 upper_32_bits(eop_gpu_addr) >> 8);
 +   /* write the EOP addr */
 +   WREG32(CP_HPD_EOP_BASE_ADDR, eop_gpu_addr >> 8);
 +   WREG32(CP_HPD_EOP_BASE_ADDR_HI, upper_32_bits(eop_gpu_addr) >>
 8);
- /* set the VMID assigned */
 -   WREG32(CP_HPD_EOP_VMID, 0);
 +   /* set the VMID assigned */
 +   WREG32(CP_HPD_EOP_VMID, 0);
 +
 +   /* set the EOP size, register value is 2^(EOP_SIZE+1) dwords */
 +   tmp = RREG32(CP_HPD_EOP_CONTROL);
 +   tmp &= ~EOP_SIZE_MASK;
 +   tmp |= order_base_2(MEC_HPD_SIZE / 8);
 +   WREG32(CP_HPD_EOP_CONTROL, tmp);
- /* set the EOP size, register value is 2^(EOP_SIZE+1)
 dwords */
 -   tmp = RREG32(CP_HPD_EOP_CONTROL);
 -   tmp &= ~EOP_SIZE_MASK;
 -   tmp |= order_base_2(MEC_HPD_SIZE / 8);
 -   WREG32(CP_HPD_EOP_CONTROL, tmp);
 -   }
 -   cik_srbm_select(rdev, 0, 0, 0, 0);
  mutex_unlock(&rdev->srbm_mutex);
  /* init the queues.  Just two for now. */
 @@ -5876,8 +5871,13 @@ int cik_ib_parse(struct radeon_device *rdev,
 struct radeon_ib *ib)
 */
int cik_vm_init(struct radeon_device *rdev)
{
 -   /* number of VMs */
 -   rdev->vm_manager.nvm = 16;
 +   /*
 +* number of VMs
 +* VMID 0 is reserved for Graphics
 +* radeon compute will use VMIDs 1-7
 +* KFD will use VMIDs 8-15
 +*/
 +   rdev->vm_manager.nvm = 8;
  /* base offset of vram pages */
  if (rdev->flags & RADEON_IS_IGP) {
  u64 t

[PATCH 00/83] AMD HSA kernel driver

2014-07-12 Thread Christian König
Am 11.07.2014 23:18, schrieb Jerome Glisse:
> On Thu, Jul 10, 2014 at 10:51:29PM +, Gabbay, Oded wrote:
>> On Thu, 2014-07-10 at 18:24 -0400, Jerome Glisse wrote:
>>> On Fri, Jul 11, 2014 at 12:45:27AM +0300, Oded Gabbay wrote:
   This patch set implements a Heterogeneous System Architecture
 (HSA) driver
   for radeon-family GPUs.
>>>   
>>> This is just quick comments on few things. Given size of this, people
>>> will need to have time to review things.
>>>   
   HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to
 share
   system resources more effectively via HW features including
 shared pageable
   memory, userspace-accessible work queues, and platform-level
 atomics. In
   addition to the memory protection mechanisms in GPUVM and
 IOMMUv2, the Sea
   Islands family of GPUs also performs HW-level validation of
 commands passed
   in through the queues (aka rings).
   The code in this patch set is intended to serve both as a sample
 driver for
   other HSA-compatible hardware devices and as a production driver
 for
   radeon-family processors. The code is architected to support
 multiple CPUs
   each with connected GPUs, although the current implementation
 focuses on a
   single Kaveri/Berlin APU, and works alongside the existing radeon
 kernel
   graphics driver (kgd).
   AMD GPUs designed for use with HSA (Sea Islands and up) share
 some hardware
   functionality between HSA compute and regular gfx/compute (memory,
   interrupts, registers), while other functionality has been added
   specifically for HSA compute  (hw scheduler for virtualized
 compute rings).
   All shared hardware is owned by the radeon graphics driver, and
 an interface
   between kfd and kgd allows the kfd to make use of those shared
 resources,
   while HSA-specific functionality is managed directly by kfd by
 submitting
   packets into an HSA-specific command queue (the "HIQ").
   During kfd module initialization a char device node (/dev/kfd) is
 created
   (surviving until module exit), with ioctls for queue creation &
 management,
   and data structures are initialized for managing HSA device
 topology.
   The rest of the initialization is driven by calls from the radeon
 kgd at
   the following points :
   - radeon_init (kfd_init)
   - radeon_exit (kfd_fini)
   - radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
   - radeon_driver_unload_kms (kfd_device_fini)
   During the probe and init processing per-device data structures
 are
   established which connect to the associated graphics kernel
 driver. This
   information is exposed to userspace via sysfs, along with a
 version number
   allowing userspace to determine if a topology change has occurred
 while it
   was reading from sysfs.
   The interface between kfd and kgd also allows the kfd to request
 buffer
   management services from kgd, and allows kgd to route interrupt
 requests to
   kfd code since the interrupt block is shared between regular
   graphics/compute and HSA compute subsystems in the GPU.
   The kfd code works with an open source usermode library
 ("libhsakmt") which
   is in the final stages of IP review and should be published in a
 separate
   repo over the next few days.
   The code operates in one of three modes, selectable via the
 sched_policy
   module parameter :
   - sched_policy=0 uses a hardware scheduler running in the MEC
 block within
   CP, and allows oversubscription (more queues than HW slots)
   - sched_policy=1 also uses HW scheduling but does not allow
   oversubscription, so create_queue requests fail when we run out
 of HW slots
   - sched_policy=2 does not use HW scheduling, so the driver
 manually assigns
   queues to HW slots by programming registers
   The "no HW scheduling" option is for debug & new hardware bringup
 only, so
   has less test coverage than the other options. Default in the
 current code
   is "HW scheduling without oversubscription" since that is where
 we have the
   most test coverage but we expect to change the default to "HW
 scheduling
   with oversubscription" after further testing. This effectively
 removes the
   HW limit on the number of work queues available to applications.
   Programs running on the GPU are associated with an address space
 through the
   VMID field, which is translated to a unique PASID at access time
 via a set
   of 16 VMID-to-PASID mapping registers. The available VMIDs
 (currently 16)
   are partitioned (under control of the radeon kgd) between current
   gfx/compute and HSA compute, with each getting 8 in the current
 code. The
   VMID-to

[Intel-gfx] [Xen-devel] [RFC][PATCH] gpu:drm:i915:intel_detect_pch: back to check devfn instead of check class type

2014-07-12 Thread Daniel Vetter
On Fri, Jul 11, 2014 at 08:30:59PM +, Tian, Kevin wrote:
> > From: Konrad Rzeszutek Wilk [mailto:konrad.wilk at oracle.com]
> > Sent: Friday, July 11, 2014 12:42 PM
> > 
> > On Fri, Jul 11, 2014 at 08:29:56AM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 10, 2014 at 09:08:24PM +, Tian, Kevin wrote:
> > > > actually I'm curious whether it's still necessary to __detect__ PCH. 
> > > > Could
> > > > we assume a 1:1 mapping between GPU and PCH, e.g. BDW already hard
> > > > code the knowledge:
> > > >
> > > >   } else if (IS_BROADWELL(dev)) {
> > > >   dev_priv->pch_type = PCH_LPT;
> > > >   dev_priv->pch_id =
> > > >
> > INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
> > > >   DRM_DEBUG_KMS("This is Broadwell,
> > assuming "
> > > > "LynxPoint LP PCH\n");
> > > >
> > > > Or if there is real usage on non-fixed mapping (not majority), could it 
> > > > be a
> > > > better option to have fixed mapping as a fallback instead of leaving as
> > > > PCH_NONE? Then even when Qemu doesn't provide a special tweaked
> > PCH,
> > > > the majority case just works.
> > >
> > > I guess we can do it, at least I haven't seen any strange combinations in
> > > the wild outside of Intel ...
> > 
> > How big is the QA matrix for this? Would it make sense to just
> > include the latest hardware (say going two generations back)
> > and ignore the older one?
> 
> suppose minimal or no QA effort on bare metal, if we only conservatively 
> change the fallback path which is today not supposed to function with 
> PCH_NONE. so it's only same amount of QA effort as whatever else is 
> proposed in this passthru upstreaming task. I agree no need to cover 
> older model, possibly just snb, ivb and hsw, but will leave Tiejun to answer 
> the overall goal.

Yeah, I'd be ok with the approach of using defaults if we can't recognize
the pch - if anyone screams we can either quirk or figure something else
out.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


[Bug 79074] PRIME with compositing rednering hangs and other rendering issues

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=79074

Christoph Haag  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Christoph Haag  ---
I think this is not a problem anymore with DRI3 offloading and the other
improvements in that commit.

http://cgit.freedesktop.org/mesa/mesa/commit/?id=9320c8fea947fd0f6eb723c67f0bdb947e45c4c3

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/c9d767f7/attachment.html>


[PATCH 00/83] AMD HSA kernel driver

2014-07-12 Thread Daniel Vetter
On Sat, Jul 12, 2014 at 11:24:49AM +0200, Christian K?nig wrote:
> Am 11.07.2014 23:18, schrieb Jerome Glisse:
> >On Thu, Jul 10, 2014 at 10:51:29PM +, Gabbay, Oded wrote:
> >>On Thu, 2014-07-10 at 18:24 -0400, Jerome Glisse wrote:
> >>>On Fri, Jul 11, 2014 at 12:45:27AM +0300, Oded Gabbay wrote:
>   This patch set implements a Heterogeneous System Architecture
> (HSA) driver
>   for radeon-family GPUs.
> >>>This is just quick comments on few things. Given size of this, people
> >>>will need to have time to review things.
>   HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to
> share
>   system resources more effectively via HW features including
> shared pageable
>   memory, userspace-accessible work queues, and platform-level
> atomics. In
>   addition to the memory protection mechanisms in GPUVM and
> IOMMUv2, the Sea
>   Islands family of GPUs also performs HW-level validation of
> commands passed
>   in through the queues (aka rings).
>   The code in this patch set is intended to serve both as a sample
> driver for
>   other HSA-compatible hardware devices and as a production driver
> for
>   radeon-family processors. The code is architected to support
> multiple CPUs
>   each with connected GPUs, although the current implementation
> focuses on a
>   single Kaveri/Berlin APU, and works alongside the existing radeon
> kernel
>   graphics driver (kgd).
>   AMD GPUs designed for use with HSA (Sea Islands and up) share
> some hardware
>   functionality between HSA compute and regular gfx/compute (memory,
>   interrupts, registers), while other functionality has been added
>   specifically for HSA compute  (hw scheduler for virtualized
> compute rings).
>   All shared hardware is owned by the radeon graphics driver, and
> an interface
>   between kfd and kgd allows the kfd to make use of those shared
> resources,
>   while HSA-specific functionality is managed directly by kfd by
> submitting
>   packets into an HSA-specific command queue (the "HIQ").
>   During kfd module initialization a char device node (/dev/kfd) is
> created
>   (surviving until module exit), with ioctls for queue creation &
> management,
>   and data structures are initialized for managing HSA device
> topology.
>   The rest of the initialization is driven by calls from the radeon
> kgd at
>   the following points :
>   - radeon_init (kfd_init)
>   - radeon_exit (kfd_fini)
>   - radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
>   - radeon_driver_unload_kms (kfd_device_fini)
>   During the probe and init processing per-device data structures
> are
>   established which connect to the associated graphics kernel
> driver. This
>   information is exposed to userspace via sysfs, along with a
> version number
>   allowing userspace to determine if a topology change has occurred
> while it
>   was reading from sysfs.
>   The interface between kfd and kgd also allows the kfd to request
> buffer
>   management services from kgd, and allows kgd to route interrupt
> requests to
>   kfd code since the interrupt block is shared between regular
>   graphics/compute and HSA compute subsystems in the GPU.
>   The kfd code works with an open source usermode library
> ("libhsakmt") which
>   is in the final stages of IP review and should be published in a
> separate
>   repo over the next few days.
>   The code operates in one of three modes, selectable via the
> sched_policy
>   module parameter :
>   - sched_policy=0 uses a hardware scheduler running in the MEC
> block within
>   CP, and allows oversubscription (more queues than HW slots)
>   - sched_policy=1 also uses HW scheduling but does not allow
>   oversubscription, so create_queue requests fail when we run out
> of HW slots
>   - sched_policy=2 does not use HW scheduling, so the driver
> manually assigns
>   queues to HW slots by programming registers
>   The "no HW scheduling" option is for debug & new hardware bringup
> only, so
>   has less test coverage than the other options. Default in the
> current code
>   is "HW scheduling without oversubscription" since that is where
> we have the
>   most test coverage but we expect to change the default to "HW
> scheduling
>   with oversubscription" after further testing. This effectively
> removes the
>   HW limit on the number of work queues available to applications.
>   Programs running on the GPU are associated with an address space
> through the
>   VMID field, which is translated to a unique PASID at access time
> via a set
>   of 16 VMID-to-PASID mapping registers. The available VMIDs
> (currently 16)
>   are par

[PATCH v2 1/2] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-12 Thread Rob Clark
On Sat, Jul 12, 2014 at 3:00 AM, Boris BREZILLON
 wrote:
> Make use of lists instead of kfifo in order to dynamically allocate
> task entry when someone require some delayed work, and thus preventing
> drm_flip_work_queue from directly calling func instead of queuing this
> call.
> This allow drm_flip_work_queue to be safely called even within irq
> handlers.
>
> Add new helper functions to allocate a flip work task and queue it when
> needed. This prevents allocating data within irq context (which might
> impact the time spent in the irq handler).
>
> Signed-off-by: Boris BREZILLON 
> ---
>  drivers/gpu/drm/drm_flip_work.c | 96 
> ++---
>  include/drm/drm_flip_work.h | 29 +
>  2 files changed, 93 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
> index f9c7fa3..7441aa8 100644
> --- a/drivers/gpu/drm/drm_flip_work.c
> +++ b/drivers/gpu/drm/drm_flip_work.c
> @@ -25,6 +25,44 @@
>  #include "drm_flip_work.h"
>
>  /**
> + * drm_flip_work_allocate_task - allocate a flip-work task
> + * @data: data associated to the task
> + * @flags: allocator flags
> + *
> + * Allocate a drm_flip_task object and attach private data to it.
> + */
> +struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags)
> +{
> +   struct drm_flip_task *task;
> +
> +   task = kzalloc(sizeof(*task), flags);
> +   if (task)
> +   task->data = data;
> +
> +   return task;
> +}
> +EXPORT_SYMBOL(drm_flip_work_allocate_task);
> +
> +/**
> + * drm_flip_work_queue_task - queue a specific task
> + * @work: the flip-work
> + * @task: the task to handle
> + *
> + * Queues task, that will later be run (passed back to drm_flip_func_t
> + * func) on a work queue after drm_flip_work_commit() is called.
> + */
> +void drm_flip_work_queue_task(struct drm_flip_work *work,
> + struct drm_flip_task *task)
> +{
> +   unsigned long flags;
> +
> +   spin_lock_irqsave(&work->lock, flags);
> +   list_add_tail(&task->node, &work->queued);
> +   spin_unlock_irqrestore(&work->lock, flags);
> +}
> +EXPORT_SYMBOL(drm_flip_work_queue_task);
> +
> +/**
>   * drm_flip_work_queue - queue work
>   * @work: the flip-work
>   * @val: the value to queue
> @@ -34,10 +72,14 @@
>   */
>  void drm_flip_work_queue(struct drm_flip_work *work, void *val)
>  {
> -   if (kfifo_put(&work->fifo, val)) {
> -   atomic_inc(&work->pending);
> +   struct drm_flip_task *task;
> +
> +   task = drm_flip_work_allocate_task(val,
> +   drm_can_sleep() ? GFP_KERNEL : GFP_ATOMIC);
> +   if (task) {
> +   drm_flip_work_queue_task(work, task);
> } else {
> -   DRM_ERROR("%s fifo full!\n", work->name);
> +   DRM_ERROR("%s could not allocate task!\n", work->name);
> work->func(work, val);
> }
>  }
> @@ -56,9 +98,12 @@ EXPORT_SYMBOL(drm_flip_work_queue);
>  void drm_flip_work_commit(struct drm_flip_work *work,
> struct workqueue_struct *wq)
>  {
> -   uint32_t pending = atomic_read(&work->pending);
> -   atomic_add(pending, &work->count);
> -   atomic_sub(pending, &work->pending);
> +   unsigned long flags;
> +
> +   spin_lock_irqsave(&work->lock, flags);
> +   list_splice_tail(&work->queued, &work->commited);
> +   INIT_LIST_HEAD(&work->queued);
> +   spin_unlock_irqrestore(&work->lock, flags);
> queue_work(wq, &work->worker);
>  }
>  EXPORT_SYMBOL(drm_flip_work_commit);
> @@ -66,14 +111,26 @@ EXPORT_SYMBOL(drm_flip_work_commit);
>  static void flip_worker(struct work_struct *w)
>  {
> struct drm_flip_work *work = container_of(w, struct drm_flip_work, 
> worker);
> -   uint32_t count = atomic_read(&work->count);
> -   void *val = NULL;
> +   struct list_head tasks;
> +   unsigned long flags;
>
> -   atomic_sub(count, &work->count);
> +   while (1) {
> +   struct drm_flip_task *task, *tmp;
>
> -   while(count--)
> -   if (!WARN_ON(!kfifo_get(&work->fifo, &val)))
> -   work->func(work, val);
> +   INIT_LIST_HEAD(&tasks);
> +   spin_lock_irqsave(&work->lock, flags);
> +   list_splice_tail(&work->commited, &tasks);
> +   INIT_LIST_HEAD(&work->commited);
> +   spin_unlock_irqrestore(&work->lock, flags);
> +
> +   if (list_empty(&tasks))
> +   break;
> +
> +   list_for_each_entry_safe(task, tmp, &tasks, node) {
> +   work->func(work, task->data);
> +   kfree(task);
> +   }
> +   }
>  }
>
>  /**
> @@ -91,19 +148,11 @@ static void flip_worker(struct work_struct *w)
>  int drm_flip_work_init(struct drm_flip_work *work, int size,
> const char *name, drm_flip_func_t func)
>  {
> -   i

[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

--- Comment #6 from joshua.r.marshall.1991 at gmail.com ---
That is not a BIOS option.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/60df37fd/attachment.html>


[Bug 73053] dpm hangs with BTC parts

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73053

--- Comment #41 from almos  ---
(In reply to comment #40)
> (In reply to comment #38)
> > The new problem is that I get kernel panic after a few hours if dpm is
> > enabled. With the good old profile method the system is stable.
> 
> Can you get a copy of the panic?  I think it may be related to the page
> flipping changes in the last couple kernels.  It's not likely dpm would
> cause a panic.

It seems I spoke too soon. 3.15.4 panics even without dpm.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/c825fd39/attachment.html>


[Bug 73053] dpm hangs with BTC parts

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73053

--- Comment #42 from almos  ---
Created attachment 102669
  --> https://bugs.freedesktop.org/attachment.cgi?id=102669&action=edit
image of panic.jpg

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/068a2520/attachment.html>


[Bug 73911] Color Banding on radeon

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=73911

tomimaki  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #30 from tomimaki  ---
Well, I think we can close it as fix is now in stable and longterm kernels
(except 3.12). :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/766482fc/attachment.html>


[Bug 81255] EDEADLK with S. Islands APU+dGPU during ib test on ring 5

2014-07-12 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=81255

joshua.r.marshall.1991 at gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #7 from joshua.r.marshall.1991 at gmail.com ---
Re-read the manual.  There is an option...albeit described as something else.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20140712/057bcd47/attachment.html>


[PATCH v3 1/2] drm: rework flip-work helpers to avoid calling func when the FIFO is full

2014-07-12 Thread Boris BREZILLON
Make use of lists instead of kfifo in order to dynamically allocate
task entry when someone require some delayed work, and thus preventing
drm_flip_work_queue from directly calling func instead of queuing this
call.
This allow drm_flip_work_queue to be safely called even within irq
handlers.

Add new helper functions to allocate a flip work task and queue it when
needed. This prevents allocating data within irq context (which might
impact the time spent in the irq handler).

Signed-off-by: Boris BREZILLON 
Reviewed-by: Rob Clark 
---
 drivers/gpu/drm/drm_flip_work.c | 97 +++--
 include/drm/drm_flip_work.h | 31 +
 2 files changed, 96 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
index f9c7fa3..6f4ae5b 100644
--- a/drivers/gpu/drm/drm_flip_work.c
+++ b/drivers/gpu/drm/drm_flip_work.c
@@ -25,6 +25,44 @@
 #include "drm_flip_work.h"

 /**
+ * drm_flip_work_allocate_task - allocate a flip-work task
+ * @data: data associated to the task
+ * @flags: allocator flags
+ *
+ * Allocate a drm_flip_task object and attach private data to it.
+ */
+struct drm_flip_task *drm_flip_work_allocate_task(void *data, gfp_t flags)
+{
+   struct drm_flip_task *task;
+
+   task = kzalloc(sizeof(*task), flags);
+   if (task)
+   task->data = data;
+
+   return task;
+}
+EXPORT_SYMBOL(drm_flip_work_allocate_task);
+
+/**
+ * drm_flip_work_queue_task - queue a specific task
+ * @work: the flip-work
+ * @task: the task to handle
+ *
+ * Queues task, that will later be run (passed back to drm_flip_func_t
+ * func) on a work queue after drm_flip_work_commit() is called.
+ */
+void drm_flip_work_queue_task(struct drm_flip_work *work,
+ struct drm_flip_task *task)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(&work->lock, flags);
+   list_add_tail(&task->node, &work->queued);
+   spin_unlock_irqrestore(&work->lock, flags);
+}
+EXPORT_SYMBOL(drm_flip_work_queue_task);
+
+/**
  * drm_flip_work_queue - queue work
  * @work: the flip-work
  * @val: the value to queue
@@ -34,10 +72,14 @@
  */
 void drm_flip_work_queue(struct drm_flip_work *work, void *val)
 {
-   if (kfifo_put(&work->fifo, val)) {
-   atomic_inc(&work->pending);
+   struct drm_flip_task *task;
+
+   task = drm_flip_work_allocate_task(val,
+   drm_can_sleep() ? GFP_KERNEL : GFP_ATOMIC);
+   if (task) {
+   drm_flip_work_queue_task(work, task);
} else {
-   DRM_ERROR("%s fifo full!\n", work->name);
+   DRM_ERROR("%s could not allocate task!\n", work->name);
work->func(work, val);
}
 }
@@ -56,9 +98,12 @@ EXPORT_SYMBOL(drm_flip_work_queue);
 void drm_flip_work_commit(struct drm_flip_work *work,
struct workqueue_struct *wq)
 {
-   uint32_t pending = atomic_read(&work->pending);
-   atomic_add(pending, &work->count);
-   atomic_sub(pending, &work->pending);
+   unsigned long flags;
+
+   spin_lock_irqsave(&work->lock, flags);
+   list_splice_tail(&work->queued, &work->commited);
+   INIT_LIST_HEAD(&work->queued);
+   spin_unlock_irqrestore(&work->lock, flags);
queue_work(wq, &work->worker);
 }
 EXPORT_SYMBOL(drm_flip_work_commit);
@@ -66,14 +111,26 @@ EXPORT_SYMBOL(drm_flip_work_commit);
 static void flip_worker(struct work_struct *w)
 {
struct drm_flip_work *work = container_of(w, struct drm_flip_work, 
worker);
-   uint32_t count = atomic_read(&work->count);
-   void *val = NULL;
+   struct list_head tasks;
+   unsigned long flags;

-   atomic_sub(count, &work->count);
+   while (1) {
+   struct drm_flip_task *task, *tmp;

-   while(count--)
-   if (!WARN_ON(!kfifo_get(&work->fifo, &val)))
-   work->func(work, val);
+   INIT_LIST_HEAD(&tasks);
+   spin_lock_irqsave(&work->lock, flags);
+   list_splice_tail(&work->commited, &tasks);
+   INIT_LIST_HEAD(&work->commited);
+   spin_unlock_irqrestore(&work->lock, flags);
+
+   if (list_empty(&tasks))
+   break;
+
+   list_for_each_entry_safe(task, tmp, &tasks, node) {
+   work->func(work, task->data);
+   kfree(task);
+   }
+   }
 }

 /**
@@ -91,19 +148,12 @@ static void flip_worker(struct work_struct *w)
 int drm_flip_work_init(struct drm_flip_work *work, int size,
const char *name, drm_flip_func_t func)
 {
-   int ret;
-
work->name = name;
-   atomic_set(&work->count, 0);
-   atomic_set(&work->pending, 0);
+   INIT_LIST_HEAD(&work->queued);
+   INIT_LIST_HEAD(&work->commited);
+   spin_lock_init(&work->lock);
work->func = func;

-   ret = kfifo_alloc(&work->fifo, size, GFP_

[PATCH v3 0/2] drm: rework flip-work framework

2014-07-12 Thread Boris BREZILLON
Hello,

This patch series reworks the flip-work framework to make it safe when
calling drm_flip_work_queue from atomic contexts.

The 2nd patch of this series is optional, as it only reworks
drm_flip_work_init prototype to remove unneeded size argument and
return code (this function cannot fail anymore).

Best Regards,

Boris

Changes since v2:
- add missing spin_lock_init
- fix flip utils description

Changes since v1:
- add gfp flags argument to drm_flip_work_allocate_task function
- make drm_flip_work_queue safe when called from atomic context

Boris BREZILLON (2):
  drm: rework flip-work helpers to avoid calling func when the FIFO is
full
  drm: flip-work: change drm_flip_work_init prototype

 drivers/gpu/drm/drm_flip_work.c  | 105 ++-
 drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c |  19 ++
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c |  16 +
 drivers/gpu/drm/omapdrm/omap_plane.c |  14 +
 drivers/gpu/drm/tilcdc/tilcdc_crtc.c |   6 +-
 include/drm/drm_flip_work.h  |  33 +++---
 6 files changed, 108 insertions(+), 85 deletions(-)

-- 
1.8.3.2



[PATCH v3 2/2] drm: flip-work: change drm_flip_work_init prototype

2014-07-12 Thread Boris BREZILLON
Now that we're using lists instead of kfifo to store drm flip-work tasks
we do not need the size parameter passed to drm_flip_work_init function
anymore.
Moreover this function cannot fail anymore, we can thus remove the return
code.

Modify drm_flip_work_init users to take account of these changes.

Signed-off-by: Boris BREZILLON 
Reviewed-by: Rob Clark 
---
 drivers/gpu/drm/drm_flip_work.c  |  8 +---
 drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c | 19 ---
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c | 16 +++-
 drivers/gpu/drm/omapdrm/omap_plane.c | 14 ++
 drivers/gpu/drm/tilcdc/tilcdc_crtc.c |  6 +-
 include/drm/drm_flip_work.h  |  2 +-
 6 files changed, 12 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/drm_flip_work.c b/drivers/gpu/drm/drm_flip_work.c
index 6f4ae5b..43d9b95 100644
--- a/drivers/gpu/drm/drm_flip_work.c
+++ b/drivers/gpu/drm/drm_flip_work.c
@@ -136,16 +136,12 @@ static void flip_worker(struct work_struct *w)
 /**
  * drm_flip_work_init - initialize flip-work
  * @work: the flip-work to initialize
- * @size: the max queue depth
  * @name: debug name
  * @func: the callback work function
  *
  * Initializes/allocates resources for the flip-work
- *
- * RETURNS:
- * Zero on success, error code on failure.
  */
-int drm_flip_work_init(struct drm_flip_work *work, int size,
+void drm_flip_work_init(struct drm_flip_work *work,
const char *name, drm_flip_func_t func)
 {
work->name = name;
@@ -155,8 +151,6 @@ int drm_flip_work_init(struct drm_flip_work *work, int size,
work->func = func;

INIT_WORK(&work->worker, flip_worker);
-
-   return 0;
 }
 EXPORT_SYMBOL(drm_flip_work_init);

diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c 
b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
index 74cebb5..44d4f93 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
@@ -755,10 +755,8 @@ struct drm_crtc *mdp4_crtc_init(struct drm_device *dev,
int ret;

mdp4_crtc = kzalloc(sizeof(*mdp4_crtc), GFP_KERNEL);
-   if (!mdp4_crtc) {
-   ret = -ENOMEM;
-   goto fail;
-   }
+   if (!mdp4_crtc)
+   return ERR_PTR(-ENOMEM);

crtc = &mdp4_crtc->base;

@@ -779,12 +777,9 @@ struct drm_crtc *mdp4_crtc_init(struct drm_device *dev,

spin_lock_init(&mdp4_crtc->cursor.lock);

-   ret = drm_flip_work_init(&mdp4_crtc->unref_fb_work, 16,
+   drm_flip_work_init(&mdp4_crtc->unref_fb_work,
"unref fb", unref_fb_worker);
-   if (ret)
-   goto fail;
-
-   ret = drm_flip_work_init(&mdp4_crtc->unref_cursor_work, 64,
+   drm_flip_work_init(&mdp4_crtc->unref_cursor_work,
"unref cursor", unref_cursor_worker);

INIT_FENCE_CB(&mdp4_crtc->pageflip_cb, pageflip_cb);
@@ -795,10 +790,4 @@ struct drm_crtc *mdp4_crtc_init(struct drm_device *dev,
mdp4_plane_install_properties(mdp4_crtc->plane, &crtc->base);

return crtc;
-
-fail:
-   if (crtc)
-   mdp4_crtc_destroy(crtc);
-
-   return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c 
b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
index ebe2e60..a0cb374 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
@@ -537,10 +537,8 @@ struct drm_crtc *mdp5_crtc_init(struct drm_device *dev,
int ret;

mdp5_crtc = kzalloc(sizeof(*mdp5_crtc), GFP_KERNEL);
-   if (!mdp5_crtc) {
-   ret = -ENOMEM;
-   goto fail;
-   }
+   if (!mdp5_crtc)
+   return ERR_PTR(-ENOMEM);

crtc = &mdp5_crtc->base;

@@ -553,10 +551,8 @@ struct drm_crtc *mdp5_crtc_init(struct drm_device *dev,
snprintf(mdp5_crtc->name, sizeof(mdp5_crtc->name), "%s:%d",
pipe2name(mdp5_plane_pipe(plane)), id);

-   ret = drm_flip_work_init(&mdp5_crtc->unref_fb_work, 16,
+   drm_flip_work_init(&mdp5_crtc->unref_fb_work,
"unref fb", unref_fb_worker);
-   if (ret)
-   goto fail;

INIT_FENCE_CB(&mdp5_crtc->pageflip_cb, pageflip_cb);

@@ -566,10 +562,4 @@ struct drm_crtc *mdp5_crtc_init(struct drm_device *dev,
mdp5_plane_install_properties(mdp5_crtc->plane, &crtc->base);

return crtc;
-
-fail:
-   if (crtc)
-   mdp5_crtc_destroy(crtc);
-
-   return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c 
b/drivers/gpu/drm/omapdrm/omap_plane.c
index 3cf31ee..847d1ca 100644
--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -397,14 +397,10 @@ struct drm_plane *omap_plane_init(struct drm_device *dev,

omap_plane = kzalloc(sizeof(*omap_plane), GFP_KERNEL);
if (!omap_plane)
-   goto fail;
+   return NULL;

-   ret = drm_flip_work_init(&omap_plane->unpin_

[PATCH] drm/nouveau/disp/dp: drop dead code

2014-07-12 Thread Brian Norris
Since this commit:

  commit 55f083c33feb7231c7574a64cd01b0477715a370
  Author: Ben Skeggs 
  Date:   Tue May 20 10:18:03 2014 +1000

  drm/nouveau/disp/dp: maintain link in response to hpd signal

a few bits of code have been dead.

This was noticed by Coverity Scan.

Signed-off-by: Brian Norris 
Cc: Ben Skeggs 
---
Compile tested only

 drivers/gpu/drm/nouveau/core/engine/disp/dport.c |   10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/engine/disp/dport.c 
b/drivers/gpu/drm/nouveau/core/engine/disp/dport.c
index 5a5b59b21130..0f6fbe020c41 100644
--- a/drivers/gpu/drm/nouveau/core/engine/disp/dport.c
+++ b/drivers/gpu/drm/nouveau/core/engine/disp/dport.c
@@ -331,7 +331,6 @@ nouveau_dp_train(struct work_struct *w)
struct dp_state _dp = {
.outp = outp,
}, *dp = &_dp;
-   u32 datarate = 0;
int ret;

/* bring capabilities within encoder limits */
@@ -345,20 +344,13 @@ nouveau_dp_train(struct work_struct *w)
outp->dpcd[1] = outp->base.info.dpconf.link_bw;
dp->pc2 = outp->dpcd[2] & DPCD_RC02_TPS3_SUPPORTED;

-   /* restrict link config to the lowest required rate, if requested */
-   if (datarate) {
-   datarate = (datarate / 8) * 10; /* 8B/10B coding overhead */
-   while (cfg[1].rate >= datarate)
-   cfg++;
-   }
-   cfg--;
-
/* disable link interrupt handling during link training */
nouveau_event_put(outp->irq);

/* enable down-spreading and execute pre-train script from vbios */
dp_link_train_init(dp, outp->dpcd[3] & 0x01);

+   cfg--;
while (ret = -EIO, (++cfg)->rate) {
/* select next configuration supported by encoder and sink */
while (cfg->nr > (outp->dpcd[2] & DPCD_RC02_MAX_LANE_COUNT) ||
-- 
1.7.9.5



[PATCH] drm: omapdrm: fix compiler errors

2014-07-12 Thread Russell King
Regular randconfig nightly testing has detected problems with omapdrm.

omapdrm fails to build when the kernel is built to support 64-bit DMA
addresses and/or 64-bit physical addresses due to an assumption about
the width of these types.

Use %pad to print DMA addresses, rather than %x or %Zx (which is even
more wrong than %x).  Avoid passing a uint32_t pointer into a function
which expects dma_addr_t pointer.

drivers/gpu/drm/omapdrm/omap_plane.c: In function 'omap_plane_pre_apply':
drivers/gpu/drm/omapdrm/omap_plane.c:145:2: error: format '%x' expects argument 
of type 'unsigned int', but argument 5 has type 'dma_addr_t' [-Werror=format]
drivers/gpu/drm/omapdrm/omap_plane.c:145:2: error: format '%x' expects argument 
of type 'unsigned int', but argument 6 has type 'dma_addr_t' [-Werror=format]
make[5]: *** [drivers/gpu/drm/omapdrm/omap_plane.o] Error 1
drivers/gpu/drm/omapdrm/omap_gem.c: In function 'omap_gem_get_paddr':
drivers/gpu/drm/omapdrm/omap_gem.c:794:4: error: format '%x' expects argument 
of type 'unsigned int', but argument 3 has type 'dma_addr_t' [-Werror=format]
drivers/gpu/drm/omapdrm/omap_gem.c: In function 'omap_gem_describe':
drivers/gpu/drm/omapdrm/omap_gem.c:991:4: error: format '%Zx' expects argument 
of type 'size_t', but argument 7 has type 'dma_addr_t' [-Werror=format]
drivers/gpu/drm/omapdrm/omap_gem.c: In function 'omap_gem_init':
drivers/gpu/drm/omapdrm/omap_gem.c:1470:4: error: format '%x' expects argument 
of type 'unsigned int', but argument 7 has type 'dma_addr_t' [-Werror=format]
make[5]: *** [drivers/gpu/drm/omapdrm/omap_gem.o] Error 1
drivers/gpu/drm/omapdrm/omap_dmm_tiler.c: In function 'dmm_txn_append':
drivers/gpu/drm/omapdrm/omap_dmm_tiler.c:226:2: error: passing argument 3 of 
'alloc_dma' from incompatible pointer type [-Werror]
make[5]: *** [drivers/gpu/drm/omapdrm/omap_dmm_tiler.o] Error 1
make[5]: Target `__build' not remade because of errors.
make[4]: *** [drivers/gpu/drm/omapdrm] Error 2

Signed-off-by: Russell King 
---
 drivers/gpu/drm/omapdrm/omap_dmm_tiler.c |  6 --
 drivers/gpu/drm/omapdrm/omap_gem.c   | 10 +-
 drivers/gpu/drm/omapdrm/omap_plane.c |  4 ++--
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c 
b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
index f926b4caf449..56c60552abba 100644
--- a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
+++ b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
@@ -199,7 +199,7 @@ static struct dmm_txn *dmm_txn_init(struct dmm *dmm, struct 
tcm *tcm)
 static void dmm_txn_append(struct dmm_txn *txn, struct pat_area *area,
struct page **pages, uint32_t npages, uint32_t roll)
 {
-   dma_addr_t pat_pa = 0;
+   dma_addr_t pat_pa = 0, data_pa = 0;
uint32_t *data;
struct pat *pat;
struct refill_engine *engine = txn->engine_handle;
@@ -223,7 +223,9 @@ static void dmm_txn_append(struct dmm_txn *txn, struct 
pat_area *area,
.lut_id = engine->tcm->lut_id,
};

-   data = alloc_dma(txn, 4*i, &pat->data_pa);
+   data = alloc_dma(txn, 4*i, &data_pa);
+   /* FIXME: what if data_pa is more than 32-bit ? */
+   pat->data_pa = data_pa;

while (i--) {
int n = i + roll;
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c 
b/drivers/gpu/drm/omapdrm/omap_gem.c
index 95dbce286a41..d9f5e5241af4 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -791,7 +791,7 @@ int omap_gem_get_paddr(struct drm_gem_object *obj,
omap_obj->paddr = tiler_ssptr(block);
omap_obj->block = block;

-   DBG("got paddr: %08x", omap_obj->paddr);
+   DBG("got paddr: %pad", &omap_obj->paddr);
}

omap_obj->paddr_cnt++;
@@ -985,9 +985,9 @@ void omap_gem_describe(struct drm_gem_object *obj, struct 
seq_file *m)

off = drm_vma_node_start(&obj->vma_node);

-   seq_printf(m, "%08x: %2d (%2d) %08llx %08Zx (%2d) %p %4d",
+   seq_printf(m, "%08x: %2d (%2d) %08llx %pad (%2d) %p %4d",
omap_obj->flags, obj->name, 
obj->refcount.refcount.counter,
-   off, omap_obj->paddr, omap_obj->paddr_cnt,
+   off, &omap_obj->paddr, omap_obj->paddr_cnt,
omap_obj->vaddr, omap_obj->roll);

if (omap_obj->flags & OMAP_BO_TILED) {
@@ -1467,8 +1467,8 @@ void omap_gem_init(struct drm_device *dev)
entry->paddr = tiler_ssptr(block);
entry->block = block;

-   DBG("%d:%d: %dx%d: paddr=%08x stride=%d", i, j, w, h,
-   entry->paddr,
+   DBG("%d:%d: %dx%d: paddr=%pad stride=%d", i, j, w, h,
+   &entry->paddr,
usergart[i].stride_pfn << PAGE_SHIFT);
}
 

[PATCH] drm: bochs: fix warnings

2014-07-12 Thread Russell King
Regular nightly randconfig build testing discovered these warnings:

drivers/gpu/drm/bochs/bochs_drv.c:100:12: warning: 'bochs_pm_suspend' defined 
but not used [-Wunused-function]
drivers/gpu/drm/bochs/bochs_drv.c:117:12: warning: 'bochs_pm_resume' defined 
but not used [-Wunused-function]

Fix these by adding the same condition that SET_SYSTEM_SLEEP_PM_OPS()
uses.

Signed-off-by: Russell King 
---
There is no maintainers entry for this driver, so I don't know who this
should be sent to.

 drivers/gpu/drm/bochs/bochs_drv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/bochs/bochs_drv.c 
b/drivers/gpu/drm/bochs/bochs_drv.c
index 9c13df29fd20..f5e0ead974a6 100644
--- a/drivers/gpu/drm/bochs/bochs_drv.c
+++ b/drivers/gpu/drm/bochs/bochs_drv.c
@@ -97,6 +97,7 @@ static struct drm_driver bochs_driver = {
 /* -- */
 /* pm interface   */

+#ifdef CONFIG_PM_SLEEP
 static int bochs_pm_suspend(struct device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -131,6 +132,7 @@ static int bochs_pm_resume(struct device *dev)
drm_kms_helper_poll_enable(drm_dev);
return 0;
 }
+#endif

 static const struct dev_pm_ops bochs_pm_ops = {
SET_SYSTEM_SLEEP_PM_OPS(bochs_pm_suspend,
-- 
1.8.3.1



[PATCH] drm: cirrus: fix warnings

2014-07-12 Thread Russell King
Regular nightly randconfig build testing discovered these warnings:

drivers/gpu/drm/cirrus/cirrus_drv.c:79:12: warning: 'cirrus_pm_suspend' defined 
but not used [-Wunused-function]
drivers/gpu/drm/cirrus/cirrus_drv.c:96:12: warning: 'cirrus_pm_resume' defined 
but not used [-Wunused-function]

Fix these by adding the same condition that SET_SYSTEM_SLEEP_PM_OPS()
uses.

Signed-off-by: Russell King 
---
There is no maintainers entry for this driver, so I don't know who this
should be sent to.

 drivers/gpu/drm/cirrus/cirrus_drv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c 
b/drivers/gpu/drm/cirrus/cirrus_drv.c
index 08ce520f61a5..4516b052cc67 100644
--- a/drivers/gpu/drm/cirrus/cirrus_drv.c
+++ b/drivers/gpu/drm/cirrus/cirrus_drv.c
@@ -76,6 +76,7 @@ static void cirrus_pci_remove(struct pci_dev *pdev)
drm_put_dev(dev);
 }

+#ifdef CONFIG_PM_SLEEP
 static int cirrus_pm_suspend(struct device *dev)
 {
struct pci_dev *pdev = to_pci_dev(dev);
@@ -110,6 +111,7 @@ static int cirrus_pm_resume(struct device *dev)
drm_kms_helper_poll_enable(drm_dev);
return 0;
 }
+#endif

 static const struct file_operations cirrus_driver_fops = {
.owner = THIS_MODULE,
-- 
1.8.3.1



[PATCH 00/83] AMD HSA kernel driver

2014-07-12 Thread Jerome Glisse
On Sat, Jul 12, 2014 at 01:10:32PM +0200, Daniel Vetter wrote:
> On Sat, Jul 12, 2014 at 11:24:49AM +0200, Christian K?nig wrote:
> > Am 11.07.2014 23:18, schrieb Jerome Glisse:
> > >On Thu, Jul 10, 2014 at 10:51:29PM +, Gabbay, Oded wrote:
> > >>On Thu, 2014-07-10 at 18:24 -0400, Jerome Glisse wrote:
> > >>>On Fri, Jul 11, 2014 at 12:45:27AM +0300, Oded Gabbay wrote:
> >   This patch set implements a Heterogeneous System Architecture
> > (HSA) driver
> >   for radeon-family GPUs.
> > >>>This is just quick comments on few things. Given size of this, people
> > >>>will need to have time to review things.
> >   HSA allows different processor types (CPUs, DSPs, GPUs, etc..) to
> > share
> >   system resources more effectively via HW features including
> > shared pageable
> >   memory, userspace-accessible work queues, and platform-level
> > atomics. In
> >   addition to the memory protection mechanisms in GPUVM and
> > IOMMUv2, the Sea
> >   Islands family of GPUs also performs HW-level validation of
> > commands passed
> >   in through the queues (aka rings).
> >   The code in this patch set is intended to serve both as a sample
> > driver for
> >   other HSA-compatible hardware devices and as a production driver
> > for
> >   radeon-family processors. The code is architected to support
> > multiple CPUs
> >   each with connected GPUs, although the current implementation
> > focuses on a
> >   single Kaveri/Berlin APU, and works alongside the existing radeon
> > kernel
> >   graphics driver (kgd).
> >   AMD GPUs designed for use with HSA (Sea Islands and up) share
> > some hardware
> >   functionality between HSA compute and regular gfx/compute (memory,
> >   interrupts, registers), while other functionality has been added
> >   specifically for HSA compute  (hw scheduler for virtualized
> > compute rings).
> >   All shared hardware is owned by the radeon graphics driver, and
> > an interface
> >   between kfd and kgd allows the kfd to make use of those shared
> > resources,
> >   while HSA-specific functionality is managed directly by kfd by
> > submitting
> >   packets into an HSA-specific command queue (the "HIQ").
> >   During kfd module initialization a char device node (/dev/kfd) is
> > created
> >   (surviving until module exit), with ioctls for queue creation &
> > management,
> >   and data structures are initialized for managing HSA device
> > topology.
> >   The rest of the initialization is driven by calls from the radeon
> > kgd at
> >   the following points :
> >   - radeon_init (kfd_init)
> >   - radeon_exit (kfd_fini)
> >   - radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
> >   - radeon_driver_unload_kms (kfd_device_fini)
> >   During the probe and init processing per-device data structures
> > are
> >   established which connect to the associated graphics kernel
> > driver. This
> >   information is exposed to userspace via sysfs, along with a
> > version number
> >   allowing userspace to determine if a topology change has occurred
> > while it
> >   was reading from sysfs.
> >   The interface between kfd and kgd also allows the kfd to request
> > buffer
> >   management services from kgd, and allows kgd to route interrupt
> > requests to
> >   kfd code since the interrupt block is shared between regular
> >   graphics/compute and HSA compute subsystems in the GPU.
> >   The kfd code works with an open source usermode library
> > ("libhsakmt") which
> >   is in the final stages of IP review and should be published in a
> > separate
> >   repo over the next few days.
> >   The code operates in one of three modes, selectable via the
> > sched_policy
> >   module parameter :
> >   - sched_policy=0 uses a hardware scheduler running in the MEC
> > block within
> >   CP, and allows oversubscription (more queues than HW slots)
> >   - sched_policy=1 also uses HW scheduling but does not allow
> >   oversubscription, so create_queue requests fail when we run out
> > of HW slots
> >   - sched_policy=2 does not use HW scheduling, so the driver
> > manually assigns
> >   queues to HW slots by programming registers
> >   The "no HW scheduling" option is for debug & new hardware bringup
> > only, so
> >   has less test coverage than the other options. Default in the
> > current code
> >   is "HW scheduling without oversubscription" since that is where
> > we have the
> >   most test coverage but we expect to change the default to "HW
> > scheduling
> >   with oversubscription" after further testing. This effectively
> > removes the
> >   HW limit on the number of work queues available to applications.
> >   Programs

[RESEND PATCH v3 05/11] drm: add Atmel HLCDC Display Controller support

2014-07-12 Thread Boris BREZILLON
Hello,

On Mon,  7 Jul 2014 18:42:58 +0200
Boris BREZILLON  wrote:


> +int atmel_hlcdc_layer_disable(struct atmel_hlcdc_layer *layer)
> +{
> + struct atmel_hlcdc_layer_dma_channel *dma = &layer->dma;
> + unsigned long flags;
> + int i;
> +
> + spin_lock_irqsave(&dma->lock, flags);
> + for (i = 0; i < layer->max_planes; i++) {
> + if (!dma->cur[i])
> + break;
> +
> + dma->cur[i]->ctrl = 0;
> + }
> + spin_unlock_irqrestore(&dma->lock, flags);
> +
> + return 0;
> +}


I'm trying to simplify the hlcdc_layer code and in order to do that I
need to know what's expected when a user calls plane_disable (or more
exactly DRM_IOCTL_MODE_SETPLANE ioctl call with the frame buffer ID set
to 0).

The HLCDC Display Controller support two types of disable:

1) The plane is disabled at the end of the current frame (the is the
solution I'm using)

2) The plane is disabled right away (I haven't tested it, but I think
this solution could generate some sort of artifacts for a short period
of time, because the framebuffer might be partially displayed)

If solution 1 is chosen, should I wait for the plane to be actually
disabled before returning ?
A the moment, I'm not: I'm just asking for the plane to be disabled and
then return. And this is where some of my complicated code come from,
because I must handle the case where a user disable the plane then re
enable it right away (modetest cursor test is doing a lot of cursor
enable/disable in a short period of time, and this is how I tested all
this weird use cases).

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


[RESEND PATCH v3 05/11] drm: add Atmel HLCDC Display Controller support

2014-07-12 Thread Rob Clark
On Sat, Jul 12, 2014 at 2:16 PM, Boris BREZILLON
 wrote:
> Hello,
>
> On Mon,  7 Jul 2014 18:42:58 +0200
> Boris BREZILLON  wrote:
>
>
>> +int atmel_hlcdc_layer_disable(struct atmel_hlcdc_layer *layer)
>> +{
>> + struct atmel_hlcdc_layer_dma_channel *dma = &layer->dma;
>> + unsigned long flags;
>> + int i;
>> +
>> + spin_lock_irqsave(&dma->lock, flags);
>> + for (i = 0; i < layer->max_planes; i++) {
>> + if (!dma->cur[i])
>> + break;
>> +
>> + dma->cur[i]->ctrl = 0;
>> + }
>> + spin_unlock_irqrestore(&dma->lock, flags);
>> +
>> + return 0;
>> +}
>
>
> I'm trying to simplify the hlcdc_layer code and in order to do that I
> need to know what's expected when a user calls plane_disable (or more
> exactly DRM_IOCTL_MODE_SETPLANE ioctl call with the frame buffer ID set
> to 0).
>
> The HLCDC Display Controller support two types of disable:
>
> 1) The plane is disabled at the end of the current frame (the is the
> solution I'm using)
>
> 2) The plane is disabled right away (I haven't tested it, but I think
> this solution could generate some sort of artifacts for a short period
> of time, because the framebuffer might be partially displayed)
>
> If solution 1 is chosen, should I wait for the plane to be actually
> disabled before returning ?

for cursor in particular, if you block, it is going to be a massive
slowdown for some apps.  I remember at least older gdm would rapidly
flash a spinning cursor.  As a result, if you wait for vsync each
time, it would take a couple minutes to login!

if #2 works, I'd recommend it.  Otherwise you may have to do some of
the same hijinks that I have to do in mdp4_crtc for the cursor.

BR,
-R

> A the moment, I'm not: I'm just asking for the plane to be disabled and
> then return. And this is where some of my complicated code come from,
> because I must handle the case where a user disable the plane then re
> enable it right away (modetest cursor test is doing a lot of cursor
> enable/disable in a short period of time, and this is how I tested all
> this weird use cases).
>
> Best Regards,
>
> Boris
>
> --
> Boris Brezillon, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 73901] Kernel crash after modprobe radeon runpm=1

2014-07-12 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=73901

Pali Roh?r  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |CODE_FIX

--- Comment #17 from Pali Roh?r  ---
Ok, when set auto control via

$ echo auto > /sys/bus/pci/devices/:01:00.0/power/control

card is automatically turned off when it is not used. When I set on via

$ echo on > /sys/bus/pci/devices/:01:00.0/power/control

then it is always on. So it working as expected and closing this bug as fixed.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[Bug 79591] possible circular locking dependency detected

2014-07-12 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=79591

--- Comment #5 from Martin Peres  ---
Created attachment 142831
  --> https://bugzilla.kernel.org/attachment.cgi?id=142831&action=edit
drm/nouveau/therm: fix a potential deadlock in the therm monitoring code

Sorry for the wait. Can you try to reproduce the issue with this patch?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


[PATCH 00/83] AMD HSA kernel driver

2014-07-12 Thread Gabbay, Oded
On Fri, 2014-07-11 at 17:18 -0400, Jerome Glisse wrote:
> On Thu, Jul 10, 2014 at 10:51:29PM +, Gabbay, Oded wrote:
> >  On Thu, 2014-07-10 at 18:24 -0400, Jerome Glisse wrote:
> > >  On Fri, Jul 11, 2014 at 12:45:27AM +0300, Oded Gabbay wrote:
> > > >   This patch set implements a Heterogeneous System Architecture
> > > >  (HSA) driver
> > > >   for radeon-family GPUs.
> > >  This is just quick comments on few things. Given size of this, 
> > > people
> > >  will need to have time to review things.
> > > >   HSA allows different processor types (CPUs, DSPs, GPUs, 
> > > > etc..) to
> > > >  share
> > > >   system resources more effectively via HW features including
> > > >  shared pageable
> > > >   memory, userspace-accessible work queues, and platform-level
> > > >  atomics. In
> > > >   addition to the memory protection mechanisms in GPUVM and
> > > >  IOMMUv2, the Sea
> > > >   Islands family of GPUs also performs HW-level validation of
> > > >  commands passed
> > > >   in through the queues (aka rings).
> > > >   The code in this patch set is intended to serve both as a 
> > > > sample
> > > >  driver for
> > > >   other HSA-compatible hardware devices and as a production 
> > > > driver
> > > >  for
> > > >   radeon-family processors. The code is architected to support
> > > >  multiple CPUs
> > > >   each with connected GPUs, although the current implementation
> > > >  focuses on a
> > > >   single Kaveri/Berlin APU, and works alongside the existing 
> > > > radeon
> > > >  kernel
> > > >   graphics driver (kgd).
> > > >   AMD GPUs designed for use with HSA (Sea Islands and up) share
> > > >  some hardware
> > > >   functionality between HSA compute and regular gfx/compute 
> > > > (memory,
> > > >   interrupts, registers), while other functionality has been 
> > > > added
> > > >   specifically for HSA compute  (hw scheduler for virtualized
> > > >  compute rings).
> > > >   All shared hardware is owned by the radeon graphics driver, 
> > > > and
> > > >  an interface
> > > >   between kfd and kgd allows the kfd to make use of those 
> > > > shared
> > > >  resources,
> > > >   while HSA-specific functionality is managed directly by kfd 
> > > > by
> > > >  submitting
> > > >   packets into an HSA-specific command queue (the "HIQ").
> > > >   During kfd module initialization a char device node 
> > > > (/dev/kfd) is
> > > >  created
> > > >   (surviving until module exit), with ioctls for queue 
> > > > creation &
> > > >  management,
> > > >   and data structures are initialized for managing HSA device
> > > >  topology.
> > > >   The rest of the initialization is driven by calls from the 
> > > > radeon
> > > >  kgd at
> > > >   the following points :
> > > >   - radeon_init (kfd_init)
> > > >   - radeon_exit (kfd_fini)
> > > >   - radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
> > > >   - radeon_driver_unload_kms (kfd_device_fini)
> > > >   During the probe and init processing per-device data 
> > > > structures
> > > >  are
> > > >   established which connect to the associated graphics kernel
> > > >  driver. This
> > > >   information is exposed to userspace via sysfs, along with a
> > > >  version number
> > > >   allowing userspace to determine if a topology change has 
> > > > occurred
> > > >  while it
> > > >   was reading from sysfs.
> > > >   The interface between kfd and kgd also allows the kfd to 
> > > > request
> > > >  buffer
> > > >   management services from kgd, and allows kgd to route 
> > > > interrupt
> > > >  requests to
> > > >   kfd code since the interrupt block is shared between regular
> > > >   graphics/compute and HSA compute subsystems in the GPU.
> > > >   The kfd code works with an open source usermode library
> > > >  ("libhsakmt") which
> > > >   is in the final stages of IP review and should be published 
> > > > in a
> > > >  separate
> > > >   repo over the next few days.
> > > >   The code operates in one of three modes, selectable via the
> > > >  sched_policy
> > > >   module parameter :
> > > >   - sched_policy=0 uses a hardware scheduler running in the MEC
> > > >  block within
> > > >   CP, and allows oversubscription (more queues than HW slots)
> > > >   - sched_policy=1 also uses HW scheduling but does not allow
> > > >   oversubscription, so create_queue requests fail when we run 
> > > > out
> > > >  of HW slots
> > > >   - sched_policy=2 does not use HW scheduling, so the driver
> > > >  manually assigns
> > > >   queues to HW slots by programming registers
> > > >   The "no HW scheduling" option is for debug & new hardware 
> > > > bringup
> > > >  only, so
> > > >   has less test coverage than the other options. Default in the
> > > >  current code
> > > >   is "HW scheduling without oversubscription" since that is 
> > > > where
> > > >  we have the
> > > >   most test coverage but we expect to change the default to "HW
> > > >  scheduling
> > > >   with oversubscription" after further testing. This 
> > > > effectively
> > > >

[PATCH 00/83] AMD HSA kernel driver

2014-07-12 Thread Jerome Glisse
On Sat, Jul 12, 2014 at 09:55:49PM +, Gabbay, Oded wrote:
> On Fri, 2014-07-11 at 17:18 -0400, Jerome Glisse wrote:
> > On Thu, Jul 10, 2014 at 10:51:29PM +, Gabbay, Oded wrote:
> > >  On Thu, 2014-07-10 at 18:24 -0400, Jerome Glisse wrote:
> > > >  On Fri, Jul 11, 2014 at 12:45:27AM +0300, Oded Gabbay wrote:
> > > > >   This patch set implements a Heterogeneous System Architecture
> > > > >  (HSA) driver
> > > > >   for radeon-family GPUs.
> > > >  This is just quick comments on few things. Given size of this, 
> > > > people
> > > >  will need to have time to review things.
> > > > >   HSA allows different processor types (CPUs, DSPs, GPUs, 
> > > > > etc..) to
> > > > >  share
> > > > >   system resources more effectively via HW features including
> > > > >  shared pageable
> > > > >   memory, userspace-accessible work queues, and platform-level
> > > > >  atomics. In
> > > > >   addition to the memory protection mechanisms in GPUVM and
> > > > >  IOMMUv2, the Sea
> > > > >   Islands family of GPUs also performs HW-level validation of
> > > > >  commands passed
> > > > >   in through the queues (aka rings).
> > > > >   The code in this patch set is intended to serve both as a 
> > > > > sample
> > > > >  driver for
> > > > >   other HSA-compatible hardware devices and as a production 
> > > > > driver
> > > > >  for
> > > > >   radeon-family processors. The code is architected to support
> > > > >  multiple CPUs
> > > > >   each with connected GPUs, although the current implementation
> > > > >  focuses on a
> > > > >   single Kaveri/Berlin APU, and works alongside the existing 
> > > > > radeon
> > > > >  kernel
> > > > >   graphics driver (kgd).
> > > > >   AMD GPUs designed for use with HSA (Sea Islands and up) share
> > > > >  some hardware
> > > > >   functionality between HSA compute and regular gfx/compute 
> > > > > (memory,
> > > > >   interrupts, registers), while other functionality has been 
> > > > > added
> > > > >   specifically for HSA compute  (hw scheduler for virtualized
> > > > >  compute rings).
> > > > >   All shared hardware is owned by the radeon graphics driver, 
> > > > > and
> > > > >  an interface
> > > > >   between kfd and kgd allows the kfd to make use of those 
> > > > > shared
> > > > >  resources,
> > > > >   while HSA-specific functionality is managed directly by kfd 
> > > > > by
> > > > >  submitting
> > > > >   packets into an HSA-specific command queue (the "HIQ").
> > > > >   During kfd module initialization a char device node 
> > > > > (/dev/kfd) is
> > > > >  created
> > > > >   (surviving until module exit), with ioctls for queue 
> > > > > creation &
> > > > >  management,
> > > > >   and data structures are initialized for managing HSA device
> > > > >  topology.
> > > > >   The rest of the initialization is driven by calls from the 
> > > > > radeon
> > > > >  kgd at
> > > > >   the following points :
> > > > >   - radeon_init (kfd_init)
> > > > >   - radeon_exit (kfd_fini)
> > > > >   - radeon_driver_load_kms (kfd_device_probe, kfd_device_init)
> > > > >   - radeon_driver_unload_kms (kfd_device_fini)
> > > > >   During the probe and init processing per-device data 
> > > > > structures
> > > > >  are
> > > > >   established which connect to the associated graphics kernel
> > > > >  driver. This
> > > > >   information is exposed to userspace via sysfs, along with a
> > > > >  version number
> > > > >   allowing userspace to determine if a topology change has 
> > > > > occurred
> > > > >  while it
> > > > >   was reading from sysfs.
> > > > >   The interface between kfd and kgd also allows the kfd to 
> > > > > request
> > > > >  buffer
> > > > >   management services from kgd, and allows kgd to route 
> > > > > interrupt
> > > > >  requests to
> > > > >   kfd code since the interrupt block is shared between regular
> > > > >   graphics/compute and HSA compute subsystems in the GPU.
> > > > >   The kfd code works with an open source usermode library
> > > > >  ("libhsakmt") which
> > > > >   is in the final stages of IP review and should be published 
> > > > > in a
> > > > >  separate
> > > > >   repo over the next few days.
> > > > >   The code operates in one of three modes, selectable via the
> > > > >  sched_policy
> > > > >   module parameter :
> > > > >   - sched_policy=0 uses a hardware scheduler running in the MEC
> > > > >  block within
> > > > >   CP, and allows oversubscription (more queues than HW slots)
> > > > >   - sched_policy=1 also uses HW scheduling but does not allow
> > > > >   oversubscription, so create_queue requests fail when we run 
> > > > > out
> > > > >  of HW slots
> > > > >   - sched_policy=2 does not use HW scheduling, so the driver
> > > > >  manually assigns
> > > > >   queues to HW slots by programming registers
> > > > >   The "no HW scheduling" option is for debug & new hardware 
> > > > > bringup
> > > > >  only, so
> > > > >   has less test coverage than the other options. Default in the
> > > > >  current co