from:"Peter Hurley"

ast: cursor flashing softlockups

2016-05-17 Thread Peter Hurley

[ +to Scot Doyle ]

Scot, please take a look at this soft lockup.

Regards,
Peter Hurley


Hi Ming,

On 05/17/2016 02:12 AM, Ming Lei wrote:
> Hi,
> 
> On Tue, May 17, 2016 at 4:07 AM, Dann Frazier
>  wrote:
>> Hi,
>>  I'm observing a soft lockup issue w/ the ASPEED controller on an
>> arm64 server platform. This was originally seen on Ubuntu's 4.4
>> kernel, but it is reproducible w/ vanilla 4.6-rc7 as well.
>>
>> [   32.792656] NMI watchdog: BUG: soft lockup - CPU#38 stuck for 22s!
>> [swapper/38:0]
>>
>> I observe this just once each time I boot into debian-installer (I'm
>> using a serial console, but the ast module gets loaded during
>> startup).
> 
> I have figured out that it is caused by 'mod_timer(timer, jiffies)' and
> 'ops->cur_blink_jiffies' is observed as zero in cursor_timer_handler()
> when the issue happened.

Thanks for tracking this down.

This softlockup looks to be caused by:

commit 27a4c827c34ac4256a190cc9d24607f953c1c459
Author: Scot Doyle 
Date:   Thu Mar 26 13:56:38 2015 +

fbcon: use the cursor blink interval provided by vt

vt now provides a cursor blink interval via vc_data. Use this
interval instead of the currently hardcoded 200 msecs. Store it in
fbcon_ops to avoid locking the console in cursor_timer_handler().

Signed-off-by: Scot Doyle 
Acked-by: Pavel Machek 
Signed-off-by: Greg Kroah-Hartman 

and

commit bd63364caa8df38bad2b25b11b2a1b849475cce5
Author: Scot Doyle 
Date:   Thu Mar 26 13:54:39 2015 +

vt: add cursor blink interval escape sequence

Add an escape sequence to specify the current console's cursor blink
interval. The interval is specified as a number of milliseconds 
until
the next cursor display state toggle, from 50 to 65535. 
/proc/loadavg
did not show a difference with a one msec interval, but the lower
bound is set to 50 msecs since slower hardware wasn't tested.

Store the interval in the vc_data structure for later access by 
fbcon,
initializing the value to fbcon's current hardcoded value of 200 
msecs.

Signed-off-by: Scot Doyle 
Acked-by: Pavel Machek 
Signed-off-by: Greg Kroah-Hartman 



> Looks it is a real fbcon/vt issue, see following:
> 
> fbcon_init()
> <-.con_init
>   <-visual_init()
> 
> reset_terminal()
> <-vc_init()
> 
> vc->vc_cur_blink_ms is just set in reset_terminal() from vc_init() path,
> and ops->cur_blink_jiffies is figured out from vc->vc_cur_blink_ms
> in fbcon_init().
> 
> And visual_init() is always run before vc_init(), so ops->cur_blink_jiffies
> is initialized as zero and cause the soft lockup issue finally.
> 
> Thanks,
> Ming
> 
>>
>> perf shows that the CPU caught by the NMI is typically in code
>> updating the cursor timer:
>>
>> -   16.92%  swapper  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
>>- _raw_spin_unlock_irqrestore
>>   + 16.87% mod_timer
>>   + 0.05% cursor_timer_handler
>> -   12.15%  swapper  [kernel.kallsyms]  [k] queue_work_on
>>- queue_work_on
>>   + 12.00% cursor_timer_handler
>>   + 0.15% call_timer_fn
>> +   10.98%  swapper  [kernel.kallsyms]  [k] run_timer_softirq
>> -2.23%  swapper  [kernel.kallsyms]  [k] mod_timer
>>- mod_timer
>>   + 1.97% cursor_timer_handler
>>   + 0.26% call_timer_fn
>>
>> During the same period, I can see that another CPU is actively
>> executing the timer function:
>>
>> -   42.18%  kworker/u96:2  [kernel.kallsyms]  [k] ww_mutex_unlock
>>- ww_mutex_unlock
>>   - 40.70% ast_dirty_update
>>ast_imageblit
>>soft_cursor
>>bit_cursor
>>fb_flashcursor
>>process_one_work
>>worker_thread
>>kthread
>>ret_from_fork
>>   + 1.48% ast_imageblit
>> -   40.15%  kworker/u96:2  [kernel.kallsyms]  [k] __memcpy_toio
>>- __memcpy_toio
>>   + 31.54% ast_dirty_update
>>   + 8.61% ast_imageblit
>>
>> Using the graph function tracer on fb_flashcursor(), I see that
>> ast_dirty_update usually takes around 60 us, in which it makes 16
>> calls to __memcpy_toio(). However, there is always one instance on
>> every boot of the installer where ast_dirty_update() takes ~98 *ms* to
>> complete, during which it makes 743 calls to __memcpy_toio(). While
>> that  doesn't directly account for the full 22s, I do wonder if that
>> maybe a smoking gun.
>>
>> fyi, this is being tracked at: https://bugs.launchpad.net/bugs/1574814
>>
>>   -dann

Re: [Nouveau] [PATCH] drm/nouveau: fix vblank deadlock

2013-08-20 Thread Peter Hurley


On 08/12/2013 07:50 AM, Maarten Lankhorst wrote:

This fixes a deadlock inversion when vblank is enabled/disabled by drm.
&dev->vblank_time_lock is always taken when the vblank state is toggled,
which caused a deadlock when &event->lock was also taken during
event_get/put.

Solve the race by requiring that lock to change enable/disable state,
and always keeping vblank on the event list. Core drm ignores unwanted
vblanks, so extra calls to drm_handle_vblank are harmless.


I don't feel this is the appropriate solution to the lock inversion
between vblank_time_lock and event->lock.

Preferably drm core should correct the interface layer bug; ie., calling
into a sub-driver holding a lock _and_ requiring the sub-driver to call a
drm helper function which claims the same lock is bad design. The console
lock suffers from the same design flaw and is a constant problem.

Alternatively, the event trigger could be lockless; ie., the event list
could be an RCU list instead. In this way, the event->lock does not need
to be claimed, and thus no lock inversion is possible. The main drawback
here is that currently the event->lock enforces non-overlapping lifetimes
between the event handler and the event. Untangling object lifetimes in
nouveau is a non-trivial exercise.

Regards,
Peter Hurley


Signed-off-by: Maarten Lankhorst 
---
diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 7eb81c1..78bff7c 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -23,43 +23,64 @@
  #include 
  #include 

-static void
-nouveau_event_put_locked(struct nouveau_event *event, int index,
-struct nouveau_eventh *handler)
-{
-   if (!--event->index[index].refs) {
-   if (event->disable)
-   event->disable(event, index);
-   }
-   list_del(&handler->head);
-}
-
  void
  nouveau_event_put(struct nouveau_event *event, int index,
  struct nouveau_eventh *handler)
  {
unsigned long flags;

+   if (index >= event->index_nr)
+   return;
+
spin_lock_irqsave(&event->lock, flags);
-   if (index < event->index_nr)
-   nouveau_event_put_locked(event, index, handler);
+   list_del(&handler->head);
+
+   if (event->toggle_lock)
+   spin_lock(event->toggle_lock);
+   nouveau_event_disable_locked(event, index, 1);
+   if (event->toggle_lock)
+   spin_unlock(event->toggle_lock);
+
spin_unlock_irqrestore(&event->lock, flags);
  }

  void
+nouveau_event_enable_locked(struct nouveau_event *event, int index)
+{
+   if (index >= event->index_nr)
+   return;
+
+   if (!event->index[index].refs++ && event->enable)
+   event->enable(event, index);
+}
+
+void
+nouveau_event_disable_locked(struct nouveau_event *event, int index, int refs)
+{
+   if (index >= event->index_nr)
+   return;
+
+   event->index[index].refs -= refs;
+   if (!event->index[index].refs && event->disable)
+   event->disable(event, index);
+}
+
+void
  nouveau_event_get(struct nouveau_event *event, int index,
  struct nouveau_eventh *handler)
  {
unsigned long flags;

+   if (index >= event->index_nr)
+   return;
+
spin_lock_irqsave(&event->lock, flags);
-   if (index < event->index_nr) {
-   list_add(&handler->head, &event->index[index].list);
-   if (!event->index[index].refs++) {
-   if (event->enable)
-   event->enable(event, index);
-   }
-   }
+   list_add(&handler->head, &event->index[index].list);
+   if (event->toggle_lock)
+   spin_lock(event->toggle_lock);
+   nouveau_event_enable_locked(event, index);
+   if (event->toggle_lock)
+   spin_unlock(event->toggle_lock);
spin_unlock_irqrestore(&event->lock, flags);
  }

@@ -68,6 +89,7 @@ nouveau_event_trigger(struct nouveau_event *event, int index)
  {
struct nouveau_eventh *handler, *temp;
unsigned long flags;
+   int refs = 0;

if (index >= event->index_nr)
return;
@@ -75,9 +97,17 @@ nouveau_event_trigger(struct nouveau_event *event, int index)
spin_lock_irqsave(&event->lock, flags);
list_for_each_entry_safe(handler, temp, &event->index[index].list, 
head) {
if (handler->func(handler, index) == NVKM_EVENT_DROP) {
-   nouveau_event_put_locked(event, index, handler);
+   list_del(&handler->head);
+   refs++;

[PATCH] drm: Use correct spinlock flavor in drm_vblank_get()

2013-08-20 Thread Peter Hurley

The irq flags state is already established by the outer
spin_lock_irqsave(); re-disabling irqs is redundant.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/drm_irq.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index f92da0a..c7d4af5 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -952,13 +952,13 @@ static void drm_update_vblank_count(struct drm_device 
*dev, int crtc)
  */
 int drm_vblank_get(struct drm_device *dev, int crtc)
 {
-   unsigned long irqflags, irqflags2;
+   unsigned long irqflags;
int ret = 0;
 
spin_lock_irqsave(&dev->vbl_lock, irqflags);
/* Going from 0->1 means we have to enable interrupts again */
if (atomic_add_return(1, &dev->vblank_refcount[crtc]) == 1) {
-   spin_lock_irqsave(&dev->vblank_time_lock, irqflags2);
+   spin_lock(&dev->vblank_time_lock);
if (!dev->vblank_enabled[crtc]) {
/* Enable vblank irqs under vblank_time_lock protection.
 * All vblank count & timestamp updates are held off
@@ -976,7 +976,7 @@ int drm_vblank_get(struct drm_device *dev, int crtc)
drm_update_vblank_count(dev, crtc);
}
}
-   spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags2);
+   spin_unlock(&dev->vblank_time_lock);
} else {
if (!dev->vblank_enabled[crtc]) {
atomic_dec(&dev->vblank_refcount[crtc]);
-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 1/9] drm/nouveau: Add priv field for event handlers

2013-08-27 Thread Peter Hurley

Provide private field for event handlers exclusive use.
Convert nouveau_fence_wait_uevent() and
nouveau_fence_wait_uevent_handler(); drop struct nouveau_fence_uevent.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/include/core/event.h |  1 +
 drivers/gpu/drm/nouveau/nouveau_fence.c   | 20 +++-
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/include/core/event.h 
b/drivers/gpu/drm/nouveau/core/include/core/event.h
index 9e09440..ad4d8c2 100644
--- a/drivers/gpu/drm/nouveau/core/include/core/event.h
+++ b/drivers/gpu/drm/nouveau/core/include/core/event.h
@@ -7,6 +7,7 @@
 
 struct nouveau_eventh {
struct list_head head;
+   void *priv;
int (*func)(struct nouveau_eventh *, int index);
 };
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index be31499..c2e3167 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -165,17 +165,11 @@ nouveau_fence_done(struct nouveau_fence *fence)
return !fence->channel;
 }
 
-struct nouveau_fence_uevent {
-   struct nouveau_eventh handler;
-   struct nouveau_fence_priv *priv;
-};
-
 static int
-nouveau_fence_wait_uevent_handler(struct nouveau_eventh *event, int index)
+nouveau_fence_wait_uevent_handler(struct nouveau_eventh *handler, int index)
 {
-   struct nouveau_fence_uevent *uevent =
-   container_of(event, struct nouveau_fence_uevent, handler);
-   wake_up_all(&uevent->priv->waiting);
+   struct nouveau_fence_priv *priv = handler->priv;
+   wake_up_all(&priv->waiting);
return NVKM_EVENT_KEEP;
 }
 
@@ -186,13 +180,13 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, 
bool intr)
struct nouveau_channel *chan = fence->channel;
struct nouveau_fifo *pfifo = nouveau_fifo(chan->drm->device);
struct nouveau_fence_priv *priv = chan->drm->fence;
-   struct nouveau_fence_uevent uevent = {
-   .handler.func = nouveau_fence_wait_uevent_handler,
+   struct nouveau_eventh handler = {
+   .func = nouveau_fence_wait_uevent_handler,
.priv = priv,
};
int ret = 0;
 
-   nouveau_event_get(pfifo->uevent, 0, &uevent.handler);
+   nouveau_event_get(pfifo->uevent, 0, &handler);
 
if (fence->timeout) {
unsigned long timeout = fence->timeout - jiffies;
@@ -224,7 +218,7 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, bool 
intr)
}
}
 
-   nouveau_event_put(pfifo->uevent, 0, &uevent.handler);
+   nouveau_event_put(pfifo->uevent, 0, &handler);
if (unlikely(ret < 0))
return ret;
 
-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 2/9] drm/nouveau: Move event index check from critical section

2013-08-27 Thread Peter Hurley

The index_nr field is constant for the lifetime of the event, so
serialized access is unnecessary.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 7eb81c1..e69c463 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -40,9 +40,11 @@ nouveau_event_put(struct nouveau_event *event, int index,
 {
unsigned long flags;
 
+   if (index >= event->index_nr)
+   return;
+
spin_lock_irqsave(&event->lock, flags);
-   if (index < event->index_nr)
-   nouveau_event_put_locked(event, index, handler);
+   nouveau_event_put_locked(event, index, handler);
spin_unlock_irqrestore(&event->lock, flags);
 }
 
@@ -52,13 +54,14 @@ nouveau_event_get(struct nouveau_event *event, int index,
 {
unsigned long flags;
 
+   if (index >= event->index_nr)
+   return;
+
spin_lock_irqsave(&event->lock, flags);
-   if (index < event->index_nr) {
-   list_add(&handler->head, &event->index[index].list);
-   if (!event->index[index].refs++) {
-   if (event->enable)
-   event->enable(event, index);
-   }
+   list_add(&handler->head, &event->index[index].list);
+   if (!event->index[index].refs++) {
+   if (event->enable)
+   event->enable(event, index);
}
spin_unlock_irqrestore(&event->lock, flags);
 }
-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 0/9] drm/nouveau: Cleanup event/handler design

2013-08-27 Thread Peter Hurley

This series was originally motivated by a deadlock, introduced in
commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
'drm/nouveau/disp: port vblank handling to event interface',
due to inverted lock order between nouveau_drm_vblank_enable()
and nouveau_drm_vblank_handler() (the complete
lockdep report is included in the patch 4/5 changelog).

Because this series fixes the vblank event deadlock, it is
a competing solution to Maarten Lankhorst's 'drm/nouveau:
fix vblank deadlock'.

This series fixes the vblank event deadlock by converting the
event handler list to RCU. This solution allows the event trigger
to be lockless, and thus avoiding the lock inversion.

Typical hurdles to RCU conversion include: 1) ensuring the
object lifetime exceeds the lockless access; 2) preventing
premature object reuse; and 3) verifying that stale object
use not present logical errors.

Because object reuse is not safe in RCU-based operations,
the nouveau_event_get/_put interface is migrated from
add/remove semantics to enable/disable semantics with a separate
install/remove step (which also serves to document the
handler lifetime). This also corrects an unsafe interface design
where handlers can mistakenly be reused while in use.

The nouveau driver currently supports 4 events -- gpio, uevent, cevent,
and vblank. Every event is created by and stored within its
respective subdev/engine object -- gpio in the GPIO subdev, uevent
and cevent in the FIFO engine, and vblank in the DISP engine.
Thus event lifetime extends until the subdev is destructed during
devobj teardown.

Event handler lifetime varies and is detailed in the table below
(apologies for the wide-format):

Event  Handler function   Container Until
-     ---   
--
gpio   nouveau_connector_hotplug  struct nouveau_connector  
nouveau_connector_destroy
uevent nouveau_fence_wait_uevent_handler  local stack object
nouveau_fence_wait_uevent returns
cevent none   n/a   n/a
vblank nouveau_drm_vblank_handler struct nouveau_drm
nouveau_drm_remove
vblank nv50_software_vblsem_release   struct nouveau_software_chan  
_nouveau_engctx_dtor
  (call 
stack originates with
   
nouveau_abi16_chan_free ioctl)
vblank nvc0_software_vblsem_release   struct nouveau_software_chan  same as 
above


RCU lifetime considerations for handlers:

Event  HandlerLifetime
-     -
gpio   nouveau_connector_hotplug  kfree_rcu(nv_connector)
uevent nouveau_fence_wait_uevent_handler  explicit use of 
nouveau_event_hander_create/_destroy
cevent none   n/a
vblank nouveau_drm_vblank_handler synchronize_rcu() in 
nouveau_drm_unload
vblank nv50_software_vblsem_release   synchronize_rcu() in container dtor
vblank nvc0_software_vblsem_release   synchronize_rcu() in container dtor

synchronize_rcu() is used for nouveau_object-based containers for which
kfree_rcu() is not suitable/possible.

Stale event handler execution is not a concern for the existing handlers
because either: 1) the handler is responsible for disabling itself (via
returning NVKM_EVENT_DROP), or 2) the existing handler can already be stale,
as is the case with nouveau_connector_hotplug, which only schedules a work
item, and nouveau_drm_vblank_handler, which the drm core expects may be stale.


Peter Hurley (9):
  drm/nouveau: Add priv field for event handlers
  drm/nouveau: Move event index check from critical section
  drm/nouveau: Allocate local event handlers
  drm/nouveau: Allow asymmetric nouveau_event_get/_put
  drm/nouveau: Add install/remove semantics for event handlers
  drm/nouveau: Convert event handler list to RCU
  drm/nouveau: Fold nouveau_event_put_locked into caller
  drm/nouveau: Simplify event interface
  drm/nouveau: Simplify event handler interface

 drivers/gpu/drm/nouveau/core/core/event.c  | 121 +
 .../gpu/drm/nouveau/core/engine/software/nv50.c|  32 --
 .../gpu/drm/nouveau/core/engine/software/nvc0.c|  32 --
 drivers/gpu/drm/nouveau/core/include/core/event.h  |  27 -
 .../gpu/drm/nouveau/core/include/engine/software.h |   2 +-
 drivers/gpu/drm/nouveau/nouveau_connector.c|  16 ++-
 drivers/gpu/drm/nouveau/nouveau_display.c  |  16 +--
 drivers/gpu/drm/nouveau/nouveau_drm.c  |  35 +++---
 drivers/gpu/drm/nouveau/nouveau_fence.c|  27 ++---
 9 files changed, 220 insertions(+), 88 deletions(-)

-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://list

[PATCH 3/9] drm/nouveau: Allocate local event handlers

2013-08-27 Thread Peter Hurley

Prepare for transition to RCU-based event handler list;
since RCU list traversal may have stale pointers, local
storage may go out of scope before handler completes.

Introduce nouveau_event_handler_create/_destroy which provides
suitable semantics for multiple, temporary event handlers.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 24 +++
 drivers/gpu/drm/nouveau/core/include/core/event.h |  6 ++
 drivers/gpu/drm/nouveau/nouveau_fence.c   | 15 +++---
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index e69c463..1a8d685 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -23,6 +23,30 @@
 #include 
 #include 
 
+int
+nouveau_event_handler_create(struct nouveau_event *event, int index,
+int (*func)(struct nouveau_eventh*, int),
+void *priv, struct nouveau_eventh **phandler)
+{
+   struct nouveau_eventh *handler;
+
+   handler = *phandler = kzalloc(sizeof(*handler), GFP_KERNEL);
+   if (!handler)
+   return -ENOMEM;
+   handler->func = func;
+   handler->priv = priv;
+   nouveau_event_get(event, index, handler);
+   return 0;
+}
+
+void
+nouveau_event_handler_destroy(struct nouveau_event *event, int index,
+ struct nouveau_eventh *handler)
+{
+   nouveau_event_put(event, index, handler);
+   kfree(handler);
+}
+
 static void
 nouveau_event_put_locked(struct nouveau_event *event, int index,
 struct nouveau_eventh *handler)
diff --git a/drivers/gpu/drm/nouveau/core/include/core/event.h 
b/drivers/gpu/drm/nouveau/core/include/core/event.h
index ad4d8c2..bdf1a0a 100644
--- a/drivers/gpu/drm/nouveau/core/include/core/event.h
+++ b/drivers/gpu/drm/nouveau/core/include/core/event.h
@@ -34,4 +34,10 @@ void nouveau_event_get(struct nouveau_event *, int index,
 void nouveau_event_put(struct nouveau_event *, int index,
   struct nouveau_eventh *);
 
+int nouveau_event_handler_create(struct nouveau_event *, int index,
+int (*func)(struct nouveau_eventh*, int),
+void *priv, struct nouveau_eventh **);
+void nouveau_event_handler_destroy(struct nouveau_event *, int index,
+  struct nouveau_eventh *);
+
 #endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index c2e3167..6dde483 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -180,13 +180,14 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, 
bool intr)
struct nouveau_channel *chan = fence->channel;
struct nouveau_fifo *pfifo = nouveau_fifo(chan->drm->device);
struct nouveau_fence_priv *priv = chan->drm->fence;
-   struct nouveau_eventh handler = {
-   .func = nouveau_fence_wait_uevent_handler,
-   .priv = priv,
-   };
-   int ret = 0;
+   struct nouveau_eventh *handler;
+   int ret;
 
-   nouveau_event_get(pfifo->uevent, 0, &handler);
+   ret = nouveau_event_handler_create(pfifo->uevent, 0,
+  nouveau_fence_wait_uevent_handler,
+  priv, &handler);
+   if (ret)
+   return ret;
 
if (fence->timeout) {
unsigned long timeout = fence->timeout - jiffies;
@@ -218,7 +219,7 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, bool 
intr)
}
}
 
-   nouveau_event_put(pfifo->uevent, 0, &handler);
+   nouveau_event_handler_destroy(pfifo->uevent, 0, handler);
if (unlikely(ret < 0))
return ret;
 
-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 4/9] drm/nouveau: Allow asymmetric nouveau_event_get/_put

2013-08-27 Thread Peter Hurley

Most nouveau event handlers have storage in 'static' containers
(structures with lifetimes nearly equivalent to the drm_device),
but are dangerously reused via nouveau_event_get/_put. For
example, if nouveau_event_get is called more than once for a
given handler, the event handler list will be corrupted.

Migrate nouveau_event_get/_put from add/remove semantics to
enable/disable semantics.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 20 
 drivers/gpu/drm/nouveau/core/include/core/event.h |  4 
 drivers/gpu/drm/nouveau/nouveau_drm.c |  8 ++--
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 1a8d685..0a65ede 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -51,11 +51,13 @@ static void
 nouveau_event_put_locked(struct nouveau_event *event, int index,
 struct nouveau_eventh *handler)
 {
-   if (!--event->index[index].refs) {
-   if (event->disable)
-   event->disable(event, index);
+   if (__test_and_clear_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
+   if (!--event->index[index].refs) {
+   if (event->disable)
+   event->disable(event, index);
+   }
+   list_del(&handler->head);
}
-   list_del(&handler->head);
 }
 
 void
@@ -82,10 +84,12 @@ nouveau_event_get(struct nouveau_event *event, int index,
return;
 
spin_lock_irqsave(&event->lock, flags);
-   list_add(&handler->head, &event->index[index].list);
-   if (!event->index[index].refs++) {
-   if (event->enable)
-   event->enable(event, index);
+   if (!__test_and_set_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
+   list_add(&handler->head, &event->index[index].list);
+   if (!event->index[index].refs++) {
+   if (event->enable)
+   event->enable(event, index);
+   }
}
spin_unlock_irqrestore(&event->lock, flags);
 }
diff --git a/drivers/gpu/drm/nouveau/core/include/core/event.h 
b/drivers/gpu/drm/nouveau/core/include/core/event.h
index bdf1a0a..3e704d5 100644
--- a/drivers/gpu/drm/nouveau/core/include/core/event.h
+++ b/drivers/gpu/drm/nouveau/core/include/core/event.h
@@ -5,8 +5,12 @@
 #define NVKM_EVENT_DROP 0
 #define NVKM_EVENT_KEEP 1
 
+/* nouveau_eventh.flags bit #s */
+#define NVKM_EVENT_ENABLE 0
+
 struct nouveau_eventh {
struct list_head head;
+   unsigned long flags;
void *priv;
int (*func)(struct nouveau_eventh *, int index);
 };
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index b29d04b..ccea2c4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -88,7 +88,6 @@ nouveau_drm_vblank_enable(struct drm_device *dev, int head)
 
if (WARN_ON_ONCE(head > ARRAY_SIZE(drm->vblank)))
return -EIO;
-   WARN_ON_ONCE(drm->vblank[head].func);
drm->vblank[head].func = nouveau_drm_vblank_handler;
nouveau_event_get(pdisp->vblank, head, &drm->vblank[head]);
return 0;
@@ -99,11 +98,8 @@ nouveau_drm_vblank_disable(struct drm_device *dev, int head)
 {
struct nouveau_drm *drm = nouveau_drm(dev);
struct nouveau_disp *pdisp = nouveau_disp(drm->device);
-   if (drm->vblank[head].func)
-   nouveau_event_put(pdisp->vblank, head, &drm->vblank[head]);
-   else
-   WARN_ON_ONCE(1);
-   drm->vblank[head].func = NULL;
+
+   nouveau_event_put(pdisp->vblank, head, &drm->vblank[head]);
 }
 
 static u64
-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 5/9] drm/nouveau: Add install/remove semantics for event handlers

2013-08-27 Thread Peter Hurley

Complete migration of nouveau_event_get/_put from add/remove
semantics to enable/disable semantics.

Introduce nouveau_event_handler_install/_remove interface to
add/remove non-local-scope event handlers (ie., those stored in
parent containers). This change in semantics makes explicit the
handler lifetime, and distinguishes "one-of" event handlers
(such as gpio) from "many temporary" event handlers (such as uevent).

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c  | 63 +++---
 .../gpu/drm/nouveau/core/engine/software/nv50.c| 31 +--
 .../gpu/drm/nouveau/core/engine/software/nvc0.c| 31 +--
 drivers/gpu/drm/nouveau/core/include/core/event.h  |  6 +++
 .../gpu/drm/nouveau/core/include/engine/software.h |  2 +-
 drivers/gpu/drm/nouveau/nouveau_connector.c| 10 +++-
 drivers/gpu/drm/nouveau/nouveau_drm.c  | 17 +-
 7 files changed, 140 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 0a65ede..4cd1ebe 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -23,19 +23,60 @@
 #include 
 #include 
 
+void
+nouveau_event_handler_install(struct nouveau_event *event, int index,
+ int (*func)(struct nouveau_eventh*, int),
+ void *priv, struct nouveau_eventh *handler)
+{
+   unsigned long flags;
+
+   if (index >= event->index_nr)
+   return;
+
+   handler->func = func;
+   handler->priv = priv;
+
+   spin_lock_irqsave(&event->lock, flags);
+   list_add(&handler->head, &event->index[index].list);
+   spin_unlock_irqrestore(&event->lock, flags);
+}
+
+void
+nouveau_event_handler_remove(struct nouveau_event *event, int index,
+struct nouveau_eventh *handler)
+{
+   unsigned long flags;
+
+   if (index >= event->index_nr)
+   return;
+
+   spin_lock_irqsave(&event->lock, flags);
+   list_del(&handler->head);
+   spin_unlock_irqrestore(&event->lock, flags);
+}
+
 int
 nouveau_event_handler_create(struct nouveau_event *event, int index,
 int (*func)(struct nouveau_eventh*, int),
 void *priv, struct nouveau_eventh **phandler)
 {
struct nouveau_eventh *handler;
+   unsigned long flags;
 
handler = *phandler = kzalloc(sizeof(*handler), GFP_KERNEL);
if (!handler)
return -ENOMEM;
handler->func = func;
handler->priv = priv;
-   nouveau_event_get(event, index, handler);
+   __set_bit(NVKM_EVENT_ENABLE, &handler->flags);
+
+   spin_lock_irqsave(&event->lock, flags);
+   list_add(&handler->head, &event->index[index].list);
+   if (!event->index[index].refs++) {
+   if (event->enable)
+   event->enable(event, index);
+   }
+   spin_unlock_irqrestore(&event->lock, flags);
return 0;
 }
 
@@ -43,7 +84,18 @@ void
 nouveau_event_handler_destroy(struct nouveau_event *event, int index,
  struct nouveau_eventh *handler)
 {
-   nouveau_event_put(event, index, handler);
+   unsigned long flags;
+
+   if (index >= event->index_nr)
+   return;
+
+   spin_lock_irqsave(&event->lock, flags);
+   if (!--event->index[index].refs) {
+   if (event->disable)
+   event->disable(event, index);
+   }
+   list_del(&handler->head);
+   spin_unlock_irqrestore(&event->lock, flags);
kfree(handler);
 }
 
@@ -56,7 +108,6 @@ nouveau_event_put_locked(struct nouveau_event *event, int 
index,
if (event->disable)
event->disable(event, index);
}
-   list_del(&handler->head);
}
 }
 
@@ -85,7 +136,6 @@ nouveau_event_get(struct nouveau_event *event, int index,
 
spin_lock_irqsave(&event->lock, flags);
if (!__test_and_set_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
-   list_add(&handler->head, &event->index[index].list);
if (!event->index[index].refs++) {
if (event->enable)
event->enable(event, index);
@@ -105,8 +155,9 @@ nouveau_event_trigger(struct nouveau_event *event, int 
index)
 
spin_lock_irqsave(&event->lock, flags);
list_for_each_entry_safe(handler, temp, &event->index[index].list, 
head) {
-   if (handler->func(handler, index) == NVKM_EVENT_DROP) {
-   nouveau_event_put_locked(event, i

[PATCH 6/9] drm/nouveau: Convert event handler list to RCU

2013-08-27 Thread Peter Hurley

Lockdep report [1] correctly identifies a potential deadlock between
nouveau_drm_vblank_enable() and nouveau_drm_vblank_handler() due
to inverted lock order of event->lock with drm core's
dev->vblank_time_lock.

Fix vblank event deadlock by converting event handler list to RCU.
List updates remain serialized by event->lock.
List traversal & handler execution is lockless which prevents
inverted lock order problems with the drm core.
Safe deletion of the event handler is accomplished with
synchronize_rcu() for those handlers stored in nouveau_object-based
containers (nouveau_drm & nv50_/nvc0_software_context); otherwise,
with kfree_rcu (for nouveau_connector & uevent temporary handlers).

[1]
==
[ INFO: possible circular locking dependency detected ]
3.10.0-0+tip-xeon+lockdep #0+tip Not tainted
---
swapper/7/0 is trying to acquire lock:
 (&(&dev->vblank_time_lock)->rlock){-.}, at: [] 
drm_handle_vblank+0x69/0x400 [drm]

but task is already holding lock:
 (&(&event->lock)->rlock){-.}, at: [] 
nouveau_event_trigger+0x4d/0xd0 [nouveau]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&(&event->lock)->rlock){-.}:
   [] lock_acquire+0x92/0x1f0
   [] _raw_spin_lock_irqsave+0x46/0x60
   [] nouveau_event_get+0x27/0xa0 [nouveau]
   [] nouveau_drm_vblank_enable+0x8d/0xf0 [nouveau]
   [] drm_vblank_get+0xf8/0x2c0 [drm]
   [] drm_wait_vblank+0x84/0x6f0 [drm]
   [] drm_ioctl+0x559/0x690 [drm]
   [] do_vfs_ioctl+0x97/0x590
   [] SyS_ioctl+0x91/0xb0
   [] system_call_fastpath+0x16/0x1b

-> #0 (&(&dev->vblank_time_lock)->rlock){-.}:
   [] __lock_acquire+0x1c43/0x1d30
   [] lock_acquire+0x92/0x1f0
   [] _raw_spin_lock_irqsave+0x46/0x60
   [] drm_handle_vblank+0x69/0x400 [drm]
   [] nouveau_drm_vblank_handler+0x27/0x30 [nouveau]
   [] nouveau_event_trigger+0x90/0xd0 [nouveau]
   [] nv50_disp_intr+0xdd/0x230 [nouveau]
   [] nouveau_mc_intr+0xa1/0x100 [nouveau]
   [] handle_irq_event_percpu+0x6c/0x3d0
   [] handle_irq_event+0x48/0x70
   [] handle_fasteoi_irq+0x5a/0x100
   [] handle_irq+0x22/0x40
   [] do_IRQ+0x5a/0xd0
   [] ret_from_intr+0x0/0x13
   [] arch_cpu_idle+0x26/0x30
   [] cpu_startup_entry+0xce/0x410
   [] start_secondary+0x255/0x25c

other info that might help us debug this:

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(&(&event->lock)->rlock);
   lock(&(&dev->vblank_time_lock)->rlock);
   lock(&(&event->lock)->rlock);
  lock(&(&dev->vblank_time_lock)->rlock);

 *** DEADLOCK ***

1 lock held by swapper/7/0:
 #0:  (&(&event->lock)->rlock){-.}, at: [] 
nouveau_event_trigger+0x4d/0xd0 [nouveau]

stack backtrace:
CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.10.0-0+tip-xeon+lockdep #0+tip
Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, BIOS A11 
04/30/2012
 821ccf60 8802bfdc3b08 81755f39 8802bfdc3b58
 8174f526 0002 8802bfdc3be8 8802b330a7f8
 8802b330a820 8802b330a7f8  0001
Call Trace:
   [] dump_stack+0x19/0x1b
 [] print_circular_bug+0x1fb/0x20c
 [] __lock_acquire+0x1c43/0x1d30
 [] ? trace_hardirqs_off+0xd/0x10
 [] ? flat_send_IPI_mask+0x88/0xa0
 [] lock_acquire+0x92/0x1f0
 [] ? drm_handle_vblank+0x69/0x400 [drm]
 [] _raw_spin_lock_irqsave+0x46/0x60
 [] ? drm_handle_vblank+0x69/0x400 [drm]
 [] drm_handle_vblank+0x69/0x400 [drm]
 [] nouveau_drm_vblank_handler+0x27/0x30 [nouveau]
 [] nouveau_event_trigger+0x90/0xd0 [nouveau]
 [] nv50_disp_intr+0xdd/0x230 [nouveau]
 [] ? _raw_spin_unlock_irqrestore+0x42/0x80
 [] ? __hrtimer_start_range_ns+0x16c/0x530
 [] nouveau_mc_intr+0xa1/0x100 [nouveau]
 [] handle_irq_event_percpu+0x6c/0x3d0
 [] handle_irq_event+0x48/0x70
 [] ? handle_fasteoi_irq+0x1e/0x100
 [] handle_fasteoi_irq+0x5a/0x100
 [] handle_irq+0x22/0x40
 [] do_IRQ+0x5a/0xd0
 [] common_interrupt+0x6f/0x6f
   [] ? default_idle+0x1d/0x290
 [] ? default_idle+0x1b/0x290
 [] arch_cpu_idle+0x26/0x30
 [] cpu_startup_entry+0xce/0x410
 [] ? clockevents_register_device+0xdc/0x140
 [] start_secondary+0x255/0x25c

Reported-by: Dave Jones 
Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c  | 22 +++---
 .../gpu/drm/nouveau/core/engine/software/nv50.c|  1 +
 .../gpu/drm/nouveau/core/engine/software/nvc0.c|  1 +
 drivers/gpu/drm/nouveau/core/include/core/event.h  |  1 +
 drivers/gpu/drm/nouveau/nouveau_connector.c|  2 +-
 drivers/gpu/drm/nouveau/nouveau_drm.c  |  1 +
 6 files changed, 16 insertion

[PATCH 7/9] drm/nouveau: Fold nouveau_event_put_locked into caller

2013-08-27 Thread Peter Hurley

nouveau_event_put_locked() only has 1 call site; fold into caller.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index ce0a0ef..45bcb37 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -100,18 +100,6 @@ nouveau_event_handler_destroy(struct nouveau_event *event, 
int index,
kfree_rcu(handler, rcu);
 }
 
-static void
-nouveau_event_put_locked(struct nouveau_event *event, int index,
-struct nouveau_eventh *handler)
-{
-   if (__test_and_clear_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
-   if (!--event->index[index].refs) {
-   if (event->disable)
-   event->disable(event, index);
-   }
-   }
-}
-
 void
 nouveau_event_put(struct nouveau_event *event, int index,
  struct nouveau_eventh *handler)
@@ -122,7 +110,12 @@ nouveau_event_put(struct nouveau_event *event, int index,
return;
 
spin_lock_irqsave(&event->lock, flags);
-   nouveau_event_put_locked(event, index, handler);
+   if (__test_and_clear_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
+   if (!--event->index[index].refs) {
+   if (event->disable)
+   event->disable(event, index);
+   }
+   }
spin_unlock_irqrestore(&event->lock, flags);
 }
 
-- 
1.8.1.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 8/9] drm/nouveau: Simplify event interface

2013-08-27 Thread Peter Hurley

Store event back-pointer and index within struct event_handler;
remove superfluous parameters when event_handler is supplied.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c  | 36 +-
 .../gpu/drm/nouveau/core/engine/software/nv50.c| 11 ++-
 .../gpu/drm/nouveau/core/engine/software/nvc0.c| 11 ++-
 drivers/gpu/drm/nouveau/core/include/core/event.h  | 15 +
 drivers/gpu/drm/nouveau/nouveau_connector.c|  5 +--
 drivers/gpu/drm/nouveau/nouveau_display.c  | 16 +++---
 drivers/gpu/drm/nouveau/nouveau_drm.c  | 10 ++
 drivers/gpu/drm/nouveau/nouveau_fence.c|  2 +-
 8 files changed, 44 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 45bcb37..b7d8ae1 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -34,6 +34,9 @@ nouveau_event_handler_install(struct nouveau_event *event, 
int index,
if (index >= event->index_nr)
return;
 
+   handler->event = event;
+   handler->index = index;
+
handler->func = func;
handler->priv = priv;
 
@@ -43,12 +46,12 @@ nouveau_event_handler_install(struct nouveau_event *event, 
int index,
 }
 
 void
-nouveau_event_handler_remove(struct nouveau_event *event, int index,
-struct nouveau_eventh *handler)
+nouveau_event_handler_remove(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
unsigned long flags;
 
-   if (index >= event->index_nr)
+   if (!event)
return;
 
spin_lock_irqsave(&event->lock, flags);
@@ -67,6 +70,10 @@ nouveau_event_handler_create(struct nouveau_event *event, 
int index,
handler = *phandler = kzalloc(sizeof(*handler), GFP_KERNEL);
if (!handler)
return -ENOMEM;
+
+   handler->event = event;
+   handler->index = index;
+
handler->func = func;
handler->priv = priv;
__set_bit(NVKM_EVENT_ENABLE, &handler->flags);
@@ -82,13 +89,12 @@ nouveau_event_handler_create(struct nouveau_event *event, 
int index,
 }
 
 void
-nouveau_event_handler_destroy(struct nouveau_event *event, int index,
- struct nouveau_eventh *handler)
+nouveau_event_handler_destroy(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
+   int index = handler->index;
unsigned long flags;
 
-   if (index >= event->index_nr)
-   return;
 
spin_lock_irqsave(&event->lock, flags);
if (!--event->index[index].refs) {
@@ -101,12 +107,13 @@ nouveau_event_handler_destroy(struct nouveau_event 
*event, int index,
 }
 
 void
-nouveau_event_put(struct nouveau_event *event, int index,
- struct nouveau_eventh *handler)
+nouveau_event_put(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
+   int index = handler->index;
unsigned long flags;
 
-   if (index >= event->index_nr)
+   if (!event)
return;
 
spin_lock_irqsave(&event->lock, flags);
@@ -120,12 +127,13 @@ nouveau_event_put(struct nouveau_event *event, int index,
 }
 
 void
-nouveau_event_get(struct nouveau_event *event, int index,
- struct nouveau_eventh *handler)
+nouveau_event_get(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
+   int index = handler->index;
unsigned long flags;
 
-   if (index >= event->index_nr)
+   if (!event)
return;
 
spin_lock_irqsave(&event->lock, flags);
@@ -150,7 +158,7 @@ nouveau_event_trigger(struct nouveau_event *event, int 
index)
list_for_each_entry_rcu(handler, &event->index[index].list, head) {
if (test_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
if (handler->func(handler, index) == NVKM_EVENT_DROP)
-   nouveau_event_put(event, index, handler);
+   nouveau_event_put(handler);
}
}
rcu_read_unlock();
diff --git a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c 
b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
index 87aeee1..e969f0c 100644
--- a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
+++ b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
@@ -92,12 +92,11 @@ nv50_software_mthd_vblsem_release(struct nouveau_object 
*object, u32 mthd,
  void *args, u32 size)
 {
struct nv50_software_chan *chan = (void *)nv_engctx(object->parent);
-   struct nouveau_disp *disp = nouveau_disp(object);
u32 crtc = *(u32 *)args

[PATCH 9/9] drm/nouveau: Simplify event handler interface

2013-08-27 Thread Peter Hurley

Remove index parameter; access index via handler->index instead.
Dissociate handler from related container; use handler->priv to
access container.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c   | 6 +++---
 drivers/gpu/drm/nouveau/core/engine/software/nv50.c | 7 +++
 drivers/gpu/drm/nouveau/core/engine/software/nvc0.c | 7 +++
 drivers/gpu/drm/nouveau/core/include/core/event.h   | 6 +++---
 drivers/gpu/drm/nouveau/nouveau_connector.c | 9 +
 drivers/gpu/drm/nouveau/nouveau_drm.c   | 9 +
 drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
 7 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index b7d8ae1..1240fef 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -26,7 +26,7 @@
 
 void
 nouveau_event_handler_install(struct nouveau_event *event, int index,
- int (*func)(struct nouveau_eventh*, int),
+ int (*func)(struct nouveau_eventh*),
  void *priv, struct nouveau_eventh *handler)
 {
unsigned long flags;
@@ -61,7 +61,7 @@ nouveau_event_handler_remove(struct nouveau_eventh *handler)
 
 int
 nouveau_event_handler_create(struct nouveau_event *event, int index,
-int (*func)(struct nouveau_eventh*, int),
+int (*func)(struct nouveau_eventh*),
 void *priv, struct nouveau_eventh **phandler)
 {
struct nouveau_eventh *handler;
@@ -157,7 +157,7 @@ nouveau_event_trigger(struct nouveau_event *event, int 
index)
rcu_read_lock();
list_for_each_entry_rcu(handler, &event->index[index].list, head) {
if (test_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
-   if (handler->func(handler, index) == NVKM_EVENT_DROP)
+   if (handler->func(handler) == NVKM_EVENT_DROP)
nouveau_event_put(handler);
}
}
diff --git a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c 
b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
index e969f0c..bf6f23b 100644
--- a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
+++ b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
@@ -131,10 +131,9 @@ nv50_software_sclass[] = {
  
**/
 
 static int
-nv50_software_vblsem_release(struct nouveau_eventh *event, int head)
+nv50_software_vblsem_release(struct nouveau_eventh *handler)
 {
-   struct nouveau_software_chan *chan =
-   container_of(event, struct nouveau_software_chan, 
vblank.event[head]);
+   struct nouveau_software_chan *chan = handler->priv;
struct nv50_software_priv *priv = (void *)nv_object(chan)->engine;
struct nouveau_bar *bar = nouveau_bar(priv);
 
@@ -172,7 +171,7 @@ nv50_software_context_ctor(struct nouveau_object *parent,
for (i = 0; i < ARRAY_SIZE(chan->base.vblank.event); i++) {
nouveau_event_handler_install(disp->vblank, i,
  nv50_software_vblsem_release,
- NULL,
+ chan,
  &chan->base.vblank.event[i]);
}
return 0;
diff --git a/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c 
b/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c
index d06658a..1a2a7a8 100644
--- a/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c
@@ -143,10 +143,9 @@ nvc0_software_sclass[] = {
  
**/
 
 static int
-nvc0_software_vblsem_release(struct nouveau_eventh *event, int head)
+nvc0_software_vblsem_release(struct nouveau_eventh *handler)
 {
-   struct nouveau_software_chan *chan =
-   container_of(event, struct nouveau_software_chan, 
vblank.event[head]);
+   struct nouveau_software_chan *chan = handler->priv;
struct nvc0_software_priv *priv = (void *)nv_object(chan)->engine;
struct nouveau_bar *bar = nouveau_bar(priv);
 
@@ -178,7 +177,7 @@ nvc0_software_context_ctor(struct nouveau_object *parent,
for (i = 0; i < ARRAY_SIZE(chan->base.vblank.event); i++) {
nouveau_event_handler_install(disp->vblank, i,
  nvc0_software_vblsem_release,
- NULL,
+ chan,
  &chan->base.vblank.event[i]);
}
return 0;
diff --git a/drivers/gpu/drm/nouveau/core/inc

Re: [PATCH 0/9] drm/nouveau: Cleanup event/handler design

2013-08-29 Thread Peter Hurley


On 08/28/2013 03:15 AM, Ben Skeggs wrote:

On Wed, Aug 28, 2013 at 6:12 AM, Peter Hurley  wrote:

This series was originally motivated by a deadlock, introduced in
commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
'drm/nouveau/disp: port vblank handling to event interface',
due to inverted lock order between nouveau_drm_vblank_enable()
and nouveau_drm_vblank_handler() (the complete
lockdep report is included in the patch 4/5 changelog).

Hey Peter,

Thanks for the patch series.  I've only taken a cursory glance through
the patches thus far, as they're definitely too intrusive for -fixes
at this point.  I will take a proper look through the series within
the next week or so, I have some work pending which may possibly make
some of these changes unnecessary, though from what I can tell,
there's other good bits here we'll want.

In a previous mail on the locking issue, you mentioned the drm core
doing the "wrong thing" here too, did you have any further thoughts on
correcting that issue too?


I'm working on the premise that drm_handle_vblank() can be lockless;
that would minimize the api disruption. I still have a fair bit of
analysis to do, but some early observations:

1) The vbl_time_lock spinlock is unnecessary -- drm_handle_vblank()
   could be protected with vbl_lock, since everywhere else
   vbl_time_lock is held, vbl_lock is already held.

2) The _vblank_count[crtc] doesn't need to be atomic. All the code
   paths that update _vblank_count[] are already spinlocked. Even
   if the code weren't spinlocked, the atomic_inc/_adds aren't
   necessary; only the memory barriers and read/write order of
   the vblank count/timestamp tuple is relevant.

3) Careful enabling of drm_handle_vblank() is sufficient to
   solve the concurrency between drm_handle_vblank() and
   drm_vblank_get() without needing a spinlock for exclusion.
   drm_handle_vblank() would need to account for the possibility of
   having missed a vblank interrupt between the reading of the
   hw vblank counter and the enabling drm_handle_vblank().

4) I'm still thinking about how/whether to exclude
   drm_handle_vblank() and vblank_disable_and_save(). One thought
   is to replace the timeout timer with delayed work instead so
   that vblank_disable_and_save() could wait for
   drm_handle_vblank() to finish after disabling it.
   [I'd also need to verify that the drm drivers other than
   intel which use drm_vblank_off() do so from non-atomic context.]

I realize that I'm mostly referring to the hw counter/timestamp
flavor of vblank handling; that's primarily because it requires a
more rigorous handling and has race conditions that don't apply
to the software-only counter.

Regards,
Peter Hurley
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: BUG: sleeping function called from invalid context on 3.10.10-rt7

2013-09-11 Thread Peter Hurley


[+cc dri-devel]

On 09/11/2013 11:38 AM, Steven Rostedt wrote:

On Wed, 11 Sep 2013 11:16:43 -0400
Peter Hurley  wrote:


The funny part is, there's a comment there that shows that this was
done even for "PREEMPT_RT". Unfortunately, the call to
"get_scanout_position()" can call functions that use the rt-mutex
"sleeping spin locks" and it breaks there.

I guess we need to ask the authors of the mainline patch exactly why
that preempt_disable() is needed?


The drm core associates a timestamp with each vertical blank frame #.
Drm drivers can optionally support a 'high resolution' hw timestamp.
The vblank frame #/timestamp tuple is user-space visible.

The i915 drm driver supports a hw timestamp via this drm helper function
which computes the timestamp from the crtc scan position (based on the
pixel clock).

For mainline, the preempt_disable/_enable() isn't actually necessary
because every call tree that leads here already has preemption disabled.

For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?



No, it should not. Note, any other lock that can be held when it is
held would also need to be raw.


By that, you mean "any other lock" that might be claimed "would also need
to be raw"?  Hopefully not "any other lock" already held?


And by taking a quick audit of the code, I see this:

spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);

/* Reset the chip */

/* GEN6_GDRST is not in the gt power well, no need to check
 * for fifo space for the write or forcewake the chip for
 * the read
 */
__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);

/* Spin waiting for the device to ack the reset request */
ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & 
GEN6_GRDOM_FULL) == 0, 500);

That spin is unacceptable in RT with preemption and interrupts disabled.


Yep. That would be bad.

AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
in the force-wake set, so raw reads of the registers would
probably be acceptable (thus obviating the need for claiming the uncore.lock).

Except that _ALL_ register access is disabled with the uncore.lock
during a gpu reset. Not sure if that's meant to include crtc registers
or not, or what other synchronization/serialization issues are being
handled/hidden by forcing all register accesses to wait during a gpu reset.

Hopefully an i915 expert can weigh in here?


What's the real issue here?


That the vblank timestamp needs to be an accurate measurement of a
realtime event. Sleeping/servicing interrupts while reading
the registers necessary to compute the timestamp would be bad too.

(edit: which hopefully Mario Kleiner clarified in his reply)

My point earlier was three-fold:
1. Don't need the preempt_disable() for mainline: all callers are already
   holding interrupt-disabling spinlocks.
2. -RT still needs to prevent scheduling there.
3. the problem is i915-specific.

[update: the radeon driver should also BUG like the i915 driver but probably
should have mmio_idx_lock spinlock as raw]


Does it have something to do with this dump?


Not sure [but I didn't realize the query was regarding 3.0].


[3.932060] [ cut here ]
[3.936809] WARNING: at 
/home/rostedt/work/git/linux-rt.git/drivers/gpu/drm/i915/i915_drv.c:322 
gen6_gt_force_wake_get+0x4d/0x50 [i915]()
[3.949229] Hardware name: HP Compaq Pro 6300 SFF
[3.953988] Modules linked in: i915(+) video i2c_algo_bit drm_kms_helper drm 
i2c_core
[3.961943] Pid: 220, comm: udevd Not tainted 3.0.89-test-rt117 #117
[3.968343] Call Trace:
[3.970851]  [] warn_slowpath_common+0x7f/0xc0
[3.970852]  [] warn_slowpath_null+0x1a/0x20
[3.970857]  [] gen6_gt_force_wake_get+0x4d/0x50 [i915]
[3.970865]  [] ivybridge_init_clock_gating+0xcb/0x2f0 
[i915]
[3.970870]  [] ? intel_crtc_disable+0x29/0x60 [i915]
[3.970877]  [] intel_modeset_init+0x751/0x10e0 [i915]
[3.970882]  [] i915_driver_load+0xfc8/0x17f0 [i915]
[3.970885]  [] ? device_register+0x1e/0x30
[3.970892]  [] ? drm_sysfs_device_add+0x86/0xb0 [drm]
[3.970896]  [] ? drm_get_minor+0x233/0x300 [drm]
[3.970900]  [] drm_get_pci_dev+0x189/0x2a0 [drm]
[3.970902]  [] ? migrate_enable+0x8b/0x1c0
[3.970910]  [] i915_pci_probe+0x1b/0x1d [i915]
[3.970913]  [] local_pci_probe+0x5c/0xd0
[3.970915]  [] pci_device_probe+0x109/0x130
[3.970917]  [] driver_probe_device+0x9c/0x2b0
[3.970918]  [] __driver_attach+0xab/0xb0
[3.970919]  [] ? driver_probe_device+0x2b0/0x2b0
[3.970920]  [] ? driver_probe_device+0x2b0/0x2b0
[3.970921]  [] bus_for_each_dev+0x64/0xa0
[3.970923]  [] driver_attach+0x1e/0x20
[3.970924]  [] bus_add_driver+0x1b0/0x290
[3.970925]  [] driver_register+0x76/0x140
[3.970927]  [] __pci_register_driver+0x82/0x10

Re: BUG: sleeping function called from invalid context on 3.10.10-rt7

2013-09-17 Thread Peter Hurley


On 09/11/2013 03:31 PM, Peter Hurley wrote:

[+cc dri-devel]

On 09/11/2013 11:38 AM, Steven Rostedt wrote:

On Wed, 11 Sep 2013 11:16:43 -0400
Peter Hurley  wrote:


The funny part is, there's a comment there that shows that this was
done even for "PREEMPT_RT". Unfortunately, the call to
"get_scanout_position()" can call functions that use the rt-mutex
"sleeping spin locks" and it breaks there.

I guess we need to ask the authors of the mainline patch exactly why
that preempt_disable() is needed?


The drm core associates a timestamp with each vertical blank frame #.
Drm drivers can optionally support a 'high resolution' hw timestamp.
The vblank frame #/timestamp tuple is user-space visible.

The i915 drm driver supports a hw timestamp via this drm helper function
which computes the timestamp from the crtc scan position (based on the
pixel clock).

For mainline, the preempt_disable/_enable() isn't actually necessary
because every call tree that leads here already has preemption disabled.

For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?



No, it should not. Note, any other lock that can be held when it is
held would also need to be raw.


By that, you mean "any other lock" that might be claimed "would also need
to be raw"?  Hopefully not "any other lock" already held?


And by taking a quick audit of the code, I see this:

spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);

/* Reset the chip */

/* GEN6_GDRST is not in the gt power well, no need to check
 * for fifo space for the write or forcewake the chip for
 * the read
 */
__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);

/* Spin waiting for the device to ack the reset request */
ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) 
== 0, 500);

That spin is unacceptable in RT with preemption and interrupts disabled.


Yep. That would be bad.

AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
in the force-wake set, so raw reads of the registers would
probably be acceptable (thus obviating the need for claiming the uncore.lock).

Except that _ALL_ register access is disabled with the uncore.lock
during a gpu reset. Not sure if that's meant to include crtc registers
or not, or what other synchronization/serialization issues are being
handled/hidden by forcing all register accesses to wait during a gpu reset.

Hopefully an i915 expert can weigh in here?



Daniel,

Can you shed some light on whether the i915+ crtc registers (specifically
those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
read as part of the vblank counter/timestamp handling need to
be prevented during gpu reset?

The implied wait with preemption and interrupts disabled is causing grief
in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.

Regards,
Peter Hurley



What's the real issue here?


That the vblank timestamp needs to be an accurate measurement of a
realtime event. Sleeping/servicing interrupts while reading
the registers necessary to compute the timestamp would be bad too.

(edit: which hopefully Mario Kleiner clarified in his reply)

My point earlier was three-fold:
1. Don't need the preempt_disable() for mainline: all callers are already
holding interrupt-disabling spinlocks.
2. -RT still needs to prevent scheduling there.
3. the problem is i915-specific.

[update: the radeon driver should also BUG like the i915 driver but probably
should have mmio_idx_lock spinlock as raw]



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: BUG: sleeping function called from invalid context on 3.10.10-rt7

2013-09-18 Thread Peter Hurley


On 09/17/2013 04:55 PM, Daniel Vetter wrote:

On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley  wrote:

On 09/11/2013 03:31 PM, Peter Hurley wrote:


[+cc dri-devel]

On 09/11/2013 11:38 AM, Steven Rostedt wrote:


On Wed, 11 Sep 2013 11:16:43 -0400
Peter Hurley  wrote:


The funny part is, there's a comment there that shows that this was
done even for "PREEMPT_RT". Unfortunately, the call to
"get_scanout_position()" can call functions that use the rt-mutex
"sleeping spin locks" and it breaks there.

I guess we need to ask the authors of the mainline patch exactly why
that preempt_disable() is needed?



The drm core associates a timestamp with each vertical blank frame #.
Drm drivers can optionally support a 'high resolution' hw timestamp.
The vblank frame #/timestamp tuple is user-space visible.

The i915 drm driver supports a hw timestamp via this drm helper function
which computes the timestamp from the crtc scan position (based on the
pixel clock).

For mainline, the preempt_disable/_enable() isn't actually necessary
because every call tree that leads here already has preemption disabled.

For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?



No, it should not. Note, any other lock that can be held when it is
held would also need to be raw.



By that, you mean "any other lock" that might be claimed "would also need
to be raw"?  Hopefully not "any other lock" already held?


And by taking a quick audit of the code, I see this:

 spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);

 /* Reset the chip */

 /* GEN6_GDRST is not in the gt power well, no need to check
  * for fifo space for the write or forcewake the chip for
  * the read
  */
 __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);

 /* Spin waiting for the device to ack the reset request */
 ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);

That spin is unacceptable in RT with preemption and interrupts disabled.



Yep. That would be bad.

AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included
in the force-wake set, so raw reads of the registers would
probably be acceptable (thus obviating the need for claiming the
uncore.lock).

Except that _ALL_ register access is disabled with the uncore.lock
during a gpu reset. Not sure if that's meant to include crtc registers
or not, or what other synchronization/serialization issues are being
handled/hidden by forcing all register accesses to wait during a gpu
reset.

Hopefully an i915 expert can weigh in here?




Daniel,

Can you shed some light on whether the i915+ crtc registers (specifically
those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter())
read as part of the vblank counter/timestamp handling need to
be prevented during gpu reset?


The depency here in the locking is a recent addition:

commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a
Author: Chris Wilson 
Date:   Fri Jul 19 20:36:51 2013 +0100

 drm/i915: Serialize almost all register access

It's a (slightly) oversized hammer to work around a hardware issue -
we could break it down to register blocks, which can be accessed
concurrently, but that tends to be more fragile. But the chip really
dies if you access (even just reads) the same block concurrently :(


Ouch. But thanks for clarifying that.

Ok, so register access needs to be serialized. And a separate but
related concern is that gen6+ resets also need to hold-off register
access where forcewake is required.


While I was reviewing the registers that require forcewake handling,
I saw this:

from i915_reg.h:
#define _DPLL_A (dev_priv->info->display_mmio_offset + 0x6014)
#define _DPLL_B (dev_priv->info->display_mmio_offset + 0x6018)

from i915_drv.c:
static const struct intel_device_info intel_valleyview_m_info = {
GEN7_FEATURES,
.is_mobile = 1,
.num_pipes = 2,
.is_valleyview = 1,
.display_mmio_offset = VLV_DISPLAY_BASE, <<<---
.has_llc = 0, /* legal, last one wins */
};

from intel_uncore.c:
#define NEEDS_FORCE_WAKE(dev_priv, reg) \
((HAS_FORCE_WAKE((dev_priv)->dev)) && \
 ((reg) < 0x4) &&\
 ((reg) != FORCEWAKE))

Is this is a mistake or do the valleyview PLLs not require the
same forcewake handling as the other intel gpus?

Regards,
Peter Hurley



We could try break the spinlock protected section a bit in the reset
handler - register access on a hung gpu tends to be ill-defined
anyway.


The implied wait with preemption and interrupts disabled is causing grief
in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.


Oops, the magic code in wait_for which is just there to make the imo
totally misguided kgdb support work papered over the aweful long wait
in atomic context ever since w

Re: [Nouveau] [PATCH 6/6] drm/nouveau: use MSI interrupts

2013-09-30 Thread Peter Hurley

iously the right thing to do.

Well, we can't just go around breaking stuff deliberately for the
people still using them!

I've blacklisted them myself and merged the patch.


Ben,

This patch causes my dual-head Quadro FX570 (G84) to fail to idle when
dragging a window around.

It loops for the full timeout (15 sec.) in nouveau_gem_ioctl_cpu_prep() --
ie., never reaches required fence sequence #.


lspci -vvv -nn

02:00.0 VGA compatible controller [0300]: NVIDIA Corporation G84 [Quadro FX 
570] [10de:040e] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Device [10de:0474]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 

Kernel driver in use: nouveau
Kernel modules: nouveau, nvidiafb

Regards,
Peter Hurley


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Nouveau] [PATCH 6/6] drm/nouveau: use MSI interrupts

2013-10-01 Thread Peter Hurley


On 09/30/2013 01:27 PM, Peter Hurley wrote:

On 09/03/2013 09:45 PM, Ben Skeggs wrote:

Well, we can't just go around breaking stuff deliberately for the
people still using them!

I've blacklisted them myself and merged the patch.


Ben,

This patch causes my dual-head Quadro FX570 (G84) to fail to idle when
dragging a window around.

It loops for the full timeout (15 sec.) in nouveau_gem_ioctl_cpu_prep() --
ie., never reaches required fence sequence #.


FWIW, I can get the binary 319.32 driver to run this hardware
with MSIs on [1] and not crash in similar circumstances.

But repeating the testing on 3.12.0-rc2/3 has proved challenging.
Although I have the binary driver running on 3.12.0-rc2, the
userspace won't fully come up and I haven't figured out why
(it's an old userspace that I don't use regularly, so it could
be some other problem).

Regards,
Peter Hurley

[1] lspci -vvv -nn on 3.2.x

02:00.0 VGA compatible controller [0300]: NVIDIA Corporation G84GL [Quadro FX 
570] [10de:040e] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Device [10de:0474]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 

Kernel driver in use: nvidia
Kernel modules: nvidia_319_updates, nouveau, nvidiafb
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm: Fix DRM_FORCE_ON_DIGITAL use

2014-11-03 Thread Peter Hurley

A connector may be forced on from the command line via video=
command line setting. The digital output of dual-mode connectors
can also be specifically selected and forced on; eg., 'video=DVI-I-2:D'.
However, in this case, the connector->status will be mistakenly set to
connector_status_disconnected, and the connector will not be mode set.

Fix the connector->status when connector->force is DRM_FORCE_ON_DIGITAL.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/drm_probe_helper.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index 6857e9a..7483a47 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -118,7 +118,8 @@ static int 
drm_helper_probe_single_connector_modes_merge_bits(struct drm_connect
mode->status = MODE_UNVERIFIED;

if (connector->force) {
-   if (connector->force == DRM_FORCE_ON)
+   if (connector->force == DRM_FORCE_ON ||
+   connector->force == DRM_FORCE_ON_DIGITAL)
connector->status = connector_status_connected;
else
connector->status = connector_status_disconnected;
-- 
2.1.1

[PATCH] drm: Remove compiler BUG_ON() test

2014-11-03 Thread Peter Hurley

modeset->num_connectors must be 0 to reach the BUG_ON() which tests
for non-zero modeset->num_connectors; remove BUG_ON().

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/drm_fb_helper.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index d934346..7198257 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1573,7 +1573,6 @@ static void drm_setup_crtcs(struct drm_fb_helper 
*fb_helper)
modeset = &fb_helper->crtc_info[i].mode_set;
if (modeset->num_connectors == 0) {
BUG_ON(modeset->fb);
-   BUG_ON(modeset->num_connectors);
if (modeset->mode)
drm_mode_destroy(dev, modeset->mode);
modeset->mode = NULL;
-- 
2.1.1

BUG caused by "Use new drm_fb_helper functions" series

2016-02-01 Thread Peter Hurley

Hi Archit,

Just booting 4.4-rc5+, I got this splat [1]
At first glance, this appears to be a simple fix.

However, I'm concerned that fbcon functions, which may be called with
interrupts disabled, are now hooked up to fbdev functions which may assume
interrupts are not disabled (as is the case with cfb_imageblit()).

For example, in the splat below, it's a simple fix to make the splat go
away with GFP_ATOMIC allocation. However, the following fence wait is _never_
going to trigger with interrupts disabled on UP.

FWIW, I've been running almost exclusively debug kernel builds so I'm not
sure why this hasn't triggered many times before, but it hasn't.

Regards,
Peter Hurley


[1] BUG splat

[   37.438494] BUG: sleeping function called from invalid context at 
/home/peter/src/kernels/mainline/mm/slub.c:1287
[   37.438495] in_atomic(): 1, irqs_disabled(): 1, pid: 2276, name: auditd
[   37.438497] 1 lock held by auditd/2276:
[   37.438507]  #0:  (audit_cmd_mutex){+.+.+.}, at: [] 
audit_receive+0x1f/0xa0
[   37.438507] irq event stamp: 1689
[   37.438511] hardirqs last  enabled at (1689): [] 
vprintk_emit+0x236/0x620
[   37.438513] hardirqs last disabled at (1688): [] 
vprintk_emit+0xd4/0x620
[   37.438518] softirqs last  enabled at (1652): [] 
netlink_poll+0x138/0x1c0
[   37.438520] softirqs last disabled at (1650): [] 
netlink_poll+0xf7/0x1c0
[   37.438522] CPU: 7 PID: 2276 Comm: auditd Not tainted 
4.4.0-rc5+wip-xeon+debug #rc5+wip
[   37.438523] Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, 
BIOS A11 04/30/2012
[   37.438526]  81ce5cc8 8802a87c3590 813fb6c5 
8802ac768000
[   37.438528]  8802a87c35b8 810a6fb9 81ce5cc8 
0507
[   37.438530]   8802a87c35e0 810a70b9 
024080c0
[   37.438531] Call Trace:
[   37.438535]  [] dump_stack+0x4e/0x79
[   37.438538]  [] ___might_sleep+0x149/0x200
[   37.438540]  [] __might_sleep+0x49/0x80
[   37.438544]  [] kmem_cache_alloc_trace+0x20d/0x2e0
[   37.438600]  [] ? nouveau_fence_new+0x3b/0x90 [nouveau]
[   37.438624]  [] nouveau_fence_new+0x3b/0x90 [nouveau]
[   37.438649]  [] nouveau_channel_idle+0x42/0xb0 [nouveau]
[   37.438673]  [] nouveau_fbcon_sync+0x7f/0xb0 [nouveau]
[   37.438677]  [] cfb_imageblit+0x9a/0x4d0
[   37.438681]  [] ? trace_hardirqs_off_caller+0x1f/0xc0
[   37.438693]  [] drm_fb_helper_cfb_imageblit+0xe/0x10 
[drm_kms_helper]
[   37.438717]  [] nouveau_fbcon_imageblit+0x51/0xd0 [nouveau]
[   37.438719]  [] bit_putcs+0x2dc/0x530
[   37.438721]  [] ? trace_hardirqs_off+0xd/0x10
[   37.438725]  [] ? get_color.isra.15+0x34/0x130
[   37.438727]  [] fbcon_putcs+0x128/0x160
[   37.438728]  [] ? bit_cursor+0x5e0/0x5e0
[   37.438730]  [] fbcon_redraw.isra.25+0x16b/0x1d0
[   37.438731]  [] fbcon_scroll+0x1ea/0xce0
[   37.438734]  [] scrup+0x14a/0x160
[   37.438736]  [] lf+0x80/0x90
[   37.438737]  [] vt_console_print+0x2a7/0x3e0
[   37.438739]  [] 
call_console_drivers.constprop.24+0x144/0x1d0
[   37.438741]  [] console_unlock+0x463/0x540
[   37.438742]  [] vprintk_emit+0x35a/0x620
[   37.438744]  [] vprintk_default+0x29/0x40
[   37.438748]  [] printk+0x4d/0x4f
[   37.438750]  [] audit_printk_skb+0x62/0x70
[   37.438751]  [] audit_log_end+0x1d4/0x2d0
[   37.438752]  [] ? audit_log_end+0x30/0x2d0
[   37.438754]  [] audit_log_config_change+0x89/0xa0
[   37.438756]  [] audit_receive_msg+0xa5a/0xbb0
[   37.438759]  [] ? mutex_lock_nested+0x2ed/0x450
[   37.438761]  [] ? audit_receive+0x1f/0xa0
[   37.438762]  [] ? audit_receive+0x1f/0xa0
[   37.438764]  [] audit_receive+0x52/0xa0
[   37.438766]  [] netlink_unicast+0xf2/0x1c0
[   37.438767]  [] netlink_sendmsg+0x3e7/0x620
[   37.438771]  [] sock_sendmsg+0x38/0x50
[   37.438772]  [] SYSC_sendto+0xf6/0x170
[   37.438775]  [] ? context_tracking_exit+0x1d/0x30
[   37.438778]  [] ? enter_from_user_mode+0x1f/0x50
[   37.438780]  [] ? syscall_trace_enter_phase1+0xcb/0x130
[   37.438781]  [] ? trace_hardirqs_on_thunk+0x17/0x19
[   37.438784]  [] SyS_sendto+0xe/0x10
[   37.438786]  [] entry_SYSCALL_64_fastpath+0x16/0x7a

BUG caused by "Use new drm_fb_helper functions" series

2016-02-02 Thread Peter Hurley

On 02/01/2016 09:20 PM, Archit Taneja wrote:
> Hi Peter,
> 
> On 02/02/2016 02:07 AM, Peter Hurley wrote:
>> Hi Archit,
>>
>> Just booting 4.4-rc5+, I got this splat [1]
>> At first glance, this appears to be a simple fix.
> 
> Thanks for sharing this.
> 
>>
>> However, I'm concerned that fbcon functions, which may be called with
>> interrupts disabled, are now hooked up to fbdev functions which may assume
>> interrupts are not disabled (as is the case with cfb_imageblit()).
> 
> In the case when CONFIG_FB is enabled, drm_fb_helper_cfb_imageblit
> helper simply wraps around cfg_imageblit, so I don't see how we'd have
> any difference in behaviour.


Sorry; terrible attribution on my part.
This bug clearly has nothing to do with this series.

But a better look has me wondering how all these gpus are syncing
the framebuffer for direct access via cfb_imageblit and friends. I only see
nouveau and intel gma even trying.



>> For example, in the splat below, it's a simple fix to make the splat go
>> away with GFP_ATOMIC allocation. However, the following fence wait is _never_
>> going to trigger with interrupts disabled on UP.
>>
>> FWIW, I've been running almost exclusively debug kernel builds so I'm not
>> sure why this hasn't triggered many times before, but it hasn't.
> 
> We could revert the patch
> "drm/nouveau: Use new drm_fb_helper functions" and see if we still hit this
> issue.
> 
> Thanks,
> Archit
> 
>> Regards,
>> Peter Hurley
>>
>>
>> [1] BUG splat
>>
>> [   37.438494] BUG: sleeping function called from invalid context at 
>> /home/peter/src/kernels/mainline/mm/slub.c:1287
>> [   37.438495] in_atomic(): 1, irqs_disabled(): 1, pid: 2276, name: auditd
>> [   37.438497] 1 lock held by auditd/2276:
>> [   37.438507]  #0:  (audit_cmd_mutex){+.+.+.}, at: [] 
>> audit_receive+0x1f/0xa0
>> [   37.438507] irq event stamp: 1689
>> [   37.438511] hardirqs last  enabled at (1689): [] 
>> vprintk_emit+0x236/0x620
>> [   37.438513] hardirqs last disabled at (1688): [] 
>> vprintk_emit+0xd4/0x620
>> [   37.438518] softirqs last  enabled at (1652): [] 
>> netlink_poll+0x138/0x1c0
>> [   37.438520] softirqs last disabled at (1650): [] 
>> netlink_poll+0xf7/0x1c0
>> [   37.438522] CPU: 7 PID: 2276 Comm: auditd Not tainted 
>> 4.4.0-rc5+wip-xeon+debug #rc5+wip
>> [   37.438523] Hardware name: Dell Inc. Precision WorkStation T5400  
>> /0RW203, BIOS A11 04/30/2012
>> [   37.438526]  81ce5cc8 8802a87c3590 813fb6c5 
>> 8802ac768000
>> [   37.438528]  8802a87c35b8 810a6fb9 81ce5cc8 
>> 0507
>> [   37.438530]   8802a87c35e0 810a70b9 
>> 024080c0
>> [   37.438531] Call Trace:
>> [   37.438535]  [] dump_stack+0x4e/0x79
>> [   37.438538]  [] ___might_sleep+0x149/0x200
>> [   37.438540]  [] __might_sleep+0x49/0x80
>> [   37.438544]  [] kmem_cache_alloc_trace+0x20d/0x2e0
>> [   37.438600]  [] ? nouveau_fence_new+0x3b/0x90 [nouveau]
>> [   37.438624]  [] nouveau_fence_new+0x3b/0x90 [nouveau]
>> [   37.438649]  [] nouveau_channel_idle+0x42/0xb0 [nouveau]
>> [   37.438673]  [] nouveau_fbcon_sync+0x7f/0xb0 [nouveau]
>> [   37.438677]  [] cfb_imageblit+0x9a/0x4d0
>> [   37.438681]  [] ? trace_hardirqs_off_caller+0x1f/0xc0
>> [   37.438693]  [] drm_fb_helper_cfb_imageblit+0xe/0x10 
>> [drm_kms_helper]
>> [   37.438717]  [] nouveau_fbcon_imageblit+0x51/0xd0 
>> [nouveau]
>> [   37.438719]  [] bit_putcs+0x2dc/0x530
>> [   37.438721]  [] ? trace_hardirqs_off+0xd/0x10
>> [   37.438725]  [] ? get_color.isra.15+0x34/0x130
>> [   37.438727]  [] fbcon_putcs+0x128/0x160
>> [   37.438728]  [] ? bit_cursor+0x5e0/0x5e0
>> [   37.438730]  [] fbcon_redraw.isra.25+0x16b/0x1d0
>> [   37.438731]  [] fbcon_scroll+0x1ea/0xce0
>> [   37.438734]  [] scrup+0x14a/0x160
>> [   37.438736]  [] lf+0x80/0x90
>> [   37.438737]  [] vt_console_print+0x2a7/0x3e0
>> [   37.438739]  [] 
>> call_console_drivers.constprop.24+0x144/0x1d0
>> [   37.438741]  [] console_unlock+0x463/0x540
>> [   37.438742]  [] vprintk_emit+0x35a/0x620
>> [   37.438744]  [] vprintk_default+0x29/0x40
>> [   37.438748]  [] printk+0x4d/0x4f
>> [   37.438750]  [] audit_printk_skb+0x62/0x70
>> [   37.438751]  [] audit_log_end+0x1d4/0x2d0
>> [   37.438752]  [] ? audit_log_end+0x30/0x2d0
>> [   37.438754]  [] audit_log_config_change+0x89/0xa0
>> [   37.438756]  [] audit_receive_msg+0xa5a/0xbb0
>> [   37.438759]

[Intel-gfx] [PATCH] drm: Assert correct locking for drm_send_vblank_event

2014-09-12 Thread Peter Hurley

On 09/12/2014 01:25 PM, Daniel Vetter wrote:
> On Fri, Sep 12, 2014 at 01:03:51PM -0400, Peter Hurley wrote:
>> On 09/12/2014 12:04 PM, Chris Wilson wrote:
>>> On Fri, Sep 12, 2014 at 05:34:56PM +0200, Daniel Vetter wrote:
>>>> On Fri, Sep 12, 2014 at 04:23:29PM +0100, Chris Wilson wrote:
>>>>> On Fri, Sep 12, 2014 at 03:40:56PM +0200, Daniel Vetter wrote:
>>>>>> The comment says that the caller must hold the dev->event_lock
>>>>>> spinlock, so let's enforce this.
>>>>>>
>>>>>> A quick audit over all driver shows that except for the one place in
>>>>>> i915 which motivated this all callers fullfill this requirement
>>>>>> already.
>>>>>
>>>>> Replace the rogue WARN_ON_SMP(!spin_is_locked(&dev->event_lock)) in
>>>>> send_vblank_event() as well then.
>>>>
>>>> Meh, I've missed that one, that's actually better I think. I'll drop my
>>>> patch here.
>>>
>>> I thought assert_spin_lock was the preferred form?
>>
>> Actually, lockdep_assert_held() is the preferred form.
>>
>> See https://lkml.org/lkml/2014/9/3/171
> 
> Which unfortunately doesn't warn for all the normal users which are not
> insane enough to enable lockdep and so is totally useless to validate a
> driver that runs on metric piles of different chips (with a resulting
> combinatorial explosion of code-paths because hw designers are creative).
> And we rely a lot on random drive-by testers to report such issues.

I know. When I wrote [in that thread linked above]:

On Wed, Sep 03, 2014 at 10:50:01AM -0400, Peter Hurley wrote:
> So a lockdep-only assert is unlikely to draw attention to existing bugs,
> especially in established drivers.

here's the replies I got:

Peter Zijlstra  wrote:
> By the same logic lockdep will not find locking errors in established
> drivers.

and

On 09/04/2014 01:14 AM, Ingo Molnar wrote:
> Indeed, this patch is ill-advised in several ways:
> 
>   - it extends an API variant that we want to phase
> 
>   - emits a warning even if say lockdep has already emitted a
> warning and locking state is not guaranteed to be consistent. 
> 
>   - makes the kernel more expensive once fully debugged, in that
> non-fatal checks are unconditional.

:/

Regards,
Peter Hurley

[Intel-gfx] [PATCH] drm: Assert correct locking for drm_send_vblank_event

2014-09-12 Thread Peter Hurley

On 09/12/2014 12:04 PM, Chris Wilson wrote:
> On Fri, Sep 12, 2014 at 05:34:56PM +0200, Daniel Vetter wrote:
>> On Fri, Sep 12, 2014 at 04:23:29PM +0100, Chris Wilson wrote:
>>> On Fri, Sep 12, 2014 at 03:40:56PM +0200, Daniel Vetter wrote:
>>>> The comment says that the caller must hold the dev->event_lock
>>>> spinlock, so let's enforce this.
>>>>
>>>> A quick audit over all driver shows that except for the one place in
>>>> i915 which motivated this all callers fullfill this requirement
>>>> already.
>>>
>>> Replace the rogue WARN_ON_SMP(!spin_is_locked(&dev->event_lock)) in
>>> send_vblank_event() as well then.
>>
>> Meh, I've missed that one, that's actually better I think. I'll drop my
>> patch here.
> 
> I thought assert_spin_lock was the preferred form?

Actually, lockdep_assert_held() is the preferred form.

See https://lkml.org/lkml/2014/9/3/171

Regards,
Peter Hurley

page allocator bug in 3.16?

2014-09-25 Thread Peter Hurley

After several days uptime with a 3.16 kernel (generally running
Thunderbird, emacs, kernel builds, several Chrome tabs on multiple
desktop workspaces) I've been seeing some really extreme slowdowns.

Mostly the slowdowns are associated with gpu-related tasks, like
opening new emacs windows, switching workspaces, laughing at internet
gifs, etc. Because this x86_64 desktop is nouveau-based, I didn't pursue
it right away -- 3.15 is the first time suspend has worked reliably.

This week I started looking into what the slowdown was and discovered
it's happening during dma allocation through swiotlb (the cpus can do
intel iommu but I don't use it because it's not the default for most users).

I'm still working on a bisection but each step takes 8+ hours to
validate and even then I'm no longer sure I still have the 'bad'
commit in the bisection. [edit: yup, I started over]

I just discovered a smattering of these in my logs and only on 3.16-rc+ kernels:
Sep 25 07:57:59 thor kernel: [28786.001300] alloc_contig_range 
test_pages_isolated(2bf560, 2bf562) failed

This dual-Xeon box has 10GB and sysrq Show Memory isn't showing heavy
fragmentation [1].

Besides Mel's page allocator changes in 3.16, another suspect commit is:

commit b13b1d2d8692b437203de7a404c6b809d2cc4d99
Author: Shaohua Li 
Date:   Tue Apr 8 15:58:09 2014 +0800

x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead 
of flushing the TLB

Specifically, this statement:

It could cause incorrect page aging and the (mistaken) reclaim of
hot pages, but the chance of that should be relatively low.

I'm wondering if this could cause worse-case behavior with TTM? I'm
testing a revert of this on mainline 3.16-final now, with no results yet.

Thoughts?

Regards,
Peter Hurley

[1]
SysRq : Show Memory
Mem-Info:
Node 0 DMA per-cpu:
CPU0: hi:0, btch:   1 usd:   0
CPU1: hi:0, btch:   1 usd:   0
CPU2: hi:0, btch:   1 usd:   0
CPU3: hi:0, btch:   1 usd:   0
CPU4: hi:0, btch:   1 usd:   0
CPU5: hi:0, btch:   1 usd:   0
CPU6: hi:0, btch:   1 usd:   0
CPU7: hi:0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU0: hi:  186, btch:  31 usd:  18
CPU1: hi:  186, btch:  31 usd:  82
CPU2: hi:  186, btch:  31 usd:  46
CPU3: hi:  186, btch:  31 usd:  30
CPU4: hi:  186, btch:  31 usd:  18
CPU5: hi:  186, btch:  31 usd:  43
CPU6: hi:  186, btch:  31 usd: 157
CPU7: hi:  186, btch:  31 usd:  26
Node 0 Normal per-cpu:
CPU0: hi:  186, btch:  31 usd:  25
CPU1: hi:  186, btch:  31 usd:  33
CPU2: hi:  186, btch:  31 usd:  28
CPU3: hi:  186, btch:  31 usd:  46
CPU4: hi:  186, btch:  31 usd:  23
CPU5: hi:  186, btch:  31 usd:   8
CPU6: hi:  186, btch:  31 usd: 112
CPU7: hi:  186, btch:  31 usd:  18
active_anon:382833 inactive_anon:12103 isolated_anon:0
 active_file:1156997 inactive_file:733988 isolated_file:0
 unevictable:15 dirty:35833 writeback:0 unstable:0
 free:129383 slab_reclaimable:95038 slab_unreclaimable:11095
 mapped:81924 shmem:12509 pagetables:9039 bounce:0
 free_cma:0
Node 0 DMA free:15860kB min:104kB low:128kB high:156kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:15960kB managed:15876kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 2974 9980 9980
Node 0 DMA32 free:166712kB min:20108kB low:25132kB high:30160kB 
active_anon:475548kB inactive_anon:15204kB active_file:1368716kB 
inactive_file:865832kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:3127336kB managed:3048188kB mlocked:0kB dirty:38228kB writeback:0kB 
mapped:94340kB shmem:15436kB slab_reclaimable:116424kB 
slab_unreclaimable:12756kB kernel_stack:2512kB pagetables:11532kB unstable:0kB 
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 7006 7006
Node 0 Normal free:334960kB min:47368kB low:59208kB high:71052kB 
active_anon:1055784kB inactive_anon:33208kB active_file:3259272kB 
inactive_file:2070120kB unevictable:60kB isolated(anon):0kB isolated(file):0kB 
present:7340032kB managed:7174484kB mlocked:60kB dirty:105104kB writeback:0kB 
mapped:233356kB shmem:34600kB slab_reclaimable:263728kB 
slab_unreclaimable:31608kB kernel_stack:7344kB pagetables:24624kB unstable:0kB 
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 
1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15860kB
Node 0 DMA32: 209*4kB (UEM) 394*8kB (UEM) 303*16kB (UEM) 60*32kB (UEM) 314*64kB 
(UEM) 117*128kB (UEM) 9*256kB (EM) 3*512kB (UEM) 2*1

page allocator bug in 3.16?

2014-09-25 Thread Peter Hurley

On 09/25/2014 02:55 PM, Peter Hurley wrote:
> After several days uptime with a 3.16 kernel (generally running
> Thunderbird, emacs, kernel builds, several Chrome tabs on multiple
> desktop workspaces) I've been seeing some really extreme slowdowns.
> 
> Mostly the slowdowns are associated with gpu-related tasks, like
> opening new emacs windows, switching workspaces, laughing at internet
> gifs, etc. Because this x86_64 desktop is nouveau-based, I didn't pursue
> it right away -- 3.15 is the first time suspend has worked reliably.
> 
> This week I started looking into what the slowdown was and discovered
> it's happening during dma allocation through swiotlb (the cpus can do
> intel iommu but I don't use it because it's not the default for most users).
> 
> I'm still working on a bisection but each step takes 8+ hours to
> validate and even then I'm no longer sure I still have the 'bad'
> commit in the bisection. [edit: yup, I started over]
> 
> I just discovered a smattering of these in my logs and only on 3.16-rc+ 
> kernels:
> Sep 25 07:57:59 thor kernel: [28786.001300] alloc_contig_range 
> test_pages_isolated(2bf560, 2bf562) failed
> 
> This dual-Xeon box has 10GB and sysrq Show Memory isn't showing heavy
> fragmentation [1].

It's swapping, which is crazy because there's 7+GB of file cache [1] which
should be dropped before swapping.

The alloc_contig_range() failure precedes the swapping but not immediately
(44 mins. earlier).

How I reproduce this is to simply do a full distro kernel build.
Skipping the TLB flush is not the problem; the results below are from
3.16-final with that commit reverted.

The slowdown is really obvious because workspace switching redraw takes
multiple seconds to complete (all-cpu perf record of that below [2])

Regards,
Peter Hurley

[1]
SysRq : Show Memory
Mem-Info:
Node 0 DMA per-cpu:
CPU0: hi:0, btch:   1 usd:   0
CPU1: hi:0, btch:   1 usd:   0
CPU2: hi:0, btch:   1 usd:   0
CPU3: hi:0, btch:   1 usd:   0
CPU4: hi:0, btch:   1 usd:   0
CPU5: hi:0, btch:   1 usd:   0
CPU6: hi:0, btch:   1 usd:   0
CPU7: hi:0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU0: hi:  186, btch:  31 usd:  71
CPU1: hi:  186, btch:  31 usd: 166
CPU2: hi:  186, btch:  31 usd: 183
CPU3: hi:  186, btch:  31 usd: 109
CPU4: hi:  186, btch:  31 usd: 106
CPU5: hi:  186, btch:  31 usd: 161
CPU6: hi:  186, btch:  31 usd: 120
CPU7: hi:  186, btch:  31 usd:  54
Node 0 Normal per-cpu:
CPU0: hi:  186, btch:  31 usd: 159
CPU1: hi:  186, btch:  31 usd:  66
CPU2: hi:  186, btch:  31 usd: 178
CPU3: hi:  186, btch:  31 usd: 173
CPU4: hi:  186, btch:  31 usd:  91
CPU5: hi:  186, btch:  31 usd:  57
CPU6: hi:  186, btch:  31 usd:  58
CPU7: hi:  186, btch:  31 usd: 158
active_anon:170368 inactive_anon:173964 isolated_anon:0
 active_file:982209 inactive_file:973911 isolated_file:0
 unevictable:15 dirty:15 writeback:1 unstable:0
 free:96067 slab_reclaimable:107401 slab_unreclaimable:12572
 mapped:58271 shmem:10857 pagetables:9898 bounce:0
 free_cma:18
Node 0 DMA free:15860kB min:104kB low:128kB high:156kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:15960kB managed:15876kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 2974 9980 9980
Node 0 DMA32 free:117740kB min:20108kB low:25132kB high:30160kB 
active_anon:205232kB inactive_anon:196308kB active_file:1186764kB 
inactive_file:1173760kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:3127336kB managed:3048212kB mlocked:0kB dirty:24kB writeback:4kB 
mapped:71600kB shmem:8776kB slab_reclaimable:129132kB 
slab_unreclaimable:13468kB kernel_stack:2864kB pagetables:11536kB unstable:0kB 
bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 7006 7006
Node 0 Normal free:250668kB min:47368kB low:59208kB high:71052kB 
active_anon:476240kB inactive_anon:499548kB active_file:2742072kB 
inactive_file:2721884kB unevictable:60kB isolated(anon):0kB isolated(file):0kB 
present:7340032kB managed:7174484kB mlocked:60kB dirty:36kB writeback:0kB 
mapped:161484kB shmem:34652kB slab_reclaimable:300472kB 
slab_unreclaimable:36804kB kernel_stack:7232kB pagetables:28056kB unstable:0kB 
bounce:0kB free_cma:72kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 
1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15860kB
Node 0 DMA32: 4099*4kB (UEM) 4372*8kB (UEM) 668*16kB (UEM)

page allocator bug in 3.16?

2014-09-25 Thread Peter Hurley

On 09/25/2014 04:33 PM, Alex Deucher wrote:
> On Thu, Sep 25, 2014 at 2:55 PM, Peter Hurley  
> wrote:
>> After several days uptime with a 3.16 kernel (generally running
>> Thunderbird, emacs, kernel builds, several Chrome tabs on multiple
>> desktop workspaces) I've been seeing some really extreme slowdowns.
>>
>> Mostly the slowdowns are associated with gpu-related tasks, like
>> opening new emacs windows, switching workspaces, laughing at internet
>> gifs, etc. Because this x86_64 desktop is nouveau-based, I didn't pursue
>> it right away -- 3.15 is the first time suspend has worked reliably.
>>
>> This week I started looking into what the slowdown was and discovered
>> it's happening during dma allocation through swiotlb (the cpus can do
>> intel iommu but I don't use it because it's not the default for most users).
>>
>> I'm still working on a bisection but each step takes 8+ hours to
>> validate and even then I'm no longer sure I still have the 'bad'
>> commit in the bisection. [edit: yup, I started over]
>>
>> I just discovered a smattering of these in my logs and only on 3.16-rc+ 
>> kernels:
>> Sep 25 07:57:59 thor kernel: [28786.001300] alloc_contig_range 
>> test_pages_isolated(2bf560, 2bf562) failed
>>
>> This dual-Xeon box has 10GB and sysrq Show Memory isn't showing heavy
>> fragmentation [1].
>>
>> Besides Mel's page allocator changes in 3.16, another suspect commit is:
>>
>> commit b13b1d2d8692b437203de7a404c6b809d2cc4d99
>> Author: Shaohua Li 
>> Date:   Tue Apr 8 15:58:09 2014 +0800
>>
>> x86/mm: In the PTE swapout page reclaim case clear the accessed bit 
>> instead of flushing the TLB
>>
>> Specifically, this statement:
>>
>> It could cause incorrect page aging and the (mistaken) reclaim of
>> hot pages, but the chance of that should be relatively low.
>>
>> I'm wondering if this could cause worse-case behavior with TTM? I'm
>> testing a revert of this on mainline 3.16-final now, with no results yet.
>>
>> Thoughts?
> 
> You may also be seeing this:
> https://lkml.org/lkml/2014/8/8/445

Thanks Alex. That is indeed the problem.

Still reading the email thread to find out where the patches
are that fix this. Although it doesn't make much sense to me
that nouveau sets up a 1GB GART and then uses TTM which is
trying to shove all the DMA through a 16MB CMA window
(which turns out to be the base Ubuntu config).

Regards,
Peter Hurley

page allocator bug in 3.16?

2014-09-25 Thread Peter Hurley

On 09/25/2014 03:35 PM, Chuck Ebbert wrote:
> There are six ttm patches queued for 3.16.4:
> 
> drm-ttm-choose-a-pool-to-shrink-correctly-in-ttm_dma_pool_shrink_scan.patch
> drm-ttm-fix-handling-of-ttm_pl_flag_topdown-v2.patch
> drm-ttm-fix-possible-division-by-0-in-ttm_dma_pool_shrink_scan.patch
> drm-ttm-fix-possible-stack-overflow-by-recursive-shrinker-calls.patch
> drm-ttm-pass-gfp-flags-in-order-to-avoid-deadlock.patch
> drm-ttm-use-mutex_trylock-to-avoid-deadlock-inside-shrinker-functions.patch

Thanks for info, Chuck.

Unfortunately, none of these fix TTM dma allocation doing CMA dma allocation,
which is the root problem.

Regards,
Peter Hurley

page allocator bug in 3.16?

2014-09-26 Thread Peter Hurley

[ +cc Leann Ogasawara, Marek Szyprowski, Kyungmin Park, Arnd Bergmann ]

On 09/26/2014 08:40 AM, Rik van Riel wrote:
> On 09/26/2014 08:28 AM, Rob Clark wrote:
>> On Fri, Sep 26, 2014 at 6:45 AM, Thomas Hellstrom
>>  wrote:
>>> On 09/26/2014 12:40 PM, Chuck Ebbert wrote:
>>>> On Fri, 26 Sep 2014 09:15:57 +0200 Thomas Hellstrom
>>>>  wrote:
>>>>
>>>>> On 09/26/2014 01:52 AM, Peter Hurley wrote:
>>>>>> On 09/25/2014 03:35 PM, Chuck Ebbert wrote:
>>>>>>> There are six ttm patches queued for 3.16.4:
>>>>>>>
>>>>>>> drm-ttm-choose-a-pool-to-shrink-correctly-in-ttm_dma_pool_shrink_scan.patch
>>>>>>>
>>>>>>>
> drm-ttm-fix-handling-of-ttm_pl_flag_topdown-v2.patch
>>>>>>> drm-ttm-fix-possible-division-by-0-in-ttm_dma_pool_shrink_scan.patch
>>>>>>>
>>>>>>>
> drm-ttm-fix-possible-stack-overflow-by-recursive-shrinker-calls.patch
>>>>>>> drm-ttm-pass-gfp-flags-in-order-to-avoid-deadlock.patch 
>>>>>>> drm-ttm-use-mutex_trylock-to-avoid-deadlock-inside-shrinker-functions.patch
>>>>>>
>>>>>>>
> Thanks for info, Chuck.
>>>>>>
>>>>>> Unfortunately, none of these fix TTM dma allocation doing
>>>>>> CMA dma allocation, which is the root problem.
>>>>>>
>>>>>> Regards, Peter Hurley
>>>>> The problem is not really in TTM but in CMA, There was a guy
>>>>> offering to fix this in the CMA code but I guess he didn't
>>>>> probably because he didn't receive any feedback.
>>>>>
>>>> Yeah, the "solution" to this problem seems to be "don't enable
>>>> CMA on x86". Maybe it should even be disabled in the config
>>>> system.
>>> Or, as previously suggested, don't use CMA for order 0 (single
>>> page) allocations
>>
>> On devices that actually need CMA pools to arrange for memory to be
>> in certain ranges, I think you probably do want to have order 0
>> pages come from the CMA pool.
>>
>> Seems like disabling CMA on x86 (where it should be unneeded) is
>> the better way, IMO
> 
> CMA has its uses on x86. For example, CMA is used to allocate 1GB huge
> pages.
> 
> There may also be people with devices that do not scatter-gather, and
> need a large physically contiguous buffer, though there should be
> relatively few of those on x86.
> 
> I suspect it makes most sense to do DMA allocations up to PAGE_ORDER
> through the normal allocator on x86, and only invoking CMA for larger
> allocations.

The code that uses CMA to satisfy DMA allocations on x86 is
specific to the x86 arch and was added in 2011 as a means of _testing_
CMA in KVM:

commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6
Author: Marek Szyprowski 
Date:   Thu Dec 29 13:09:51 2011 +0100

X86: integrate CMA with DMA-mapping subsystem

This patch adds support for CMA to dma-mapping subsystem for x86
architecture that uses common pci-dma/pci-nommu implementation. This
allows to test CMA on KVM/QEMU and a lot of common x86 boxes.

Signed-off-by: Marek Szyprowski 
Signed-off-by: Kyungmin Park 
CC: Michal Nazarewicz 
Acked-by: Arnd Bergmann 

(no x86 maintainer acks?).

Unfortunately, this code is enabled whenever CMA is enabled, rather
than as a separate test configuration.

So, while enabling CMA may have other purposes on x86, using it for
x86 swiotlb and nommu dma allocations is not one of the them.

And Ubuntu should not be enabling CONFIG_DMA_CMA for their i386
and amd64 configurations, as this is trying to drive _all_ dma mapping
allocations through a _very_ small window (which is killing GPU
performance).

Regards,
Peter Hurley

page allocator bug in 3.16?

2014-09-27 Thread Peter Hurley

On 09/26/2014 11:12 AM, Leann Ogasawara wrote:
> On Fri, Sep 26, 2014 at 7:10 AM, Peter Hurley  
> wrote:
>> [ +cc Leann Ogasawara, Marek Szyprowski, Kyungmin Park, Arnd Bergmann ]
>>
>> On 09/26/2014 08:40 AM, Rik van Riel wrote:
>>> On 09/26/2014 08:28 AM, Rob Clark wrote:
>>>> On Fri, Sep 26, 2014 at 6:45 AM, Thomas Hellstrom
>>>>  wrote:
>>>>> On 09/26/2014 12:40 PM, Chuck Ebbert wrote:
>>>>>> On Fri, 26 Sep 2014 09:15:57 +0200 Thomas Hellstrom
>>>>>>  wrote:
>>>>>>
>>>>>>> On 09/26/2014 01:52 AM, Peter Hurley wrote:
>>>>>>>> On 09/25/2014 03:35 PM, Chuck Ebbert wrote:
>>>>>>>>> There are six ttm patches queued for 3.16.4:
>>>>>>>>>
>>>>>>>>> drm-ttm-choose-a-pool-to-shrink-correctly-in-ttm_dma_pool_shrink_scan.patch
>>>>>>>>>
>>>>>>>>>
>>> drm-ttm-fix-handling-of-ttm_pl_flag_topdown-v2.patch
>>>>>>>>> drm-ttm-fix-possible-division-by-0-in-ttm_dma_pool_shrink_scan.patch
>>>>>>>>>
>>>>>>>>>
>>> drm-ttm-fix-possible-stack-overflow-by-recursive-shrinker-calls.patch
>>>>>>>>> drm-ttm-pass-gfp-flags-in-order-to-avoid-deadlock.patch
>>>>>>>>> drm-ttm-use-mutex_trylock-to-avoid-deadlock-inside-shrinker-functions.patch
>>>>>>>>
>>>>>>>>>
>>> Thanks for info, Chuck.
>>>>>>>>
>>>>>>>> Unfortunately, none of these fix TTM dma allocation doing
>>>>>>>> CMA dma allocation, which is the root problem.
>>>>>>>>
>>>>>>>> Regards, Peter Hurley
>>>>>>> The problem is not really in TTM but in CMA, There was a guy
>>>>>>> offering to fix this in the CMA code but I guess he didn't
>>>>>>> probably because he didn't receive any feedback.
>>>>>>>
>>>>>> Yeah, the "solution" to this problem seems to be "don't enable
>>>>>> CMA on x86". Maybe it should even be disabled in the config
>>>>>> system.
>>>>> Or, as previously suggested, don't use CMA for order 0 (single
>>>>> page) allocations
>>>>
>>>> On devices that actually need CMA pools to arrange for memory to be
>>>> in certain ranges, I think you probably do want to have order 0
>>>> pages come from the CMA pool.
>>>>
>>>> Seems like disabling CMA on x86 (where it should be unneeded) is
>>>> the better way, IMO
>>>
>>> CMA has its uses on x86. For example, CMA is used to allocate 1GB huge
>>> pages.
>>>
>>> There may also be people with devices that do not scatter-gather, and
>>> need a large physically contiguous buffer, though there should be
>>> relatively few of those on x86.
>>>
>>> I suspect it makes most sense to do DMA allocations up to PAGE_ORDER
>>> through the normal allocator on x86, and only invoking CMA for larger
>>> allocations.
>>
>> The code that uses CMA to satisfy DMA allocations on x86 is
>> specific to the x86 arch and was added in 2011 as a means of _testing_
>> CMA in KVM:
>>
>> commit 0a2b9a6ea93650b8a00f9fd5ee8fdd25671e2df6
>> Author: Marek Szyprowski 
>> Date:   Thu Dec 29 13:09:51 2011 +0100
>>
>> X86: integrate CMA with DMA-mapping subsystem
>>
>> This patch adds support for CMA to dma-mapping subsystem for x86
>> architecture that uses common pci-dma/pci-nommu implementation. This
>> allows to test CMA on KVM/QEMU and a lot of common x86 boxes.
>>
>> Signed-off-by: Marek Szyprowski 
>> Signed-off-by: Kyungmin Park 
>> CC: Michal Nazarewicz 
>> Acked-by: Arnd Bergmann 
>>
>> (no x86 maintainer acks?).
>>
>> Unfortunately, this code is enabled whenever CMA is enabled, rather
>> than as a separate test configuration.
>>
>> So, while enabling CMA may have other purposes on x86, using it for
>> x86 swiotlb and nommu dma allocations is not one of the them.
>>
>> And Ubuntu should not be enabling CONFIG_DMA_CMA for their i386
>> and amd64 configurations, as this is trying to drive _all_ dma mapping
>> allocations through a _very_ small window (which is killing GPU
>> performance).
> 
> Thanks for the note Peter.  We do have this disabled for our upcoming
> Ubuntu 14.10 release.  It is however still enabled in the previous 14.04
> release.  We have been tracking this in
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1362261 but users
> able to reproduce performance impacts in 14.10 were unable to reproduce
> in 14.04 which is why we hadn't yet disabled it there.

Leann,

Thanks for that important clue.

The missing piece specific to 3.16+ is these patches which impact every
iommu config:

Akinobu Mita (5):
  x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled
  x86: enable DMA CMA with swiotlb
  intel-iommu: integrate DMA CMA
  memblock: introduce memblock_alloc_range()
  cma: add placement specifier for "cma=" kernel parameter

These patches take the pre-existing nommu CMA test configuration and
hook it up to all the x86 iommus, effectively reducing 10GB of DMA-able
memory to 64MB, and hooks it all up to an allocator that's not nearly as
effective as the page allocator.

All to enable DMA allocation below 4GB which is already supported with the
GFP_DMA32 flag to dma_alloc_coherent().

Regards,
Peter Hurley

[Nouveau] [PATCH] drm/nouveau: fix vblank deadlock

2013-08-19 Thread Peter Hurley

On 08/12/2013 07:50 AM, Maarten Lankhorst wrote:
> This fixes a deadlock inversion when vblank is enabled/disabled by drm.
> &dev->vblank_time_lock is always taken when the vblank state is toggled,
> which caused a deadlock when &event->lock was also taken during
> event_get/put.
>
> Solve the race by requiring that lock to change enable/disable state,
> and always keeping vblank on the event list. Core drm ignores unwanted
> vblanks, so extra calls to drm_handle_vblank are harmless.

I don't feel this is the appropriate solution to the lock inversion
between vblank_time_lock and event->lock.

Preferably drm core should correct the interface layer bug; ie., calling
into a sub-driver holding a lock _and_ requiring the sub-driver to call a
drm helper function which claims the same lock is bad design. The console
lock suffers from the same design flaw and is a constant problem.

Alternatively, the event trigger could be lockless; ie., the event list
could be an RCU list instead. In this way, the event->lock does not need
to be claimed, and thus no lock inversion is possible. The main drawback
here is that currently the event->lock enforces non-overlapping lifetimes
between the event handler and the event. Untangling object lifetimes in
nouveau is a non-trivial exercise.

Regards,
Peter Hurley

> Signed-off-by: Maarten Lankhorst 
> ---
> diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
> b/drivers/gpu/drm/nouveau/core/core/event.c
> index 7eb81c1..78bff7c 100644
> --- a/drivers/gpu/drm/nouveau/core/core/event.c
> +++ b/drivers/gpu/drm/nouveau/core/core/event.c
> @@ -23,43 +23,64 @@
>   #include 
>   #include 
>
> -static void
> -nouveau_event_put_locked(struct nouveau_event *event, int index,
> -  struct nouveau_eventh *handler)
> -{
> - if (!--event->index[index].refs) {
> - if (event->disable)
> - event->disable(event, index);
> - }
> - list_del(&handler->head);
> -}
> -
>   void
>   nouveau_event_put(struct nouveau_event *event, int index,
> struct nouveau_eventh *handler)
>   {
>   unsigned long flags;
>
> + if (index >= event->index_nr)
> + return;
> +
>   spin_lock_irqsave(&event->lock, flags);
> - if (index < event->index_nr)
> - nouveau_event_put_locked(event, index, handler);
> + list_del(&handler->head);
> +
> + if (event->toggle_lock)
> + spin_lock(event->toggle_lock);
> + nouveau_event_disable_locked(event, index, 1);
> + if (event->toggle_lock)
> + spin_unlock(event->toggle_lock);
> +
>   spin_unlock_irqrestore(&event->lock, flags);
>   }
>
>   void
> +nouveau_event_enable_locked(struct nouveau_event *event, int index)
> +{
> + if (index >= event->index_nr)
> + return;
> +
> + if (!event->index[index].refs++ && event->enable)
> + event->enable(event, index);
> +}
> +
> +void
> +nouveau_event_disable_locked(struct nouveau_event *event, int index, int 
> refs)
> +{
> + if (index >= event->index_nr)
> + return;
> +
> + event->index[index].refs -= refs;
> + if (!event->index[index].refs && event->disable)
> + event->disable(event, index);
> +}
> +
> +void
>   nouveau_event_get(struct nouveau_event *event, int index,
> struct nouveau_eventh *handler)
>   {
>   unsigned long flags;
>
> + if (index >= event->index_nr)
> + return;
> +
>   spin_lock_irqsave(&event->lock, flags);
> - if (index < event->index_nr) {
> - list_add(&handler->head, &event->index[index].list);
> - if (!event->index[index].refs++) {
> - if (event->enable)
> - event->enable(event, index);
> - }
> - }
> + list_add(&handler->head, &event->index[index].list);
> + if (event->toggle_lock)
> + spin_lock(event->toggle_lock);
> + nouveau_event_enable_locked(event, index);
> + if (event->toggle_lock)
> + spin_unlock(event->toggle_lock);
>   spin_unlock_irqrestore(&event->lock, flags);
>   }
>
> @@ -68,6 +89,7 @@ nouveau_event_trigger(struct nouveau_event *event, int 
> index)
>   {
>   struct nouveau_eventh *handler, *temp;
>   unsigned long flags;
> + int refs = 0;
>
>   if (index >= event->index_nr)
>   return;
> @@ -75,9 +97,17 @@ nouveau_event_

[PATCH] drm: Use correct spinlock flavor in drm_vblank_get()

2013-08-20 Thread Peter Hurley

The irq flags state is already established by the outer
spin_lock_irqsave(); re-disabling irqs is redundant.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/drm_irq.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index f92da0a..c7d4af5 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -952,13 +952,13 @@ static void drm_update_vblank_count(struct drm_device 
*dev, int crtc)
  */
 int drm_vblank_get(struct drm_device *dev, int crtc)
 {
-   unsigned long irqflags, irqflags2;
+   unsigned long irqflags;
int ret = 0;

spin_lock_irqsave(&dev->vbl_lock, irqflags);
/* Going from 0->1 means we have to enable interrupts again */
if (atomic_add_return(1, &dev->vblank_refcount[crtc]) == 1) {
-   spin_lock_irqsave(&dev->vblank_time_lock, irqflags2);
+   spin_lock(&dev->vblank_time_lock);
if (!dev->vblank_enabled[crtc]) {
/* Enable vblank irqs under vblank_time_lock protection.
 * All vblank count & timestamp updates are held off
@@ -976,7 +976,7 @@ int drm_vblank_get(struct drm_device *dev, int crtc)
drm_update_vblank_count(dev, crtc);
}
}
-   spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags2);
+   spin_unlock(&dev->vblank_time_lock);
} else {
if (!dev->vblank_enabled[crtc]) {
atomic_dec(&dev->vblank_refcount[crtc]);
-- 
1.8.1.2

[PATCH 0/9] drm/nouveau: Cleanup event/handler design

2013-08-27 Thread Peter Hurley

This series was originally motivated by a deadlock, introduced in
commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
'drm/nouveau/disp: port vblank handling to event interface',
due to inverted lock order between nouveau_drm_vblank_enable()
and nouveau_drm_vblank_handler() (the complete
lockdep report is included in the patch 4/5 changelog).

Because this series fixes the vblank event deadlock, it is
a competing solution to Maarten Lankhorst's 'drm/nouveau:
fix vblank deadlock'.

This series fixes the vblank event deadlock by converting the
event handler list to RCU. This solution allows the event trigger
to be lockless, and thus avoiding the lock inversion.

Typical hurdles to RCU conversion include: 1) ensuring the
object lifetime exceeds the lockless access; 2) preventing
premature object reuse; and 3) verifying that stale object
use not present logical errors.

Because object reuse is not safe in RCU-based operations,
the nouveau_event_get/_put interface is migrated from
add/remove semantics to enable/disable semantics with a separate
install/remove step (which also serves to document the
handler lifetime). This also corrects an unsafe interface design
where handlers can mistakenly be reused while in use.

The nouveau driver currently supports 4 events -- gpio, uevent, cevent,
and vblank. Every event is created by and stored within its
respective subdev/engine object -- gpio in the GPIO subdev, uevent
and cevent in the FIFO engine, and vblank in the DISP engine.
Thus event lifetime extends until the subdev is destructed during
devobj teardown.

Event handler lifetime varies and is detailed in the table below
(apologies for the wide-format):

Event  Handler function   Container Until
-     ---   
--
gpio   nouveau_connector_hotplug  struct nouveau_connector  
nouveau_connector_destroy
uevent nouveau_fence_wait_uevent_handler  local stack object
nouveau_fence_wait_uevent returns
cevent none   n/a   n/a
vblank nouveau_drm_vblank_handler struct nouveau_drm
nouveau_drm_remove
vblank nv50_software_vblsem_release   struct nouveau_software_chan  
_nouveau_engctx_dtor
  (call 
stack originates with
   
nouveau_abi16_chan_free ioctl)
vblank nvc0_software_vblsem_release   struct nouveau_software_chan  same as 
above


RCU lifetime considerations for handlers:

Event  HandlerLifetime
-     -
gpio   nouveau_connector_hotplug  kfree_rcu(nv_connector)
uevent nouveau_fence_wait_uevent_handler  explicit use of 
nouveau_event_hander_create/_destroy
cevent none   n/a
vblank nouveau_drm_vblank_handler synchronize_rcu() in 
nouveau_drm_unload
vblank nv50_software_vblsem_release   synchronize_rcu() in container dtor
vblank nvc0_software_vblsem_release   synchronize_rcu() in container dtor

synchronize_rcu() is used for nouveau_object-based containers for which
kfree_rcu() is not suitable/possible.

Stale event handler execution is not a concern for the existing handlers
because either: 1) the handler is responsible for disabling itself (via
returning NVKM_EVENT_DROP), or 2) the existing handler can already be stale,
as is the case with nouveau_connector_hotplug, which only schedules a work
item, and nouveau_drm_vblank_handler, which the drm core expects may be stale.


Peter Hurley (9):
  drm/nouveau: Add priv field for event handlers
  drm/nouveau: Move event index check from critical section
  drm/nouveau: Allocate local event handlers
  drm/nouveau: Allow asymmetric nouveau_event_get/_put
  drm/nouveau: Add install/remove semantics for event handlers
  drm/nouveau: Convert event handler list to RCU
  drm/nouveau: Fold nouveau_event_put_locked into caller
  drm/nouveau: Simplify event interface
  drm/nouveau: Simplify event handler interface

 drivers/gpu/drm/nouveau/core/core/event.c  | 121 +
 .../gpu/drm/nouveau/core/engine/software/nv50.c|  32 --
 .../gpu/drm/nouveau/core/engine/software/nvc0.c|  32 --
 drivers/gpu/drm/nouveau/core/include/core/event.h  |  27 -
 .../gpu/drm/nouveau/core/include/engine/software.h |   2 +-
 drivers/gpu/drm/nouveau/nouveau_connector.c|  16 ++-
 drivers/gpu/drm/nouveau/nouveau_display.c  |  16 +--
 drivers/gpu/drm/nouveau/nouveau_drm.c  |  35 +++---
 drivers/gpu/drm/nouveau/nouveau_fence.c|  27 ++---
 9 files changed, 220 insertions(+), 88 deletions(-)

-- 
1.8.1.2

[PATCH 1/9] drm/nouveau: Add priv field for event handlers

2013-08-27 Thread Peter Hurley

Provide private field for event handlers exclusive use.
Convert nouveau_fence_wait_uevent() and
nouveau_fence_wait_uevent_handler(); drop struct nouveau_fence_uevent.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/include/core/event.h |  1 +
 drivers/gpu/drm/nouveau/nouveau_fence.c   | 20 +++-
 2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/include/core/event.h 
b/drivers/gpu/drm/nouveau/core/include/core/event.h
index 9e09440..ad4d8c2 100644
--- a/drivers/gpu/drm/nouveau/core/include/core/event.h
+++ b/drivers/gpu/drm/nouveau/core/include/core/event.h
@@ -7,6 +7,7 @@

 struct nouveau_eventh {
struct list_head head;
+   void *priv;
int (*func)(struct nouveau_eventh *, int index);
 };

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index be31499..c2e3167 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -165,17 +165,11 @@ nouveau_fence_done(struct nouveau_fence *fence)
return !fence->channel;
 }

-struct nouveau_fence_uevent {
-   struct nouveau_eventh handler;
-   struct nouveau_fence_priv *priv;
-};
-
 static int
-nouveau_fence_wait_uevent_handler(struct nouveau_eventh *event, int index)
+nouveau_fence_wait_uevent_handler(struct nouveau_eventh *handler, int index)
 {
-   struct nouveau_fence_uevent *uevent =
-   container_of(event, struct nouveau_fence_uevent, handler);
-   wake_up_all(&uevent->priv->waiting);
+   struct nouveau_fence_priv *priv = handler->priv;
+   wake_up_all(&priv->waiting);
return NVKM_EVENT_KEEP;
 }

@@ -186,13 +180,13 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, 
bool intr)
struct nouveau_channel *chan = fence->channel;
struct nouveau_fifo *pfifo = nouveau_fifo(chan->drm->device);
struct nouveau_fence_priv *priv = chan->drm->fence;
-   struct nouveau_fence_uevent uevent = {
-   .handler.func = nouveau_fence_wait_uevent_handler,
+   struct nouveau_eventh handler = {
+   .func = nouveau_fence_wait_uevent_handler,
.priv = priv,
};
int ret = 0;

-   nouveau_event_get(pfifo->uevent, 0, &uevent.handler);
+   nouveau_event_get(pfifo->uevent, 0, &handler);

if (fence->timeout) {
unsigned long timeout = fence->timeout - jiffies;
@@ -224,7 +218,7 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, bool 
intr)
}
}

-   nouveau_event_put(pfifo->uevent, 0, &uevent.handler);
+   nouveau_event_put(pfifo->uevent, 0, &handler);
if (unlikely(ret < 0))
return ret;

-- 
1.8.1.2

[PATCH 2/9] drm/nouveau: Move event index check from critical section

2013-08-27 Thread Peter Hurley

The index_nr field is constant for the lifetime of the event, so
serialized access is unnecessary.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 7eb81c1..e69c463 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -40,9 +40,11 @@ nouveau_event_put(struct nouveau_event *event, int index,
 {
unsigned long flags;

+   if (index >= event->index_nr)
+   return;
+
spin_lock_irqsave(&event->lock, flags);
-   if (index < event->index_nr)
-   nouveau_event_put_locked(event, index, handler);
+   nouveau_event_put_locked(event, index, handler);
spin_unlock_irqrestore(&event->lock, flags);
 }

@@ -52,13 +54,14 @@ nouveau_event_get(struct nouveau_event *event, int index,
 {
unsigned long flags;

+   if (index >= event->index_nr)
+   return;
+
spin_lock_irqsave(&event->lock, flags);
-   if (index < event->index_nr) {
-   list_add(&handler->head, &event->index[index].list);
-   if (!event->index[index].refs++) {
-   if (event->enable)
-   event->enable(event, index);
-   }
+   list_add(&handler->head, &event->index[index].list);
+   if (!event->index[index].refs++) {
+   if (event->enable)
+   event->enable(event, index);
}
spin_unlock_irqrestore(&event->lock, flags);
 }
-- 
1.8.1.2

[PATCH 3/9] drm/nouveau: Allocate local event handlers

2013-08-27 Thread Peter Hurley

Prepare for transition to RCU-based event handler list;
since RCU list traversal may have stale pointers, local
storage may go out of scope before handler completes.

Introduce nouveau_event_handler_create/_destroy which provides
suitable semantics for multiple, temporary event handlers.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 24 +++
 drivers/gpu/drm/nouveau/core/include/core/event.h |  6 ++
 drivers/gpu/drm/nouveau/nouveau_fence.c   | 15 +++---
 3 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index e69c463..1a8d685 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -23,6 +23,30 @@
 #include 
 #include 

+int
+nouveau_event_handler_create(struct nouveau_event *event, int index,
+int (*func)(struct nouveau_eventh*, int),
+void *priv, struct nouveau_eventh **phandler)
+{
+   struct nouveau_eventh *handler;
+
+   handler = *phandler = kzalloc(sizeof(*handler), GFP_KERNEL);
+   if (!handler)
+   return -ENOMEM;
+   handler->func = func;
+   handler->priv = priv;
+   nouveau_event_get(event, index, handler);
+   return 0;
+}
+
+void
+nouveau_event_handler_destroy(struct nouveau_event *event, int index,
+ struct nouveau_eventh *handler)
+{
+   nouveau_event_put(event, index, handler);
+   kfree(handler);
+}
+
 static void
 nouveau_event_put_locked(struct nouveau_event *event, int index,
 struct nouveau_eventh *handler)
diff --git a/drivers/gpu/drm/nouveau/core/include/core/event.h 
b/drivers/gpu/drm/nouveau/core/include/core/event.h
index ad4d8c2..bdf1a0a 100644
--- a/drivers/gpu/drm/nouveau/core/include/core/event.h
+++ b/drivers/gpu/drm/nouveau/core/include/core/event.h
@@ -34,4 +34,10 @@ void nouveau_event_get(struct nouveau_event *, int index,
 void nouveau_event_put(struct nouveau_event *, int index,
   struct nouveau_eventh *);

+int nouveau_event_handler_create(struct nouveau_event *, int index,
+int (*func)(struct nouveau_eventh*, int),
+void *priv, struct nouveau_eventh **);
+void nouveau_event_handler_destroy(struct nouveau_event *, int index,
+  struct nouveau_eventh *);
+
 #endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index c2e3167..6dde483 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -180,13 +180,14 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, 
bool intr)
struct nouveau_channel *chan = fence->channel;
struct nouveau_fifo *pfifo = nouveau_fifo(chan->drm->device);
struct nouveau_fence_priv *priv = chan->drm->fence;
-   struct nouveau_eventh handler = {
-   .func = nouveau_fence_wait_uevent_handler,
-   .priv = priv,
-   };
-   int ret = 0;
+   struct nouveau_eventh *handler;
+   int ret;

-   nouveau_event_get(pfifo->uevent, 0, &handler);
+   ret = nouveau_event_handler_create(pfifo->uevent, 0,
+  nouveau_fence_wait_uevent_handler,
+  priv, &handler);
+   if (ret)
+   return ret;

if (fence->timeout) {
unsigned long timeout = fence->timeout - jiffies;
@@ -218,7 +219,7 @@ nouveau_fence_wait_uevent(struct nouveau_fence *fence, bool 
intr)
}
}

-   nouveau_event_put(pfifo->uevent, 0, &handler);
+   nouveau_event_handler_destroy(pfifo->uevent, 0, handler);
if (unlikely(ret < 0))
return ret;

-- 
1.8.1.2

[PATCH 4/9] drm/nouveau: Allow asymmetric nouveau_event_get/_put

2013-08-27 Thread Peter Hurley

Most nouveau event handlers have storage in 'static' containers
(structures with lifetimes nearly equivalent to the drm_device),
but are dangerously reused via nouveau_event_get/_put. For
example, if nouveau_event_get is called more than once for a
given handler, the event handler list will be corrupted.

Migrate nouveau_event_get/_put from add/remove semantics to
enable/disable semantics.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 20 
 drivers/gpu/drm/nouveau/core/include/core/event.h |  4 
 drivers/gpu/drm/nouveau/nouveau_drm.c |  8 ++--
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 1a8d685..0a65ede 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -51,11 +51,13 @@ static void
 nouveau_event_put_locked(struct nouveau_event *event, int index,
 struct nouveau_eventh *handler)
 {
-   if (!--event->index[index].refs) {
-   if (event->disable)
-   event->disable(event, index);
+   if (__test_and_clear_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
+   if (!--event->index[index].refs) {
+   if (event->disable)
+   event->disable(event, index);
+   }
+   list_del(&handler->head);
}
-   list_del(&handler->head);
 }

 void
@@ -82,10 +84,12 @@ nouveau_event_get(struct nouveau_event *event, int index,
return;

spin_lock_irqsave(&event->lock, flags);
-   list_add(&handler->head, &event->index[index].list);
-   if (!event->index[index].refs++) {
-   if (event->enable)
-   event->enable(event, index);
+   if (!__test_and_set_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
+   list_add(&handler->head, &event->index[index].list);
+   if (!event->index[index].refs++) {
+   if (event->enable)
+   event->enable(event, index);
+   }
}
spin_unlock_irqrestore(&event->lock, flags);
 }
diff --git a/drivers/gpu/drm/nouveau/core/include/core/event.h 
b/drivers/gpu/drm/nouveau/core/include/core/event.h
index bdf1a0a..3e704d5 100644
--- a/drivers/gpu/drm/nouveau/core/include/core/event.h
+++ b/drivers/gpu/drm/nouveau/core/include/core/event.h
@@ -5,8 +5,12 @@
 #define NVKM_EVENT_DROP 0
 #define NVKM_EVENT_KEEP 1

+/* nouveau_eventh.flags bit #s */
+#define NVKM_EVENT_ENABLE 0
+
 struct nouveau_eventh {
struct list_head head;
+   unsigned long flags;
void *priv;
int (*func)(struct nouveau_eventh *, int index);
 };
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index b29d04b..ccea2c4 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -88,7 +88,6 @@ nouveau_drm_vblank_enable(struct drm_device *dev, int head)

if (WARN_ON_ONCE(head > ARRAY_SIZE(drm->vblank)))
return -EIO;
-   WARN_ON_ONCE(drm->vblank[head].func);
drm->vblank[head].func = nouveau_drm_vblank_handler;
nouveau_event_get(pdisp->vblank, head, &drm->vblank[head]);
return 0;
@@ -99,11 +98,8 @@ nouveau_drm_vblank_disable(struct drm_device *dev, int head)
 {
struct nouveau_drm *drm = nouveau_drm(dev);
struct nouveau_disp *pdisp = nouveau_disp(drm->device);
-   if (drm->vblank[head].func)
-   nouveau_event_put(pdisp->vblank, head, &drm->vblank[head]);
-   else
-   WARN_ON_ONCE(1);
-   drm->vblank[head].func = NULL;
+
+   nouveau_event_put(pdisp->vblank, head, &drm->vblank[head]);
 }

 static u64
-- 
1.8.1.2

[PATCH 5/9] drm/nouveau: Add install/remove semantics for event handlers

2013-08-27 Thread Peter Hurley

Complete migration of nouveau_event_get/_put from add/remove
semantics to enable/disable semantics.

Introduce nouveau_event_handler_install/_remove interface to
add/remove non-local-scope event handlers (ie., those stored in
parent containers). This change in semantics makes explicit the
handler lifetime, and distinguishes "one-of" event handlers
(such as gpio) from "many temporary" event handlers (such as uevent).

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c  | 63 +++---
 .../gpu/drm/nouveau/core/engine/software/nv50.c| 31 +--
 .../gpu/drm/nouveau/core/engine/software/nvc0.c| 31 +--
 drivers/gpu/drm/nouveau/core/include/core/event.h  |  6 +++
 .../gpu/drm/nouveau/core/include/engine/software.h |  2 +-
 drivers/gpu/drm/nouveau/nouveau_connector.c| 10 +++-
 drivers/gpu/drm/nouveau/nouveau_drm.c  | 17 +-
 7 files changed, 140 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 0a65ede..4cd1ebe 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -23,19 +23,60 @@
 #include 
 #include 

+void
+nouveau_event_handler_install(struct nouveau_event *event, int index,
+ int (*func)(struct nouveau_eventh*, int),
+ void *priv, struct nouveau_eventh *handler)
+{
+   unsigned long flags;
+
+   if (index >= event->index_nr)
+   return;
+
+   handler->func = func;
+   handler->priv = priv;
+
+   spin_lock_irqsave(&event->lock, flags);
+   list_add(&handler->head, &event->index[index].list);
+   spin_unlock_irqrestore(&event->lock, flags);
+}
+
+void
+nouveau_event_handler_remove(struct nouveau_event *event, int index,
+struct nouveau_eventh *handler)
+{
+   unsigned long flags;
+
+   if (index >= event->index_nr)
+   return;
+
+   spin_lock_irqsave(&event->lock, flags);
+   list_del(&handler->head);
+   spin_unlock_irqrestore(&event->lock, flags);
+}
+
 int
 nouveau_event_handler_create(struct nouveau_event *event, int index,
 int (*func)(struct nouveau_eventh*, int),
 void *priv, struct nouveau_eventh **phandler)
 {
struct nouveau_eventh *handler;
+   unsigned long flags;

handler = *phandler = kzalloc(sizeof(*handler), GFP_KERNEL);
if (!handler)
return -ENOMEM;
handler->func = func;
handler->priv = priv;
-   nouveau_event_get(event, index, handler);
+   __set_bit(NVKM_EVENT_ENABLE, &handler->flags);
+
+   spin_lock_irqsave(&event->lock, flags);
+   list_add(&handler->head, &event->index[index].list);
+   if (!event->index[index].refs++) {
+   if (event->enable)
+   event->enable(event, index);
+   }
+   spin_unlock_irqrestore(&event->lock, flags);
return 0;
 }

@@ -43,7 +84,18 @@ void
 nouveau_event_handler_destroy(struct nouveau_event *event, int index,
  struct nouveau_eventh *handler)
 {
-   nouveau_event_put(event, index, handler);
+   unsigned long flags;
+
+   if (index >= event->index_nr)
+   return;
+
+   spin_lock_irqsave(&event->lock, flags);
+   if (!--event->index[index].refs) {
+   if (event->disable)
+   event->disable(event, index);
+   }
+   list_del(&handler->head);
+   spin_unlock_irqrestore(&event->lock, flags);
kfree(handler);
 }

@@ -56,7 +108,6 @@ nouveau_event_put_locked(struct nouveau_event *event, int 
index,
if (event->disable)
event->disable(event, index);
}
-   list_del(&handler->head);
}
 }

@@ -85,7 +136,6 @@ nouveau_event_get(struct nouveau_event *event, int index,

spin_lock_irqsave(&event->lock, flags);
if (!__test_and_set_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
-   list_add(&handler->head, &event->index[index].list);
if (!event->index[index].refs++) {
if (event->enable)
event->enable(event, index);
@@ -105,8 +155,9 @@ nouveau_event_trigger(struct nouveau_event *event, int 
index)

spin_lock_irqsave(&event->lock, flags);
list_for_each_entry_safe(handler, temp, &event->index[index].list, 
head) {
-   if (handler->func(handler, index) == NVKM_EVENT_DROP) {
-   nouveau_event_put_locked(event, i

[PATCH 6/9] drm/nouveau: Convert event handler list to RCU

2013-08-27 Thread Peter Hurley

Lockdep report [1] correctly identifies a potential deadlock between
nouveau_drm_vblank_enable() and nouveau_drm_vblank_handler() due
to inverted lock order of event->lock with drm core's
dev->vblank_time_lock.

Fix vblank event deadlock by converting event handler list to RCU.
List updates remain serialized by event->lock.
List traversal & handler execution is lockless which prevents
inverted lock order problems with the drm core.
Safe deletion of the event handler is accomplished with
synchronize_rcu() for those handlers stored in nouveau_object-based
containers (nouveau_drm & nv50_/nvc0_software_context); otherwise,
with kfree_rcu (for nouveau_connector & uevent temporary handlers).

[1]
==
[ INFO: possible circular locking dependency detected ]
3.10.0-0+tip-xeon+lockdep #0+tip Not tainted
---
swapper/7/0 is trying to acquire lock:
 (&(&dev->vblank_time_lock)->rlock){-.}, at: [] 
drm_handle_vblank+0x69/0x400 [drm]

but task is already holding lock:
 (&(&event->lock)->rlock){-.}, at: [] 
nouveau_event_trigger+0x4d/0xd0 [nouveau]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&(&event->lock)->rlock){-.}:
   [] lock_acquire+0x92/0x1f0
   [] _raw_spin_lock_irqsave+0x46/0x60
   [] nouveau_event_get+0x27/0xa0 [nouveau]
   [] nouveau_drm_vblank_enable+0x8d/0xf0 [nouveau]
   [] drm_vblank_get+0xf8/0x2c0 [drm]
   [] drm_wait_vblank+0x84/0x6f0 [drm]
   [] drm_ioctl+0x559/0x690 [drm]
   [] do_vfs_ioctl+0x97/0x590
   [] SyS_ioctl+0x91/0xb0
   [] system_call_fastpath+0x16/0x1b

-> #0 (&(&dev->vblank_time_lock)->rlock){-.}:
   [] __lock_acquire+0x1c43/0x1d30
   [] lock_acquire+0x92/0x1f0
   [] _raw_spin_lock_irqsave+0x46/0x60
   [] drm_handle_vblank+0x69/0x400 [drm]
   [] nouveau_drm_vblank_handler+0x27/0x30 [nouveau]
   [] nouveau_event_trigger+0x90/0xd0 [nouveau]
   [] nv50_disp_intr+0xdd/0x230 [nouveau]
   [] nouveau_mc_intr+0xa1/0x100 [nouveau]
   [] handle_irq_event_percpu+0x6c/0x3d0
   [] handle_irq_event+0x48/0x70
   [] handle_fasteoi_irq+0x5a/0x100
   [] handle_irq+0x22/0x40
   [] do_IRQ+0x5a/0xd0
   [] ret_from_intr+0x0/0x13
   [] arch_cpu_idle+0x26/0x30
   [] cpu_startup_entry+0xce/0x410
   [] start_secondary+0x255/0x25c

other info that might help us debug this:

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(&(&event->lock)->rlock);
   lock(&(&dev->vblank_time_lock)->rlock);
   lock(&(&event->lock)->rlock);
  lock(&(&dev->vblank_time_lock)->rlock);

 *** DEADLOCK ***

1 lock held by swapper/7/0:
 #0:  (&(&event->lock)->rlock){-.}, at: [] 
nouveau_event_trigger+0x4d/0xd0 [nouveau]

stack backtrace:
CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.10.0-0+tip-xeon+lockdep #0+tip
Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, BIOS A11 
04/30/2012
 821ccf60 8802bfdc3b08 81755f39 8802bfdc3b58
 8174f526 0002 8802bfdc3be8 8802b330a7f8
 8802b330a820 8802b330a7f8  0001
Call Trace:
   [] dump_stack+0x19/0x1b
 [] print_circular_bug+0x1fb/0x20c
 [] __lock_acquire+0x1c43/0x1d30
 [] ? trace_hardirqs_off+0xd/0x10
 [] ? flat_send_IPI_mask+0x88/0xa0
 [] lock_acquire+0x92/0x1f0
 [] ? drm_handle_vblank+0x69/0x400 [drm]
 [] _raw_spin_lock_irqsave+0x46/0x60
 [] ? drm_handle_vblank+0x69/0x400 [drm]
 [] drm_handle_vblank+0x69/0x400 [drm]
 [] nouveau_drm_vblank_handler+0x27/0x30 [nouveau]
 [] nouveau_event_trigger+0x90/0xd0 [nouveau]
 [] nv50_disp_intr+0xdd/0x230 [nouveau]
 [] ? _raw_spin_unlock_irqrestore+0x42/0x80
 [] ? __hrtimer_start_range_ns+0x16c/0x530
 [] nouveau_mc_intr+0xa1/0x100 [nouveau]
 [] handle_irq_event_percpu+0x6c/0x3d0
 [] handle_irq_event+0x48/0x70
 [] ? handle_fasteoi_irq+0x1e/0x100
 [] handle_fasteoi_irq+0x5a/0x100
 [] handle_irq+0x22/0x40
 [] do_IRQ+0x5a/0xd0
 [] common_interrupt+0x6f/0x6f
   [] ? default_idle+0x1d/0x290
 [] ? default_idle+0x1b/0x290
 [] arch_cpu_idle+0x26/0x30
 [] cpu_startup_entry+0xce/0x410
 [] ? clockevents_register_device+0xdc/0x140
 [] start_secondary+0x255/0x25c

Reported-by: Dave Jones 
Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c  | 22 +++---
 .../gpu/drm/nouveau/core/engine/software/nv50.c|  1 +
 .../gpu/drm/nouveau/core/engine/software/nvc0.c|  1 +
 drivers/gpu/drm/nouveau/core/include/core/event.h  |  1 +
 drivers/gpu/drm/nouveau/nouveau_connector.c|  2 +-
 drivers/gpu/drm/nouveau/nouveau_drm.c  |  1 +
 6 files changed, 16 insertion

[PATCH 7/9] drm/nouveau: Fold nouveau_event_put_locked into caller

2013-08-27 Thread Peter Hurley

nouveau_event_put_locked() only has 1 call site; fold into caller.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c | 19 ++-
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index ce0a0ef..45bcb37 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -100,18 +100,6 @@ nouveau_event_handler_destroy(struct nouveau_event *event, 
int index,
kfree_rcu(handler, rcu);
 }

-static void
-nouveau_event_put_locked(struct nouveau_event *event, int index,
-struct nouveau_eventh *handler)
-{
-   if (__test_and_clear_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
-   if (!--event->index[index].refs) {
-   if (event->disable)
-   event->disable(event, index);
-   }
-   }
-}
-
 void
 nouveau_event_put(struct nouveau_event *event, int index,
  struct nouveau_eventh *handler)
@@ -122,7 +110,12 @@ nouveau_event_put(struct nouveau_event *event, int index,
return;

spin_lock_irqsave(&event->lock, flags);
-   nouveau_event_put_locked(event, index, handler);
+   if (__test_and_clear_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
+   if (!--event->index[index].refs) {
+   if (event->disable)
+   event->disable(event, index);
+   }
+   }
spin_unlock_irqrestore(&event->lock, flags);
 }

-- 
1.8.1.2

[PATCH 8/9] drm/nouveau: Simplify event interface

2013-08-27 Thread Peter Hurley

Store event back-pointer and index within struct event_handler;
remove superfluous parameters when event_handler is supplied.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c  | 36 +-
 .../gpu/drm/nouveau/core/engine/software/nv50.c| 11 ++-
 .../gpu/drm/nouveau/core/engine/software/nvc0.c| 11 ++-
 drivers/gpu/drm/nouveau/core/include/core/event.h  | 15 +
 drivers/gpu/drm/nouveau/nouveau_connector.c|  5 +--
 drivers/gpu/drm/nouveau/nouveau_display.c  | 16 +++---
 drivers/gpu/drm/nouveau/nouveau_drm.c  | 10 ++
 drivers/gpu/drm/nouveau/nouveau_fence.c|  2 +-
 8 files changed, 44 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index 45bcb37..b7d8ae1 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -34,6 +34,9 @@ nouveau_event_handler_install(struct nouveau_event *event, 
int index,
if (index >= event->index_nr)
return;

+   handler->event = event;
+   handler->index = index;
+
handler->func = func;
handler->priv = priv;

@@ -43,12 +46,12 @@ nouveau_event_handler_install(struct nouveau_event *event, 
int index,
 }

 void
-nouveau_event_handler_remove(struct nouveau_event *event, int index,
-struct nouveau_eventh *handler)
+nouveau_event_handler_remove(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
unsigned long flags;

-   if (index >= event->index_nr)
+   if (!event)
return;

spin_lock_irqsave(&event->lock, flags);
@@ -67,6 +70,10 @@ nouveau_event_handler_create(struct nouveau_event *event, 
int index,
handler = *phandler = kzalloc(sizeof(*handler), GFP_KERNEL);
if (!handler)
return -ENOMEM;
+
+   handler->event = event;
+   handler->index = index;
+
handler->func = func;
handler->priv = priv;
__set_bit(NVKM_EVENT_ENABLE, &handler->flags);
@@ -82,13 +89,12 @@ nouveau_event_handler_create(struct nouveau_event *event, 
int index,
 }

 void
-nouveau_event_handler_destroy(struct nouveau_event *event, int index,
- struct nouveau_eventh *handler)
+nouveau_event_handler_destroy(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
+   int index = handler->index;
unsigned long flags;

-   if (index >= event->index_nr)
-   return;

spin_lock_irqsave(&event->lock, flags);
if (!--event->index[index].refs) {
@@ -101,12 +107,13 @@ nouveau_event_handler_destroy(struct nouveau_event 
*event, int index,
 }

 void
-nouveau_event_put(struct nouveau_event *event, int index,
- struct nouveau_eventh *handler)
+nouveau_event_put(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
+   int index = handler->index;
unsigned long flags;

-   if (index >= event->index_nr)
+   if (!event)
return;

spin_lock_irqsave(&event->lock, flags);
@@ -120,12 +127,13 @@ nouveau_event_put(struct nouveau_event *event, int index,
 }

 void
-nouveau_event_get(struct nouveau_event *event, int index,
- struct nouveau_eventh *handler)
+nouveau_event_get(struct nouveau_eventh *handler)
 {
+   struct nouveau_event *event = handler->event;
+   int index = handler->index;
unsigned long flags;

-   if (index >= event->index_nr)
+   if (!event)
return;

spin_lock_irqsave(&event->lock, flags);
@@ -150,7 +158,7 @@ nouveau_event_trigger(struct nouveau_event *event, int 
index)
list_for_each_entry_rcu(handler, &event->index[index].list, head) {
if (test_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
if (handler->func(handler, index) == NVKM_EVENT_DROP)
-   nouveau_event_put(event, index, handler);
+   nouveau_event_put(handler);
}
}
rcu_read_unlock();
diff --git a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c 
b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
index 87aeee1..e969f0c 100644
--- a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
+++ b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
@@ -92,12 +92,11 @@ nv50_software_mthd_vblsem_release(struct nouveau_object 
*object, u32 mthd,
  void *args, u32 size)
 {
struct nv50_software_chan *chan = (void *)nv_engctx(object->parent);
-   struct nouveau_disp *disp = nouveau_disp(object);
u32 crtc = *(u32 *)args;
if (crtc > 1)

[PATCH 9/9] drm/nouveau: Simplify event handler interface

2013-08-27 Thread Peter Hurley

Remove index parameter; access index via handler->index instead.
Dissociate handler from related container; use handler->priv to
access container.

Signed-off-by: Peter Hurley 
---
 drivers/gpu/drm/nouveau/core/core/event.c   | 6 +++---
 drivers/gpu/drm/nouveau/core/engine/software/nv50.c | 7 +++
 drivers/gpu/drm/nouveau/core/engine/software/nvc0.c | 7 +++
 drivers/gpu/drm/nouveau/core/include/core/event.h   | 6 +++---
 drivers/gpu/drm/nouveau/nouveau_connector.c | 9 +
 drivers/gpu/drm/nouveau/nouveau_drm.c   | 9 +
 drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
 7 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/core/core/event.c 
b/drivers/gpu/drm/nouveau/core/core/event.c
index b7d8ae1..1240fef 100644
--- a/drivers/gpu/drm/nouveau/core/core/event.c
+++ b/drivers/gpu/drm/nouveau/core/core/event.c
@@ -26,7 +26,7 @@

 void
 nouveau_event_handler_install(struct nouveau_event *event, int index,
- int (*func)(struct nouveau_eventh*, int),
+ int (*func)(struct nouveau_eventh*),
  void *priv, struct nouveau_eventh *handler)
 {
unsigned long flags;
@@ -61,7 +61,7 @@ nouveau_event_handler_remove(struct nouveau_eventh *handler)

 int
 nouveau_event_handler_create(struct nouveau_event *event, int index,
-int (*func)(struct nouveau_eventh*, int),
+int (*func)(struct nouveau_eventh*),
 void *priv, struct nouveau_eventh **phandler)
 {
struct nouveau_eventh *handler;
@@ -157,7 +157,7 @@ nouveau_event_trigger(struct nouveau_event *event, int 
index)
rcu_read_lock();
list_for_each_entry_rcu(handler, &event->index[index].list, head) {
if (test_bit(NVKM_EVENT_ENABLE, &handler->flags)) {
-   if (handler->func(handler, index) == NVKM_EVENT_DROP)
+   if (handler->func(handler) == NVKM_EVENT_DROP)
nouveau_event_put(handler);
}
}
diff --git a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c 
b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
index e969f0c..bf6f23b 100644
--- a/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
+++ b/drivers/gpu/drm/nouveau/core/engine/software/nv50.c
@@ -131,10 +131,9 @@ nv50_software_sclass[] = {
  
**/

 static int
-nv50_software_vblsem_release(struct nouveau_eventh *event, int head)
+nv50_software_vblsem_release(struct nouveau_eventh *handler)
 {
-   struct nouveau_software_chan *chan =
-   container_of(event, struct nouveau_software_chan, 
vblank.event[head]);
+   struct nouveau_software_chan *chan = handler->priv;
struct nv50_software_priv *priv = (void *)nv_object(chan)->engine;
struct nouveau_bar *bar = nouveau_bar(priv);

@@ -172,7 +171,7 @@ nv50_software_context_ctor(struct nouveau_object *parent,
for (i = 0; i < ARRAY_SIZE(chan->base.vblank.event); i++) {
nouveau_event_handler_install(disp->vblank, i,
  nv50_software_vblsem_release,
- NULL,
+ chan,
  &chan->base.vblank.event[i]);
}
return 0;
diff --git a/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c 
b/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c
index d06658a..1a2a7a8 100644
--- a/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c
+++ b/drivers/gpu/drm/nouveau/core/engine/software/nvc0.c
@@ -143,10 +143,9 @@ nvc0_software_sclass[] = {
  
**/

 static int
-nvc0_software_vblsem_release(struct nouveau_eventh *event, int head)
+nvc0_software_vblsem_release(struct nouveau_eventh *handler)
 {
-   struct nouveau_software_chan *chan =
-   container_of(event, struct nouveau_software_chan, 
vblank.event[head]);
+   struct nouveau_software_chan *chan = handler->priv;
struct nvc0_software_priv *priv = (void *)nv_object(chan)->engine;
struct nouveau_bar *bar = nouveau_bar(priv);

@@ -178,7 +177,7 @@ nvc0_software_context_ctor(struct nouveau_object *parent,
for (i = 0; i < ARRAY_SIZE(chan->base.vblank.event); i++) {
nouveau_event_handler_install(disp->vblank, i,
  nvc0_software_vblsem_release,
- NULL,
+ chan,
  &chan->base.vblank.event[i]);
}
return 0;
diff --git a/drivers/gpu/drm/nouveau/core/include/c

[PATCH 0/9] drm/nouveau: Cleanup event/handler design

2013-08-29 Thread Peter Hurley

On 08/28/2013 03:15 AM, Ben Skeggs wrote:
> On Wed, Aug 28, 2013 at 6:12 AM, Peter Hurley  
> wrote:
>> This series was originally motivated by a deadlock, introduced in
>> commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
>> 'drm/nouveau/disp: port vblank handling to event interface',
>> due to inverted lock order between nouveau_drm_vblank_enable()
>> and nouveau_drm_vblank_handler() (the complete
>> lockdep report is included in the patch 4/5 changelog).
> Hey Peter,
>
> Thanks for the patch series.  I've only taken a cursory glance through
> the patches thus far, as they're definitely too intrusive for -fixes
> at this point.  I will take a proper look through the series within
> the next week or so, I have some work pending which may possibly make
> some of these changes unnecessary, though from what I can tell,
> there's other good bits here we'll want.
>
> In a previous mail on the locking issue, you mentioned the drm core
> doing the "wrong thing" here too, did you have any further thoughts on
> correcting that issue too?

I'm working on the premise that drm_handle_vblank() can be lockless;
that would minimize the api disruption. I still have a fair bit of
analysis to do, but some early observations:

1) The vbl_time_lock spinlock is unnecessary -- drm_handle_vblank()
could be protected with vbl_lock, since everywhere else
vbl_time_lock is held, vbl_lock is already held.

2) The _vblank_count[crtc] doesn't need to be atomic. All the code
paths that update _vblank_count[] are already spinlocked. Even
if the code weren't spinlocked, the atomic_inc/_adds aren't
necessary; only the memory barriers and read/write order of
the vblank count/timestamp tuple is relevant.

3) Careful enabling of drm_handle_vblank() is sufficient to
solve the concurrency between drm_handle_vblank() and
drm_vblank_get() without needing a spinlock for exclusion.
drm_handle_vblank() would need to account for the possibility of
having missed a vblank interrupt between the reading of the
hw vblank counter and the enabling drm_handle_vblank().

4) I'm still thinking about how/whether to exclude
drm_handle_vblank() and vblank_disable_and_save(). One thought
is to replace the timeout timer with delayed work instead so
that vblank_disable_and_save() could wait for
drm_handle_vblank() to finish after disabling it.
[I'd also need to verify that the drm drivers other than
intel which use drm_vblank_off() do so from non-atomic context.]

I realize that I'm mostly referring to the hw counter/timestamp
flavor of vblank handling; that's primarily because it requires a
more rigorous handling and has race conditions that don't apply
to the software-only counter.

Regards,
Peter Hurley

Re: 3.8-rc2: lockdep warning in nouveau driver

2013-01-30 Thread Peter Hurley

On Wed, 2013-01-09 at 12:45 +0100, Arend van Spriel wrote:
> Maybe this one is already known, but I did not find a post about it. So
> here it is.
> 
> Regards,
> Arend

[snip]

> [9.589986] =
> [9.595365] [ INFO: possible recursive locking detected ]
> [9.600745] 3.8.0-rc2-wl-testing-lockdep-2-ga524cf0 #1 Not tainted
> [9.607248] -
> [9.612626] modprobe/163 is trying to acquire lock:
> [9.617486]  (&subdev->mutex){+.+.+.}, at: []
> nv50_fb_vram_new+0x92/0x230 [nouveau]
> [9.626052]
> [9.626052] but task is already holding lock:
> [9.631865]  (&subdev->mutex){+.+.+.}, at: []
> nv50_disp_data_ctor+0x55/0xc0 [nouveau]
> [9.640593]
> [9.640593] other info that might help us debug this:
> [9.647096]  Possible unsafe locking scenario:
> [9.647096]
> [9.652995]CPU0
> [9.655430]
> [9.657867]   lock(&subdev->mutex);
> [9.661365]   lock(&subdev->mutex);
> [9.664863]
> [9.664863]  *** DEADLOCK ***
> [9.664863]
> [9.670762]  May be due to missing lock nesting notation

Same.

[5.995881] =
[5.995886] [ INFO: possible recursive locking detected ]
[5.995892] 3.8.0-next-20130125+ttypatch-xeon+lockdep #20130125+ttypatch Not 
tainted
[5.995898] -
[5.995904] modprobe/275 is trying to acquire lock:
[5.995909]  (&subdev->mutex){+.+.+.}, at: [] 
nouveau_instobj_create_+0x48/0x90 [nouveau]
[5.995955] 
[5.995955] but task is already holding lock:
[5.995961]  (&subdev->mutex){+.+.+.}, at: [] 
nv50_disp_data_ctor+0x65/0xd0 [nouveau]
[5.995989] 
[5.995989] other info that might help us debug this:
[5.995995]  Possible unsafe locking scenario:
[5.995995] 
[5.996001]CPU0
[5.996004]
[5.996005]   lock(&subdev->mutex);
[5.996005]   lock(&subdev->mutex);
[5.996005] 
[5.996005]  *** DEADLOCK ***

Regards,
Peter Hurley



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/nouveau: add lockdep annotations

2013-02-04 Thread Peter Hurley

Hi Maarten

On Mon, 2013-02-04 at 22:59 +0100, Maarten Lankhorst wrote:
> Op 04-02-13 22:30, Marcin Slusarz schreef:
> > 1) Lockdep thinks all nouveau subdevs belong to the same class and can be
> > locked in arbitrary order, which is not true (at least in general case).
> > Tell it to distinguish subdevs by (o)class type.
> Apart from this specific case, is there any other reason why we require being 
> able to nest 2 subdev locks?
> 
> Add a NVOBJ_FLAG_CREATE_EXCLUSIVE flag to nouveau_engctx_create and return 
> -EBUSY if there is any existing object. Problem solved, and lockdep is still 
> generic.
> 
> > 2) DRM client can be locked under user client lock - tell lockdep to put DRM
> > client lock in a separate class.
> Can you copy paste a typical lockdep splat for this? Also this should be a 
> separate patch. :-)

PS - Deep call chain. Has anyone looked into max stack depth yet?

[5.995881] =
[5.995886] [ INFO: possible recursive locking detected ]
[5.995892] 3.8.0-next-20130125+ttypatch-xeon+lockdep #20130125+ttypatch Not 
tainted
[5.995898] -
[5.995904] modprobe/275 is trying to acquire lock:
[5.995909]  (&subdev->mutex){+.+.+.}, at: [] 
nouveau_instobj_create_+0x48/0x90 [nouveau]
[5.995955] 
[5.995955] but task is already holding lock:
[5.995961]  (&subdev->mutex){+.+.+.}, at: [] 
nv50_disp_data_ctor+0x65/0xd0 [nouveau]
[5.995989] 
[5.995989] other info that might help us debug this:
[5.995995]  Possible unsafe locking scenario:
[5.995995] 
[5.996001]CPU0
[5.996004]
[5.996005]   lock(&subdev->mutex);
[5.996005]   lock(&subdev->mutex);
[5.996005] 
[5.996005]  *** DEADLOCK ***
[5.996005] 
[5.996005]  May be due to missing lock nesting notation
[5.996005] 
[5.996005] 4 locks held by modprobe/275:
[5.996005]  #0:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x5b/0xb0
[5.996005]  #1:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x69/0xb0
[5.996005]  #2:  (drm_global_mutex){+.+.+.}, at: [] 
drm_get_pci_dev+0xc6/0x2d0 [drm]
[5.996005]  #3:  (&subdev->mutex){+.+.+.}, at: [] 
nv50_disp_data_ctor+0x65/0xd0 [nouveau]
[5.996005] 
[5.996005] stack backtrace:
[5.996005] Pid: 275, comm: modprobe Not tainted 
3.8.0-next-20130125+ttypatch-xeon+lockdep #20130125+ttypatch
[5.996005] Call Trace:
[5.996005]  [] __lock_acquire+0x687/0x1a70
[5.996005]  [] ? mark_held_locks+0x9b/0x100
[5.996005]  [] ? trace_hardirqs_on_caller+0x10d/0x1a0
[5.996005]  [] ? trace_hardirqs_on+0xd/0x10
[5.996005]  [] ? _raw_write_unlock_irqrestore+0x4a/0x90
[5.996005]  [] lock_acquire+0x98/0x150
[5.996005]  [] ? nouveau_instobj_create_+0x48/0x90 
[nouveau]
[5.996005]  [] ? nouveau_instobj_create_+0x48/0x90 
[nouveau]
[5.996005]  [] mutex_lock_nested+0x50/0x390
[5.996005]  [] ? nouveau_instobj_create_+0x48/0x90 
[nouveau]
[5.996005]  [] ? nouveau_object_ref+0x46/0xd0 [nouveau]
[5.996005]  [] ? nouveau_object_create_+0x65/0xa0 
[nouveau]
[5.996005]  [] nouveau_instobj_create_+0x48/0x90 [nouveau]
[5.996005]  [] nv50_instobj_ctor+0x51/0xf0 [nouveau]
[5.996005]  [] nouveau_object_ctor+0x38/0xc0 [nouveau]
[5.996005]  [] nv50_instmem_alloc+0x26/0x30 [nouveau]
[5.996005]  [] nouveau_gpuobj_create_+0x247/0x2f0 
[nouveau]
[5.996005]  [] ? _raw_spin_unlock_irqrestore+0x6d/0x90
[5.996005]  [] ? trace_hardirqs_on_caller+0x10d/0x1a0
[5.996005]  [] nouveau_engctx_create_+0x26c/0x2b0 
[nouveau]
[5.996005]  [] nv50_disp_data_ctor+0xc1/0xd0 [nouveau]
[5.996005]  [] nouveau_object_ctor+0x38/0xc0 [nouveau]
[5.996005]  [] nouveau_object_new+0x112/0x240 [nouveau]
[5.996005]  [] nv50_display_create+0x1a5/0x890 [nouveau]
[5.996005]  [] ? __cancel_work_timer+0x8b/0xe0
[5.996005]  [] nouveau_display_create+0x3cb/0x670 
[nouveau]
[5.996005]  [] nouveau_drm_load+0x26f/0x590 [nouveau]
[5.996005]  [] ? device_register+0x1e/0x30
[5.996005]  [] ? drm_sysfs_device_add+0x86/0xb0 [drm]
[5.996005]  [] drm_get_pci_dev+0x186/0x2d0 [drm]
[5.996005]  [] nouveau_drm_probe+0x26a/0x2c0 [nouveau]
[5.996005]  [] ? _raw_spin_unlock_irqrestore+0x4a/0x90
[5.996005]  [] local_pci_probe+0x4b/0x80
[5.996005]  [] pci_device_probe+0x111/0x120
[5.996005]  [] driver_probe_device+0x8b/0x3a0
[5.996005]  [] __driver_attach+0xab/0xb0
[5.996005]  [] ? driver_probe_device+0x3a0/0x3a0
[5.996005]  [] bus_for_each_dev+0x55/0x90
[5.996005]  [] driver_attach+0x1e/0x20
[5.996005]  [] bus_add_driver+0x121/0x2b0
[5.996005]  [] ? 0xa01acfff
[5.996005]  [] driver_register+0x77/0x170
[5.996005]  [] ? 0xa01acfff
[5.996005]  [] __pci_register_driver+0x64/0x70
[5.996005]  [] drm_pci_init+0x11a/0x130 [drm]
[5.996005]  [] ? 0xa01acfff
[5.996005]  [] ?

[next-20130204] nouveau: lockdep warning (not the same as the subdev lockdep warning)

2013-02-12 Thread Peter Hurley

Got this lockdep warning straightaway during boot:

[7.435890] =
[7.435891] [ INFO: possible recursive locking detected ]
[7.435893] 3.8.0-next-20130204+pcipatch-xeon+lockdep #20130204+pcipatch Not 
tainted
[7.435893] -
[7.435895] modprobe/255 is trying to acquire lock:
[7.435942]  (&dmac->lock){+.+...}, at: [] 
evo_wait+0x43/0xf0 [nouveau]
[7.435942] 
[7.435942] but task is already holding lock:
[7.435964]  (&dmac->lock){+.+...}, at: [] 
evo_wait+0x43/0xf0 [nouveau]
[7.435964] 
[7.435964] other info that might help us debug this:
[7.435965]  Possible unsafe locking scenario:
[7.435965] 
[7.435966]CPU0
[7.435967]
[7.435968]   lock(&dmac->lock);
[7.435970]   lock(&dmac->lock);
[7.435970] 
[7.435970]  *** DEADLOCK ***
[7.435970] 
[7.435971]  May be due to missing lock nesting notation
[7.435971] 
[7.435973] 9 locks held by modprobe/255:
[7.435979]  #0:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x5b/0xb0
[7.435982]  #1:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x69/0xb0
[7.436001]  #2:  (drm_global_mutex){+.+.+.}, at: [] 
drm_get_pci_dev+0xbd/0x2a0 [drm]
[7.436012]  #3:  (registration_lock){+.+.+.}, at: [] 
register_framebuffer+0x25/0x2f0
[7.436016]  #4:  (&fb_info->lock){+.+.+.}, at: [] 
lock_fb_info+0x26/0x60
[7.436023]  #5:  ((fb_notifier_list).rwsem){.+.+.+}, at: 
[] __blocking_notifier_call_chain+0x56/0xc0
[7.436035]  #6:  (&dev->mode_config.mutex){+.+.+.}, at: 
[] drm_modeset_lock_all+0x2a/0x70 [drm]
[7.436046]  #7:  (&crtc->mutex){+.+.+.}, at: [] 
drm_modeset_lock_all+0x54/0x70 [drm]
[7.436068]  #8:  (&dmac->lock){+.+...}, at: [] 
evo_wait+0x43/0xf0 [nouveau]
[7.436068] 
[7.436068] stack backtrace:
[7.436070] Pid: 255, comm: modprobe Not tainted 
3.8.0-next-20130204+pcipatch-xeon+lockdep #20130204+pcipatch
[7.436072] Call Trace:
[7.436077]  [] __lock_acquire+0x697/0x1a80
[7.436080]  [] lock_acquire+0xa1/0x200
[7.436100]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436121]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436125]  [] mutex_lock_nested+0x63/0x3a0
[7.436145]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436148]  [] ? mutex_lock_nested+0x29e/0x3a0
[7.436168]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436170]  [] ? mark_held_locks+0xaf/0x110
[7.436190]  [] evo_wait+0x43/0xf0 [nouveau]
[7.436212]  [] evo_sync+0x5c/0xf0 [nouveau]
[7.436233]  [] nv50_display_flip_next+0x5ad/0x5c0 
[nouveau]


Regards,
Peter Hurley

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: kernel BUG at mm/slub.c:3409, 3.8.0-rc7

2013-02-18 Thread Peter Hurley

[+cc Marcin Slusarz, dri-devel]

On Tue, 2013-02-19 at 00:21 +, Christoph Lameter wrote:
> The problem is that the subsystem attempted to call kfree with a pointer
> that was not obtained via a slab allocation.
> 
> On Sat, 16 Feb 2013, Denys Fedoryshchenko wrote:
> 
> > Hi
> >
> > Worked for a while on 3.8.0-rc7, generally it is fine, then suddenly laptop
> > stopped responding to keyboard and mouse.
> > Sure it can be memory corruption by some other module, but maybe not. Worth 
> > to
> > report i guess.
> > After reboot checked logs and found this:
> >
> > Feb 16 00:40:17 localhost kernel: [23260.079253] [ cut here
> > ]
> > Feb 16 00:40:17 localhost kernel: [23260.079257] kernel BUG at 
> > mm/slub.c:3409!
> > Feb 16 00:40:17 localhost kernel: [23260.079259] invalid opcode:  [#1] 
> > SMP
> > Feb 16 00:40:17 localhost kernel: [23260.079262] Modules linked in:
> > ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
> > nf_defrag_ipv4
> > xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter
> > ip_tables tun bridge stp llc nouveau snd_hda_codec_hdmi coretemp kvm_intel

Was there an allocation failure earlier in the log?

Might be this nouveau bug (linux-next at the time was 3.8 now):
https://bugs.freedesktop.org/show_bug.cgi?id=58087

I think this was fixed but neither bug report has a cross reference :(

The original report is here:
https://bugzilla.kernel.org/show_bug.cgi?id=51291

Pekka,
Can you please re-assign the bugzilla #51291 above to DRI? Thanks.

Regards,
Peter Hurley

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Page allocation failures (was Re: WARNING: at drivers/gpu/drm/nouveau/core/core/mm.c:242)

2013-02-18 Thread Peter Hurley

On Tue, 2013-02-19 at 00:43 +0100, Jiri Slaby wrote:
> On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
> > On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
> >> Hi,
> >>
> >> we have a report of WARNING from 3.7.6 in nouveau at
> >> drivers/gpu/drm/nouveau/core/core/mm.c:242 here:
> >> https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
> >>
> >> There is an order 4 allocation failure in nouveau_drm_open ->
> >> nouveau_vm_create, i.e. this one failed:
> >> vm->pgt  = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);

Hi Jiri,

I had the order 4 allocation failure and nouveau crash back in November
on next-20121129. Bugzillas here:

Allocation failure:
https://bugzilla.kernel.org/show_bug.cgi?id=51301

Nouveau bug:
https://bugzilla.kernel.org/show_bug.cgi?id=51291
https://bugs.freedesktop.org/show_bug.cgi?id=58087

IMO, the 32k allocation failure is the more serious bug. Check out the
slab info from your report:

Feb 06 13:16:15 desdemona kernel: Node 0 DMA32: 13378*4kB 5026*8kB 1823*16kB
135*32kB 5*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 129576kB
Feb 06 13:16:15 desdemona kernel: Node 0 Normal: 1946*4kB 831*8kB 3*16kB 1*32kB
1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 16496kB

The pages are there: why did the allocation fail?

I think this is related to all that kswapd mess. In my case, the machine
really was OOM -- which made no sense. Completely out of page blocks
larger than 32k on a 10gb machine with a bunch of emacs and terminal
windows open for 3 days, just doing code, build, code, build, code,
build?

Regards,
Peter Hurley

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[bisected][3.9.0-rc3] NULL ptr dereference from nv50_disp_intr()

2013-03-19 Thread Peter Hurley

On vanilla 3.9.0-rc3, I get this 100% repeatable oops after login when
the user X session is coming up:


BUG: unable to handle kernel NULL pointer dereference at 0001
IP: [<0001>] 0x0
PGD 0 
Oops: 0010 [#1] PREEMPT SMP 
Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ..
CPU 3 
Pid: 0, comm: swapper/3 Not tainted 3.9.0-rc3-xeon #rc3 Dell Inc. Precision 
WorkStation T5400  /0RW203
RIP: 0010:[<0001>]  [<0001>] 0x0
RSP: 0018:8802afcc3d80  EFLAGS: 00010087
RAX: 88029f6e5808 RBX: 0001 RCX: 
RDX: 0096 RSI: 0001 RDI: 88029f6e5808
RBP: 8802afcc3dc8 R08:  R09: 0004
R10: 002c R11: 88029e559a98 R12: 8802a376cb78
R13: 88029f6e57e0 R14: 88029f6e57f8 R15: 88029f6e5808
FS:  () GS:8802afcc() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0001 CR3: 00029fa67000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process swapper/3 (pid: 0, threadinfo 8802a355e000, task 8802a3535c40)
Stack:
 a0159d8a 0082 88029f6e5820 0001
 88029f71aa00   0400
 0400 8802afcc3e38 a01843b5 8802afcc3df8
Call Trace:
  
 [] ? nouveau_event_trigger+0xaa/0xe0 [nouveau]
 [] nv50_disp_intr+0xc5/0x200 [nouveau]
 [] ? _raw_spin_unlock_irqrestore+0x2c/0x50
 [] ? notifier_call_chain+0x4d/0x70
 [] nouveau_mc_intr+0xb5/0x110 [nouveau]
 [] nouveau_irq_handler+0x6f/0x80 [nouveau]
 [] handle_irq_event_percpu+0x75/0x260
 [] handle_irq_event+0x48/0x70
 [] handle_fasteoi_irq+0x5a/0x100
 [] handle_irq+0x22/0x40
 [] do_IRQ+0x5a/0xd0
 [] common_interrupt+0x6d/0x6d
  
 [] ? native_safe_halt+0x6/0x10
 [] default_idle+0x3d/0x170
 [] cpu_idle+0x116/0x130
 [] start_secondary+0x251/0x258
Code:  Bad RIP value.
RIP  [<0001>] 0x0
 RSP 
CR2: 0001
---[ end trace 907323cb8ce6f301 ]---



git bisect from 3.8.0 (good) to 3.9.0-rc3 (bad) blames (bisect log
attached):

1d7c71a3e2f77336df536855b0efd2dc5bdeb41b is the first bad commit
commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
Author: Ben Skeggs 
Date:   Thu Jan 31 09:23:34 2013 +1000

drm/nouveau/disp: port vblank handling to event interface

This removes the nastiness with the interactions between display and
software engines when handling vblank semaphore release interrupts.

Now, all the semantics are handled in one place (sw) \o/.

Signed-off-by: Ben Skeggs 

:04 04 fbd44f8566271415fd2775ab4b6346efef7e82fe 
a0730be0f35feaa1476b1447b1d65c4b3b3c0686 M  drivers


On this hardware:
nouveau  [  DEVICE][:02:00.0] BOOT0  : 0x084e00a2
nouveau  [  DEVICE][:02:00.0] Chipset: G84 (NV84)
nouveau  [  DEVICE][:02:00.0] Family : NV50
nouveau  [   VBIOS][:02:00.0] checking PRAMIN for image...
nouveau  [   VBIOS][:02:00.0] ... appears to be valid
nouveau  [   VBIOS][:02:00.0] using image from PRAMIN
nouveau  [   VBIOS][:02:00.0] BIT signature found
nouveau  [   VBIOS][:02:00.0] version 60.84.63.00.11
nouveau  [ PFB][:02:00.0] RAM type: DDR2
nouveau  [ PFB][:02:00.0] RAM size: 256 MiB
nouveau  [ PFB][:02:00.0]ZCOMP: 1892 tags
nouveau  [ DRM] VRAM: 256 MiB
nouveau  [ DRM] GART: 512 MiB
nouveau  [ DRM] BIT BIOS found
nouveau  [ DRM] Bios version 60.84.63.00
nouveau  [ DRM] TMDS table version 2.0
nouveau  [ DRM] DCB version 4.0
nouveau  [ DRM] DCB outp 00: 02000300 0028
nouveau  [ DRM] DCB outp 01: 01000302 0030
nouveau  [ DRM] DCB outp 02: 04011310 0028
nouveau  [ DRM] DCB outp 03: 02011312 0030
nouveau  [ DRM] DCB conn 00: 1030
nouveau  [ DRM] DCB conn 01: 2130
nouveau  [ DRM] 2 available performance level(s)
nouveau  [ DRM] 0: core 208MHz shader 416MHz memory 100MHz voltage 1200mV 
fanspeed 100%
nouveau  [ DRM] 1: core 460MHz shader 920MHz memory 400MHz voltage 1200mV 
fanspeed 100%
nouveau  [ DRM] c: core 459MHz shader 918MHz memory 399MHz voltage 1200mV
nouveau  [ DRM] MM: using CRYPT for buffer copies
nouveau  [ DRM] allocated 1680x1050 fb: 0x6, bo 88029ef50400
fbcon: nouveaufb (fb0) is primary device
nouveau :02:00.0: fb0: nouveaufb frame buffer device
nouveau :02:00.0: registered panic notifier
[drm] Initialized nouveau 1.1.0 20120801 for :02:00.0 on minor 0


02:00.0 VGA compatible controller: NVIDIA Corporation G84 [Quadro FX 570] (rev 
a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Device 0474
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-

Re: [Nouveau] nouveau lockdep splat

2013-03-19 Thread Peter Hurley

[ adding Ben Skeggs and Dave Airlie ]

On Tue, 2013-03-19 at 21:24 +0100, Borislav Petkov wrote:
> On Tue, Mar 05, 2013 at 05:30:52PM +0100, Lucas Stach wrote:
> > Dropping Tegra ML, it's not the place where Nouveau mails should go.
> > Adding Nouveau ML and Maarten, who probably knows Lockdep+Nouveau best.
> 
> Ok,
> 
> with the hope of having the right people on CC now (finally, thanks
> Lucas :-)), here's the same splat on -rc3. Someone better take a look
> soonish, please:

Also happens in next (on nv50 hardware).

> [0.541078] [drm] No driver support for vblank timestamp query.
> [0.541272] nouveau  [ DRM] 3 available performance level(s)
> [0.541276] nouveau  [ DRM] 0: core 135MHz shader 270MHz memory 135MHz 
> voltage 900mV
> [0.541280] nouveau  [ DRM] 1: core 405MHz shader 810MHz memory 405MHz 
> voltage 900mV
> [0.541284] nouveau  [ DRM] 3: core 520MHz shader 1230MHz memory 
> 790MHz voltage 900mV
> [0.541287] nouveau  [ DRM] c: core 405MHz shader 810MHz memory 405MHz 
> voltage 900mV
> [0.559846] nouveau  [ DRM] MM: using COPY for buffer copies
> [0.625371] nouveau  [ DRM] allocated 1920x1080 fb: 0x7, bo 
> 88043b54f000
> [0.625441] fbcon: nouveaufb (fb0) is primary device
> [0.62] 
> [0.625556] =
> [0.625556] [ INFO: possible recursive locking detected ]
> [0.625557] 3.9.0-rc3+ #25 Not tainted
> [0.625557] -
> [0.625558] swapper/0/1 is trying to acquire lock:
> [0.625562]  (&dmac->lock){+.+...}, at: [] 
> evo_wait+0x43/0xf0
> [0.625562] 
> [0.625562] but task is already holding lock:
> [0.625564]  (&dmac->lock){+.+...}, at: [] 
> evo_wait+0x43/0xf0
> [0.625565] 
> [0.625565] other info that might help us debug this:
> [0.625565]  Possible unsafe locking scenario:
> [0.625565] 
> [0.625565]CPU0
> [0.625565]
> [0.625566]   lock(&dmac->lock);
> [0.625567]   lock(&dmac->lock);
> [0.625567] 
> [0.625567]  *** DEADLOCK ***
> [0.625567] 
> [0.625567]  May be due to missing lock nesting notation
> [0.625567] 
> [0.625568] 10 locks held by swapper/0/1:
> [0.625570]  #0:  (&__lockdep_no_validate__){..}, at: 
> [] __driver_attach+0x5b/0xb0
> [0.625572]  #1:  (&__lockdep_no_validate__){..}, at: 
> [] __driver_attach+0x69/0xb0
> [0.625575]  #2:  (drm_global_mutex){+.+.+.}, at: [] 
> drm_get_pci_dev+0xc6/0x2d0
> [0.625578]  #3:  (registration_lock){+.+.+.}, at: [] 
> register_framebuffer+0x25/0x310
> [0.625581]  #4:  (&fb_info->lock){+.+.+.}, at: [] 
> lock_fb_info+0x26/0x60
> [0.625583]  #5:  (console_lock){+.+.+.}, at: [] 
> register_framebuffer+0x1ba/0x310
> [0.625585]  #6:  ((fb_notifier_list).rwsem){.+.+.+}, at: 
> [] __blocking_notifier_call_chain+0x42/0x80
> [0.625587]  #7:  (&dev->mode_config.mutex){+.+.+.}, at: 
> [] drm_modeset_lock_all+0x2a/0x70
> [0.625589]  #8:  (&crtc->mutex){+.+.+.}, at: [] 
> drm_modeset_lock_all+0x54/0x70
> [0.625591]  #9:  (&dmac->lock){+.+...}, at: [] 
> evo_wait+0x43/0xf0
> [0.625591] 
> [0.625591] stack backtrace:
> [0.625592] Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc3+ #25
> [0.625593] Call Trace:
> [0.625595]  [] __lock_acquire+0x76b/0x1c20
> [0.625597]  [] ? dcb_table+0x1ac/0x2a0
> [0.625599]  [] lock_acquire+0x8a/0x120
> [0.625600]  [] ? evo_wait+0x43/0xf0
> [0.625602]  [] ? mutex_lock_nested+0x292/0x330
> [0.625603]  [] mutex_lock_nested+0x6e/0x330
> [0.625605]  [] ? evo_wait+0x43/0xf0
> [0.625606]  [] ? mark_held_locks+0x9b/0x100
> [0.625607]  [] evo_wait+0x43/0xf0
> [0.625609]  [] nv50_display_flip_next+0x713/0x7a0
> [0.625611]  [] ? mutex_unlock+0xe/0x10
> [0.625612]  [] ? evo_kick+0x37/0x40
> [0.625613]  [] nv50_crtc_commit+0x10e/0x230
> [0.625615]  [] drm_crtc_helper_set_mode+0x365/0x510
> [0.625617]  [] drm_crtc_helper_set_config+0xa4e/0xb70
> [0.625618]  [] drm_mode_set_config_internal+0x31/0x70
> [0.625619]  [] drm_fb_helper_set_par+0x71/0xf0
> [0.625621]  [] fbcon_init+0x514/0x5a0
> [0.625623]  [] visual_init+0xbc/0x120
> [0.625624]  [] do_bind_con_driver+0x163/0x320
> [0.625625]  [] do_take_over_console+0x61/0x70
> [0.625627]  [] do_fbcon_takeover+0x63/0xc0
> [0.625628]  [] fbcon_event_notify+0x5fd/0x700
> [0.625629]  [] notifier_call_chain+0x4d/0x70
> [0.625630]  [] __blocking_notifier_call_chain+0x58/0x80
> [0.625631]  [] blocking_notifier_call_chain+0x16/0x20
> [0.625633]  [] fb_notifier_call_chain+0x1b/0x20
> [0.625634]  [] register_framebuffer+0x1c8/0x310
> [0.625635]  [] drm_fb_helper_initial_config+0x371/0x520
> [0.625637]  [] ? 
> drm_fb_helper_single_add_all_connectors+0x47/0xf0
> [0.625639]  [] ? kmem_cache_alloc_trace+0xee/0x150
> [0.625641]  [] nouveau_fbcon_init+0x10e/0x160
> [

Re: [bisected][3.9.0-rc3] NULL ptr dereference from nv50_disp_intr()

2013-03-24 Thread Peter Hurley

On Tue, 2013-03-19 at 11:13 -0400, Peter Hurley wrote:
> On vanilla 3.9.0-rc3, I get this 100% repeatable oops after login when
> the user X session is coming up:

Perhaps I wasn't clear that this happens on every boot and is a
regression from 3.8

I'd be happy to help resolve this but time is of the essence; it would
be a shame to have to revert all of this for 3.9

Regards,
Peter Hurley

> BUG: unable to handle kernel NULL pointer dereference at 0001
> IP: [<0001>] 0x0
> PGD 0 
> Oops: 0010 [#1] PREEMPT SMP 
> Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables 
> ..
> CPU 3 
> Pid: 0, comm: swapper/3 Not tainted 3.9.0-rc3-xeon #rc3 Dell Inc. Precision 
> WorkStation T5400  /0RW203
> RIP: 0010:[<0001>]  [<0001>] 0x0
> RSP: 0018:8802afcc3d80  EFLAGS: 00010087
> RAX: 88029f6e5808 RBX: 0001 RCX: 
> RDX: 0096 RSI: 0001 RDI: 88029f6e5808
> RBP: 8802afcc3dc8 R08:  R09: 0004
> R10: 002c R11: 88029e559a98 R12: 8802a376cb78
> R13: 88029f6e57e0 R14: 88029f6e57f8 R15: 88029f6e5808
> FS:  () GS:8802afcc() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 0001 CR3: 00029fa67000 CR4: 07e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process swapper/3 (pid: 0, threadinfo 8802a355e000, task 8802a3535c40)
> Stack:
>  a0159d8a 0082 88029f6e5820 0001
>  88029f71aa00   0400
>  0400 8802afcc3e38 a01843b5 8802afcc3df8
> Call Trace:
>   
>  [] ? nouveau_event_trigger+0xaa/0xe0 [nouveau]
>  [] nv50_disp_intr+0xc5/0x200 [nouveau]
>  [] ? _raw_spin_unlock_irqrestore+0x2c/0x50
>  [] ? notifier_call_chain+0x4d/0x70
>  [] nouveau_mc_intr+0xb5/0x110 [nouveau]
>  [] nouveau_irq_handler+0x6f/0x80 [nouveau]
>  [] handle_irq_event_percpu+0x75/0x260
>  [] handle_irq_event+0x48/0x70
>  [] handle_fasteoi_irq+0x5a/0x100
>  [] handle_irq+0x22/0x40
>  [] do_IRQ+0x5a/0xd0
>  [] common_interrupt+0x6d/0x6d
>   
>  [] ? native_safe_halt+0x6/0x10
>  [] default_idle+0x3d/0x170
>  [] cpu_idle+0x116/0x130
>  [] start_secondary+0x251/0x258
> Code:  Bad RIP value.
> RIP  [<0001>] 0x0
>  RSP 
> CR2: 0001
> ---[ end trace 907323cb8ce6f301 ]---
> 
> 
> 
> git bisect from 3.8.0 (good) to 3.9.0-rc3 (bad) blames (bisect log
> attached):
> 
> 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b is the first bad commit
> commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
> Author: Ben Skeggs 
> Date:   Thu Jan 31 09:23:34 2013 +1000
> 
> drm/nouveau/disp: port vblank handling to event interface
> 
> This removes the nastiness with the interactions between display and
> software engines when handling vblank semaphore release interrupts.
> 
> Now, all the semantics are handled in one place (sw) \o/.
> 
> Signed-off-by: Ben Skeggs 
> 
> :04 04 fbd44f8566271415fd2775ab4b6346efef7e82fe 
> a0730be0f35feaa1476b1447b1d65c4b3b3c0686 Mdrivers
> 
> 
> On this hardware:
> nouveau  [  DEVICE][:02:00.0] BOOT0  : 0x084e00a2
> nouveau  [  DEVICE][:02:00.0] Chipset: G84 (NV84)
> nouveau  [  DEVICE][:02:00.0] Family : NV50
> nouveau  [   VBIOS][:02:00.0] checking PRAMIN for image...
> nouveau  [   VBIOS][:02:00.0] ... appears to be valid
> nouveau  [   VBIOS][:02:00.0] using image from PRAMIN
> nouveau  [   VBIOS][:02:00.0] BIT signature found
> nouveau  [   VBIOS][:02:00.0] version 60.84.63.00.11
> nouveau  [ PFB][:02:00.0] RAM type: DDR2
> nouveau  [ PFB][:02:00.0] RAM size: 256 MiB
> nouveau  [ PFB][:02:00.0]ZCOMP: 1892 tags
> nouveau  [ DRM] VRAM: 256 MiB
> nouveau  [ DRM] GART: 512 MiB
> nouveau  [ DRM] BIT BIOS found
> nouveau  [ DRM] Bios version 60.84.63.00
> nouveau  [ DRM] TMDS table version 2.0
> nouveau  [ DRM] DCB version 4.0
> nouveau  [ DRM] DCB outp 00: 02000300 0028
> nouveau  [ DRM] DCB outp 01: 01000302 0030
> nouveau  [ DRM] DCB outp 02: 04011310 0028
> nouveau  [ DRM] DCB outp 03: 02011312 0030
> nouveau  [ DRM] DCB conn 00: 1030
> nouveau  [ DRM] DCB conn 01: 2130
> nouveau  [ DRM] 2 available performance level(s)
> nouveau  [ DRM] 0: core 208MHz shader 416MHz memory 100MHz voltage 1200mV 
> fanspeed 100%
> nouveau  [ DRM] 1: core 46

Re: [PATCH] drm/nouveau: fix NULL ptr dereference from nv50_disp_intr()

2013-03-26 Thread Peter Hurley

On Sun, 2013-03-24 at 12:56 +0100, Maarten Lankhorst wrote:
> Op 23-03-13 12:47, Peter Hurley schreef:
> > On Tue, 2013-03-19 at 11:13 -0400, Peter Hurley wrote:
> >> On vanilla 3.9.0-rc3, I get this 100% repeatable oops after login when
> >> the user X session is coming up:
> > Perhaps I wasn't clear that this happens on every boot and is a
> > regression from 3.8
> >
> > I'd be happy to help resolve this but time is of the essence; it would
> > be a shame to have to revert all of this for 3.9
> 
> Well it broke on my system too, so it was easy to fix.
> 
> I didn't even need gdm to trigger it!
> 
> >8
> This fixes regression caused by 1d7c71a3e2f7 (drm/nouveau/disp: port vblank 
> handling to event interface),

Thanks Maarten!

But am I the only one running multi-head nouveau on linux-next and early
RCs? That's a scary thought.

Is there a test bench for validating nouveau?

Regards,
Peter Hurley


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Nouveau] [PATCH v2] drm/nouveau: wait for vblank on page flipping

2013-03-31 Thread Peter Hurley

On Thu, 2013-03-28 at 16:16 +0100, Maarten Lankhorst wrote:
> Signed-off-by: Maarten Lankhorst 
> ---
> Oops, fixed to apply this time..
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
> b/drivers/gpu/drm/nouveau/nouveau_display.c
> index 4610c3a..020542e 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_display.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c
> @@ -593,7 +597,7 @@ nouveau_crtc_page_flip(struct drm_crtc *crtc, struct 
> drm_framebuffer *fb,
>  
>   /* Emit a page flip */
>   if (nv_device(drm->device)->card_type >= NV_50) {
> - ret = nv50_display_flip_next(crtc, fb, chan, 0);
> + ret = nv50_display_flip_next(crtc, fb, chan, 1);

Why would this work?

>   if (ret) {
>   mutex_unlock(&chan->cli->mutex);
>   goto fail_unreserve;


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/nouveau: add lockdep annotations

2013-02-04 Thread Peter Hurley

Hi Maarten

On Mon, 2013-02-04 at 22:59 +0100, Maarten Lankhorst wrote:
> Op 04-02-13 22:30, Marcin Slusarz schreef:
> > 1) Lockdep thinks all nouveau subdevs belong to the same class and can be
> > locked in arbitrary order, which is not true (at least in general case).
> > Tell it to distinguish subdevs by (o)class type.
> Apart from this specific case, is there any other reason why we require being 
> able to nest 2 subdev locks?
> 
> Add a NVOBJ_FLAG_CREATE_EXCLUSIVE flag to nouveau_engctx_create and return 
> -EBUSY if there is any existing object. Problem solved, and lockdep is still 
> generic.
> 
> > 2) DRM client can be locked under user client lock - tell lockdep to put DRM
> > client lock in a separate class.
> Can you copy paste a typical lockdep splat for this? Also this should be a 
> separate patch. :-)

PS - Deep call chain. Has anyone looked into max stack depth yet?

[5.995881] =
[5.995886] [ INFO: possible recursive locking detected ]
[5.995892] 3.8.0-next-20130125+ttypatch-xeon+lockdep #20130125+ttypatch Not 
tainted
[5.995898] -
[5.995904] modprobe/275 is trying to acquire lock:
[5.995909]  (&subdev->mutex){+.+.+.}, at: [] 
nouveau_instobj_create_+0x48/0x90 [nouveau]
[5.995955] 
[5.995955] but task is already holding lock:
[5.995961]  (&subdev->mutex){+.+.+.}, at: [] 
nv50_disp_data_ctor+0x65/0xd0 [nouveau]
[5.995989] 
[5.995989] other info that might help us debug this:
[5.995995]  Possible unsafe locking scenario:
[5.995995] 
[5.996001]CPU0
[5.996004]
[5.996005]   lock(&subdev->mutex);
[5.996005]   lock(&subdev->mutex);
[5.996005] 
[5.996005]  *** DEADLOCK ***
[5.996005] 
[5.996005]  May be due to missing lock nesting notation
[5.996005] 
[5.996005] 4 locks held by modprobe/275:
[5.996005]  #0:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x5b/0xb0
[5.996005]  #1:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x69/0xb0
[5.996005]  #2:  (drm_global_mutex){+.+.+.}, at: [] 
drm_get_pci_dev+0xc6/0x2d0 [drm]
[5.996005]  #3:  (&subdev->mutex){+.+.+.}, at: [] 
nv50_disp_data_ctor+0x65/0xd0 [nouveau]
[5.996005] 
[5.996005] stack backtrace:
[5.996005] Pid: 275, comm: modprobe Not tainted 
3.8.0-next-20130125+ttypatch-xeon+lockdep #20130125+ttypatch
[5.996005] Call Trace:
[5.996005]  [] __lock_acquire+0x687/0x1a70
[5.996005]  [] ? mark_held_locks+0x9b/0x100
[5.996005]  [] ? trace_hardirqs_on_caller+0x10d/0x1a0
[5.996005]  [] ? trace_hardirqs_on+0xd/0x10
[5.996005]  [] ? _raw_write_unlock_irqrestore+0x4a/0x90
[5.996005]  [] lock_acquire+0x98/0x150
[5.996005]  [] ? nouveau_instobj_create_+0x48/0x90 
[nouveau]
[5.996005]  [] ? nouveau_instobj_create_+0x48/0x90 
[nouveau]
[5.996005]  [] mutex_lock_nested+0x50/0x390
[5.996005]  [] ? nouveau_instobj_create_+0x48/0x90 
[nouveau]
[5.996005]  [] ? nouveau_object_ref+0x46/0xd0 [nouveau]
[5.996005]  [] ? nouveau_object_create_+0x65/0xa0 
[nouveau]
[5.996005]  [] nouveau_instobj_create_+0x48/0x90 [nouveau]
[5.996005]  [] nv50_instobj_ctor+0x51/0xf0 [nouveau]
[5.996005]  [] nouveau_object_ctor+0x38/0xc0 [nouveau]
[5.996005]  [] nv50_instmem_alloc+0x26/0x30 [nouveau]
[5.996005]  [] nouveau_gpuobj_create_+0x247/0x2f0 
[nouveau]
[5.996005]  [] ? _raw_spin_unlock_irqrestore+0x6d/0x90
[5.996005]  [] ? trace_hardirqs_on_caller+0x10d/0x1a0
[5.996005]  [] nouveau_engctx_create_+0x26c/0x2b0 
[nouveau]
[5.996005]  [] nv50_disp_data_ctor+0xc1/0xd0 [nouveau]
[5.996005]  [] nouveau_object_ctor+0x38/0xc0 [nouveau]
[5.996005]  [] nouveau_object_new+0x112/0x240 [nouveau]
[5.996005]  [] nv50_display_create+0x1a5/0x890 [nouveau]
[5.996005]  [] ? __cancel_work_timer+0x8b/0xe0
[5.996005]  [] nouveau_display_create+0x3cb/0x670 
[nouveau]
[5.996005]  [] nouveau_drm_load+0x26f/0x590 [nouveau]
[5.996005]  [] ? device_register+0x1e/0x30
[5.996005]  [] ? drm_sysfs_device_add+0x86/0xb0 [drm]
[5.996005]  [] drm_get_pci_dev+0x186/0x2d0 [drm]
[5.996005]  [] nouveau_drm_probe+0x26a/0x2c0 [nouveau]
[5.996005]  [] ? _raw_spin_unlock_irqrestore+0x4a/0x90
[5.996005]  [] local_pci_probe+0x4b/0x80
[5.996005]  [] pci_device_probe+0x111/0x120
[5.996005]  [] driver_probe_device+0x8b/0x3a0
[5.996005]  [] __driver_attach+0xab/0xb0
[5.996005]  [] ? driver_probe_device+0x3a0/0x3a0
[5.996005]  [] bus_for_each_dev+0x55/0x90
[5.996005]  [] driver_attach+0x1e/0x20
[5.996005]  [] bus_add_driver+0x121/0x2b0
[5.996005]  [] ? 0xa01acfff
[5.996005]  [] driver_register+0x77/0x170
[5.996005]  [] ? 0xa01acfff
[5.996005]  [] __pci_register_driver+0x64/0x70
[5.996005]  [] drm_pci_init+0x11a/0x130 [drm]
[5.996005]  [] ? 0xa01acfff
[5.996005]  [] ?

[next-20130204] nouveau: lockdep warning (not the same as the subdev lockdep warning)

2013-02-13 Thread Peter Hurley

Got this lockdep warning straightaway during boot:

[7.435890] =
[7.435891] [ INFO: possible recursive locking detected ]
[7.435893] 3.8.0-next-20130204+pcipatch-xeon+lockdep #20130204+pcipatch Not 
tainted
[7.435893] -
[7.435895] modprobe/255 is trying to acquire lock:
[7.435942]  (&dmac->lock){+.+...}, at: [] 
evo_wait+0x43/0xf0 [nouveau]
[7.435942] 
[7.435942] but task is already holding lock:
[7.435964]  (&dmac->lock){+.+...}, at: [] 
evo_wait+0x43/0xf0 [nouveau]
[7.435964] 
[7.435964] other info that might help us debug this:
[7.435965]  Possible unsafe locking scenario:
[7.435965] 
[7.435966]CPU0
[7.435967]
[7.435968]   lock(&dmac->lock);
[7.435970]   lock(&dmac->lock);
[7.435970] 
[7.435970]  *** DEADLOCK ***
[7.435970] 
[7.435971]  May be due to missing lock nesting notation
[7.435971] 
[7.435973] 9 locks held by modprobe/255:
[7.435979]  #0:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x5b/0xb0
[7.435982]  #1:  (&__lockdep_no_validate__){..}, at: 
[] __driver_attach+0x69/0xb0
[7.436001]  #2:  (drm_global_mutex){+.+.+.}, at: [] 
drm_get_pci_dev+0xbd/0x2a0 [drm]
[7.436012]  #3:  (registration_lock){+.+.+.}, at: [] 
register_framebuffer+0x25/0x2f0
[7.436016]  #4:  (&fb_info->lock){+.+.+.}, at: [] 
lock_fb_info+0x26/0x60
[7.436023]  #5:  ((fb_notifier_list).rwsem){.+.+.+}, at: 
[] __blocking_notifier_call_chain+0x56/0xc0
[7.436035]  #6:  (&dev->mode_config.mutex){+.+.+.}, at: 
[] drm_modeset_lock_all+0x2a/0x70 [drm]
[7.436046]  #7:  (&crtc->mutex){+.+.+.}, at: [] 
drm_modeset_lock_all+0x54/0x70 [drm]
[7.436068]  #8:  (&dmac->lock){+.+...}, at: [] 
evo_wait+0x43/0xf0 [nouveau]
[7.436068] 
[7.436068] stack backtrace:
[7.436070] Pid: 255, comm: modprobe Not tainted 
3.8.0-next-20130204+pcipatch-xeon+lockdep #20130204+pcipatch
[7.436072] Call Trace:
[7.436077]  [] __lock_acquire+0x697/0x1a80
[7.436080]  [] lock_acquire+0xa1/0x200
[7.436100]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436121]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436125]  [] mutex_lock_nested+0x63/0x3a0
[7.436145]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436148]  [] ? mutex_lock_nested+0x29e/0x3a0
[7.436168]  [] ? evo_wait+0x43/0xf0 [nouveau]
[7.436170]  [] ? mark_held_locks+0xaf/0x110
[7.436190]  [] evo_wait+0x43/0xf0 [nouveau]
[7.436212]  [] evo_sync+0x5c/0xf0 [nouveau]
[7.436233]  [] nv50_display_flip_next+0x5ad/0x5c0 
[nouveau]


Regards,
Peter Hurley

Page allocation failures (was Re: WARNING: at drivers/gpu/drm/nouveau/core/core/mm.c:242)

2013-02-18 Thread Peter Hurley

On Tue, 2013-02-19 at 00:43 +0100, Jiri Slaby wrote:
> On 02/19/2013 12:23 AM, Marcin Slusarz wrote:
> > On Mon, Feb 18, 2013 at 11:27:43AM +0100, Jiri Slaby wrote:
> >> Hi,
> >>
> >> we have a report of WARNING from 3.7.6 in nouveau at
> >> drivers/gpu/drm/nouveau/core/core/mm.c:242 here:
> >> https://bugzilla.novell.com/show_bug.cgi?id=802347#c11
> >>
> >> There is an order 4 allocation failure in nouveau_drm_open ->
> >> nouveau_vm_create, i.e. this one failed:
> >> vm->pgt  = kcalloc(vm->lpde - vm->fpde + 1, sizeof(*vm->pgt), GFP_KERNEL);

Hi Jiri,

I had the order 4 allocation failure and nouveau crash back in November
on next-20121129. Bugzillas here:

Allocation failure:
https://bugzilla.kernel.org/show_bug.cgi?id=51301

Nouveau bug:
https://bugzilla.kernel.org/show_bug.cgi?id=51291
https://bugs.freedesktop.org/show_bug.cgi?id=58087

IMO, the 32k allocation failure is the more serious bug. Check out the
slab info from your report:

Feb 06 13:16:15 desdemona kernel: Node 0 DMA32: 13378*4kB 5026*8kB 1823*16kB
135*32kB 5*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 129576kB
Feb 06 13:16:15 desdemona kernel: Node 0 Normal: 1946*4kB 831*8kB 3*16kB 1*32kB
1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 16496kB

The pages are there: why did the allocation fail?

I think this is related to all that kswapd mess. In my case, the machine
really was OOM -- which made no sense. Completely out of page blocks
larger than 32k on a 10gb machine with a bunch of emacs and terminal
windows open for 3 days, just doing code, build, code, build, code,
build?

Regards,
Peter Hurley

kernel BUG at mm/slub.c:3409, 3.8.0-rc7

2013-02-18 Thread Peter Hurley

[+cc Marcin Slusarz, dri-devel]

On Tue, 2013-02-19 at 00:21 +, Christoph Lameter wrote:
> The problem is that the subsystem attempted to call kfree with a pointer
> that was not obtained via a slab allocation.
> 
> On Sat, 16 Feb 2013, Denys Fedoryshchenko wrote:
> 
> > Hi
> >
> > Worked for a while on 3.8.0-rc7, generally it is fine, then suddenly laptop
> > stopped responding to keyboard and mouse.
> > Sure it can be memory corruption by some other module, but maybe not. Worth 
> > to
> > report i guess.
> > After reboot checked logs and found this:
> >
> > Feb 16 00:40:17 localhost kernel: [23260.079253] [ cut here
> > ]
> > Feb 16 00:40:17 localhost kernel: [23260.079257] kernel BUG at 
> > mm/slub.c:3409!
> > Feb 16 00:40:17 localhost kernel: [23260.079259] invalid opcode:  [#1] 
> > SMP
> > Feb 16 00:40:17 localhost kernel: [23260.079262] Modules linked in:
> > ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 
> > nf_defrag_ipv4
> > xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter
> > ip_tables tun bridge stp llc nouveau snd_hda_codec_hdmi coretemp kvm_intel

Was there an allocation failure earlier in the log?

Might be this nouveau bug (linux-next at the time was 3.8 now):
https://bugs.freedesktop.org/show_bug.cgi?id=58087

I think this was fixed but neither bug report has a cross reference :(

The original report is here:
https://bugzilla.kernel.org/show_bug.cgi?id=51291

Pekka,
Can you please re-assign the bugzilla #51291 above to DRI? Thanks.

Regards,
Peter Hurley

3.8-rc2: lockdep warning in nouveau driver

2013-01-29 Thread Peter Hurley

On Wed, 2013-01-09 at 12:45 +0100, Arend van Spriel wrote:
> Maybe this one is already known, but I did not find a post about it. So
> here it is.
> 
> Regards,
> Arend

[snip]

> [9.589986] =
> [9.595365] [ INFO: possible recursive locking detected ]
> [9.600745] 3.8.0-rc2-wl-testing-lockdep-2-ga524cf0 #1 Not tainted
> [9.607248] -
> [9.612626] modprobe/163 is trying to acquire lock:
> [9.617486]  (&subdev->mutex){+.+.+.}, at: []
> nv50_fb_vram_new+0x92/0x230 [nouveau]
> [9.626052]
> [9.626052] but task is already holding lock:
> [9.631865]  (&subdev->mutex){+.+.+.}, at: []
> nv50_disp_data_ctor+0x55/0xc0 [nouveau]
> [9.640593]
> [9.640593] other info that might help us debug this:
> [9.647096]  Possible unsafe locking scenario:
> [9.647096]
> [9.652995]CPU0
> [9.655430]
> [9.657867]   lock(&subdev->mutex);
> [9.661365]   lock(&subdev->mutex);
> [9.664863]
> [9.664863]  *** DEADLOCK ***
> [9.664863]
> [9.670762]  May be due to missing lock nesting notation

Same.

[5.995881] =
[5.995886] [ INFO: possible recursive locking detected ]
[5.995892] 3.8.0-next-20130125+ttypatch-xeon+lockdep #20130125+ttypatch Not 
tainted
[5.995898] -
[5.995904] modprobe/275 is trying to acquire lock:
[5.995909]  (&subdev->mutex){+.+.+.}, at: [] 
nouveau_instobj_create_+0x48/0x90 [nouveau]
[5.995955] 
[5.995955] but task is already holding lock:
[5.995961]  (&subdev->mutex){+.+.+.}, at: [] 
nv50_disp_data_ctor+0x65/0xd0 [nouveau]
[5.995989] 
[5.995989] other info that might help us debug this:
[5.995995]  Possible unsafe locking scenario:
[5.995995] 
[5.996001]CPU0
[5.996004]
[5.996005]   lock(&subdev->mutex);
[5.996005]   lock(&subdev->mutex);
[5.996005] 
[5.996005]  *** DEADLOCK ***

Regards,
Peter Hurley

[bisected][3.9.0-rc3] NULL ptr dereference from nv50_disp_intr()

2013-03-23 Thread Peter Hurley

On Tue, 2013-03-19 at 11:13 -0400, Peter Hurley wrote:
> On vanilla 3.9.0-rc3, I get this 100% repeatable oops after login when
> the user X session is coming up:

Perhaps I wasn't clear that this happens on every boot and is a
regression from 3.8

I'd be happy to help resolve this but time is of the essence; it would
be a shame to have to revert all of this for 3.9

Regards,
Peter Hurley

> BUG: unable to handle kernel NULL pointer dereference at 0001
> IP: [<0001>] 0x0
> PGD 0 
> Oops: 0010 [#1] PREEMPT SMP 
> Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables 
> ..
> CPU 3 
> Pid: 0, comm: swapper/3 Not tainted 3.9.0-rc3-xeon #rc3 Dell Inc. Precision 
> WorkStation T5400  /0RW203
> RIP: 0010:[<0001>]  [<0001>] 0x0
> RSP: 0018:8802afcc3d80  EFLAGS: 00010087
> RAX: 88029f6e5808 RBX: 0001 RCX: 
> RDX: 0096 RSI: 0001 RDI: 88029f6e5808
> RBP: 8802afcc3dc8 R08:  R09: 0004
> R10: 002c R11: 88029e559a98 R12: 8802a376cb78
> R13: 88029f6e57e0 R14: 88029f6e57f8 R15: 88029f6e5808
> FS:  () GS:8802afcc() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 0001 CR3: 00029fa67000 CR4: 07e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process swapper/3 (pid: 0, threadinfo 8802a355e000, task 8802a3535c40)
> Stack:
>  a0159d8a 0082 88029f6e5820 0001
>  88029f71aa00   0400
>  0400 8802afcc3e38 a01843b5 8802afcc3df8
> Call Trace:
>   
>  [] ? nouveau_event_trigger+0xaa/0xe0 [nouveau]
>  [] nv50_disp_intr+0xc5/0x200 [nouveau]
>  [] ? _raw_spin_unlock_irqrestore+0x2c/0x50
>  [] ? notifier_call_chain+0x4d/0x70
>  [] nouveau_mc_intr+0xb5/0x110 [nouveau]
>  [] nouveau_irq_handler+0x6f/0x80 [nouveau]
>  [] handle_irq_event_percpu+0x75/0x260
>  [] handle_irq_event+0x48/0x70
>  [] handle_fasteoi_irq+0x5a/0x100
>  [] handle_irq+0x22/0x40
>  [] do_IRQ+0x5a/0xd0
>  [] common_interrupt+0x6d/0x6d
>   
>  [] ? native_safe_halt+0x6/0x10
>  [] default_idle+0x3d/0x170
>  [] cpu_idle+0x116/0x130
>  [] start_secondary+0x251/0x258
> Code:  Bad RIP value.
> RIP  [<0001>] 0x0
>  RSP 
> CR2: 0001
> ---[ end trace 907323cb8ce6f301 ]---
> 
> 
> 
> git bisect from 3.8.0 (good) to 3.9.0-rc3 (bad) blames (bisect log
> attached):
> 
> 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b is the first bad commit
> commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
> Author: Ben Skeggs 
> Date:   Thu Jan 31 09:23:34 2013 +1000
> 
> drm/nouveau/disp: port vblank handling to event interface
> 
> This removes the nastiness with the interactions between display and
> software engines when handling vblank semaphore release interrupts.
> 
> Now, all the semantics are handled in one place (sw) \o/.
> 
> Signed-off-by: Ben Skeggs 
> 
> :04 04 fbd44f8566271415fd2775ab4b6346efef7e82fe 
> a0730be0f35feaa1476b1447b1d65c4b3b3c0686 Mdrivers
> 
> 
> On this hardware:
> nouveau  [  DEVICE][:02:00.0] BOOT0  : 0x084e00a2
> nouveau  [  DEVICE][:02:00.0] Chipset: G84 (NV84)
> nouveau  [  DEVICE][:02:00.0] Family : NV50
> nouveau  [   VBIOS][:02:00.0] checking PRAMIN for image...
> nouveau  [   VBIOS][:02:00.0] ... appears to be valid
> nouveau  [   VBIOS][:02:00.0] using image from PRAMIN
> nouveau  [   VBIOS][:02:00.0] BIT signature found
> nouveau  [   VBIOS][:02:00.0] version 60.84.63.00.11
> nouveau  [ PFB][:02:00.0] RAM type: DDR2
> nouveau  [ PFB][:02:00.0] RAM size: 256 MiB
> nouveau  [ PFB][:02:00.0]ZCOMP: 1892 tags
> nouveau  [ DRM] VRAM: 256 MiB
> nouveau  [ DRM] GART: 512 MiB
> nouveau  [ DRM] BIT BIOS found
> nouveau  [ DRM] Bios version 60.84.63.00
> nouveau  [ DRM] TMDS table version 2.0
> nouveau  [ DRM] DCB version 4.0
> nouveau  [ DRM] DCB outp 00: 02000300 0028
> nouveau  [ DRM] DCB outp 01: 01000302 0030
> nouveau  [ DRM] DCB outp 02: 04011310 0028
> nouveau  [ DRM] DCB outp 03: 02011312 0030
> nouveau  [ DRM] DCB conn 00: 1030
> nouveau  [ DRM] DCB conn 01: 2130
> nouveau  [ DRM] 2 available performance level(s)
> nouveau  [ DRM] 0: core 208MHz shader 416MHz memory 100MHz voltage 1200mV 
> fanspeed 100%
> nouveau  [ DRM] 1: core 46

[PATCH] drm/nouveau: fix NULL ptr dereference from nv50_disp_intr()

2013-03-25 Thread Peter Hurley

On Sun, 2013-03-24 at 12:56 +0100, Maarten Lankhorst wrote:
> Op 23-03-13 12:47, Peter Hurley schreef:
> > On Tue, 2013-03-19 at 11:13 -0400, Peter Hurley wrote:
> >> On vanilla 3.9.0-rc3, I get this 100% repeatable oops after login when
> >> the user X session is coming up:
> > Perhaps I wasn't clear that this happens on every boot and is a
> > regression from 3.8
> >
> > I'd be happy to help resolve this but time is of the essence; it would
> > be a shame to have to revert all of this for 3.9
> 
> Well it broke on my system too, so it was easy to fix.
> 
> I didn't even need gdm to trigger it!
> 
> >8
> This fixes regression caused by 1d7c71a3e2f7 (drm/nouveau/disp: port vblank 
> handling to event interface),

Thanks Maarten!

But am I the only one running multi-head nouveau on linux-next and early
RCs? That's a scary thought.

Is there a test bench for validating nouveau?

Regards,
Peter Hurley

[Nouveau] [PATCH v2] drm/nouveau: wait for vblank on page flipping

2013-03-28 Thread Peter Hurley

On Thu, 2013-03-28 at 16:16 +0100, Maarten Lankhorst wrote:
> Signed-off-by: Maarten Lankhorst 
> ---
> Oops, fixed to apply this time..
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c 
> b/drivers/gpu/drm/nouveau/nouveau_display.c
> index 4610c3a..020542e 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_display.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_display.c
> @@ -593,7 +597,7 @@ nouveau_crtc_page_flip(struct drm_crtc *crtc, struct 
> drm_framebuffer *fb,
>  
>   /* Emit a page flip */
>   if (nv_device(drm->device)->card_type >= NV_50) {
> - ret = nv50_display_flip_next(crtc, fb, chan, 0);
> + ret = nv50_display_flip_next(crtc, fb, chan, 1);

Why would this work?

>   if (ret) {
>   mutex_unlock(&chan->cli->mutex);
>   goto fail_unreserve;

[Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-05-05 Thread Peter Hurley

On 05/04/2015 12:52 AM, Mario Kleiner wrote:
> On 04/16/2015 03:03 PM, Daniel Vetter wrote:
>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:
>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote:
>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>>>>under the protection of dev->vblank_time_lock. Switch to an unsigned
>>>>>>long instead and update comments. Note that atomic_read is just a
>>>>>>normal read of a volatile variable, so no need to audit all the
>>>>>>read-side access specifically.
>>>>>>
>>>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>>>>read-side was missing the first barrier between the counter read and
>>>>>>the timestamp read, it only had a barrier between the ts and the
>>>>>>counter read. We need both.
>>>>>>
>>>>>> - Barriers weren't properly documented. Since barriers only work if
>>>>>>you have them on boths sides of the transaction it's prudent to
>>>>>>reference where the other side is. To avoid duplicating the
>>>>>>write-side comment 3 times extract a little store_vblank() helper.
>>>>>>In that helper also assert that we do indeed hold
>>>>>>dev->vblank_time_lock, since in some cases the lock is acquired a
>>>>>>few functions up in the callchain.
>>>>>>
>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
>>>>>> the vblank_wait ioctl.
>>>>>>
>>>>>> Cc: Chris Wilson 
>>>>>> Cc: Mario Kleiner 
>>>>>> Cc: Ville SyrjÃ¤lÃ¤ 
>>>>>> Cc: Michel DÃ¤nzer 
>>>>>> Signed-off-by: Daniel Vetter 
>>>>>> ---
>>>>>>   drivers/gpu/drm/drm_irq.c | 92 
>>>>>> ---
>>>>>>   include/drm/drmP.h|  8 +++--
>>>>>>   2 files changed, 54 insertions(+), 46 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>> index c8a34476570a..23bfbc61a494 100644
>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, 
>>>>>> drm_vblank_offdelay, int, 0600);
>>>>>>   module_param_named(timestamp_precision_usec, drm_timestamp_precision, 
>>>>>> int, 0600);
>>>>>>   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 
>>>>>> 0600);
>>>>>>
>>>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>>>> + unsigned vblank_count_inc,
>>>>>> + struct timeval *t_vblank)
>>>>>> +{
>>>>>> +struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>>>>> +u32 tslot;
>>>>>> +
>>>>>> +assert_spin_locked(&dev->vblank_time_lock);
>>>>>> +
>>>>>> +if (t_vblank) {
>>>>>> +tslot = vblank->count + vblank_count_inc;
>>>>>> +vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>>>>> +}
>>>>>> +
>>>>>> +/*
>>>>>> + * vblank timestamp updates are protected on the write side with
>>>>>> + * vblank_time_lock, but on the read side done locklessly using a
>>>>>> + * sequence-lock on the vblank counter. Ensure correct ordering 
>>>>>> using
>>>>>> + * memory barrriers. We need the barrier both before and also after 
>>>>>> the
>>>>>> + * counter update to synchronize with the next timestamp write.
>>>>>> + * The read-side barriers for this are in drm_vblank_count_and_time.
>>>>>> + */
>>>>>> +smp_wmb();
>>>>>> +vblank->count += vblank_c

[Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-05-05 Thread Peter Hurley

On 05/05/2015 11:42 AM, Daniel Vetter wrote:
> On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote:
>> On 05/04/2015 12:52 AM, Mario Kleiner wrote:
>>> On 04/16/2015 03:03 PM, Daniel Vetter wrote:
>>>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:
>>>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote:
>>>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>>>>>>> Hi Daniel,
>>>>>>>
>>>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>>>>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>>>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>>>>>>under the protection of dev->vblank_time_lock. Switch to an unsigned
>>>>>>>>long instead and update comments. Note that atomic_read is just a
>>>>>>>>normal read of a volatile variable, so no need to audit all the
>>>>>>>>read-side access specifically.
>>>>>>>>
>>>>>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>>>>>>read-side was missing the first barrier between the counter read and
>>>>>>>>the timestamp read, it only had a barrier between the ts and the
>>>>>>>>counter read. We need both.
>>>>>>>>
>>>>>>>> - Barriers weren't properly documented. Since barriers only work if
>>>>>>>>you have them on boths sides of the transaction it's prudent to
>>>>>>>>reference where the other side is. To avoid duplicating the
>>>>>>>>write-side comment 3 times extract a little store_vblank() helper.
>>>>>>>>In that helper also assert that we do indeed hold
>>>>>>>>dev->vblank_time_lock, since in some cases the lock is acquired a
>>>>>>>>few functions up in the callchain.
>>>>>>>>
>>>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
>>>>>>>> the vblank_wait ioctl.
>>>>>>>>
>>>>>>>> Cc: Chris Wilson 
>>>>>>>> Cc: Mario Kleiner 
>>>>>>>> Cc: Ville SyrjÃ¤lÃ¤ 
>>>>>>>> Cc: Michel DÃ¤nzer 
>>>>>>>> Signed-off-by: Daniel Vetter 
>>>>>>>> ---
>>>>>>>>   drivers/gpu/drm/drm_irq.c | 92 
>>>>>>>> ---
>>>>>>>>   include/drm/drmP.h|  8 +++--
>>>>>>>>   2 files changed, 54 insertions(+), 46 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>>>> index c8a34476570a..23bfbc61a494 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, 
>>>>>>>> drm_vblank_offdelay, int, 0600);
>>>>>>>>   module_param_named(timestamp_precision_usec, 
>>>>>>>> drm_timestamp_precision, int, 0600);
>>>>>>>>   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, 
>>>>>>>> int, 0600);
>>>>>>>>
>>>>>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>>>>>> + unsigned vblank_count_inc,
>>>>>>>> + struct timeval *t_vblank)
>>>>>>>> +{
>>>>>>>> +struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>>>>>>> +u32 tslot;
>>>>>>>> +
>>>>>>>> +assert_spin_locked(&dev->vblank_time_lock);
>>>>>>>> +
>>>>>>>> +if (t_vblank) {
>>>>>>>> +tslot = vblank->count + vblank_count_inc;
>>>>>>>> +vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +/*
>>>>>>>> + * vblank timestamp updates are protected on the write side with
>>

[Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-05-05 Thread Peter Hurley

On 05/05/2015 11:57 AM, Peter Hurley wrote:
> On 05/05/2015 11:42 AM, Daniel Vetter wrote:
>> I'm also somewhat confused about how you to a line across both cpus for
>> barriers because barriers only have cpu-local effects (which is why we
>> always need a barrier on both ends of a transaction).

I'm sorry if my barrier notation confuses you; I find that it clearly
identifies matching pairs.

Also, there is a distinction between "can be visible" and "must be visible";
the load and stores themselves are not cpu-local.

Regards,
Peter Hurley

[Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-05-07 Thread Peter Hurley

On 05/06/2015 04:56 AM, Daniel Vetter wrote:
> On Tue, May 05, 2015 at 11:57:42AM -0400, Peter Hurley wrote:
>> On 05/05/2015 11:42 AM, Daniel Vetter wrote:
>>> On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote:
>>>> On 05/04/2015 12:52 AM, Mario Kleiner wrote:
>>>>> On 04/16/2015 03:03 PM, Daniel Vetter wrote:
>>>>>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:
>>>>>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote:
>>>>>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>>>>>>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>>>>>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>>>>>>>>under the protection of dev->vblank_time_lock. Switch to an 
>>>>>>>>>> unsigned
>>>>>>>>>>long instead and update comments. Note that atomic_read is just a
>>>>>>>>>>normal read of a volatile variable, so no need to audit all the
>>>>>>>>>>read-side access specifically.
>>>>>>>>>>
>>>>>>>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>>>>>>>>read-side was missing the first barrier between the counter read 
>>>>>>>>>> and
>>>>>>>>>>the timestamp read, it only had a barrier between the ts and the
>>>>>>>>>>counter read. We need both.
>>>>>>>>>>
>>>>>>>>>> - Barriers weren't properly documented. Since barriers only work if
>>>>>>>>>>you have them on boths sides of the transaction it's prudent to
>>>>>>>>>>reference where the other side is. To avoid duplicating the
>>>>>>>>>>write-side comment 3 times extract a little store_vblank() helper.
>>>>>>>>>>In that helper also assert that we do indeed hold
>>>>>>>>>>dev->vblank_time_lock, since in some cases the lock is acquired a
>>>>>>>>>>few functions up in the callchain.
>>>>>>>>>>
>>>>>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath 
>>>>>>>>>> to
>>>>>>>>>> the vblank_wait ioctl.
>>>>>>>>>>
>>>>>>>>>> Cc: Chris Wilson 
>>>>>>>>>> Cc: Mario Kleiner 
>>>>>>>>>> Cc: Ville SyrjÃ¤lÃ¤ 
>>>>>>>>>> Cc: Michel DÃ¤nzer 
>>>>>>>>>> Signed-off-by: Daniel Vetter 
>>>>>>>>>> ---
>>>>>>>>>>   drivers/gpu/drm/drm_irq.c | 92 
>>>>>>>>>> ---
>>>>>>>>>>   include/drm/drmP.h|  8 +++--
>>>>>>>>>>   2 files changed, 54 insertions(+), 46 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>>>>>> index c8a34476570a..23bfbc61a494 100644
>>>>>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, 
>>>>>>>>>> drm_vblank_offdelay, int, 0600);
>>>>>>>>>>   module_param_named(timestamp_precision_usec, 
>>>>>>>>>> drm_timestamp_precision, int, 0600);
>>>>>>>>>>   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, 
>>>>>>>>>> int, 0600);
>>>>>>>>>>
>>>>>>>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>>>>>>>> + unsigned vblank_count_inc,
>>>>>>>>>> + struct timeval *t_vblank)
>>>>>>>>>> +{
>>>>>>>>>> +stru

[PATCH] drm/nouveau/core: deinline nv_mask()

2015-05-08 Thread Peter Hurley

On 05/07/2015 04:49 AM, Denys Vlasenko wrote:
> Function compiles to 89 bytes of machine code.
> 466 callsites with this .config:
> http://busybox.net/~vda/kernel_config
> Size reduction:

Much of the cruft is related to calling iowriteX.

Ben,

Isn't subdev io always mmio? (iow, never to the 64k i/o space)


>text  data  bss   dec hex filename
> 82432426 22255384 20627456 125315266 77828c2 vmlinux.before
> 82426986 22255416 20627456 125309858 77813a2 vmlinux
> 
> Signed-off-by: Denys Vlasenko 
> CC: Stefan Huehner 
> CC: Ben Skeggs 
> CC: David Airlie 
> CC: dri-devel at lists.freedesktop.org
> CC: linux-kernel at vger.kernel.org
> ---
>  drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h | 9 ++---
>  drivers/gpu/drm/nouveau/nvkm/core/subdev.c | 8 
>  2 files changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h
> index 6fdc391..261b7ff 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h
> @@ -109,11 +109,6 @@ nv_wr32(void *obj, u32 addr, u32 data)
>   iowrite32_native(data, subdev->mmio + addr);
>  }
>  
> -static inline u32
> -nv_mask(void *obj, u32 addr, u32 mask, u32 data)
> -{
> - u32 temp = nv_rd32(obj, addr);
> - nv_wr32(obj, addr, (temp & ~mask) | data);
> - return temp;
> -}
> +u32
> +nv_mask(void *obj, u32 addr, u32 mask, u32 data);
>  #endif
> diff --git a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c 
> b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
> index c5fb3a79..88331ea 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
> @@ -25,6 +25,14 @@
>  #include 
>  #include 
>  
> +u32
> +nv_mask(void *obj, u32 addr, u32 mask, u32 data)
> +{
> + u32 temp = nv_rd32(obj, addr);
> + nv_wr32(obj, addr, (temp & ~mask) | data);
> + return temp;
> +}
> +
>  struct nvkm_subdev *
>  nvkm_subdev(void *obj, int idx)
>  {
>

[PATCH] drm/nouveau/gem: tolerate a buffer specified multiple times

2015-07-30 Thread Peter Hurley

On 07/30/2015 10:12 AM, Ilia Mirkin wrote:
> Is this happening with libdrm 2.4.60? If so, that's a known
> (user-side) issue and should be fixed by using any version but that
> one.

What's the freedesktop bugzilla # for reference?

Regards,
Peter Hurley

> On Thu, Jul 30, 2015 at 6:28 AM, Bryan O'Donoghue
>  wrote:
>> Ubuntu is shipping Chrome Version 44.0.2403.125 (64-bit). With this version
>> of the browser and current tip-of-tree 86ea07ca846a I get the following
>> error message followed by a lock-up of X.
>>
>> nouveau E[chrome[2737]] multiple instances of buffer 33 on validation list
>> nouveau E[chrome[2737]] validate_init
>> nouveau E[chrome[2737]] validate: -22
>> nouveau E[chrome[2737]] multiple instances of buffer 18 on validation list
>> nouveau E[chrome[2737]] validate_init
>>  nouveau E[chrome[2737]] validate: -22
>> nouveau E[   PFIFO][:01:00.0] PFIFO: read fault at
>> 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum
>> 0x)/GPC0/(unknown enum 0x000f) on channel 0x007f80c000
>> [unknown]
>>
>> This patch suggests a fix for this with the kernel simply tolerating an
>> application such as chrome requesting the same buffer more than once.
>>
>> With the version of chrome given above, you can elicit this behaviour by
>> clicking on the bookmarks drop down. This will open another window on-top
>> of the current window. Minus the fix included here, this will lead to hard
>> lockup of all windows on the desktop.
>>
>> Chrome Version 44.0.2403.125 (64-bit)
>> Linux 4.2.0-rc4+ 86ea07ca846a
>>
>> People are suggesting running chrome with -disable-gpu however it is
>> possible to run Chrome in it's default mode, so long as we tolerate the
>> above behaviour.
>>
>> http://tinyurl.com/orvbzf3
>>
>> Signed-off-by: Bryan O'Donoghue 
>> ---
>>  drivers/gpu/drm/nouveau/nouveau_gem.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
>> b/drivers/gpu/drm/nouveau/nouveau_gem.c
>> index af1ee51..a9694faad 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
>> @@ -401,9 +401,7 @@ retry:
>> if (nvbo->reserved_by && nvbo->reserved_by == file_priv) {
>> NV_PRINTK(error, cli, "multiple instances of buffer 
>> %d on "
>>   "validation list\n", b->handle);
>> -   drm_gem_object_unreference_unlocked(gem);
>> -   ret = -EINVAL;
>> -   break;
>> +   continue;
>> }
>>
>> ret = ttm_bo_reserve(&nvbo->bo, true, false, true, 
>> &op->ticket);
>> --
>> 1.9.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

[PATCH] drm/nouveau/gem: tolerate a buffer specified multiple times

2015-07-30 Thread Peter Hurley

On 07/30/2015 10:12 AM, Ilia Mirkin wrote:
> Is this happening with libdrm 2.4.60? If so, that's a known
> (user-side) issue and should be fixed by using any version but that
> one.

What's the freedesktop bugzilla # for reference?

Regards,
Peter Hurley

> On Thu, Jul 30, 2015 at 6:28 AM, Bryan O'Donoghue
>  wrote:
>> Ubuntu is shipping Chrome Version 44.0.2403.125 (64-bit). With this version
>> of the browser and current tip-of-tree 86ea07ca846a I get the following
>> error message followed by a lock-up of X.
>>
>> nouveau E[chrome[2737]] multiple instances of buffer 33 on validation list
>> nouveau E[chrome[2737]] validate_init
>> nouveau E[chrome[2737]] validate: -22
>> nouveau E[chrome[2737]] multiple instances of buffer 18 on validation list
>> nouveau E[chrome[2737]] validate_init
>>  nouveau E[chrome[2737]] validate: -22
>> nouveau E[   PFIFO][:01:00.0] PFIFO: read fault at
>> 0x0003e21000 [PAGE_NOT_PRESENT] from (unknown enum
>> 0x)/GPC0/(unknown enum 0x000f) on channel 0x007f80c000
>> [unknown]
>>
>> This patch suggests a fix for this with the kernel simply tolerating an
>> application such as chrome requesting the same buffer more than once.
>>
>> With the version of chrome given above, you can elicit this behaviour by
>> clicking on the bookmarks drop down. This will open another window on-top
>> of the current window. Minus the fix included here, this will lead to hard
>> lockup of all windows on the desktop.
>>
>> Chrome Version 44.0.2403.125 (64-bit)
>> Linux 4.2.0-rc4+ 86ea07ca846a
>>
>> People are suggesting running chrome with -disable-gpu however it is
>> possible to run Chrome in it's default mode, so long as we tolerate the
>> above behaviour.
>>
>> http://tinyurl.com/orvbzf3
>>
>> Signed-off-by: Bryan O'Donoghue 
>> ---
>>  drivers/gpu/drm/nouveau/nouveau_gem.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
>> b/drivers/gpu/drm/nouveau/nouveau_gem.c
>> index af1ee51..a9694faad 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
>> @@ -401,9 +401,7 @@ retry:
>> if (nvbo->reserved_by && nvbo->reserved_by == file_priv) {
>> NV_PRINTK(error, cli, "multiple instances of buffer 
>> %d on "
>>   "validation list\n", b->handle);
>> -   drm_gem_object_unreference_unlocked(gem);
>> -   ret = -EINVAL;
>> -   break;
>> +   continue;
>> }
>>
>> ret = ttm_bo_reserve(&nvbo->bo, true, false, true, 
>> &op->ticket);
>> --
>> 1.9.1

[PATCH] drm/nouveau/gem: tolerate a buffer specified multiple times

2015-07-30 Thread Peter Hurley

[ +cc Debian maintainer ]

On 07/30/2015 11:26 AM, Emil Velikov wrote:
> On 30 July 2015 at 16:02, Ilia Mirkin  wrote:
>> On Thu, Jul 30, 2015 at 10:56 AM, Bryan O'Donoghue
>>  wrote:
>>> On 30/07/15 15:52, Bryan O'Donoghue wrote:
>>>>
>>>> On 30/07/15 15:49, Peter Hurley wrote:
>>>>>
>>>>> On 07/30/2015 10:12 AM, Ilia Mirkin wrote:
>>>>>>
>>>>>> Is this happening with libdrm 2.4.60? If so, that's a known
>>>>>> (user-side) issue and should be fixed by using any version but that
>>>>>> one.
>>>>>
>>>>>
>>>>> What's the freedesktop bugzilla # for reference?
>>>>>
>>>>> Regards,
>>>>> Peter Hurley
>>>>
>>>>
>>>> I believe it's this one
>>>>
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=89842#c19
>>>>
>>>
>>> Not really a world of choice on ubuntu to fix it though...
>>>
>>> deckard at aineko:~/Development/projectara$ apt-show-versions libdrm2
>>> libdrm2:amd64/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate
>>> libdrm2:i386/trusty-updates 2.4.60-2~ubuntu14.04.1 uptodate
>>>
>>> :(
>>
>> That's unfortunate. I know next to nothing about debian/ubuntu or how
>> they do versions or how to even build packages for them. But they're
>> big distros, presumably they have support teams of some sort, perhaps
>> they can help you.
>>
>> Assuming that switching away does resolve the issue for you, perhaps
>> you can also recommend that they avoid shipping that version, or
>> include this nouveau fix in it:
>>
>> http://cgit.freedesktop.org/mesa/drm/commit/?id=812e8fe6ce46d733c30207ee26c788c61f546294
>>
> Fwiw debian has been tracking this as #789759, and they are shipping
> 2.4.62 which includes the fix.

Unfortunately the LTS version of Ubuntu (trusty) was updated to 2.4.60
several days ago without this fix.

I repackaged libdrm 2.4.60 with only the bug fix above and confirm the
patch above fixes the observed behavior in freedesktop bug# 89842/
debian bug# 789759.

I pushed the repackage to Launchpad PPA @ ppa:phurley/libdrm

Hopefully the Debian maintainer grabs this fix and updates the official
distribution version soon.

Regards,
Peter Hurley

[PATCH] vt_buffer: drop console buffer copying optimisations

2015-01-29 Thread Peter Hurley

On 01/28/2015 11:11 PM, Dave Airlie wrote:
> These two copy to/from VGA memory, however on the Silicon
> Motion SMI750 VGA card on a 64-bit system cause console corruption.
> 
> This is due to the hw being buggy and not handling a 64-bit transaction
> correctly.
> 
> We could try and create a 32-bit version of these routines,
> but I'm not sure the optimisation is worth much today.
> 
> Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1132826

Restricted link.


> Tested-by: Huawei engineering.
> Signed-off-by: Dave Airlie 
> ---
> 
> Linus, this came up a while back I finally got some confirmation
> that it fixes those servers.
> 
>  include/linux/vt_buffer.h | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/include/linux/vt_buffer.h b/include/linux/vt_buffer.h
> index 057db7d..f38c10b 100644
> --- a/include/linux/vt_buffer.h
> +++ b/include/linux/vt_buffer.h
> @@ -21,10 +21,6 @@
>  #ifndef VT_BUF_HAVE_RW
>  #define scr_writew(val, addr) (*(addr) = (val))
>  #define scr_readw(addr) (*(addr))
> -#define scr_memcpyw(d, s, c) memcpy(d, s, c)
> -#define scr_memmovew(d, s, c) memmove(d, s, c)
> -#define VT_BUF_HAVE_MEMCPYW
> -#define VT_BUF_HAVE_MEMMOVEW
>  #endif
>  
>  #ifndef VT_BUF_HAVE_MEMSETW
> 


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
--
___
Dri-devel mailing list
Dri-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

[PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-15 Thread Peter Hurley

Hi Daniel,

On 04/15/2015 03:17 AM, Daniel Vetter wrote:
> This was a bit too much cargo-culted, so lets make it solid:
> - vblank->count doesn't need to be an atomic, writes are always done
>   under the protection of dev->vblank_time_lock. Switch to an unsigned
>   long instead and update comments. Note that atomic_read is just a
>   normal read of a volatile variable, so no need to audit all the
>   read-side access specifically.
> 
> - The barriers for the vblank counter seqlock weren't complete: The
>   read-side was missing the first barrier between the counter read and
>   the timestamp read, it only had a barrier between the ts and the
>   counter read. We need both.
> 
> - Barriers weren't properly documented. Since barriers only work if
>   you have them on boths sides of the transaction it's prudent to
>   reference where the other side is. To avoid duplicating the
>   write-side comment 3 times extract a little store_vblank() helper.
>   In that helper also assert that we do indeed hold
>   dev->vblank_time_lock, since in some cases the lock is acquired a
>   few functions up in the callchain.
> 
> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
> the vblank_wait ioctl.
> 
> Cc: Chris Wilson 
> Cc: Mario Kleiner 
> Cc: Ville SyrjÃ¤lÃ¤ 
> Cc: Michel DÃ¤nzer 
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/drm_irq.c | 92 
> ---
>  include/drm/drmP.h|  8 +++--
>  2 files changed, 54 insertions(+), 46 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index c8a34476570a..23bfbc61a494 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, 
> int, 0600);
>  module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
> 0600);
>  module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 0600);
>  
> +static void store_vblank(struct drm_device *dev, int crtc,
> +  unsigned vblank_count_inc,
> +  struct timeval *t_vblank)
> +{
> + struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
> + u32 tslot;
> +
> + assert_spin_locked(&dev->vblank_time_lock);
> +
> + if (t_vblank) {
> + tslot = vblank->count + vblank_count_inc;
> + vblanktimestamp(dev, crtc, tslot) = *t_vblank;
> + }
> +
> + /*
> +  * vblank timestamp updates are protected on the write side with
> +  * vblank_time_lock, but on the read side done locklessly using a
> +  * sequence-lock on the vblank counter. Ensure correct ordering using
> +  * memory barrriers. We need the barrier both before and also after the
> +  * counter update to synchronize with the next timestamp write.
> +  * The read-side barriers for this are in drm_vblank_count_and_time.
> +  */
> + smp_wmb();
> + vblank->count += vblank_count_inc;
> + smp_wmb();

The comment and the code are each self-contradictory.

If vblank->count writes are always protected by vblank_time_lock (something I
did not verify but that the comment above asserts), then the trailing write
barrier is not required (and the assertion that it is in the comment is 
incorrect).

A spin unlock operation is always a write barrier.

Regards,
Peter Hurley

> +}
> +
>  /**
>   * drm_update_vblank_count - update the master vblank counter
>   * @dev: DRM device
> @@ -93,7 +120,7 @@ module_param_named(timestamp_monotonic, 
> drm_timestamp_monotonic, int, 0600);
>  static void drm_update_vblank_count(struct drm_device *dev, int crtc)
>  {
>   struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
> - u32 cur_vblank, diff, tslot;
> + u32 cur_vblank, diff;
>   bool rc;
>   struct timeval t_vblank;
>  
> @@ -129,18 +156,12 @@ static void drm_update_vblank_count(struct drm_device 
> *dev, int crtc)
>   if (diff == 0)
>   return;
>  
> - /* Reinitialize corresponding vblank timestamp if high-precision query
> -  * available. Skip this step if query unsupported or failed. Will
> -  * reinitialize delayed at next vblank interrupt in that case.
> + /*
> +  * Only reinitialize corresponding vblank timestamp if high-precision 
> query
> +  * available and didn't fail. Will reinitialize delayed at next vblank
> +  * interrupt in that case.
>*/
> - if (rc) {
> - tslot = atomic_read(&vblank->count) + diff;
> - vblanktimestamp(dev, crtc, tslot) = t_vblank;
> - }
> -
> - s

[PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-15 Thread Peter Hurley

On 04/15/2015 05:26 PM, Mario Kleiner wrote:
> A couple of questions to educate me and one review comment.
> 
> On 04/15/2015 07:34 PM, Daniel Vetter wrote:
>> This was a bit too much cargo-culted, so lets make it solid:
>> - vblank->count doesn't need to be an atomic, writes are always done
>>under the protection of dev->vblank_time_lock. Switch to an unsigned
>>long instead and update comments. Note that atomic_read is just a
>>normal read of a volatile variable, so no need to audit all the
>>read-side access specifically.
>>
>> - The barriers for the vblank counter seqlock weren't complete: The
>>read-side was missing the first barrier between the counter read and
>>the timestamp read, it only had a barrier between the ts and the
>>counter read. We need both.
>>
>> - Barriers weren't properly documented. Since barriers only work if
>>you have them on boths sides of the transaction it's prudent to
>>reference where the other side is. To avoid duplicating the
>>write-side comment 3 times extract a little store_vblank() helper.
>>In that helper also assert that we do indeed hold
>>dev->vblank_time_lock, since in some cases the lock is acquired a
>>few functions up in the callchain.
>>
>> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
>> the vblank_wait ioctl.
>>
>> v2: Add comment to better explain how store_vblank works, suggested by
>> Chris.
>>
>> v3: Peter noticed that as-is the 2nd smp_wmb is redundant with the
>> implicit barrier in the spin_unlock. But that can only be proven by
>> auditing all callers and my point in extracting this little helper was
>> to localize all the locking into just one place. Hence I think that
>> additional optimization is too risky.
>>
>> Cc: Chris Wilson 
>> Cc: Mario Kleiner 
>> Cc: Ville SyrjÃ¤lÃ¤ 
>> Cc: Michel DÃ¤nzer 
>> Cc: Peter Hurley 
>> Signed-off-by: Daniel Vetter 
>> ---
>>   drivers/gpu/drm/drm_irq.c | 95 
>> +--
>>   include/drm/drmP.h|  8 +++-
>>   2 files changed, 57 insertions(+), 46 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>> index c8a34476570a..8694b77d0002 100644
>> --- a/drivers/gpu/drm/drm_irq.c
>> +++ b/drivers/gpu/drm/drm_irq.c
>> @@ -74,6 +74,36 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, 
>> int, 0600);
>>   module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
>> 0600);
>>   module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 
>> 0600);
>>
>> +static void store_vblank(struct drm_device *dev, int crtc,
>> + unsigned vblank_count_inc,
>> + struct timeval *t_vblank)
>> +{
>> +struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>> +u32 tslot;
>> +
>> +assert_spin_locked(&dev->vblank_time_lock);
>> +
>> +if (t_vblank) {
>> +/* All writers hold the spinlock, but readers are serialized by
>> + * the latching of vblank->count below.
>> + */
>> +tslot = vblank->count + vblank_count_inc;
>> +vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>> +}
>> +
>> +/*
>> + * vblank timestamp updates are protected on the write side with
>> + * vblank_time_lock, but on the read side done locklessly using a
>> + * sequence-lock on the vblank counter. Ensure correct ordering using
>> + * memory barrriers. We need the barrier both before and also after the
>> + * counter update to synchronize with the next timestamp write.
>> + * The read-side barriers for this are in drm_vblank_count_and_time.
>> + */
>> +smp_wmb();
>> +vblank->count += vblank_count_inc;
>> +smp_wmb();
>> +}
>> +
>>   /**
>>* drm_update_vblank_count - update the master vblank counter
>>* @dev: DRM device
>> @@ -93,7 +123,7 @@ module_param_named(timestamp_monotonic, 
>> drm_timestamp_monotonic, int, 0600);
>>   static void drm_update_vblank_count(struct drm_device *dev, int crtc)
>>   {
>>   struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>> -u32 cur_vblank, diff, tslot;
>> +u32 cur_vblank, diff;
>>   bool rc;
>>   struct timeval t_vblank;
>>
>> @@ -129,18 +159,12 @@ static void drm_update_vblank_count(struct drm_device 
>> *dev, int crtc)
>&g

[PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-16 Thread Peter Hurley

On 04/16/2015 02:39 AM, Mario Kleiner wrote:
> On 04/16/2015 03:29 AM, Peter Hurley wrote:
>> On 04/15/2015 05:26 PM, Mario Kleiner wrote:
>>> A couple of questions to educate me and one review comment.
>>>
>>> On 04/15/2015 07:34 PM, Daniel Vetter wrote:
>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>> under the protection of dev->vblank_time_lock. Switch to an unsigned
>>>> long instead and update comments. Note that atomic_read is just a
>>>> normal read of a volatile variable, so no need to audit all the
>>>> read-side access specifically.
>>>>
>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>> read-side was missing the first barrier between the counter read and
>>>> the timestamp read, it only had a barrier between the ts and the
>>>> counter read. We need both.
>>>>
>>>> - Barriers weren't properly documented. Since barriers only work if
>>>> you have them on boths sides of the transaction it's prudent to
>>>> reference where the other side is. To avoid duplicating the
>>>> write-side comment 3 times extract a little store_vblank() helper.
>>>> In that helper also assert that we do indeed hold
>>>> dev->vblank_time_lock, since in some cases the lock is acquired a
>>>> few functions up in the callchain.
>>>>
>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
>>>> the vblank_wait ioctl.
>>>>
>>>> v2: Add comment to better explain how store_vblank works, suggested by
>>>> Chris.
>>>>
>>>> v3: Peter noticed that as-is the 2nd smp_wmb is redundant with the
>>>> implicit barrier in the spin_unlock. But that can only be proven by
>>>> auditing all callers and my point in extracting this little helper was
>>>> to localize all the locking into just one place. Hence I think that
>>>> additional optimization is too risky.
>>>>
>>>> Cc: Chris Wilson 
>>>> Cc: Mario Kleiner 
>>>> Cc: Ville SyrjÃ¤lÃ¤ 
>>>> Cc: Michel DÃ¤nzer 
>>>> Cc: Peter Hurley 
>>>> Signed-off-by: Daniel Vetter 
>>>> ---
>>>>drivers/gpu/drm/drm_irq.c | 95 
>>>> +--
>>>>include/drm/drmP.h|  8 +++-
>>>>2 files changed, 57 insertions(+), 46 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>> index c8a34476570a..8694b77d0002 100644
>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>> @@ -74,6 +74,36 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, 
>>>> int, 0600);
>>>>module_param_named(timestamp_precision_usec, drm_timestamp_precision, 
>>>> int, 0600);
>>>>module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 
>>>> 0600);
>>>>
>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>> + unsigned vblank_count_inc,
>>>> + struct timeval *t_vblank)
>>>> +{
>>>> +struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>>> +u32 tslot;
>>>> +
>>>> +assert_spin_locked(&dev->vblank_time_lock);
>>>> +
>>>> +if (t_vblank) {
>>>> +/* All writers hold the spinlock, but readers are serialized by
>>>> + * the latching of vblank->count below.
>>>> + */
>>>> +tslot = vblank->count + vblank_count_inc;
>>>> +vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>>> +}
>>>> +
>>>> +/*
>>>> + * vblank timestamp updates are protected on the write side with
>>>> + * vblank_time_lock, but on the read side done locklessly using a
>>>> + * sequence-lock on the vblank counter. Ensure correct ordering using
>>>> + * memory barrriers. We need the barrier both before and also after 
>>>> the
>>>> + * counter update to synchronize with the next timestamp write.
>>>> + * The read-side barriers for this are in drm_vblank_count_and_time.
>>>> + */
>>>>

[Intel-gfx] [PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-16 Thread Peter Hurley

On 04/15/2015 01:31 PM, Daniel Vetter wrote:
> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>> Hi Daniel,
>>
>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>> This was a bit too much cargo-culted, so lets make it solid:
>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>   under the protection of dev->vblank_time_lock. Switch to an unsigned
>>>   long instead and update comments. Note that atomic_read is just a
>>>   normal read of a volatile variable, so no need to audit all the
>>>   read-side access specifically.
>>>
>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>   read-side was missing the first barrier between the counter read and
>>>   the timestamp read, it only had a barrier between the ts and the
>>>   counter read. We need both.
>>>
>>> - Barriers weren't properly documented. Since barriers only work if
>>>   you have them on boths sides of the transaction it's prudent to
>>>   reference where the other side is. To avoid duplicating the
>>>   write-side comment 3 times extract a little store_vblank() helper.
>>>   In that helper also assert that we do indeed hold
>>>   dev->vblank_time_lock, since in some cases the lock is acquired a
>>>   few functions up in the callchain.
>>>
>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath to
>>> the vblank_wait ioctl.
>>>
>>> Cc: Chris Wilson 
>>> Cc: Mario Kleiner 
>>> Cc: Ville SyrjÃ¤lÃ¤ 
>>> Cc: Michel DÃ¤nzer 
>>> Signed-off-by: Daniel Vetter 
>>> ---
>>>  drivers/gpu/drm/drm_irq.c | 92 
>>> ---
>>>  include/drm/drmP.h|  8 +++--
>>>  2 files changed, 54 insertions(+), 46 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>> index c8a34476570a..23bfbc61a494 100644
>>> --- a/drivers/gpu/drm/drm_irq.c
>>> +++ b/drivers/gpu/drm/drm_irq.c
>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, drm_vblank_offdelay, 
>>> int, 0600);
>>>  module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 
>>> 0600);
>>>  module_param_named(timestamp_monotonic, drm_timestamp_monotonic, int, 
>>> 0600);
>>>  
>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>> +unsigned vblank_count_inc,
>>> +struct timeval *t_vblank)
>>> +{
>>> +   struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>> +   u32 tslot;
>>> +
>>> +   assert_spin_locked(&dev->vblank_time_lock);
>>> +
>>> +   if (t_vblank) {
>>> +   tslot = vblank->count + vblank_count_inc;
>>> +   vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>> +   }
>>> +
>>> +   /*
>>> +* vblank timestamp updates are protected on the write side with
>>> +* vblank_time_lock, but on the read side done locklessly using a
>>> +* sequence-lock on the vblank counter. Ensure correct ordering using
>>> +* memory barrriers. We need the barrier both before and also after the
>>> +* counter update to synchronize with the next timestamp write.
>>> +* The read-side barriers for this are in drm_vblank_count_and_time.
>>> +*/
>>> +   smp_wmb();
>>> +   vblank->count += vblank_count_inc;
>>> +   smp_wmb();
>>
>> The comment and the code are each self-contradictory.
>>
>> If vblank->count writes are always protected by vblank_time_lock (something I
>> did not verify but that the comment above asserts), then the trailing write
>> barrier is not required (and the assertion that it is in the comment is 
>> incorrect).
>>
>> A spin unlock operation is always a write barrier.
> 
> Hm yeah. Otoh to me that's bordering on "code too clever for my own good".
> That the spinlock is held I can assure. That no one goes around and does
> multiple vblank updates (because somehow that code raced with the hw
> itself) I can't easily assure with a simple assert or something similar.
> It's not the case right now, but that can changes.

The algorithm would be broken if multiple updates for the same vblank
count were allowed; that's why it checks to see if the vblank count has
not advanced before storing a new timestamp.

Otherwise, the read side would not be able to determine that the
timestamp is valid by double-checking that the vblank count has not
changed.

And besides, even if the code looped without dropping the spinlock,
the correct write order would still be observed because it would still
be executing on the same cpu.

My objection to the write memory barrier is not about optimization;
it's about correct code.

Regards,
Peter Hurley

[PATCH] drm/vblank: Fixup and document timestamp update/read barriers

2015-04-16 Thread Peter Hurley

On 04/16/2015 02:39 AM, Mario Kleiner wrote:
> On 04/16/2015 03:29 AM, Peter Hurley wrote:
>> On 04/15/2015 05:26 PM, Mario Kleiner wrote:

>> Because the time scales for these events don't require that level of
>> resolution; consider how much code has to get executed between a
>> hardware vblank irq triggering and the vblank counter being updated.
>>
>> Realistically, the only relevant requirement is that the timestamp
>> match the counter.
>>
> 
> Yes that is the really important part. A msec delay would possibly matter for 
> some timing sensitive apps like mine - some more exotic displays run at 200 
> Hz, and some apps need to synchronize to the vblank not strictly for 
> graphics. But i assume potential delays here are more on the order of a few 
> microseconds if some pending loads from the cache would get reordered for 
> overall efficiency?

I'd be surprised if the delay were as much as 1 us.

The latency to return to userspace significantly dwarfs any observable
effects having missed the vblank count update by 1 instruction.

Regards,
Peter Hurley

bitfield structures

2014-10-16 Thread Peter Hurley

On 10/16/2014 10:14 AM, Alex Deucher wrote:
> Are there any strong objections to these sorts of structures?

You may want to blacklist certain compiler version/arch combinations,
or get the affected arches to do it.

gcc up to 4.7.1 on ia64 and ppc64 generates 64-bit wide RMW cycles
on bitfields, regardless of the specified type or actual field width.
The enlarged write overwrites adjacent fields.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080

Regards,
Peter Hurley

[bisected][3.9.0-rc3] NULL ptr dereference from nv50_disp_intr()

2013-03-19 Thread Peter Hurley

On vanilla 3.9.0-rc3, I get this 100% repeatable oops after login when
the user X session is coming up:


BUG: unable to handle kernel NULL pointer dereference at 0001
IP: [<0001>] 0x0
PGD 0 
Oops: 0010 [#1] PREEMPT SMP 
Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables ..
CPU 3 
Pid: 0, comm: swapper/3 Not tainted 3.9.0-rc3-xeon #rc3 Dell Inc. Precision 
WorkStation T5400  /0RW203
RIP: 0010:[<0001>]  [<0001>] 0x0
RSP: 0018:8802afcc3d80  EFLAGS: 00010087
RAX: 88029f6e5808 RBX: 0001 RCX: 
RDX: 0096 RSI: 0001 RDI: 88029f6e5808
RBP: 8802afcc3dc8 R08:  R09: 0004
R10: 002c R11: 88029e559a98 R12: 8802a376cb78
R13: 88029f6e57e0 R14: 88029f6e57f8 R15: 88029f6e5808
FS:  () GS:8802afcc() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0001 CR3: 00029fa67000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process swapper/3 (pid: 0, threadinfo 8802a355e000, task 8802a3535c40)
Stack:
 a0159d8a 0082 88029f6e5820 0001
 88029f71aa00   0400
 0400 8802afcc3e38 a01843b5 8802afcc3df8
Call Trace:
  
 [] ? nouveau_event_trigger+0xaa/0xe0 [nouveau]
 [] nv50_disp_intr+0xc5/0x200 [nouveau]
 [] ? _raw_spin_unlock_irqrestore+0x2c/0x50
 [] ? notifier_call_chain+0x4d/0x70
 [] nouveau_mc_intr+0xb5/0x110 [nouveau]
 [] nouveau_irq_handler+0x6f/0x80 [nouveau]
 [] handle_irq_event_percpu+0x75/0x260
 [] handle_irq_event+0x48/0x70
 [] handle_fasteoi_irq+0x5a/0x100
 [] handle_irq+0x22/0x40
 [] do_IRQ+0x5a/0xd0
 [] common_interrupt+0x6d/0x6d
  
 [] ? native_safe_halt+0x6/0x10
 [] default_idle+0x3d/0x170
 [] cpu_idle+0x116/0x130
 [] start_secondary+0x251/0x258
Code:  Bad RIP value.
RIP  [<0001>] 0x0
 RSP 
CR2: 0001
---[ end trace 907323cb8ce6f301 ]---



git bisect from 3.8.0 (good) to 3.9.0-rc3 (bad) blames (bisect log
attached):

1d7c71a3e2f77336df536855b0efd2dc5bdeb41b is the first bad commit
commit 1d7c71a3e2f77336df536855b0efd2dc5bdeb41b
Author: Ben Skeggs 
Date:   Thu Jan 31 09:23:34 2013 +1000

drm/nouveau/disp: port vblank handling to event interface

This removes the nastiness with the interactions between display and
software engines when handling vblank semaphore release interrupts.

Now, all the semantics are handled in one place (sw) \o/.

Signed-off-by: Ben Skeggs 

:04 04 fbd44f8566271415fd2775ab4b6346efef7e82fe 
a0730be0f35feaa1476b1447b1d65c4b3b3c0686 M  drivers


On this hardware:
nouveau  [  DEVICE][:02:00.0] BOOT0  : 0x084e00a2
nouveau  [  DEVICE][:02:00.0] Chipset: G84 (NV84)
nouveau  [  DEVICE][:02:00.0] Family : NV50
nouveau  [   VBIOS][:02:00.0] checking PRAMIN for image...
nouveau  [   VBIOS][:02:00.0] ... appears to be valid
nouveau  [   VBIOS][:02:00.0] using image from PRAMIN
nouveau  [   VBIOS][:02:00.0] BIT signature found
nouveau  [   VBIOS][:02:00.0] version 60.84.63.00.11
nouveau  [ PFB][:02:00.0] RAM type: DDR2
nouveau  [ PFB][:02:00.0] RAM size: 256 MiB
nouveau  [ PFB][:02:00.0]ZCOMP: 1892 tags
nouveau  [ DRM] VRAM: 256 MiB
nouveau  [ DRM] GART: 512 MiB
nouveau  [ DRM] BIT BIOS found
nouveau  [ DRM] Bios version 60.84.63.00
nouveau  [ DRM] TMDS table version 2.0
nouveau  [ DRM] DCB version 4.0
nouveau  [ DRM] DCB outp 00: 02000300 0028
nouveau  [ DRM] DCB outp 01: 01000302 0030
nouveau  [ DRM] DCB outp 02: 04011310 0028
nouveau  [ DRM] DCB outp 03: 02011312 0030
nouveau  [ DRM] DCB conn 00: 1030
nouveau  [ DRM] DCB conn 01: 2130
nouveau  [ DRM] 2 available performance level(s)
nouveau  [ DRM] 0: core 208MHz shader 416MHz memory 100MHz voltage 1200mV 
fanspeed 100%
nouveau  [ DRM] 1: core 460MHz shader 920MHz memory 400MHz voltage 1200mV 
fanspeed 100%
nouveau  [ DRM] c: core 459MHz shader 918MHz memory 399MHz voltage 1200mV
nouveau  [ DRM] MM: using CRYPT for buffer copies
nouveau  [ DRM] allocated 1680x1050 fb: 0x6, bo 88029ef50400
fbcon: nouveaufb (fb0) is primary device
nouveau :02:00.0: fb0: nouveaufb frame buffer device
nouveau :02:00.0: registered panic notifier
[drm] Initialized nouveau 1.1.0 20120801 for :02:00.0 on minor 0


02:00.0 VGA compatible controller: NVIDIA Corporation G84 [Quadro FX 570] (rev 
a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation Device 0474
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilitie

[Nouveau] nouveau lockdep splat

2013-03-19 Thread Peter Hurley

[ adding Ben Skeggs and Dave Airlie ]

On Tue, 2013-03-19 at 21:24 +0100, Borislav Petkov wrote:
> On Tue, Mar 05, 2013 at 05:30:52PM +0100, Lucas Stach wrote:
> > Dropping Tegra ML, it's not the place where Nouveau mails should go.
> > Adding Nouveau ML and Maarten, who probably knows Lockdep+Nouveau best.
> 
> Ok,
> 
> with the hope of having the right people on CC now (finally, thanks
> Lucas :-)), here's the same splat on -rc3. Someone better take a look
> soonish, please:

Also happens in next (on nv50 hardware).

> [0.541078] [drm] No driver support for vblank timestamp query.
> [0.541272] nouveau  [ DRM] 3 available performance level(s)
> [0.541276] nouveau  [ DRM] 0: core 135MHz shader 270MHz memory 135MHz 
> voltage 900mV
> [0.541280] nouveau  [ DRM] 1: core 405MHz shader 810MHz memory 405MHz 
> voltage 900mV
> [0.541284] nouveau  [ DRM] 3: core 520MHz shader 1230MHz memory 
> 790MHz voltage 900mV
> [0.541287] nouveau  [ DRM] c: core 405MHz shader 810MHz memory 405MHz 
> voltage 900mV
> [0.559846] nouveau  [ DRM] MM: using COPY for buffer copies
> [0.625371] nouveau  [ DRM] allocated 1920x1080 fb: 0x7, bo 
> 88043b54f000
> [0.625441] fbcon: nouveaufb (fb0) is primary device
> [0.62] 
> [0.625556] =
> [0.625556] [ INFO: possible recursive locking detected ]
> [0.625557] 3.9.0-rc3+ #25 Not tainted
> [0.625557] -
> [0.625558] swapper/0/1 is trying to acquire lock:
> [0.625562]  (&dmac->lock){+.+...}, at: [] 
> evo_wait+0x43/0xf0
> [0.625562] 
> [0.625562] but task is already holding lock:
> [0.625564]  (&dmac->lock){+.+...}, at: [] 
> evo_wait+0x43/0xf0
> [0.625565] 
> [0.625565] other info that might help us debug this:
> [0.625565]  Possible unsafe locking scenario:
> [0.625565] 
> [0.625565]CPU0
> [0.625565]
> [0.625566]   lock(&dmac->lock);
> [0.625567]   lock(&dmac->lock);
> [0.625567] 
> [0.625567]  *** DEADLOCK ***
> [0.625567] 
> [0.625567]  May be due to missing lock nesting notation
> [0.625567] 
> [0.625568] 10 locks held by swapper/0/1:
> [0.625570]  #0:  (&__lockdep_no_validate__){..}, at: 
> [] __driver_attach+0x5b/0xb0
> [0.625572]  #1:  (&__lockdep_no_validate__){..}, at: 
> [] __driver_attach+0x69/0xb0
> [0.625575]  #2:  (drm_global_mutex){+.+.+.}, at: [] 
> drm_get_pci_dev+0xc6/0x2d0
> [0.625578]  #3:  (registration_lock){+.+.+.}, at: [] 
> register_framebuffer+0x25/0x310
> [0.625581]  #4:  (&fb_info->lock){+.+.+.}, at: [] 
> lock_fb_info+0x26/0x60
> [0.625583]  #5:  (console_lock){+.+.+.}, at: [] 
> register_framebuffer+0x1ba/0x310
> [0.625585]  #6:  ((fb_notifier_list).rwsem){.+.+.+}, at: 
> [] __blocking_notifier_call_chain+0x42/0x80
> [0.625587]  #7:  (&dev->mode_config.mutex){+.+.+.}, at: 
> [] drm_modeset_lock_all+0x2a/0x70
> [0.625589]  #8:  (&crtc->mutex){+.+.+.}, at: [] 
> drm_modeset_lock_all+0x54/0x70
> [0.625591]  #9:  (&dmac->lock){+.+...}, at: [] 
> evo_wait+0x43/0xf0
> [0.625591] 
> [0.625591] stack backtrace:
> [0.625592] Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc3+ #25
> [0.625593] Call Trace:
> [0.625595]  [] __lock_acquire+0x76b/0x1c20
> [0.625597]  [] ? dcb_table+0x1ac/0x2a0
> [0.625599]  [] lock_acquire+0x8a/0x120
> [0.625600]  [] ? evo_wait+0x43/0xf0
> [0.625602]  [] ? mutex_lock_nested+0x292/0x330
> [0.625603]  [] mutex_lock_nested+0x6e/0x330
> [0.625605]  [] ? evo_wait+0x43/0xf0
> [0.625606]  [] ? mark_held_locks+0x9b/0x100
> [0.625607]  [] evo_wait+0x43/0xf0
> [0.625609]  [] nv50_display_flip_next+0x713/0x7a0
> [0.625611]  [] ? mutex_unlock+0xe/0x10
> [0.625612]  [] ? evo_kick+0x37/0x40
> [0.625613]  [] nv50_crtc_commit+0x10e/0x230
> [0.625615]  [] drm_crtc_helper_set_mode+0x365/0x510
> [0.625617]  [] drm_crtc_helper_set_config+0xa4e/0xb70
> [0.625618]  [] drm_mode_set_config_internal+0x31/0x70
> [0.625619]  [] drm_fb_helper_set_par+0x71/0xf0
> [0.625621]  [] fbcon_init+0x514/0x5a0
> [0.625623]  [] visual_init+0xbc/0x120
> [0.625624]  [] do_bind_con_driver+0x163/0x320
> [0.625625]  [] do_take_over_console+0x61/0x70
> [0.625627]  [] do_fbcon_takeover+0x63/0xc0
> [0.625628]  [] fbcon_event_notify+0x5fd/0x700
> [0.625629]  [] notifier_call_chain+0x4d/0x70
> [0.625630]  [] __blocking_notifier_call_chain+0x58/0x80
> [0.625631]  [] blocking_notifier_call_chain+0x16/0x20
> [0.625633]  [] fb_notifier_call_chain+0x1b/0x20
> [0.625634]  [] register_framebuffer+0x1c8/0x310
> [0.625635]  [] drm_fb_helper_initial_config+0x371/0x520
> [0.625637]  [] ? 
> drm_fb_helper_single_add_all_connectors+0x47/0xf0
> [0.625639]  [] ? kmem_cache_alloc_trace+0xee/0x150
> [0.625641]  [] nouveau_fbcon_init+0x10e/0x160
> [

79 matches

Mail list logo