date:20130528

[git pull] drm fixes

2013-05-28 Thread Dave Airlie


Hi Linus,

this is mostly exynos and intel fixes, along with some vblank patches I 
lost from Rob a few months ago that make wayland work better on lots of 
GPUs, also a qxl kconfig fix.

Dave.

The following changes since commit b91fd4d5aad0c1124654341814067ca3f59490fc:

  Merge tag 'pci-v3.10-fixes-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci (2013-05-23 13:50:53 
-0700)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux.git drm-fixes

for you to fetch changes up to c89b65e7fffef745bdd36c372aa0dea778fecbab:

  qxl: fix Kconfig deps - select FB_DEFERRED_IO (2013-05-28 17:03:37 +1000)


Andrew Jones (1):
  qxl: fix Kconfig deps - select FB_DEFERRED_IO

Chris Wilson (1):
  drm/i915: Propagate errors back from fb set-base

Dave Airlie (3):
  Merge remote-tracking branch 'pfdo/drm-fixes' into drm-next
  Merge branch 'exynos-drm-fixes' of 
git://git.kernel.org/.../daeinki/drm-exynos into drm-next
  Merge branch 'drm-intel-fixes' of 
git://people.freedesktop.org/~danvet/drm-intel into drm-next

Imre Deak (5):
  drm/i915: force full modeset if the connector is in DPMS OFF mode
  drm/i915: add msecs_to_jiffies_timeout to guarantee minimum duration
  drm/i915: use msecs_to_jiffies_timeout instead of open coding the same
  drm/i915: avoid premature timeouts in __wait_seqno()
  drm/i915: avoid premature DP AUX timeouts

Inki Dae (1):
  drm/exynos: wait for the completion of pending page flip

Lars-Peter Clausen (1):
  drm/exynos: exynos_hdmi: Pass correct pointer to free_irq()

Rob Clark (6):
  drm/nouveau: use drm_send_vblank_event() helper
  drm/radeon: use drm_send_vblank_event() helper
  drm/shmob: use drm_send_vblank_event() helper
  drm/imx: use drm_send_vblank_event() helper
  drm/exynos: page flip fixes
  drm/exynos: use drm_send_vblank_event() helper

Rodrigo Vivi (1):
  drm/i915: Adding more reserved PCI IDs for Haswell.

Sachin Kamat (2):
  drm/exynos: exynos_drm_fbdev: Fix incorrect usage of IS_ERR_OR_NULL
  drm/exynos: exynos_drm_ipp: Fix incorrect usage of IS_ERR_OR_NULL

Seung-Woo Kim (4):
  drm/exynos: cleanup device pointer usages
  drm/exynos: fix build warnings from ipp fimc
  drm/exynos: remove unnecessary devm_kfree
  drm/exynos: replace request_threaded_irq with devm function

 drivers/gpu/drm/exynos/exynos_drm_crtc.c| 27 ++--
 drivers/gpu/drm/exynos/exynos_drm_fbdev.c   |  2 +-
 drivers/gpu/drm/exynos/exynos_drm_fimc.c| 12 +++
 drivers/gpu/drm/exynos/exynos_drm_fimd.c| 10 +++---
 drivers/gpu/drm/exynos/exynos_drm_g2d.c |  6 ++--
 drivers/gpu/drm/exynos/exynos_drm_gsc.c | 12 ++-
 drivers/gpu/drm/exynos/exynos_drm_hdmi.c|  2 +-
 drivers/gpu/drm/exynos/exynos_drm_ipp.c | 18 +--
 drivers/gpu/drm/exynos/exynos_drm_rotator.c | 13 ++--
 drivers/gpu/drm/exynos/exynos_drm_vidi.c|  4 +--
 drivers/gpu/drm/exynos/exynos_hdmi.c| 21 +
 drivers/gpu/drm/exynos/exynos_mixer.c   | 14 -
 drivers/gpu/drm/i915/i915_drv.c | 46 ---
 drivers/gpu/drm/i915/i915_drv.h | 15 +
 drivers/gpu/drm/i915/i915_gem.c |  2 +-
 drivers/gpu/drm/i915/intel_display.c| 49 +++--
 drivers/gpu/drm/i915/intel_dp.c |  2 +-
 drivers/gpu/drm/i915/intel_i2c.c|  5 +--
 drivers/gpu/drm/nouveau/nouveau_display.c   | 13 ++--
 drivers/gpu/drm/qxl/Kconfig |  1 +
 drivers/gpu/drm/radeon/radeon_display.c | 13 ++--
 drivers/gpu/drm/shmobile/shmob_drm_crtc.c   | 19 +++
 drivers/staging/imx-drm/ipuv3-crtc.c| 21 ++---
 23 files changed, 163 insertions(+), 164 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Maarten Lankhorst

Hey,

Op 28-05-13 04:49, Inki Dae schreef:
>
>> -Original Message-
>> From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com]
>> Sent: Tuesday, May 28, 2013 12:23 AM
>> To: Inki Dae
>> Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'YoungJun Cho'; 'Kyungmin
>> Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
>> ker...@lists.infradead.org; linux-me...@vger.kernel.org
>> Subject: Re: Introduce a new helper framework for buffer synchronization
>>
>> Hey,
>>
>> Op 27-05-13 12:38, Inki Dae schreef:
>>> Hi all,
>>>
>>> I have been removed previous branch and added new one with more cleanup.
>>> This time, the fence helper doesn't include user side interfaces and
>> cache
>>> operation relevant codes anymore because not only we are not sure that
>>> coupling those two things, synchronizing caches and buffer access
>> between
>>> CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
>> a
>>> good idea yet but also existing codes for user side have problems with
>> badly
>>> behaved or crashing userspace. So this could be more discussed later.
>>>
>>> The below is a new branch,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/?h=dma-f
>>> ence-helper
>>>
>>> And fence helper codes,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>>> h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
>>>
>>> And example codes for device driver,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>>> h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
>>>
>>> I think the time is not yet ripe for RFC posting: maybe existing dma
>> fence
>>> and reservation need more review and addition work. So I'd glad for
>> somebody
>>> giving other opinions and advices in advance before RFC posting.
>>>
>> NAK.
>>
>> For examples for how to handle locking properly, see Documentation/ww-
>> mutex-design.txt in my recent tree.
>> I could list what I believe is wrong with your implementation, but real
>> problem is that the approach you're taking is wrong.
> I just removed ticket stubs to show my approach you guys as simple as
> possible, and I just wanted to show that we could use buffer synchronization
> mechanism without ticket stubs.
The tickets have been removed in favor of a ww_context. Moving it in as a base 
primitive
allows more locking abuse to be detected, and makes some other things easier 
too.

> Question, WW-Mutexes could be used for all devices? I guess this has
> dependence on x86 gpu: gpu has VRAM and it means different memory domain.
> And could you tell my why shared fence should have only eight objects? I
> think we could need more than eight objects for read access. Anyway I think
> I don't surely understand yet so there might be my missing point.
Yes, ww mutexes are not limited in any way to x86. They're a locking mechanism.
When you acquired the ww mutexes for all buffer objects, all it does is say at
that point in time you have exclusively acquired the locks of all bo's.

After locking everything you can read the fence pointers safely, queue waits, 
and set a
new fence pointer on all reservation_objects. You only need a single fence
on all those objects, so 8 is plenty. Nonetheless this was a limitation of my
earlier design, and I'll dynamically allocate fence_shared in the future.

~Maarten

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH 1/2] drm/exynos: fix WINDOWS_NR checking to vidi driver

2013-05-28 Thread Inki Dae

This patch just checks if win_data array range is valid
or not correctly.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_vidi.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_vidi.c 
b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
index 24376c1..11a016d 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
@@ -282,7 +282,7 @@ static void vidi_win_mode_set(struct device *dev,
if (win == DEFAULT_ZPOS)
win = ctx->default_win;
 
-   if (win < 0 || win > WINDOWS_NR)
+   if (win < 0 || win >= WINDOWS_NR)
return;
 
offset = overlay->fb_x * (overlay->bpp >> 3);
@@ -332,7 +332,7 @@ static void vidi_win_commit(struct device *dev, int zpos)
if (win == DEFAULT_ZPOS)
win = ctx->default_win;
 
-   if (win < 0 || win > WINDOWS_NR)
+   if (win < 0 || win >= WINDOWS_NR)
return;
 
win_data = &ctx->win_data[win];
@@ -356,7 +356,7 @@ static void vidi_win_disable(struct device *dev, int zpos)
if (win == DEFAULT_ZPOS)
win = ctx->default_win;
 
-   if (win < 0 || win > WINDOWS_NR)
+   if (win < 0 || win >= WINDOWS_NR)
return;
 
win_data = &ctx->win_data[win];
-- 
1.7.5.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

drm/exynos: make overlay data to be updated to valid hw

2013-05-28 Thread Inki Dae

This patch makes sure that overlay data are updated
to real hardware enabled when framebuffer is released.
For this, this patch checks if crtc and encoder are
valid or not, and then makes it waiting for signal
synchroniztion to only valid encoder.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_encoder.c |9 ++---
 drivers/gpu/drm/exynos/exynos_drm_encoder.h |2 +-
 drivers/gpu/drm/exynos/exynos_drm_fb.c  |   13 +++--
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.c 
b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
index c63721f..9a6e3fd 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_encoder.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
@@ -220,18 +220,21 @@ static void exynos_drm_encoder_commit(struct drm_encoder 
*encoder)
exynos_encoder->dpms = DRM_MODE_DPMS_ON;
 }
 
-void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb)
+void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc)
 {
struct exynos_drm_encoder *exynos_encoder;
struct exynos_drm_manager_ops *ops;
-   struct drm_device *dev = fb->dev;
+   struct drm_device *dev = crtc->dev;
struct drm_encoder *encoder;
 
/*
 * make sure that overlay data are updated to real hardware
-* for all encoders.
+* for valid encoders.
 */
list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
+   if (encoder->crtc != crtc)
+   continue;
+
exynos_encoder = to_exynos_encoder(encoder);
ops = exynos_encoder->manager->ops;
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.h 
b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
index 89e2fb0..e8dee1c 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_encoder.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
@@ -32,6 +32,6 @@ void exynos_drm_encoder_plane_mode_set(struct drm_encoder 
*encoder, void *data);
 void exynos_drm_encoder_plane_commit(struct drm_encoder *encoder, void *data);
 void exynos_drm_encoder_plane_enable(struct drm_encoder *encoder, void *data);
 void exynos_drm_encoder_plane_disable(struct drm_encoder *encoder, void *data);
-void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb);
+void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc);
 
 #endif
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c 
b/drivers/gpu/drm/exynos/exynos_drm_fb.c
index 0e04f4e..1fc7ae6 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
@@ -68,12 +68,21 @@ static int check_fb_gem_memory_type(struct drm_device 
*drm_dev,
 static void exynos_drm_fb_destroy(struct drm_framebuffer *fb)
 {
struct exynos_drm_fb *exynos_fb = to_exynos_fb(fb);
+   struct drm_device *dev = fb->dev;
+   struct drm_crtc *crtc;
unsigned int i;
 
DRM_DEBUG_KMS("%s\n", __FILE__);
 
-   /* make sure that overlay data are updated before relesing fb. */
-   exynos_drm_encoder_complete_scanout(fb);
+   list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+   if (crtc->fb == fb) {
+   /*
+* make sure that overlay data are updated before
+* relesing fb.
+*/
+   exynos_drm_encoder_complete_scanout(crtc);
+   }
+   }
 
drm_framebuffer_cleanup(fb);
 
-- 
1.7.5.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65068] New: AtomBIOS stuck after suspend/resume cycle whilst GPU turned off

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65068

  Priority: medium
Bug ID: 65068
  Assignee: dri-devel@lists.freedesktop.org
   Summary: AtomBIOS stuck after suspend/resume cycle whilst GPU
turned off
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: austin.l...@gmail.com
  Hardware: Other
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

Created attachment 79884
  --> https://bugs.freedesktop.org/attachment.cgi?id=79884&action=edit
dmesg output when trying to switch back to radeon gpu.

I have two GPUs in my system:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Whistler [Radeon HD 6600M/6700M/7600M Series]

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core
Processor Family Integrated Graphics Controller (rev 09)

This is a macbookpro8,2 and hence the gmuxer is controlled by the apple-gmux
driver.

If I suspend the system to ram whilst on the integrated gpu (i.e. the intel
gpu), then after resume switch back to the radeon, I get a GPU hang.

I've attached the dmesg output that I get when I try this.

I'm using linux 3.10-rc3.  I don't have X running when doing this
(vgaswitcheroo won't allow this).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 63935] TURKS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=63935

--- Comment #51 from Christian König  ---
(In reply to comment #50)
> Also patched SUMO2 patch from
> http://lists.freedesktop.org/archives/dri-devel/2013-May/038894.html
> 
> '''Still no success!'''
> 
> Still got [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset
> the VCPU!!!

Well is this a Mac you are trying to get working? This bugreport is about Macs
booting in EFI mode, if you have issues with another system please open another
bugreport even if you have the same symptoms.

Christian.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 64850] Second screen black on Pitcairn PRO

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=64850

--- Comment #18 from Jakob Nixdorf  ---
New update: It seems to have nothing to do with the connectors that are used.
I just got my mini-DisplayPort to DVI adapter and tested it. No combination
works (HDMI+mDP, DVI+mDP, HDMI+DVI) the second screen is always black.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/tegra: add support for runtime pm

2013-05-28 Thread Thierry Reding

On Tue, May 28, 2013 at 08:45:03AM +0300, Terje Bergström wrote:
> On 27.05.2013 18:45, Thierry Reding wrote:
> > On Mon, May 27, 2013 at 07:19:28PM +0530, Mayuresh Kulkarni wrote:
> >> +#ifdef CONFIG_PM_RUNTIME
> >> +static int host1x_runtime_suspend(struct device *dev)
> >> +{
> >> +  struct host1x *host;
> >> +
> >> +  host = dev_get_drvdata(dev);
> >> +  if (IS_ERR_OR_NULL(host))
> > 
> > I think a simple
> > 
> > if (!host)
> > return -EINVAL;
> > 
> > would be enough here. The driver-data of the device should never be an
> > ERR_PTR()-encoded value, but either a valid pointer to a host1x object
> > or NULL.
> 
> True, we should avoid IS_ERR_OR_NULL() like plague. We always know if
> the called API returns a NULL on error or an error code. In case of
> error code we should just propagate that.

Yes, that's the case in general. In this specific case the value
obtained by dev_get_drvdata() should either be a valid pointer or NULL,
never an error code. We can easily make sure by only setting the data
(using platform_set_drvdata()) when the pointer is valid.

Thinking about it some more, I don't think we can ever get NULL here. A
device's .runtime_suspend() cannot be called when the device has been
removed, right? That's the only case where the value returned might be
NULL. It would be NULL too if host1x wasn't initialized yet, but that's
already dealt with by the proper ordering in .probe().

> > Same comments apply here. Also I think it might be a good idea to split
> > the host1x and gr2d changes into separate patches.
> 
> That's a bit tricky, but doable. We just need to enable it for 2D first,
> and then host1x to keep bisectability.

Right, there's a dependency. But I'd still prefer to have them separate.
Unless it gets really messy.

> >>  static void action_submit_complete(struct host1x_waitlist *waiter)
> >>  {
> >> +  int completed = waiter->count;
> >>struct host1x_channel *channel = waiter->data;
> >>  
> >> +  /* disable clocks for all the submits that got completed in this lot */
> >> +  while (completed--)
> >> +  pm_runtime_put(channel->dev);
> >> +
> >>host1x_cdma_update(&channel->cdma);
> >>  
> >> -  /*  Add nr_completed to trace */
> >> +  /* Add nr_completed to trace */
> >>trace_host1x_channel_submit_complete(dev_name(channel->dev),
> >> waiter->count, waiter->thresh);
> >> -
> >>  }
> > 
> > This feels hackish. But I can't see any better place to do this. Terje,
> > Arto: any ideas how we can do this in a cleaner way? If there's nothing
> > better then maybe moving the code into a separate function, say
> > host1x_waitlist_complete(), might make this less awkward?
> 
> Yeah, it's a bit awkward. action_submit_complete() actually does handle
> completion of multiple jobs, and we do one pm_runtime_get() per job.
> 
> We could do pm_runtime_put() in host1x_cdma_update(). It anyway goes
> through each job that is completed, so while freeing the job it could as
> well call runtime PM. That way we could even remove the waiter->count
> variable altogether as it's not needed anymore.

That sounds a lot better. We could add a helper (host1x_job_finish()
perhaps) with the following from update_cdma_locked():

/* Unpin the memory */
host1x_job_unpin(job);

/* Pop push buffer slots */
if (job->num_slots) {
struct push_buffer *pb = &cdma->push_buffer;
host1x_pushbuffer_pop(pb, job->num_slots);
if (cdma->event == CDMA_EVENT_PUSH_BUFFER_SPACE)
signal = true;
}

list_del(&job->list);

And add pm_runtime_put() (as well as potentially other stuff) in there.
That'll prevent update_cdma_unlocked() from growing too much. It isn't
too bad right now, so maybe a helper isn't warranted yet, but I don't
think it'll hurt.

> The not-so-beautiful aspect is that we do pm_runtime_get() in
> host1x_channel.c and pm_runtime_put() in host1x_cdma.c. For code
> readability it's be great to have them in the same file. I actually get
> questions every now and then because in downstream because of doing
> these operations in different files.

With the above helper in place, we could move host1x_job_submit() to
job.c instead and have all the code in one file.

Thierry


pgpULmE_Mi6_3.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PULL] drm-intel-fixes

2013-05-28 Thread Daniel Vetter

On Thu, May 23, 2013 at 02:03:09PM +0200, Daniel Vetter wrote:
> Hi Dave,
> 
> A few fixes, nothing shocking:
> - More Haswell pci ids. Includes a pile of marketing spare ids (which
>   despite the spare moniker show up all over the place).
> - Fix a regression in handling modeset failures, resulting in black
>   screens on 3 pipe setups when we've run out of pch plls (Chris).
> - Fix up the setcrtc semantics to unconditionally enable the outputs.
>   Juding from git digging that has (kinda) always been the case and neatly
>   fixes a few long-standing (i.e. forever) bug reports (Imre).
> - jiffies_timeout + 1 patches from Imre. They partially fix spurious
>   wait_event failures in the interrupt-driven dp aux/i2c code. The other
>   part is a core patch for the wait_event macros going in through -mm. A
>   few patches more than strictly required since Imre is pushing for a
>   general solution in 3.11.
> 
> Cheers, Daniel

Update pull request (same sha1 but with a tag) so that I can pile new
patches on top (there's one I want to give some testing for a few days
first ...).

Cheers, Daniel


The following changes since commit c7788792a5e7b0d5d7f96d0766b4cb6112d47d75:

  Linux 3.10-rc2 (2013-05-20 14:37:38 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~danvet/drm-intel tags/drm-intel-fixes-2013-05-28

for you to fetch changes up to 3598706b52cb45ba0a9e8aa99ce5ac59140f2b8b:

  drm/i915: avoid premature DP AUX timeouts (2013-05-22 13:51:26 +0200)


Chris Wilson (1):
  drm/i915: Propagate errors back from fb set-base

Imre Deak (5):
  drm/i915: force full modeset if the connector is in DPMS OFF mode
  drm/i915: add msecs_to_jiffies_timeout to guarantee minimum duration
  drm/i915: use msecs_to_jiffies_timeout instead of open coding the same
  drm/i915: avoid premature timeouts in __wait_seqno()
  drm/i915: avoid premature DP AUX timeouts

Rodrigo Vivi (1):
  drm/i915: Adding more reserved PCI IDs for Haswell.

 drivers/gpu/drm/i915/i915_drv.c  |   46 +++
 drivers/gpu/drm/i915/i915_drv.h  |   15 +++
 drivers/gpu/drm/i915/i915_gem.c  |2 +-
 drivers/gpu/drm/i915/intel_display.c |   49 ++
 drivers/gpu/drm/i915/intel_dp.c  |2 +-
 drivers/gpu/drm/i915/intel_i2c.c |5 ++--
 6 files changed, 87 insertions(+), 32 deletions(-)
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 52174] radeonsi enable GLSL 1.3 by default

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=52174

Michel Dänzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Michel Dänzer  ---
It wasn't actually ready yet, but now it is. :)

commit cdad129f9cda038fb29cde94645d2fcab7207c50
Author: Michel Dänzer 
Date:   Fri May 24 16:49:42 2013 +0200

radeonsi: Enable GLSL 1.30

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 62889] ColorTiling results in glitches on Radeon HD 7970 + Glamor

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=62889

Michel Dänzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Michel Dänzer  ---
With no answer to comment #22, I assume the Steam issue was due to picking up
stale 32-bit binaries.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH -next] drm/i915: fix error return code in init_pipe_control()

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 05:51:44PM +0800, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> Fix to return -ENOMEM in the kmap() error handling case
> instead of 0, as done elsewhere in this function.
> 
> Signed-off-by: Wei Yongjun 
Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH -next] drm/i915: fix error return code in init_pipe_control()

2013-05-28 Thread Chris Wilson

On Tue, May 28, 2013 at 05:51:44PM +0800, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> Fix to return -ENOMEM in the kmap() error handling case
> instead of 0, as done elsewhere in this function.

kmap() can fail?

It is either translated to page_address() or kmap_high() (on x86),
neither of which may return NULL. However, only kmap_atomic() is
documented as being guaranteed to return a valid value. If we could
have a similar definitive statement for kmap(), we can then cleanup
quite a bit of redundant error handling.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 12:56:57PM +0900, Inki Dae wrote:
> 
> 
> > -Original Message-
> > From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> > ow...@vger.kernel.org] On Behalf Of Rob Clark
> > Sent: Tuesday, May 28, 2013 12:48 AM
> > To: Inki Dae
> > Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> > Park; myungjoo.ham; DRI mailing list;
> linux-arm-ker...@lists.infradead.org;
> > linux-me...@vger.kernel.org
> > Subject: Re: Introduce a new helper framework for buffer synchronization
> > 
> > On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> > > Hi all,
> > >
> > > I have been removed previous branch and added new one with more cleanup.
> > > This time, the fence helper doesn't include user side interfaces and
> > cache
> > > operation relevant codes anymore because not only we are not sure that
> > > coupling those two things, synchronizing caches and buffer access
> > between
> > > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
> > a
> > > good idea yet but also existing codes for user side have problems with
> > badly
> > > behaved or crashing userspace. So this could be more discussed later.
> > >
> > > The below is a new branch,
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > exynos.git/?h=dma-f
> > > ence-helper
> > >
> > > And fence helper codes,
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > exynos.git/commit/?
> > > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> > >
> > > And example codes for device driver,
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > exynos.git/commit/?
> > > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> > >
> > > I think the time is not yet ripe for RFC posting: maybe existing dma
> > fence
> > > and reservation need more review and addition work. So I'd glad for
> > somebody
> > > giving other opinions and advices in advance before RFC posting.
> > 
> > thoughts from a *really* quick, pre-coffee, first look:
> > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > probably wouldn't want to bake in assumption that seqno_fence is used.
> > * I guess g2d is probably not actually a simple use case, since I
> > expect you can submit blits involving multiple buffers :-P
> 
> I don't think so. One and more buffers can be used: seqno_fence also has
> only one buffer. Actually, we have already applied this approach to most
> devices; multimedia, gpu and display controller. And this approach shows
> more performance; reduced power consumption against traditional way. And g2d
> example is just to show you how to apply my approach to device driver.

Note that seqno_fence is an implementation pattern for a certain type of
direct hw->hw synchronization which uses a shared dma_buf to exchange the
sync cookie. The dma_buf attached to the seqno_fence has _nothing_ to do
with the dma_buf the fence actually coordinates access to.

I think that confusing is a large reason for why Maarten&I don't
understand what you want to achieve with your fence helpers. Currently
they're using the seqno_fence, but totally not in a way the seqno_fence
was meant to be used.

Note that with the current code there is only a pointer from dma_bufs to
the fence. The fence itself has _no_ pointer to the dma_buf it syncs. This
shouldn't be a problem since the fence fastpath for already signalled
fences is completely barrier&lock free (it's just a load+bit-test), and
fences are meant to be embedded into whatever dma tracking structure you
already have, so no overhead there. The only ugly part is the fence
refcounting, but I don't think we can drop that.

Note that you completely reinvent this part of Maarten's fence patches by
adding new r/w_complete completions to the reservation object, which
completely replaces the fence stuff.

Also note that a list of reservation entries is again meant to be used
only when submitting the dma to the gpu. With your patches you seem to
hang onto that list until dma completes. This has the ugly side-effect
that you need to allocate these reservation entries with kmalloc, whereas
if you just use them in the execbuf ioctl to construct the dma you can
usually embed it into something else you need already anyway. At least
i915 and ttm based drivers can work that way.

Furthermore fences are specifically constructed as frankenstein-monsters
between completion/waitqueues and callbacks. All the different use-cases
need the different aspects:
- busy/idle checks and bo retiring need the completion semantics
- callbacks (in interrupt context) are used for hybrid hw->irq handler->hw
  sync approaches

> 
> > * otherwise, you probably don't want to depend on dmabuf, which is why
> > reservation/fence is split out the way it is..  you want to be able to
> > use a single reservation/fence mechanism within your driver without
> > having to care about which buffers are exported to dmabuf's

Re: [PATCH 2/6] gpu: host1x: Fix syncpoint wait return value

2013-05-28 Thread Thierry Reding

On Mon, May 27, 2013 at 09:55:46AM +0300, Arto Merilainen wrote:
> On 05/26/2013 01:12 PM, Thierry Reding wrote:
> >* PGP Signed by an unknown key
> >
> >On Fri, May 17, 2013 at 02:49:44PM +0300, Arto Merilainen wrote:
[...]
> >Thinking about it, maybe it would be good to have two separate error
> >codes. Keeping -EAGAIN for the case where a zero timeout was passed
> >doesn't sound too bad to differentiate it from the case where a non-
> >zero timeout was passed and it actually timed out. What do you think?
> 
> I agree, in this case it would not look bad at all. However, user
> space libraries may loop until the ioctl return code is something
> else than -EAGAIN or -EINTR. Especially function drmIoctl() in
> libdrm does this which is why I noted this isssue in the first
> place.
> 
> If user space uses zero timeout to just check if a syncpoint value
> has already passed the library continues looping until the syncpoint
> value actually passes. Of course, we could just modify the ioctl
> interface to "cast" this return code to something else but that does
> not seem correct.

That doesn't sound right. Maybe drmIoctl() needs fixing instead. Looking
at the history, drmIoctl() was introduced to automatically loop if a
signal was received (commit 8b9ab108ec1f2ba2b503f713769c4946849b3cb2).
However the ioctl(3p) manpage doesn't mention that ioctl() returns
EAGAIN in case it is interrupted by a signal.

I'm adding Keith as author of that commit and the xorg-devel mailing
list on Cc to get some more eyes on this.

Thierry


pgpyyffoUJ8YI.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH -next] drm/i915: fix error return code in init_pipe_control()

2013-05-28 Thread Wei Yongjun

From: Wei Yongjun 

Fix to return -ENOMEM in the kmap() error handling case
instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5698fae..9b97cf6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -464,9 +464,11 @@ init_pipe_control(struct intel_ring_buffer *ring)
goto err_unref;
 
pc->gtt_offset = obj->gtt_offset;
-   pc->cpu_page =  kmap(sg_page(obj->pages->sgl));
-   if (pc->cpu_page == NULL)
+   pc->cpu_page = kmap(sg_page(obj->pages->sgl));
+   if (pc->cpu_page == NULL) {
+   ret = -ENOMEM;
goto err_unpin;
+   }
 
DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08x\n",
 ring->name, pc->gtt_offset);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 58901] New: "trying to bind memory to uninitialized GART" error at resume from suspend to memory

2013-05-28 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=58901

   Summary: "trying to bind memory to uninitialized GART" error at
resume from suspend to memory
   Product: Drivers
   Version: 2.5
Kernel Version: 3.9.3
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
AssignedTo: drivers_video-...@kernel-bugs.osdl.org
ReportedBy: casteyde.christ...@free.fr
Regression: Yes


Acer Aspire 7750G
Core i7-2630QM, 6Go
AMD Radeon HD6650M, no Intel graphics
Slackware64-current

Since kernel 3.9.x, my laptop cannot resume from suspend to memory with X
completly frozen and no other way to switch off/restart.

My rc scripts save dmesg at shutdown so I managed to get the following kernel
logs:

usb 1-1.4: reset high-speed USB device number 4 using ehci-pci
PM: resume of devices complete after 1013.038 msecs
Restarting tasks ... done.
video LNXVIDEO:01: Restoring backlight state
ata1.00: configured for UDMA/133
ata1: EH complete
EXT4-fs (sda2): re-mounted. Opts: discard,commit=0
EXT4-fs (sda3): re-mounted. Opts: discard,commit=0
eth0: deauthenticated from  (Reason: 6)
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 4 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 2 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 2 KHz), (300 mBi, 2000 mBm)
cfg80211:   (517 KHz - 525 KHz @ 4 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 4 KHz), (300 mBi, 2000 mBm)
eth0: authenticate with 
eth0: send auth to 
eth0: authenticated
ath9k :03:00.0 eth0: disabling HT as WMM/QoS is not supported by the AP
ath9k :03:00.0 eth0: disabling VHT as WMM/QoS is not supported by the AP
eth0: associate with  (try 1/3)
eth0: RX AssocResp from  (capab=0x411 status=0 aid=2)
eth0: associated
[ cut here ]
WARNING: at drivers/gpu/drm/radeon/radeon_gart.c:280
radeon_gart_bind+0xe1/0xf0()
Hardware name: Aspire 7750G
trying to bind memory to uninitialized GART !
Modules linked in:
Pid: 2305, comm: X Not tainted 3.9.3 #6
Call Trace:
 [] ? radeon_gart_bind+0xe1/0xf0
 [] warn_slowpath_common+0x6b/0xa0
 [] warn_slowpath_fmt+0x47/0x50
 [] radeon_gart_bind+0xe1/0xf0
 [] radeon_ttm_backend_bind+0x32/0x90
 [] ttm_tt_bind+0x47/0x60
 [] ttm_bo_handle_move_mem+0x54f/0x5e0
 [] ? ttm_bo_mem_space+0x161/0x340
 [] ttm_bo_move_buffer+0x11f/0x140
 [] ttm_bo_validate+0x92/0x110
 [] ttm_bo_init+0x2a9/0x3c0
 [] radeon_bo_create+0x176/0x1d0
 [] ? radeon_bo_clear_va+0x50/0x50
 [] radeon_gem_object_create+0x9b/0x160
 [] radeon_gem_create_ioctl+0x5b/0x130
 [] drm_ioctl+0x4d1/0x580
 [] ? radeon_gem_pwrite_ioctl+0x30/0x30
 [] do_vfs_ioctl+0x2e5/0x4d0
 [] sys_ioctl+0x40/0x80
 [] ? sys_read+0x6c/0x90
 [] system_call_fastpath+0x16/0x1b
---[ end trace 44b14b5d0d1cf7ab ]---
[drm:radeon_ttm_backend_bind] *ERROR* failed to bind 1175 pages at 0x0240D000
[drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (4812800,
2, 4096, -22)
BUG: unable to handle kernel NULL pointer dereference at 0008
IP: [] ttm_dma_populate+0x6a3/0x960
PGD 1c61c0067 PUD 1c4df8067 PMD 0 
Oops: 0002 [#1] PREEMPT SMP 
Modules linked in:
CPU 6 
Pid: 2305, comm: X Tainted: GW3.9.3 #6 Acer Aspire 7750G/JE70_HR
RIP: 0010:[]  []
ttm_dma_populate+0x6a3/0x960
RSP: 0018:8801c4c6b9c0  EFLAGS: 00010093
RAX: 88019b74c100 RBX: 0202 RCX: 88019b74c180
RDX:  RSI: 8801c7202928 RDI: 8801c7202914
RBP: 8801c4c6ba88 R08: 000146c0 R09: 8801cf5946c0
R10: ea0007190a80 R11: 8801c8802600 R12: 88019b74c100
R13: 8801c642a320 R14: 8801c7202900 R15: 0004
FS:  7fe093e7c8c0() GS:8801cf58() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0008 CR3: 0001c619f000 CR4: 000407e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process X (pid: 2305, threadinfo 8801c4c6a000, task 8801c712da20)
Stack:
 8801c7202964 8801c762c098 8801c7202928 8801c7202971
 88019fb67558 8801c699bc00  ffc0192a
 88019fb67500 8801c7202914 4004 88010004
Call Trace:
 [] ? do_select+0x5fa/0x670
 [] ? kmem_cache_alloc+0x9a/0xa0
 [] ? __kmalloc+0xd0/0xe0
 [] radeon_ttm_tt_populate+0x1c7/0x220
 [] ? radeon_ttm_tt_create+0x6a/0xb0
 [] ttm_tt_bind+0x36/0x60
 [] ttm_bo_handle_move_mem+0x54f/0x5e0
 [] ? ttm_bo_mem_space+0x161/0x340
 [] ttm_bo_move_buffer+0x11f/0x140
 [] ttm_bo_validate+0x92/0x110
 [] ttm_bo_init+0x2a9/0x3c

Re: drm/exynos: make overlay data to be updated to valid hw

2013-05-28 Thread Inki Dae

2013/5/28 Inki Dae 

> This patch makes sure that overlay data are updated
> to real hardware enabled when framebuffer is released.
> For this, this patch checks if crtc and encoder are
> valid or not, and then makes it waiting for signal
> synchroniztion to only valid encoder.
>
> Signed-off-by: Inki Dae 
> Signed-off-by: Kyungmin Park 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_encoder.c |9 ++---
>  drivers/gpu/drm/exynos/exynos_drm_encoder.h |2 +-
>  drivers/gpu/drm/exynos/exynos_drm_fb.c  |   13 +++--
>  3 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> index c63721f..9a6e3fd 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> @@ -220,18 +220,21 @@ static void exynos_drm_encoder_commit(struct
> drm_encoder *encoder)
> exynos_encoder->dpms = DRM_MODE_DPMS_ON;
>  }
>
> -void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb)
> +void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc)
>  {
> struct exynos_drm_encoder *exynos_encoder;
> struct exynos_drm_manager_ops *ops;
> -   struct drm_device *dev = fb->dev;
> +   struct drm_device *dev = crtc->dev;
> struct drm_encoder *encoder;
>
> /*
>  * make sure that overlay data are updated to real hardware
> -* for all encoders.
> +* for valid encoders.
>  */
> list_for_each_entry(encoder, &dev->mode_config.encoder_list, head)
> {
> +   if (encoder->crtc != crtc)
> +   continue;
> +
> exynos_encoder = to_exynos_encoder(encoder);
> ops = exynos_encoder->manager->ops;
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> index 89e2fb0..e8dee1c 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> @@ -32,6 +32,6 @@ void exynos_drm_encoder_plane_mode_set(struct
> drm_encoder *encoder, void *data);
>  void exynos_drm_encoder_plane_commit(struct drm_encoder *encoder, void
> *data);
>  void exynos_drm_encoder_plane_enable(struct drm_encoder *encoder, void
> *data);
>  void exynos_drm_encoder_plane_disable(struct drm_encoder *encoder, void
> *data);
> -void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb);
> +void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc);
>
>  #endif
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c
> b/drivers/gpu/drm/exynos/exynos_drm_fb.c
> index 0e04f4e..1fc7ae6 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
> @@ -68,12 +68,21 @@ static int check_fb_gem_memory_type(struct drm_device
> *drm_dev,
>  static void exynos_drm_fb_destroy(struct drm_framebuffer *fb)
>  {
> struct exynos_drm_fb *exynos_fb = to_exynos_fb(fb);
> +   struct drm_device *dev = fb->dev;
> +   struct drm_crtc *crtc;
> unsigned int i;
>
> DRM_DEBUG_KMS("%s\n", __FILE__);
>
> -   /* make sure that overlay data are updated before relesing fb. */
> -   exynos_drm_encoder_complete_scanout(fb);
> +   list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> +   if (crtc->fb == fb) {
>

Sorry, crtc->fb could be new fb so in this case, this condition will always
be failed. This patch will be posted again after fixed.

Thanks,
Inki Dae

+   /*
> +* make sure that overlay data are updated before
> +* relesing fb.
> +*/
> +   exynos_drm_encoder_complete_scanout(crtc);
> +   }
> +   }
>
> drm_framebuffer_cleanup(fb);
>
> --
> 1.7.5.4
>
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Rob Clark

On Mon, May 27, 2013 at 11:56 PM, Inki Dae  wrote:
>
>
>> -Original Message-
>> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
>> ow...@vger.kernel.org] On Behalf Of Rob Clark
>> Sent: Tuesday, May 28, 2013 12:48 AM
>> To: Inki Dae
>> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
>> Park; myungjoo.ham; DRI mailing list;
> linux-arm-ker...@lists.infradead.org;
>> linux-me...@vger.kernel.org
>> Subject: Re: Introduce a new helper framework for buffer synchronization
>>
>> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
>> > Hi all,
>> >
>> > I have been removed previous branch and added new one with more cleanup.
>> > This time, the fence helper doesn't include user side interfaces and
>> cache
>> > operation relevant codes anymore because not only we are not sure that
>> > coupling those two things, synchronizing caches and buffer access
>> between
>> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
>> a
>> > good idea yet but also existing codes for user side have problems with
>> badly
>> > behaved or crashing userspace. So this could be more discussed later.
>> >
>> > The below is a new branch,
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/?h=dma-f
>> > ence-helper
>> >
>> > And fence helper codes,
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
>> >
>> > And example codes for device driver,
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
>> >
>> > I think the time is not yet ripe for RFC posting: maybe existing dma
>> fence
>> > and reservation need more review and addition work. So I'd glad for
>> somebody
>> > giving other opinions and advices in advance before RFC posting.
>>
>> thoughts from a *really* quick, pre-coffee, first look:
>> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
>> probably wouldn't want to bake in assumption that seqno_fence is used.
>> * I guess g2d is probably not actually a simple use case, since I
>> expect you can submit blits involving multiple buffers :-P
>
> I don't think so. One and more buffers can be used: seqno_fence also has
> only one buffer. Actually, we have already applied this approach to most
> devices; multimedia, gpu and display controller. And this approach shows
> more performance; reduced power consumption against traditional way. And g2d
> example is just to show you how to apply my approach to device driver.

no, you need the ww-mutex / reservation stuff any time you have
multiple independent devices (or rings/contexts for hw that can
support multiple contexts) which can do operations with multiple
buffers.  So you could conceivably hit this w/ gpu + g2d if multiple
buffers where shared between the two.  vram migration and such
'desktop stuff' might make the problem worse, but just because you
don't have vram doesn't mean you don't have a problem with multiple
buffers.

>> * otherwise, you probably don't want to depend on dmabuf, which is why
>> reservation/fence is split out the way it is..  you want to be able to
>> use a single reservation/fence mechanism within your driver without
>> having to care about which buffers are exported to dmabuf's and which
>> are not.  Creating a dmabuf for every GEM bo is too heavyweight.
>
> Right. But I think we should dealt with this separately. Actually, we are
> trying to use reservation for gpu pipe line synchronization such as sgx sync
> object and this approach is used without dmabuf. In order words, some device
> can use only reservation for such pipe line synchronization and at the same
> time, fence helper or similar thing with dmabuf for buffer synchronization.

it is probably easier to approach from the reverse direction.. ie, get
reservation/synchronization right first, and then dmabuf.  (Well, that
isn't really a problem because Maarten's reservation/fence patches
support dmabuf from the beginning.)

BR,
-R

>>
>> I'm not entirely sure if reservation/fence could/should be made any
>> simpler for multi-buffer users.  Probably the best thing to do is just
>> get reservation/fence rolled out in a few drivers and see if some
>> common patterns emerge.
>>
>> BR,
>> -R
>>
>> >
>> > Thanks,
>> > Inki Dae
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae


Hi Daniel,

Thank you so much. And so very useful.:) Sorry but could be give me more
comments to the below my comments? There are still things making me
confusing.:(


> -Original Message-
> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Tuesday, May 28, 2013 7:33 PM
> To: Inki Dae
> Cc: 'Rob Clark'; 'Maarten Lankhorst'; 'Daniel Vetter'; 'linux-fbdev';
> 'YoungJun Cho'; 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list';
> linux-arm-ker...@lists.infradead.org; linux-me...@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 28, 2013 at 12:56:57PM +0900, Inki Dae wrote:
> >
> >
> > > -Original Message-
> > > From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> > > ow...@vger.kernel.org] On Behalf Of Rob Clark
> > > Sent: Tuesday, May 28, 2013 12:48 AM
> > > To: Inki Dae
> > > Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho;
> Kyungmin
> > > Park; myungjoo.ham; DRI mailing list;
> > linux-arm-ker...@lists.infradead.org;
> > > linux-me...@vger.kernel.org
> > > Subject: Re: Introduce a new helper framework for buffer
> synchronization
> > >
> > > On Mon, May 27, 2013 at 6:38 AM, Inki Dae 
wrote:
> > > > Hi all,
> > > >
> > > > I have been removed previous branch and added new one with more
> cleanup.
> > > > This time, the fence helper doesn't include user side interfaces and
> > > cache
> > > > operation relevant codes anymore because not only we are not sure
> that
> > > > coupling those two things, synchronizing caches and buffer access
> > > between
> > > > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel
> side is
> > > a
> > > > good idea yet but also existing codes for user side have problems
> with
> > > badly
> > > > behaved or crashing userspace. So this could be more discussed
later.
> > > >
> > > > The below is a new branch,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/?h=dma-f
> > > > ence-helper
> > > >
> > > > And fence helper codes,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/commit/?
> > > > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> > > >
> > > > And example codes for device driver,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/commit/?
> > > > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> > > >
> > > > I think the time is not yet ripe for RFC posting: maybe existing dma
> > > fence
> > > > and reservation need more review and addition work. So I'd glad for
> > > somebody
> > > > giving other opinions and advices in advance before RFC posting.
> > >
> > > thoughts from a *really* quick, pre-coffee, first look:
> > > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > > probably wouldn't want to bake in assumption that seqno_fence is used.
> > > * I guess g2d is probably not actually a simple use case, since I
> > > expect you can submit blits involving multiple buffers :-P
> >
> > I don't think so. One and more buffers can be used: seqno_fence also has
> > only one buffer. Actually, we have already applied this approach to most
> > devices; multimedia, gpu and display controller. And this approach shows
> > more performance; reduced power consumption against traditional way. And
> g2d
> > example is just to show you how to apply my approach to device driver.
> 
> Note that seqno_fence is an implementation pattern for a certain type of
> direct hw->hw synchronization which uses a shared dma_buf to exchange the
> sync cookie.

I'm afraid that I don't understand hw->hw synchronization. hw->hw
synchronization means that device has a hardware feature which supports
buffer synchronization hardware internally? And what is the sync cookie?

> The dma_buf attached to the seqno_fence has _nothing_ to do
> with the dma_buf the fence actually coordinates access to.
> 
> I think that confusing is a large reason for why Maarten&I don't
> understand what you want to achieve with your fence helpers. Currently
> they're using the seqno_fence, but totally not in a way the seqno_fence
> was meant to be used.
> 
> Note that with the current code there is only a pointer from dma_bufs to
> the fence. The fence itself has _no_ pointer to the dma_buf it syncs. This
> shouldn't be a problem since the fence fastpath for already signalled
> fences is completely barrier&lock free (it's just a load+bit-test), and
> fences are meant to be embedded into whatever dma tracking structure you
> already have, so no overhead there. The only ugly part is the fence
> refcounting, but I don't think we can drop that.

The below is the proposed way,
dma device has to create a fence before accessing a shared buffer, and then
check if there are other dma which are accessing the shared buffer; if exist
then the dma device should be blocked, and then  it sets the fence to
reservation o

[PATCH v4 0/4] add mutex wait/wound/style style locks

2013-05-28 Thread Maarten Lankhorst

Version 4 already?

Small api changes since v3:
- Remove ww_mutex_unlock_single and ww_mutex_lock_single.
- Rename ww_mutex_trylock_single to ww_mutex_trylock.
- Remove separate implementations of ww_mutex_lock_slow*, normal
  functions can be used. Inline versions still exist for extra
  debugging, and to annotate.
- Cleanup unneeded memory barriers, add comment to the remaining
  smp_mb().

Thanks to Daniel Vetter, Rob Clark and Peter Zijlstra for their feedback.
---

Daniel Vetter (1):
  mutex: w/w mutex slowpath debugging

Maarten Lankhorst (3):
  arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded 
or not.
  mutex: add support for wound/wait style locks, v5
  mutex: Add ww tests to lib/locking-selftest.c. v4


 Documentation/ww-mutex-design.txt |  344 +++
 arch/ia64/include/asm/mutex.h |   10 -
 arch/powerpc/include/asm/mutex.h  |   10 -
 arch/sh/include/asm/mutex-llsc.h  |4 
 arch/x86/include/asm/mutex_32.h   |   11 -
 arch/x86/include/asm/mutex_64.h   |   11 -
 include/asm-generic/mutex-dec.h   |   10 -
 include/asm-generic/mutex-null.h  |2 
 include/asm-generic/mutex-xchg.h  |   10 -
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  363 +
 kernel/mutex.c|  384 ---
 lib/Kconfig.debug |   13 +
 lib/debug_locks.c |2 
 lib/locking-selftest.c|  410 +++--
 15 files changed, 1492 insertions(+), 93 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

-- 
~Maarten
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v4 2/4] mutex: add support for wound/wait style locks, v5

2013-05-28 Thread Maarten Lankhorst

Changes since RFC patch v1:
 - Updated to use atomic_long instead of atomic, since the reservation_id was a 
long.
 - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
 - removed mutex_locked_set_reservation_id (or w/e it was called)
Changes since RFC patch v2:
 - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
combination of
   mutex_(,reserve_)lock/unlock.
Changes since v1:
 - Add __always_inline to __mutex_lock_common, otherwise reservation paths can 
be
   triggered from normal locks, because __builtin_constant_p might evaluate to 
false
   for the constant 0 in that case. Tests for this have been added in the next 
patch.
 - Updated documentation slightly.
Changes since v2:
 - Renamed everything to ww_mutex. (mlankhorst)
 - Added ww_acquire_ctx and ww_class. (mlankhorst)
 - Added a lot of checks for wrong api usage. (mlankhorst)
 - Documentation updates. (danvet)
Changes since v3:
 - Small documentation fixes (robclark)
 - Memory barrier fix (danvet)
Changes since v4:
 - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
 - Rename ww_mutex_trylock_single to ww_mutex_trylock.
 - Remove separate implementations of ww_mutex_lock_slow*, normal
   functions can be used. Inline versions still exist for extra
   debugging.
 - Cleanup unneeded memory barriers, add comment to the remaining
   smp_mb().

Signed-off-by: Maarten Lankhorst 
Signed-off-by: Daniel Vetter 
Signed-off-by: Rob Clark 
---
 Documentation/ww-mutex-design.txt |  344 
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  355 +
 kernel/mutex.c|  318 +++--
 lib/debug_locks.c |2 
 5 files changed, 1003 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

diff --git a/Documentation/ww-mutex-design.txt 
b/Documentation/ww-mutex-design.txt
new file mode 100644
index 000..8bd1761
--- /dev/null
+++ b/Documentation/ww-mutex-design.txt
@@ -0,0 +1,344 @@
+Wait/Wound Deadlock-Proof Mutex Design
+==
+
+Please read mutex-design.txt first, as it applies to wait/wound mutexes too.
+
+Motivation for WW-Mutexes
+-
+
+GPU's do operations that commonly involve many buffers.  Those buffers
+can be shared across contexts/processes, exist in different memory
+domains (for example VRAM vs system memory), and so on.  And with
+PRIME / dmabuf, they can even be shared across devices.  So there are
+a handful of situations where the driver needs to wait for buffers to
+become ready.  If you think about this in terms of waiting on a buffer
+mutex for it to become available, this presents a problem because
+there is no way to guarantee that buffers appear in a execbuf/batch in
+the same order in all contexts.  That is directly under control of
+userspace, and a result of the sequence of GL calls that an application
+makes. Which results in the potential for deadlock.  The problem gets
+more complex when you consider that the kernel may need to migrate the
+buffer(s) into VRAM before the GPU operates on the buffer(s), which
+may in turn require evicting some other buffers (and you don't want to
+evict other buffers which are already queued up to the GPU), but for a
+simplified understanding of the problem you can ignore this.
+
+The algorithm that TTM came up with for dealing with this problem is quite
+simple.  For each group of buffers (execbuf) that need to be locked, the caller
+would be assigned a unique reservation id/ticket, from a global counter.  In
+case of deadlock while locking all the buffers associated with a execbuf, the
+one with the lowest reservation ticket (i.e. the oldest task) wins, and the one
+with the higher reservation id (i.e. the younger task) unlocks all of the
+buffers that it has already locked, and then tries again.
+
+In the RDBMS literature this deadlock handling approach is called wait/wound:
+The older tasks waits until it can acquire the contended lock. The younger 
tasks
+needs to back off and drop all the locks it is currently holding, i.e. the
+younger task is wounded.
+
+Concepts
+
+
+Compared to normal mutexes two additional concepts/objects show up in the lock
+interface for w/w mutexes:
+
+Acquire context: To ensure eventual forward progress it is important the a task
+trying to acquire locks doesn't grab a new reservation id, but keeps the one it
+acquired when starting the lock acquisition. This ticket is stored in the
+acquire context. Furthermore the acquire context keeps track of debugging state
+to catch w/w mutex interface abuse.
+
+W/w class: In contrast to normal mutexes the lock class needs to be explicit 
for
+w/w mutexes, since it is required to initialize the acquire context.
+
+Furthermore there are three different class of w/w lock acquire functions:
+
+* Normal lock acquisition with a context, using ww_mute

[PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4

2013-05-28 Thread Maarten Lankhorst

This stresses the lockdep code in some ways specifically useful to
ww_mutexes. It adds checks for most of the common locking errors.

Changes since v1:
 - Add tests to verify reservation_id is untouched.
 - Use L() and U() macros where possible.

Changes since v2:
 - Use the ww_mutex api directly.
 - Use macros for most of the code.
Changes since v3:
 - Rework tests for the api changes.

Signed-off-by: Maarten Lankhorst 
---
 lib/locking-selftest.c |  405 ++--
 1 file changed, 386 insertions(+), 19 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index c3eb261..b18f1d3 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -26,6 +26,8 @@
  */
 static unsigned int debug_locks_verbose;
 
+static DEFINE_WW_CLASS(ww_lockdep);
+
 static int __init setup_debug_locks_verbose(char *str)
 {
get_option(&str, &debug_locks_verbose);
@@ -42,6 +44,10 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
 #define LOCKTYPE_RWLOCK0x2
 #define LOCKTYPE_MUTEX 0x4
 #define LOCKTYPE_RWSEM 0x8
+#define LOCKTYPE_WW0x10
+
+static struct ww_acquire_ctx t, t2;
+static struct ww_mutex o, o2;
 
 /*
  * Normal standalone locks, for the circular and irq-context
@@ -193,6 +199,16 @@ static void init_shared_classes(void)
 #define RSU(x) up_read(&rwsem_##x)
 #define RWSI(x)init_rwsem(&rwsem_##x)
 
+#define WWAI(x)ww_acquire_init(x, &ww_lockdep)
+#define WWAD(x)ww_acquire_done(x)
+#define WWAF(x)ww_acquire_fini(x)
+
+#define WWL(x, c)  ww_mutex_lock(x, c)
+#define WWT(x) ww_mutex_trylock(x)
+#define WWL1(x)ww_mutex_lock(x, NULL)
+#define WWU(x) ww_mutex_unlock(x)
+
+
 #define LOCK_UNLOCK_2(x,y) LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)
 
 /*
@@ -894,11 +910,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 # define I_RWLOCK(x)   lockdep_reset_lock(&rwlock_##x.dep_map)
 # define I_MUTEX(x)lockdep_reset_lock(&mutex_##x.dep_map)
 # define I_RWSEM(x)lockdep_reset_lock(&rwsem_##x.dep_map)
+# define I_WW(x)   lockdep_reset_lock(&x.dep_map)
 #else
 # define I_SPINLOCK(x)
 # define I_RWLOCK(x)
 # define I_MUTEX(x)
 # define I_RWSEM(x)
+# define I_WW(x)
 #endif
 
 #define I1(x)  \
@@ -920,11 +938,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 static void reset_locks(void)
 {
local_irq_disable();
+   lockdep_free_key_range(&ww_lockdep.acquire_key, 1);
+   lockdep_free_key_range(&ww_lockdep.mutex_key, 1);
+
I1(A); I1(B); I1(C); I1(D);
I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
+   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
lockdep_reset();
I2(A); I2(B); I2(C); I2(D);
init_shared_classes();
+
+   ww_mutex_init(&o, &ww_lockdep); ww_mutex_init(&o2, &ww_lockdep);
+   memset(&t, 0, sizeof(t)); memset(&t2, 0, sizeof(t2));
+   memset(&ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
+   memset(&ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
local_irq_enable();
 }
 
@@ -938,7 +965,6 @@ static int unexpected_testcase_failures;
 static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
 {
unsigned long saved_preempt_count = preempt_count();
-   int expected_failure = 0;
 
WARN_ON(irqs_disabled());
 
@@ -946,26 +972,16 @@ static void dotest(void (*testcase_fn)(void), int 
expected, int lockclass_mask)
/*
 * Filter out expected failures:
 */
+   if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
-   if ((lockclass_mask & LOCKTYPE_SPIN) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_RWLOCK) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_MUTEX) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_RWSEM) && debug_locks != expected)
-   expected_failure = 1;
+   expected_testcase_failures++;
+   printk("failed|");
+#else
+   unexpected_testcase_failures++;
+   printk("FAILED|");
+
+   dump_stack();
 #endif
-   if (debug_locks != expected) {
-   if (expected_failure) {
-   expected_testcase_failures++;
-   printk("failed|");
-   } else {
-   unexpected_testcase_failures++;
-
-   printk("FAILED|");
-   dump_stack();
-   }
} else {
testcase_successes++;
printk("  ok  |");
@@ -1108,6 +1124,355 @@ static inline void print_testname(const char *testname)
DO_TESTCASE_6IRW(desc, name, 312);

[PATCH v4 4/4] mutex: w/w mutex slowpath debugging

2013-05-28 Thread Maarten Lankhorst

From: Daniel Vetter 

Injects EDEADLK conditions at pseudo-random interval, with exponential
backoff up to UINT_MAX (to ensure that every lock operation still
completes in a reasonable time).

This way we can test the wound slowpath even for ww mutex users where
contention is never expected, and the ww deadlock avoidance algorithm
is only needed for correctness against malicious userspace. An example
would be protecting kernel modesetting properties, which thanks to
single-threaded X isn't really expected to contend, ever.

I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but
decided against it for two reasons:

- EDEADLK handling is mandatory for ww mutex users and should never
  affect the outcome of a syscall. This is in contrast to -ENOMEM
  injection. So fine configurability isn't required.

- The fault injection framework only allows to set a simple
  probability for failure. Now the probability that a ww mutex acquire
  stage with N locks will never complete (due to too many injected
  EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock
  operations for the completely uncontended case would be O(exp(N)).
  The per-acuiqire ctx exponential backoff solution choosen here only
  results in O(log N) overhead due to injection and so O(log N * N)
  lock operations. This way we can fail with high probability (and so
  have good test coverage even for fancy backoff and lock acquisition
  paths) without running into patalogical cases.

Note that EDEADLK will only ever be injected when we managed to
acquire the lock. This prevents any behaviour changes for users which
rely on the EALREADY semantics.

v2: Drop the cargo-culted __sched (I should read docs next time
around) and annotate the non-debug case with inline to prevent gcc
from doing something horrible.

v3: Rebase on top of Maarten's latest patches.

v4: Actually make this stuff compile, I've misplace the hunk in the
wrong #ifdef block.

v5: Simplify ww_mutex_deadlock_injection definition, and fix
lib/locking-selftest.c warnings. Fix lib/Kconfig.debug definition
to work correctly. (mlankhorst)

v6:
Do not inject -EDEADLK when ctx->acquired == 0, because
the _slow paths are merged now. (mlankhorst)

Cc: Steven Rostedt 
Signed-off-by: Daniel Vetter 
Signed-off-by: Maarten Lankhorst 
---
 include/linux/mutex.h  |8 
 kernel/mutex.c |   44 +---
 lib/Kconfig.debug  |   13 +
 lib/locking-selftest.c |5 +
 4 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index f3ad181..2ff9178 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -95,6 +95,10 @@ struct ww_acquire_ctx {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned deadlock_inject_interval;
+   unsigned deadlock_inject_countdown;
+#endif
 };
 
 struct ww_mutex {
@@ -280,6 +284,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx 
*ctx,
 &ww_class->acquire_key, 0);
mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_);
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   ctx->deadlock_inject_interval = 1;
+   ctx->deadlock_inject_countdown = ctx->stamp & 0xf;
+#endif
 }
 
 /**
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 75fc7c4..e40004b 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -508,22 +508,60 @@ mutex_lock_interruptible_nested(struct mutex *lock, 
unsigned int subclass)
 
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
 
+static inline int
+ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned tmp;
+
+   if (ctx->deadlock_inject_countdown-- == 0) {
+   tmp = ctx->deadlock_inject_interval;
+   if (tmp > UINT_MAX/4)
+   tmp = UINT_MAX;
+   else
+   tmp = tmp*2 + tmp + tmp/2;
+
+   ctx->deadlock_inject_interval = tmp;
+   ctx->deadlock_inject_countdown = tmp;
+   ctx->contending_lock = lock;
+
+   ww_mutex_unlock(lock);
+
+   return -EDEADLK;
+   }
+#endif
+
+   return 0;
+}
 
 int __sched
 __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
+   ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
   0, &ctx->dep_map, _RET_IP_, ctx);
+   if (!ret && ctx->acquired > 0)
+   return ww_mutex_deadlock_injection(lock, ctx);
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(__ww_mutex_lock);
 
 int __sched
 __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx 
*ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(&lock->base,

RE: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae



> -Original Message-
> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> ow...@vger.kernel.org] On Behalf Of Rob Clark
> Sent: Tuesday, May 28, 2013 10:49 PM
> To: Inki Dae
> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> Park; myungjoo.ham; DRI mailing list;
linux-arm-ker...@lists.infradead.org;
> linux-me...@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 11:56 PM, Inki Dae  wrote:
> >
> >
> >> -Original Message-
> >> From: linux-fbdev-ow...@vger.kernel.org [mailto:linux-fbdev-
> >> ow...@vger.kernel.org] On Behalf Of Rob Clark
> >> Sent: Tuesday, May 28, 2013 12:48 AM
> >> To: Inki Dae
> >> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho;
> Kyungmin
> >> Park; myungjoo.ham; DRI mailing list;
> > linux-arm-ker...@lists.infradead.org;
> >> linux-me...@vger.kernel.org
> >> Subject: Re: Introduce a new helper framework for buffer
> synchronization
> >>
> >> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> >> > Hi all,
> >> >
> >> > I have been removed previous branch and added new one with more
> cleanup.
> >> > This time, the fence helper doesn't include user side interfaces and
> >> cache
> >> > operation relevant codes anymore because not only we are not sure
> that
> >> > coupling those two things, synchronizing caches and buffer access
> >> between
> >> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side
> is
> >> a
> >> > good idea yet but also existing codes for user side have problems
> with
> >> badly
> >> > behaved or crashing userspace. So this could be more discussed later.
> >> >
> >> > The below is a new branch,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/?h=dma-f
> >> > ence-helper
> >> >
> >> > And fence helper codes,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/commit/?
> >> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >> >
> >> > And example codes for device driver,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/commit/?
> >> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >> >
> >> > I think the time is not yet ripe for RFC posting: maybe existing dma
> >> fence
> >> > and reservation need more review and addition work. So I'd glad for
> >> somebody
> >> > giving other opinions and advices in advance before RFC posting.
> >>
> >> thoughts from a *really* quick, pre-coffee, first look:
> >> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> >> probably wouldn't want to bake in assumption that seqno_fence is used.
> >> * I guess g2d is probably not actually a simple use case, since I
> >> expect you can submit blits involving multiple buffers :-P
> >
> > I don't think so. One and more buffers can be used: seqno_fence also has
> > only one buffer. Actually, we have already applied this approach to most
> > devices; multimedia, gpu and display controller. And this approach shows
> > more performance; reduced power consumption against traditional way. And
> g2d
> > example is just to show you how to apply my approach to device driver.
> 
> no, you need the ww-mutex / reservation stuff any time you have
> multiple independent devices (or rings/contexts for hw that can
> support multiple contexts) which can do operations with multiple
> buffers.

I think I already used reservation stuff any time in that way except
ww-mutex. And I'm not sure that embedded system really needs ww-mutex. If
there is any case, 
could you tell me the case? I really need more advice and understanding :)

Thanks,
Inki Dae

  So you could conceivably hit this w/ gpu + g2d if multiple
> buffers where shared between the two.  vram migration and such
> 'desktop stuff' might make the problem worse, but just because you
> don't have vram doesn't mean you don't have a problem with multiple
> buffers.
> 
> >> * otherwise, you probably don't want to depend on dmabuf, which is why
> >> reservation/fence is split out the way it is..  you want to be able to
> >> use a single reservation/fence mechanism within your driver without
> >> having to care about which buffers are exported to dmabuf's and which
> >> are not.  Creating a dmabuf for every GEM bo is too heavyweight.
> >
> > Right. But I think we should dealt with this separately. Actually, we
> are
> > trying to use reservation for gpu pipe line synchronization such as sgx
> sync
> > object and this approach is used without dmabuf. In order words, some
> device
> > can use only reservation for such pipe line synchronization and at the
> same
> > time, fence helper or similar thing with dmabuf for buffer
> synchronization.
> 
> it is probably easier to approach from the reverse direction.. ie, get
> reservation/synchronization right first, and then dmabuf.  (Well, that
> isn't really a problem because Maarten's reservation

[PATCH v4 1/4] arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.

2013-05-28 Thread Maarten Lankhorst

This will allow me to call functions that have multiple arguments if fastpath 
fails.
This is required to support ticket mutexes, because they need to be able to 
pass an
extra argument to the fail function.

Originally I duplicated the functions, by adding 
__mutex_fastpath_lock_retval_arg.
This ended up being just a duplication of the existing function, so a way to 
test
if fastpath was called ended up being better.

This also cleaned up the reservation mutex patch some by being able to call an
atomic_set instead of atomic_xchg, and making it easier to detect if the wrong
unlock function was previously used.

Changes since v1, pointed out by Francesco Lavra:
- fix a small comment issue in mutex_32.h
- fix the __mutex_fastpath_lock_retval macro for mutex-null.h

Signed-off-by: Maarten Lankhorst 
---
 arch/ia64/include/asm/mutex.h|   10 --
 arch/powerpc/include/asm/mutex.h |   10 --
 arch/sh/include/asm/mutex-llsc.h |4 ++--
 arch/x86/include/asm/mutex_32.h  |   11 ---
 arch/x86/include/asm/mutex_64.h  |   11 ---
 include/asm-generic/mutex-dec.h  |   10 --
 include/asm-generic/mutex-null.h |2 +-
 include/asm-generic/mutex-xchg.h |   10 --
 kernel/mutex.c   |   32 ++--
 9 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
index bed73a6..f41e66d 100644
--- a/arch/ia64/include/asm/mutex.h
+++ b/arch/ia64/include/asm/mutex.h
@@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
-   return fail_fn(count);
+   return -1;
return 0;
 }
 
diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
index 5399f7e..127ab23 100644
--- a/arch/powerpc/include/asm/mutex.h
+++ b/arch/powerpc/include/asm/mutex.h
@@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(__mutex_dec_return_lock(count) < 0))
-   return fail_fn(count);
+   return -1;
return 0;
 }
 
diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index 090358a..dad29b6 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
 }
 
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
int __done, __res;
 
@@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int 
(*fail_fn)(atomic_t *))
: "t");
 
if (unlikely(!__done || __res != 0))
-   __res = fail_fn(count);
+   __res = -1;
 
return __res;
 }
diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
index 03f90c8..0208c3c 100644
--- a/arch/x86/include/asm/mutex_32.h
+++ b/arch/x86/include/asm/mutex_32.h
@@ -42,17 +42,14 @@ do {
\
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if it
- * wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function

[Bug 65085] New: [radeonsi LLVM] Segfault during OpenCL kernel compilation

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65085

  Priority: medium
Bug ID: 65085
  Assignee: dri-devel@lists.freedesktop.org
   Summary: [radeonsi LLVM] Segfault during OpenCL kernel
compilation
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: niels_...@salscheider-online.de
  Hardware: Other
Status: NEW
   Version: git
 Component: Drivers/Gallium/radeonsi
   Product: Mesa

Created attachment 79901
  --> https://bugs.freedesktop.org/attachment.cgi?id=79901&action=edit
Kernel that causes the segfault

I get a segmentation fault in LLVM with Tom Stellard's recent radeonsi compute
patches
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130520/175743.html)
with the attached kernel.

I can work around the issue when executing the if-block in the kernel
unconditionally or with the attached patch to LLVM.
Desc.OpInfo[0].RegClass equals -1 when the segmentation fault occurs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65085] [radeonsi LLVM] Segfault during OpenCL kernel compilation

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65085

--- Comment #1 from Niels Ole Salscheider  ---
Created attachment 79903
  --> https://bugs.freedesktop.org/attachment.cgi?id=79903&action=edit
Patch to work around the issue

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65085] [radeonsi LLVM] Segfault during OpenCL kernel compilation

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65085

--- Comment #2 from Niels Ole Salscheider  ---
Created attachment 79904
  --> https://bugs.freedesktop.org/attachment.cgi?id=79904&action=edit
Full backtrace

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 58901] "trying to bind memory to uninitialized GART" error at resume from suspend to memory

2013-05-28 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=58901





--- Comment #1 from Michel Dänzer   2013-05-28 16:31:21 ---
Can you bisect between 3.8.x and 3.9.x?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 4:50 PM, Inki Dae  wrote:
> I think I already used reservation stuff any time in that way except
> ww-mutex. And I'm not sure that embedded system really needs ww-mutex. If
> there is any case,
> could you tell me the case? I really need more advice and understanding :)

If you have only one driver, you can get away without ww_mutex.
drm/i915 does it, all buffer state is protected by dev->struct_mutex.
But as soon as you have multiple drivers sharing buffers with dma_buf
things will blow up.

Yep, current prime is broken and can lead to deadlocks.

In practice it doesn't (yet) matter since only the X server does the
sharing dance, and that one's single-threaded. Now you can claim that
since you have all buffers pinned in embedded gfx anyway, you don't
care. But both in desktop gfx and embedded gfx the real fun starts
once you put fences into the mix and link them up with buffers, then
every command submission risks that deadlock. Furthermore you can get
unlucky and construct a circle of fences waiting on each another (only
though if the fence singalling fires off the next batchbuffer
asynchronously).

To prevent such deadlocks you _absolutely_ need to lock _all_ buffers
that take part in a command submission at once. To do that you either
need a global lock (ugh) or ww_mutexes.

So ww_mutexes are the fundamental ingredient of all this, if you don't
see why you need them then everything piled on top is broken. I think
until you've understood why exactly we need ww_mutexes there's not
much point in discussing the finer issues of fences, reservation
objects and how to integrate it with dma_bufs exactly.

I'll try to clarify the motivating example in the ww_mutex
documentation a bit, but I dunno how else I could explain this ...

Yours, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65091] New: power_profile not working for HD5650

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65091

  Priority: medium
Bug ID: 65091
  Assignee: dri-devel@lists.freedesktop.org
   Summary: power_profile not working for HD5650
  Severity: major
Classification: Unclassified
OS: Linux (All)
  Reporter: nevehan...@gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Drivers/DRI/R600
   Product: Mesa

I have HD5650 with Intel i3 Ironlake (1-Gen) switcheable graphics.
My problem is that changing power_profile in radeon gives me nothing.
I have enabled KMS, use vgaswitcheroo to disable i915 and switch X to radeon.
Changing anything in radeon params at kernel parameters gives me also nothing.

It always stay at the same clocks:

For auto/low/mid/high profile
# cat /sys/kernel/debug/dri/0/radeon_pm_info
default engine clock: 10 kHz
current engine clock: 0 kHz
default memory clock: 157000 kHz
current memory clock: 156930 kHz
voltage: 1000 mV
PCIE lanes: 16

And it should be engine: 55 kHz and memory 80 kHz ( values taken from
windows )

I wanted to know if you could provide fix for it, and I'm open in helping with
providing every information.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [REGRESSION] system does not resume from ram due to commit "drm/nv50/fifo: prevent races between clients updating playlists"

2013-05-28 Thread Sven Joachim

On 2013-05-26 23:09 +0200, Maarten Maathuis wrote:

> My NV96 does not resume from suspend to ram (the screen stays black, magic
> sysrq keys do work) with the current linus git kernel, i bisected it to the
> following commit.
>
> drm/nv50/fifo: prevent races between clients updating playlists
> b5096566f6e1ee2b88324772f020ae9bc0cfa9a0
>
> It's not obvious to me how this causes problems, but reverting this commit
> does solve my problem.

Same here on my NV86.

Cheers,
   Sven
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 04:48:45PM +0200, Maarten Lankhorst wrote:
> This stresses the lockdep code in some ways specifically useful to
> ww_mutexes. It adds checks for most of the common locking errors.
> 
> Changes since v1:
>  - Add tests to verify reservation_id is untouched.
>  - Use L() and U() macros where possible.
> 
> Changes since v2:
>  - Use the ww_mutex api directly.
>  - Use macros for most of the code.
> Changes since v3:
>  - Rework tests for the api changes.
> 
> Signed-off-by: Maarten Lankhorst 
> ---
>  lib/locking-selftest.c |  405 
> ++--
>  1 file changed, 386 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
> index c3eb261..b18f1d3 100644
> --- a/lib/locking-selftest.c
> +++ b/lib/locking-selftest.c
> @@ -26,6 +26,8 @@
>   */
>  static unsigned int debug_locks_verbose;
>  
> +static DEFINE_WW_CLASS(ww_lockdep);
> +
>  static int __init setup_debug_locks_verbose(char *str)
>  {
>   get_option(&str, &debug_locks_verbose);
> @@ -42,6 +44,10 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
>  #define LOCKTYPE_RWLOCK  0x2
>  #define LOCKTYPE_MUTEX   0x4
>  #define LOCKTYPE_RWSEM   0x8
> +#define LOCKTYPE_WW  0x10
> +
> +static struct ww_acquire_ctx t, t2;
> +static struct ww_mutex o, o2;
>  
>  /*
>   * Normal standalone locks, for the circular and irq-context
> @@ -193,6 +199,16 @@ static void init_shared_classes(void)
>  #define RSU(x)   up_read(&rwsem_##x)
>  #define RWSI(x)  init_rwsem(&rwsem_##x)
>  
> +#define WWAI(x)  ww_acquire_init(x, &ww_lockdep)
> +#define WWAD(x)  ww_acquire_done(x)
> +#define WWAF(x)  ww_acquire_fini(x)
> +
> +#define WWL(x, c)ww_mutex_lock(x, c)
> +#define WWT(x)   ww_mutex_trylock(x)
> +#define WWL1(x)  ww_mutex_lock(x, NULL)
> +#define WWU(x)   ww_mutex_unlock(x)
> +
> +
>  #define LOCK_UNLOCK_2(x,y)   LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)
>  
>  /*
> @@ -894,11 +910,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
>  # define I_RWLOCK(x) lockdep_reset_lock(&rwlock_##x.dep_map)
>  # define I_MUTEX(x)  lockdep_reset_lock(&mutex_##x.dep_map)
>  # define I_RWSEM(x)  lockdep_reset_lock(&rwsem_##x.dep_map)
> +# define I_WW(x) lockdep_reset_lock(&x.dep_map)
>  #else
>  # define I_SPINLOCK(x)
>  # define I_RWLOCK(x)
>  # define I_MUTEX(x)
>  # define I_RWSEM(x)
> +# define I_WW(x)
>  #endif
>  
>  #define I1(x)\
> @@ -920,11 +938,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
>  static void reset_locks(void)
>  {
>   local_irq_disable();
> + lockdep_free_key_range(&ww_lockdep.acquire_key, 1);
> + lockdep_free_key_range(&ww_lockdep.mutex_key, 1);
> +
>   I1(A); I1(B); I1(C); I1(D);
>   I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
> + I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
>   lockdep_reset();
>   I2(A); I2(B); I2(C); I2(D);
>   init_shared_classes();
> +
> + ww_mutex_init(&o, &ww_lockdep); ww_mutex_init(&o2, &ww_lockdep);
> + memset(&t, 0, sizeof(t)); memset(&t2, 0, sizeof(t2));
> + memset(&ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
> + memset(&ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
>   local_irq_enable();
>  }
>  
> @@ -938,7 +965,6 @@ static int unexpected_testcase_failures;
>  static void dotest(void (*testcase_fn)(void), int expected, int 
> lockclass_mask)
>  {
>   unsigned long saved_preempt_count = preempt_count();
> - int expected_failure = 0;
>  
>   WARN_ON(irqs_disabled());
>  
> @@ -946,26 +972,16 @@ static void dotest(void (*testcase_fn)(void), int 
> expected, int lockclass_mask)
>   /*
>* Filter out expected failures:
>*/
> + if (debug_locks != expected) {
>  #ifndef CONFIG_PROVE_LOCKING
> - if ((lockclass_mask & LOCKTYPE_SPIN) && debug_locks != expected)
> - expected_failure = 1;
> - if ((lockclass_mask & LOCKTYPE_RWLOCK) && debug_locks != expected)
> - expected_failure = 1;
> - if ((lockclass_mask & LOCKTYPE_MUTEX) && debug_locks != expected)
> - expected_failure = 1;
> - if ((lockclass_mask & LOCKTYPE_RWSEM) && debug_locks != expected)
> - expected_failure = 1;
> + expected_testcase_failures++;
> + printk("failed|");
> +#else
> + unexpected_testcase_failures++;
> + printk("FAILED|");
> +
> + dump_stack();
>  #endif
> - if (debug_locks != expected) {
> - if (expected_failure) {
> - expected_testcase_failures++;
> - printk("failed|");
> - } else {
> - unexpected_testcase_failures++;
> -
> - printk("FAILED|");
> -

[Bug 65095] New: BARTS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65095

  Priority: medium
Bug ID: 65095
  Assignee: dri-devel@lists.freedesktop.org
   Summary: BARTS [drm:r600_uvd_init] *ERROR* UVD not responding,
trying to reset the VCPU!!!
  Severity: major
Classification: Unclassified
OS: Linux (All)
  Reporter: spamjunkea...@gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

UVD is not working on BARTS (HD6850) with Linux 3.10-rc2

I got
[drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
errors on log.

I believe it's same issue with bug ID 63935 (
https://bugs.freedesktop.org/show_bug.cgi?id=63935 )

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PULL] drm-intel-next for 3.11

2013-05-28 Thread Daniel Vetter

Hi Dave,

So I've figured it's time to upon up drm-next with a nice pile of intel
patches. And there seems to be some other stuff pending on dri-devel
already, too ;-)

Highlights (copy-pasted from my testing cycle mails):
- fbc support for Haswell (Rodrigo)
- streamlined workaround comments, including an igt tool to grep for
  them (Damien)
- sdvo and TV out cleanups, including a fixup for sdvo multifunction devices
- refactor our eDP mess a bit (Imre)
- don't register the hdmi connector on haswell when desktop eDP is present
- vlv support is no longer preliminary!
- more vlv fixes from Jesse for stolen and dpll handling
- more flexible power well checking infrastructure from Paulo
- a few gtt patches from Ben
- a bit of OCD cleanups for transcoder #defines and an assorted pile
  of smaller things.
- fixes for the gmch modeset sequence
- a bit of OCD around plane/pipe usage (Ville)
- vlv turbo support (Jesse)
- tons of vlv modeset fixes (Jesse et al.)
- vlv pte write fixes (Kenneth Graunke)
- hpd filtering to avoid costly probes on unaffected outputs (Egbert Eich)
- intel dev_info cleanups and refactorings (Damien)
- vlv rc6 support (Jesse)
- random pile of fixes around non-24bpp modes handling
- asle/opregion cleanups and locking fixes (Jani)
- dp dpll refactoring
- improvements for reduced_clock computation on g4x/ilk+
- pfit state refactored to use pipe_config (Jesse)
- lots more computed modeset state moved to pipe_config, including readout
  and cross-check support
- fdi auto-dithering for ivb B/C links, using the neat pipe_config
  improvements
- drm_rect helpers plus sprite clipping fixes (Ville)
- hw context refcounting (Mika + Ben)

Note that the merge with Linus' tree was a bit messy so I've also pushed
out a 2nd tag drm-intel-next-2013-05-20-merged which has the backmerge
which is already in my queue. Pull request for the merged tree below. Just
drop the -merged suffix if you want to have some fun ;-)

Cheers, Daniel


The following changes since commit c7788792a5e7b0d5d7f96d0766b4cb6112d47d75:

  Linux 3.10-rc2 (2013-05-20 14:37:38 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~danvet/drm-intel 
tags/drm-intel-next-2013-05-20-merged

for you to fetch changes up to e1b73cba13a0cc68dd4f746eced15bd6bb24cda4:

  Merge tag 'v3.10-rc2' into drm-intel-next-queued (2013-05-21 09:52:16 +0200)



Ben Widawsky (3):
  drm/i915: Assert mutex_is_locked on context lookup
  drm/i915: BUG_ON bad PPGTT offset
  drm/i915: Extract PDE writes

Chris Wilson (2):
  drm/i915: Only print the info message about incresing stolen size for FBC 
once
  drm/i915: put context upon switching

Damien Lespiau (12):
  drm/i915: Remove mention of Haswell in DDI code
  drm/i915: Turn DEV_INFO_FLAGS into a foreach style macro
  drm/i915: Replace the line of %s by a DEV_INFO_FOR_EACH_FLAG() invocation
  drm/i915: Use DEV_INFO_FOR_EACH_FLAG() to declare flags as well
  drm/i915: Turn HAS_DDI() into a device_info flag
  drm/i915: Introduce HAS_FPGA_DBG_UNCLAIMED()
  drm/i915: Turn HAS_FPGA_DBG_UNCLAIMED into a device_info flag
  drm/i915: Ivybridge is the odd one when it comes to pipe scalers
  drm/i915: Add platform information to implemented workarounds
  drm/i915: Add references to some workaround we implement
  drm/i915: Compute WR PLL dividers dynamically
  drm/i915: Add missing platform tags to FBC workaround comments

Daniel Vetter (56):
  drm/i915: don't enable the plane too early in i9xx_crtc_mode_set
  drm/i915: drop redundant vblank waits
  drm/i915: add pipe asserts for the crtc enable sequence
  drm/i915: add i9xx pfit pipe asserts
  drm/i915: move debug output back to the right place
  drm/i915: fix VLV limits
  drm/i915: magic VLV PLL registers in the dpio sideband
  drm/i915: disable interrupts earlier in the driver unload code
  drm/i915: Disable high-bpc on pre-1.4 EDID screens
  drm/i915: Fixup non-24bpp support for VGA screens on Haswell
  drm/i915: consolidate pch pll computations a bit
  drm/i915: shovel compute clock into crtc->config.dpll on ilk
  drm/i915: move dp clock computations to encoder->compute_config
  drm/i915: use pipe_config for lvds dithering
  drm/i915: don't force matching p1 for g4x/ilk+ reduced pll settings
  drm/i915: remove redundant has_pch_encoder check
  drm/i915: simplify config->pixel_multiplier handling
  drm/i915: put the right cpu_transcoder into pipe_config for hw state 
readout
  drm/i915: force bpp for eDP panels
  drm/i915: drop adjusted_mode from *_set_pipeconf functions
  drm/i915: implement high-bpc + pipeconf-dither support for g4x/vlv
  drm/i915: allow high-bpc modes on DP
  drm/i915: move intel_crtc->fdi_lanes to pipe_config
  drm/i915: hw state readout support for pipe_config->fdi_lanes
  drm/i915: split up fdi_

Re: [PATCH 2/6] gpu: host1x: Fix syncpoint wait return value

2013-05-28 Thread Keith Packard

Thierry Reding  writes:


> That doesn't sound right. Maybe drmIoctl() needs fixing instead. Looking
> at the history, drmIoctl() was introduced to automatically loop if a
> signal was received (commit 8b9ab108ec1f2ba2b503f713769c4946849b3cb2).
> However the ioctl(3p) manpage doesn't mention that ioctl() returns
> EAGAIN in case it is interrupted by a signal.

EAGAIN is being returned when the GPU is wedged to ask the application
to re-submit the request, which will presumably be held until the  GPU
is un-wedged.

-- 
keith.pack...@intel.com


pgpuDOHJ5RMjV.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4

2013-05-28 Thread Maarten Lankhorst

Op 28-05-13 21:18, Daniel Vetter schreef:
> On Tue, May 28, 2013 at 04:48:45PM +0200, Maarten Lankhorst wrote:
>> This stresses the lockdep code in some ways specifically useful to
>> ww_mutexes. It adds checks for most of the common locking errors.
>>
>> Changes since v1:
>>  - Add tests to verify reservation_id is untouched.
>>  - Use L() and U() macros where possible.
>>
>> Changes since v2:
>>  - Use the ww_mutex api directly.
>>  - Use macros for most of the code.
>> Changes since v3:
>>  - Rework tests for the api changes.
>>
>> 
>>
>> +static void ww_test_normal(void)
>> +{
>> +int ret;
>> +
>> +WWAI(&t);
>> +
>> +/*
>> + * test if ww_id is kept identical if not
>> + * called with any of the ww_* locking calls
>> + */
>> +
>> +/* mutex_lock (and indirectly, mutex_lock_nested) */
>> +o.ctx = (void *)~0UL;
>> +mutex_lock(&o.base);
>> +mutex_unlock(&o.base);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* mutex_lock_interruptible (and *_nested) */
>> +o.ctx = (void *)~0UL;
>> +ret = mutex_lock_interruptible(&o.base);
>> +if (!ret)
>> +mutex_unlock(&o.base);
>> +else
>> +WARN_ON(1);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* mutex_lock_killable (and *_nested) */
>> +o.ctx = (void *)~0UL;
>> +ret = mutex_lock_killable(&o.base);
>> +if (!ret)
>> +mutex_unlock(&o.base);
>> +else
>> +WARN_ON(1);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* trylock, succeeding */
>> +o.ctx = (void *)~0UL;
>> +ret = mutex_trylock(&o.base);
>> +WARN_ON(!ret);
>> +if (ret)
>> +mutex_unlock(&o.base);
>> +else
>> +WARN_ON(1);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* trylock, failing */
>> +o.ctx = (void *)~0UL;
>> +mutex_lock(&o.base);
>> +ret = mutex_trylock(&o.base);
>> +WARN_ON(ret);
>> +mutex_unlock(&o.base);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* nest_lock */
>> +o.ctx = (void *)~0UL;
>> +mutex_lock_nest_lock(&o.base, &t);
>> +mutex_unlock(&o.base);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +}
> Since we don't really allow this any more (instead allow ww_mutex_lock
> without context) do we need this test here really?
Yes. This test verifies all the normal locking paths are not affected by the 
ww_ctx changes.

>> +
>> +static void ww_test_two_contexts(void)
>> +{
>> +WWAI(&t);
>> +WWAI(&t2);
>> +}
>> +
>> +static void ww_test_context_unlock_twice(void)
>> +{
>> +WWAI(&t);
>> +WWAD(&t);
>> +WWAF(&t);
>> +WWAF(&t);
>> +}
>> +
>> +static void ww_test_object_unlock_twice(void)
>> +{
>> +WWL1(&o);
>> +WWU(&o);
>> +WWU(&o);
>> +}
>> +
>> +static void ww_test_spin_nest_unlocked(void)
>> +{
>> +raw_spin_lock_nest_lock(&lock_A, &o.base);
>> +U(A);
>> +}
> I don't quite see the point of this one here ...
It's a lockdep test that was missing. o.base is not locked. So lock_A is being 
nested into an unlocked lock, resulting in a lockdep error.

>> +
>> +static void ww_test_unneeded_slow(void)
>> +{
>> +int ret;
>> +
>> +WWAI(&t);
>> +
>> +ww_mutex_lock_slow(&o, &t);
>> +}
> I think checking the _slow debug stuff would be neat, i.e.
> - fail/success tests for properly unlocking all held locks
> - fail/success tests for lock_slow acquiring the right lock.
>
> Otherwise I didn't spot anything that seems missing in these self-tests
> here.
>
Yes it would be nice, doing so is left as an excercise for the reviewer, who 
failed to raise this point sooner. ;-)

~Maarten
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65095] BARTS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65095

Erdem U. Altınyurt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae



> -Original Message-
> From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of
> Daniel Vetter
> Sent: Wednesday, May 29, 2013 1:50 AM
> To: Inki Dae
> Cc: Rob Clark; Maarten Lankhorst; linux-fbdev; YoungJun Cho; Kyungmin
Park;
> myungjoo.ham; DRI mailing list; linux-arm-ker...@lists.infradead.org;
> linux-me...@vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 28, 2013 at 4:50 PM, Inki Dae  wrote:
> > I think I already used reservation stuff any time in that way except
> > ww-mutex. And I'm not sure that embedded system really needs ww-mutex.
> If
> > there is any case,
> > could you tell me the case? I really need more advice and
> understanding :)
> 
> If you have only one driver, you can get away without ww_mutex.
> drm/i915 does it, all buffer state is protected by dev->struct_mutex.
> But as soon as you have multiple drivers sharing buffers with dma_buf
> things will blow up.
> 
> Yep, current prime is broken and can lead to deadlocks.
> 
> In practice it doesn't (yet) matter since only the X server does the
> sharing dance, and that one's single-threaded. Now you can claim that
> since you have all buffers pinned in embedded gfx anyway, you don't
> care. But both in desktop gfx and embedded gfx the real fun starts
> once you put fences into the mix and link them up with buffers, then
> every command submission risks that deadlock. Furthermore you can get
> unlucky and construct a circle of fences waiting on each another (only
> though if the fence singalling fires off the next batchbuffer
> asynchronously).

In our case, we haven't ever experienced deadlock yet but there is still
possible to face with deadlock in case that a process is sharing two buffer
with another process like below,
Process A committed buffer A and  waits for buffer B,
Process B committed buffer B and waits for buffer A

That is deadlock and it seems that you say we can resolve deadlock issue
with ww-mutexes. And it seems that we can replace our block-wakeup mechanism
with mutex lock for more performance.

> 
> To prevent such deadlocks you _absolutely_ need to lock _all_ buffers
> that take part in a command submission at once. To do that you either
> need a global lock (ugh) or ww_mutexes.
> 
> So ww_mutexes are the fundamental ingredient of all this, if you don't
> see why you need them then everything piled on top is broken. I think
> until you've understood why exactly we need ww_mutexes there's not
> much point in discussing the finer issues of fences, reservation
> objects and how to integrate it with dma_bufs exactly.
> 
> I'll try to clarify the motivating example in the ww_mutex
> documentation a bit, but I dunno how else I could explain this ...
> 

I don't really want for you waste your time on me. I will trying to apply
ww-mutexes (v4) to the proposed framework for more understanding.

Thanks for your advices.:) 
Inki Dae

> Yours, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 64471] Radeon HD6570 lockup in Brütal Legend with HyperZ

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=64471

Alex Deucher  changed:

   What|Removed |Added

Summary|Radeon HD6570 lockup in |Radeon HD6570 lockup in
   |Brütal Legend   |Brütal Legend with HyperZ
Product|DRI |Mesa
Version|XOrg CVS|git
  Component|DRM/Radeon  |Drivers/Gallium/r600

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 64471] Radeon HD6570 lockup in Brütal Legend with HyperZ

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=64471

Alex Deucher  changed:

   What|Removed |Added

 CC||thomas.lindr...@gmail.com

--- Comment #6 from Alex Deucher  ---
*** Bug 64933 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 64933] Hyperz related gpu lockup on git mesa in Brütal Legend

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=64933

Alex Deucher  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Alex Deucher  ---


*** This bug has been marked as a duplicate of bug 64471 ***

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 64983] X3 Terran Conflict displays strange colors.

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=64983

Alex Deucher  changed:

   What|Removed |Added

  Attachment #79788|text/plain  |image/jpeg
  mime type||

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 65068] vgaswitcheroo doesn't deal with powered off dGPU on resume

2013-05-28 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=65068

Alex Deucher  changed:

   What|Removed |Added

Summary|AtomBIOS stuck after|vgaswitcheroo doesn't deal
   |suspend/resume cycle whilst |with powered off dGPU on
   |GPU turned off  |resume

--- Comment #1 from Alex Deucher  ---
It looks like vagswitcheroo doesn't properly enable the dgpu on resume so the
driver tried to resume disabled hardware.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 63935] TURKS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=63935

--- Comment #49 from Erdem U. Alt?nyurt  ---
Patch doesn't fix the issue on radeon HD 6850 (BARTS 0x1002:0x6739
0x174B:0xE174)
kernel 3.10-rc3 + SUMO_uvs.patch
(https://bugs.freedesktop.org/attachment.cgi?id=79663)

Still got
[drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
errors. :(

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/b3bd177e/attachment.html>

[Bug 63935] TURKS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=63935

--- Comment #50 from Erdem U. Alt?nyurt  ---
Also patched SUMO2 patch from
http://lists.freedesktop.org/archives/dri-devel/2013-May/038894.html

'''Still no success!'''

Still got [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the
VCPU!!!

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/be554d79/attachment.html>

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae



> -Original Message-
> From: Maarten Lankhorst [mailto:maarten.lankhorst at canonical.com]
> Sent: Tuesday, May 28, 2013 12:23 AM
> To: Inki Dae
> Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'YoungJun Cho'; 'Kyungmin
> Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
> kernel at lists.infradead.org; linux-media at vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> Hey,
> 
> Op 27-05-13 12:38, Inki Dae schreef:
> > Hi all,
> >
> > I have been removed previous branch and added new one with more cleanup.
> > This time, the fence helper doesn't include user side interfaces and
> cache
> > operation relevant codes anymore because not only we are not sure that
> > coupling those two things, synchronizing caches and buffer access
> between
> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
> a
> > good idea yet but also existing codes for user side have problems with
> badly
> > behaved or crashing userspace. So this could be more discussed later.
> >
> > The below is a new branch,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/?h=dma-f
> > ence-helper
> >
> > And fence helper codes,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >
> > And example codes for device driver,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >
> > I think the time is not yet ripe for RFC posting: maybe existing dma
> fence
> > and reservation need more review and addition work. So I'd glad for
> somebody
> > giving other opinions and advices in advance before RFC posting.
> >
> NAK.
> 
> For examples for how to handle locking properly, see Documentation/ww-
> mutex-design.txt in my recent tree.
> I could list what I believe is wrong with your implementation, but real
> problem is that the approach you're taking is wrong.

I just removed ticket stubs to show my approach you guys as simple as
possible, and I just wanted to show that we could use buffer synchronization
mechanism without ticket stubs.

Question, WW-Mutexes could be used for all devices? I guess this has
dependence on x86 gpu: gpu has VRAM and it means different memory domain.
And could you tell my why shared fence should have only eight objects? I
think we could need more than eight objects for read access. Anyway I think
I don't surely understand yet so there might be my missing point.

Thanks,
Inki Dae

> 

> ~Maarten

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae



> -Original Message-
> From: linux-fbdev-owner at vger.kernel.org [mailto:linux-fbdev-
> owner at vger.kernel.org] On Behalf Of Rob Clark
> Sent: Tuesday, May 28, 2013 12:48 AM
> To: Inki Dae
> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> Park; myungjoo.ham; DRI mailing list;
linux-arm-kernel at lists.infradead.org;
> linux-media at vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> > Hi all,
> >
> > I have been removed previous branch and added new one with more cleanup.
> > This time, the fence helper doesn't include user side interfaces and
> cache
> > operation relevant codes anymore because not only we are not sure that
> > coupling those two things, synchronizing caches and buffer access
> between
> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
> a
> > good idea yet but also existing codes for user side have problems with
> badly
> > behaved or crashing userspace. So this could be more discussed later.
> >
> > The below is a new branch,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/?h=dma-f
> > ence-helper
> >
> > And fence helper codes,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >
> > And example codes for device driver,
> >
> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >
> > I think the time is not yet ripe for RFC posting: maybe existing dma
> fence
> > and reservation need more review and addition work. So I'd glad for
> somebody
> > giving other opinions and advices in advance before RFC posting.
> 
> thoughts from a *really* quick, pre-coffee, first look:
> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> probably wouldn't want to bake in assumption that seqno_fence is used.
> * I guess g2d is probably not actually a simple use case, since I
> expect you can submit blits involving multiple buffers :-P

I don't think so. One and more buffers can be used: seqno_fence also has
only one buffer. Actually, we have already applied this approach to most
devices; multimedia, gpu and display controller. And this approach shows
more performance; reduced power consumption against traditional way. And g2d
example is just to show you how to apply my approach to device driver.

> * otherwise, you probably don't want to depend on dmabuf, which is why
> reservation/fence is split out the way it is..  you want to be able to
> use a single reservation/fence mechanism within your driver without
> having to care about which buffers are exported to dmabuf's and which
> are not.  Creating a dmabuf for every GEM bo is too heavyweight.

Right. But I think we should dealt with this separately. Actually, we are
trying to use reservation for gpu pipe line synchronization such as sgx sync
object and this approach is used without dmabuf. In order words, some device
can use only reservation for such pipe line synchronization and at the same
time, fence helper or similar thing with dmabuf for buffer synchronization.

> 
> I'm not entirely sure if reservation/fence could/should be made any
> simpler for multi-buffer users.  Probably the best thing to do is just
> get reservation/fence rolled out in a few drivers and see if some
> common patterns emerge.
> 
> BR,
> -R
> 
> >
> > Thanks,
> > Inki Dae
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae



> -Original Message-
> From: daniel.vetter at ffwll.ch [mailto:daniel.vetter at ffwll.ch] On Behalf 
> Of
> Daniel Vetter
> Sent: Tuesday, May 28, 2013 1:02 AM
> To: Rob Clark
> Cc: Inki Dae; Maarten Lankhorst; linux-fbdev; YoungJun Cho; Kyungmin Park;
> myungjoo.ham; DRI mailing list; linux-arm-kernel at lists.infradead.org;
> linux-media at vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 5:47 PM, Rob Clark  wrote:
> > On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> >> Hi all,
> >>
> >> I have been removed previous branch and added new one with more
cleanup.
> >> This time, the fence helper doesn't include user side interfaces and
> cache
> >> operation relevant codes anymore because not only we are not sure that
> >> coupling those two things, synchronizing caches and buffer access
> between
> >> CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side
> is a
> >> good idea yet but also existing codes for user side have problems with
> badly
> >> behaved or crashing userspace. So this could be more discussed later.
> >>
> >> The below is a new branch,
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/?h=dma-f
> >> ence-helper
> >>
> >> And fence helper codes,
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> >> h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >>
> >> And example codes for device driver,
> >>
> >> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> exynos.git/commit/?
> >> h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >>
> >> I think the time is not yet ripe for RFC posting: maybe existing dma
> fence
> >> and reservation need more review and addition work. So I'd glad for
> somebody
> >> giving other opinions and advices in advance before RFC posting.
> >
> > thoughts from a *really* quick, pre-coffee, first look:
> > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > probably wouldn't want to bake in assumption that seqno_fence is used.
> 
> Yeah, which is why Maarten&I discussed ideas already for what needs to
> be improved in the current dma-buf interface code to make this Just
> Work. At least as long as a driver doesn't want to add new fences,
> which would be especially useful for all kinds of gpu access.
> 
> > * I guess g2d is probably not actually a simple use case, since I
> > expect you can submit blits involving multiple buffers :-P
> 
> Yeah, on a quick read the current fence helper code seems to be a bit
> limited in scope.
> 
> > * otherwise, you probably don't want to depend on dmabuf, which is why
> > reservation/fence is split out the way it is..  you want to be able to
> > use a single reservation/fence mechanism within your driver without
> > having to care about which buffers are exported to dmabuf's and which
> > are not.  Creating a dmabuf for every GEM bo is too heavyweight.
> 
> That's pretty much the reason that reservations are free-standing from
> dma_bufs. The idea is to embed them into the gem/ttm/v4l buffer
> object. Maarten also has some helpers to keep track of multi-buffer
> ww_mutex locking and fence attaching in his reservation helpers, but I
> think we should wait with those until we have drivers using them.
> 
> For now I think the priority should be to get the basic stuff in and
> ttm as the first user established. Then we can go nuts later on.
> 
> > I'm not entirely sure if reservation/fence could/should be made any
> > simpler for multi-buffer users.  Probably the best thing to do is just
> > get reservation/fence rolled out in a few drivers and see if some
> > common patterns emerge.
> 
> I think we can make the 1 buffer per dma op (i.e. 1:1
> dma_buf->reservation : fence mapping) work fairly simple in dma_buf
> with maybe a dma_buf_attachment_start_dma/end_dma helpers. But there's
> also still the open that currently there's no way to flush cpu caches
> for dma access without unmapping the attachement (or resorting to


That was what I tried adding user interfaces to dmabuf: coupling
synchronizing caches and buffer access between CPU and CPU, CPU and DMA, and
DMA and DMA with fences in kernel side. We need something to do between
mapping and unmapping attachment.

> trick which might not work), so we have a few gaping holes in the
> interface already anyway.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

[PATCH] drm/tegra: add support for runtime pm

2013-05-28 Thread Terje Bergström

On 27.05.2013 18:45, Thierry Reding wrote:
> On Mon, May 27, 2013 at 07:19:28PM +0530, Mayuresh Kulkarni wrote:
>> +#ifdef CONFIG_PM_RUNTIME
>> +static int host1x_runtime_suspend(struct device *dev)
>> +{
>> +struct host1x *host;
>> +
>> +host = dev_get_drvdata(dev);
>> +if (IS_ERR_OR_NULL(host))
> 
> I think a simple
> 
>   if (!host)
>   return -EINVAL;
> 
> would be enough here. The driver-data of the device should never be an
> ERR_PTR()-encoded value, but either a valid pointer to a host1x object
> or NULL.

True, we should avoid IS_ERR_OR_NULL() like plague. We always know if
the called API returns a NULL on error or an error code. In case of
error code we should just propagate that.

> Same comments apply here. Also I think it might be a good idea to split
> the host1x and gr2d changes into separate patches.

That's a bit tricky, but doable. We just need to enable it for 2D first,
and then host1x to keep bisectability.

>>  static void action_submit_complete(struct host1x_waitlist *waiter)
>>  {
>> +int completed = waiter->count;
>>  struct host1x_channel *channel = waiter->data;
>>  
>> +/* disable clocks for all the submits that got completed in this lot */
>> +while (completed--)
>> +pm_runtime_put(channel->dev);
>> +
>>  host1x_cdma_update(&channel->cdma);
>>  
>> -/*  Add nr_completed to trace */
>> +/* Add nr_completed to trace */
>>  trace_host1x_channel_submit_complete(dev_name(channel->dev),
>>   waiter->count, waiter->thresh);
>> -
>>  }
> 
> This feels hackish. But I can't see any better place to do this. Terje,
> Arto: any ideas how we can do this in a cleaner way? If there's nothing
> better then maybe moving the code into a separate function, say
> host1x_waitlist_complete(), might make this less awkward?

Yeah, it's a bit awkward. action_submit_complete() actually does handle
completion of multiple jobs, and we do one pm_runtime_get() per job.

We could do pm_runtime_put() in host1x_cdma_update(). It anyway goes
through each job that is completed, so while freeing the job it could as
well call runtime PM. That way we could even remove the waiter->count
variable altogether as it's not needed anymore.

The not-so-beautiful aspect is that we do pm_runtime_get() in
host1x_channel.c and pm_runtime_put() in host1x_cdma.c. For code
readability it's be great to have them in the same file. I actually get
questions every now and then because in downstream because of doing
these operations in different files.

Terje

[git pull] drm fixes

2013-05-28 Thread Dave Airlie


Hi Linus,

this is mostly exynos and intel fixes, along with some vblank patches I 
lost from Rob a few months ago that make wayland work better on lots of 
GPUs, also a qxl kconfig fix.

Dave.

The following changes since commit b91fd4d5aad0c1124654341814067ca3f59490fc:

  Merge tag 'pci-v3.10-fixes-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci (2013-05-23 13:50:53 
-0700)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux.git drm-fixes

for you to fetch changes up to c89b65e7fffef745bdd36c372aa0dea778fecbab:

  qxl: fix Kconfig deps - select FB_DEFERRED_IO (2013-05-28 17:03:37 +1000)


Andrew Jones (1):
  qxl: fix Kconfig deps - select FB_DEFERRED_IO

Chris Wilson (1):
  drm/i915: Propagate errors back from fb set-base

Dave Airlie (3):
  Merge remote-tracking branch 'pfdo/drm-fixes' into drm-next
  Merge branch 'exynos-drm-fixes' of 
git://git.kernel.org/.../daeinki/drm-exynos into drm-next
  Merge branch 'drm-intel-fixes' of 
git://people.freedesktop.org/~danvet/drm-intel into drm-next

Imre Deak (5):
  drm/i915: force full modeset if the connector is in DPMS OFF mode
  drm/i915: add msecs_to_jiffies_timeout to guarantee minimum duration
  drm/i915: use msecs_to_jiffies_timeout instead of open coding the same
  drm/i915: avoid premature timeouts in __wait_seqno()
  drm/i915: avoid premature DP AUX timeouts

Inki Dae (1):
  drm/exynos: wait for the completion of pending page flip

Lars-Peter Clausen (1):
  drm/exynos: exynos_hdmi: Pass correct pointer to free_irq()

Rob Clark (6):
  drm/nouveau: use drm_send_vblank_event() helper
  drm/radeon: use drm_send_vblank_event() helper
  drm/shmob: use drm_send_vblank_event() helper
  drm/imx: use drm_send_vblank_event() helper
  drm/exynos: page flip fixes
  drm/exynos: use drm_send_vblank_event() helper

Rodrigo Vivi (1):
  drm/i915: Adding more reserved PCI IDs for Haswell.

Sachin Kamat (2):
  drm/exynos: exynos_drm_fbdev: Fix incorrect usage of IS_ERR_OR_NULL
  drm/exynos: exynos_drm_ipp: Fix incorrect usage of IS_ERR_OR_NULL

Seung-Woo Kim (4):
  drm/exynos: cleanup device pointer usages
  drm/exynos: fix build warnings from ipp fimc
  drm/exynos: remove unnecessary devm_kfree
  drm/exynos: replace request_threaded_irq with devm function

 drivers/gpu/drm/exynos/exynos_drm_crtc.c| 27 ++--
 drivers/gpu/drm/exynos/exynos_drm_fbdev.c   |  2 +-
 drivers/gpu/drm/exynos/exynos_drm_fimc.c| 12 +++
 drivers/gpu/drm/exynos/exynos_drm_fimd.c| 10 +++---
 drivers/gpu/drm/exynos/exynos_drm_g2d.c |  6 ++--
 drivers/gpu/drm/exynos/exynos_drm_gsc.c | 12 ++-
 drivers/gpu/drm/exynos/exynos_drm_hdmi.c|  2 +-
 drivers/gpu/drm/exynos/exynos_drm_ipp.c | 18 +--
 drivers/gpu/drm/exynos/exynos_drm_rotator.c | 13 ++--
 drivers/gpu/drm/exynos/exynos_drm_vidi.c|  4 +--
 drivers/gpu/drm/exynos/exynos_hdmi.c| 21 +
 drivers/gpu/drm/exynos/exynos_mixer.c   | 14 -
 drivers/gpu/drm/i915/i915_drv.c | 46 ---
 drivers/gpu/drm/i915/i915_drv.h | 15 +
 drivers/gpu/drm/i915/i915_gem.c |  2 +-
 drivers/gpu/drm/i915/intel_display.c| 49 +++--
 drivers/gpu/drm/i915/intel_dp.c |  2 +-
 drivers/gpu/drm/i915/intel_i2c.c|  5 +--
 drivers/gpu/drm/nouveau/nouveau_display.c   | 13 ++--
 drivers/gpu/drm/qxl/Kconfig |  1 +
 drivers/gpu/drm/radeon/radeon_display.c | 13 ++--
 drivers/gpu/drm/shmobile/shmob_drm_crtc.c   | 19 +++
 drivers/staging/imx-drm/ipuv3-crtc.c| 21 ++---
 23 files changed, 163 insertions(+), 164 deletions(-)

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Maarten Lankhorst

Hey,

Op 28-05-13 04:49, Inki Dae schreef:
>
>> -Original Message-
>> From: Maarten Lankhorst [mailto:maarten.lankhorst at canonical.com]
>> Sent: Tuesday, May 28, 2013 12:23 AM
>> To: Inki Dae
>> Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'YoungJun Cho'; 'Kyungmin
>> Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm-
>> kernel at lists.infradead.org; linux-media at vger.kernel.org
>> Subject: Re: Introduce a new helper framework for buffer synchronization
>>
>> Hey,
>>
>> Op 27-05-13 12:38, Inki Dae schreef:
>>> Hi all,
>>>
>>> I have been removed previous branch and added new one with more cleanup.
>>> This time, the fence helper doesn't include user side interfaces and
>> cache
>>> operation relevant codes anymore because not only we are not sure that
>>> coupling those two things, synchronizing caches and buffer access
>> between
>>> CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
>> a
>>> good idea yet but also existing codes for user side have problems with
>> badly
>>> behaved or crashing userspace. So this could be more discussed later.
>>>
>>> The below is a new branch,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/?h=dma-f
>>> ence-helper
>>>
>>> And fence helper codes,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>>> h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
>>>
>>> And example codes for device driver,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>>> h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
>>>
>>> I think the time is not yet ripe for RFC posting: maybe existing dma
>> fence
>>> and reservation need more review and addition work. So I'd glad for
>> somebody
>>> giving other opinions and advices in advance before RFC posting.
>>>
>> NAK.
>>
>> For examples for how to handle locking properly, see Documentation/ww-
>> mutex-design.txt in my recent tree.
>> I could list what I believe is wrong with your implementation, but real
>> problem is that the approach you're taking is wrong.
> I just removed ticket stubs to show my approach you guys as simple as
> possible, and I just wanted to show that we could use buffer synchronization
> mechanism without ticket stubs.
The tickets have been removed in favor of a ww_context. Moving it in as a base 
primitive
allows more locking abuse to be detected, and makes some other things easier 
too.

> Question, WW-Mutexes could be used for all devices? I guess this has
> dependence on x86 gpu: gpu has VRAM and it means different memory domain.
> And could you tell my why shared fence should have only eight objects? I
> think we could need more than eight objects for read access. Anyway I think
> I don't surely understand yet so there might be my missing point.
Yes, ww mutexes are not limited in any way to x86. They're a locking mechanism.
When you acquired the ww mutexes for all buffer objects, all it does is say at
that point in time you have exclusively acquired the locks of all bo's.

After locking everything you can read the fence pointers safely, queue waits, 
and set a
new fence pointer on all reservation_objects. You only need a single fence
on all those objects, so 8 is plenty. Nonetheless this was a limitation of my
earlier design, and I'll dynamically allocate fence_shared in the future.

~Maarten

[PATCH 1/2] drm/exynos: fix WINDOWS_NR checking to vidi driver

2013-05-28 Thread Inki Dae

This patch just checks if win_data array range is valid
or not correctly.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_vidi.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_vidi.c 
b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
index 24376c1..11a016d 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
@@ -282,7 +282,7 @@ static void vidi_win_mode_set(struct device *dev,
if (win == DEFAULT_ZPOS)
win = ctx->default_win;

-   if (win < 0 || win > WINDOWS_NR)
+   if (win < 0 || win >= WINDOWS_NR)
return;

offset = overlay->fb_x * (overlay->bpp >> 3);
@@ -332,7 +332,7 @@ static void vidi_win_commit(struct device *dev, int zpos)
if (win == DEFAULT_ZPOS)
win = ctx->default_win;

-   if (win < 0 || win > WINDOWS_NR)
+   if (win < 0 || win >= WINDOWS_NR)
return;

win_data = &ctx->win_data[win];
@@ -356,7 +356,7 @@ static void vidi_win_disable(struct device *dev, int zpos)
if (win == DEFAULT_ZPOS)
win = ctx->default_win;

-   if (win < 0 || win > WINDOWS_NR)
+   if (win < 0 || win >= WINDOWS_NR)
return;

win_data = &ctx->win_data[win];
-- 
1.7.5.4

drm/exynos: make overlay data to be updated to valid hw

2013-05-28 Thread Inki Dae

This patch makes sure that overlay data are updated
to real hardware enabled when framebuffer is released.
For this, this patch checks if crtc and encoder are
valid or not, and then makes it waiting for signal
synchroniztion to only valid encoder.

Signed-off-by: Inki Dae 
Signed-off-by: Kyungmin Park 
---
 drivers/gpu/drm/exynos/exynos_drm_encoder.c |9 ++---
 drivers/gpu/drm/exynos/exynos_drm_encoder.h |2 +-
 drivers/gpu/drm/exynos/exynos_drm_fb.c  |   13 +++--
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.c 
b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
index c63721f..9a6e3fd 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_encoder.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
@@ -220,18 +220,21 @@ static void exynos_drm_encoder_commit(struct drm_encoder 
*encoder)
exynos_encoder->dpms = DRM_MODE_DPMS_ON;
 }

-void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb)
+void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc)
 {
struct exynos_drm_encoder *exynos_encoder;
struct exynos_drm_manager_ops *ops;
-   struct drm_device *dev = fb->dev;
+   struct drm_device *dev = crtc->dev;
struct drm_encoder *encoder;

/*
 * make sure that overlay data are updated to real hardware
-* for all encoders.
+* for valid encoders.
 */
list_for_each_entry(encoder, &dev->mode_config.encoder_list, head) {
+   if (encoder->crtc != crtc)
+   continue;
+
exynos_encoder = to_exynos_encoder(encoder);
ops = exynos_encoder->manager->ops;

diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.h 
b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
index 89e2fb0..e8dee1c 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_encoder.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
@@ -32,6 +32,6 @@ void exynos_drm_encoder_plane_mode_set(struct drm_encoder 
*encoder, void *data);
 void exynos_drm_encoder_plane_commit(struct drm_encoder *encoder, void *data);
 void exynos_drm_encoder_plane_enable(struct drm_encoder *encoder, void *data);
 void exynos_drm_encoder_plane_disable(struct drm_encoder *encoder, void *data);
-void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb);
+void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc);

 #endif
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c 
b/drivers/gpu/drm/exynos/exynos_drm_fb.c
index 0e04f4e..1fc7ae6 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
@@ -68,12 +68,21 @@ static int check_fb_gem_memory_type(struct drm_device 
*drm_dev,
 static void exynos_drm_fb_destroy(struct drm_framebuffer *fb)
 {
struct exynos_drm_fb *exynos_fb = to_exynos_fb(fb);
+   struct drm_device *dev = fb->dev;
+   struct drm_crtc *crtc;
unsigned int i;

DRM_DEBUG_KMS("%s\n", __FILE__);

-   /* make sure that overlay data are updated before relesing fb. */
-   exynos_drm_encoder_complete_scanout(fb);
+   list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+   if (crtc->fb == fb) {
+   /*
+* make sure that overlay data are updated before
+* relesing fb.
+*/
+   exynos_drm_encoder_complete_scanout(crtc);
+   }
+   }

drm_framebuffer_cleanup(fb);

-- 
1.7.5.4

[Bug 65068] New: AtomBIOS stuck after suspend/resume cycle whilst GPU turned off

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65068

  Priority: medium
Bug ID: 65068
  Assignee: dri-devel at lists.freedesktop.org
   Summary: AtomBIOS stuck after suspend/resume cycle whilst GPU
turned off
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: austin.lund at gmail.com
  Hardware: Other
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

Created attachment 79884
  --> https://bugs.freedesktop.org/attachment.cgi?id=79884&action=edit
dmesg output when trying to switch back to radeon gpu.

I have two GPUs in my system:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Whistler [Radeon HD 6600M/6700M/7600M Series]

00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core
Processor Family Integrated Graphics Controller (rev 09)

This is a macbookpro8,2 and hence the gmuxer is controlled by the apple-gmux
driver.

If I suspend the system to ram whilst on the integrated gpu (i.e. the intel
gpu), then after resume switch back to the radeon, I get a GPU hang.

I've attached the dmesg output that I get when I try this.

I'm using linux 3.10-rc3.  I don't have X running when doing this
(vgaswitcheroo won't allow this).

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/55b5f0f2/attachment-0001.html>

[Bug 63935] TURKS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=63935

--- Comment #51 from Christian K?nig  ---
(In reply to comment #50)
> Also patched SUMO2 patch from
> http://lists.freedesktop.org/archives/dri-devel/2013-May/038894.html
> 
> '''Still no success!'''
> 
> Still got [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset
> the VCPU!!!

Well is this a Mac you are trying to get working? This bugreport is about Macs
booting in EFI mode, if you have issues with another system please open another
bugreport even if you have the same symptoms.

Christian.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/8b611739/attachment.html>

[Bug 64850] Second screen black on Pitcairn PRO

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=64850

--- Comment #18 from Jakob Nixdorf  ---
New update: It seems to have nothing to do with the connectors that are used.
I just got my mini-DisplayPort to DVI adapter and tested it. No combination
works (HDMI+mDP, DVI+mDP, HDMI+DVI) the second screen is always black.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/2bf78496/attachment.html>

[PATCH] drm/tegra: add support for runtime pm

2013-05-28 Thread Thierry Reding

On Tue, May 28, 2013 at 08:45:03AM +0300, Terje Bergstr?m wrote:
> On 27.05.2013 18:45, Thierry Reding wrote:
> > On Mon, May 27, 2013 at 07:19:28PM +0530, Mayuresh Kulkarni wrote:
> >> +#ifdef CONFIG_PM_RUNTIME
> >> +static int host1x_runtime_suspend(struct device *dev)
> >> +{
> >> +  struct host1x *host;
> >> +
> >> +  host = dev_get_drvdata(dev);
> >> +  if (IS_ERR_OR_NULL(host))
> > 
> > I think a simple
> > 
> > if (!host)
> > return -EINVAL;
> > 
> > would be enough here. The driver-data of the device should never be an
> > ERR_PTR()-encoded value, but either a valid pointer to a host1x object
> > or NULL.
> 
> True, we should avoid IS_ERR_OR_NULL() like plague. We always know if
> the called API returns a NULL on error or an error code. In case of
> error code we should just propagate that.

Yes, that's the case in general. In this specific case the value
obtained by dev_get_drvdata() should either be a valid pointer or NULL,
never an error code. We can easily make sure by only setting the data
(using platform_set_drvdata()) when the pointer is valid.

Thinking about it some more, I don't think we can ever get NULL here. A
device's .runtime_suspend() cannot be called when the device has been
removed, right? That's the only case where the value returned might be
NULL. It would be NULL too if host1x wasn't initialized yet, but that's
already dealt with by the proper ordering in .probe().

> > Same comments apply here. Also I think it might be a good idea to split
> > the host1x and gr2d changes into separate patches.
> 
> That's a bit tricky, but doable. We just need to enable it for 2D first,
> and then host1x to keep bisectability.

Right, there's a dependency. But I'd still prefer to have them separate.
Unless it gets really messy.

> >>  static void action_submit_complete(struct host1x_waitlist *waiter)
> >>  {
> >> +  int completed = waiter->count;
> >>struct host1x_channel *channel = waiter->data;
> >>  
> >> +  /* disable clocks for all the submits that got completed in this lot */
> >> +  while (completed--)
> >> +  pm_runtime_put(channel->dev);
> >> +
> >>host1x_cdma_update(&channel->cdma);
> >>  
> >> -  /*  Add nr_completed to trace */
> >> +  /* Add nr_completed to trace */
> >>trace_host1x_channel_submit_complete(dev_name(channel->dev),
> >> waiter->count, waiter->thresh);
> >> -
> >>  }
> > 
> > This feels hackish. But I can't see any better place to do this. Terje,
> > Arto: any ideas how we can do this in a cleaner way? If there's nothing
> > better then maybe moving the code into a separate function, say
> > host1x_waitlist_complete(), might make this less awkward?
> 
> Yeah, it's a bit awkward. action_submit_complete() actually does handle
> completion of multiple jobs, and we do one pm_runtime_get() per job.
> 
> We could do pm_runtime_put() in host1x_cdma_update(). It anyway goes
> through each job that is completed, so while freeing the job it could as
> well call runtime PM. That way we could even remove the waiter->count
> variable altogether as it's not needed anymore.

That sounds a lot better. We could add a helper (host1x_job_finish()
perhaps) with the following from update_cdma_locked():

/* Unpin the memory */
host1x_job_unpin(job);

/* Pop push buffer slots */
if (job->num_slots) {
struct push_buffer *pb = &cdma->push_buffer;
host1x_pushbuffer_pop(pb, job->num_slots);
if (cdma->event == CDMA_EVENT_PUSH_BUFFER_SPACE)
signal = true;
}

list_del(&job->list);

And add pm_runtime_put() (as well as potentially other stuff) in there.
That'll prevent update_cdma_unlocked() from growing too much. It isn't
too bad right now, so maybe a helper isn't warranted yet, but I don't
think it'll hurt.

> The not-so-beautiful aspect is that we do pm_runtime_get() in
> host1x_channel.c and pm_runtime_put() in host1x_cdma.c. For code
> readability it's be great to have them in the same file. I actually get
> questions every now and then because in downstream because of doing
> these operations in different files.

With the above helper in place, we could move host1x_job_submit() to
job.c instead and have all the code in one file.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/98a13182/attachment.pgp>

[PULL] drm-intel-fixes

2013-05-28 Thread Daniel Vetter

On Thu, May 23, 2013 at 02:03:09PM +0200, Daniel Vetter wrote:
> Hi Dave,
> 
> A few fixes, nothing shocking:
> - More Haswell pci ids. Includes a pile of marketing spare ids (which
>   despite the spare moniker show up all over the place).
> - Fix a regression in handling modeset failures, resulting in black
>   screens on 3 pipe setups when we've run out of pch plls (Chris).
> - Fix up the setcrtc semantics to unconditionally enable the outputs.
>   Juding from git digging that has (kinda) always been the case and neatly
>   fixes a few long-standing (i.e. forever) bug reports (Imre).
> - jiffies_timeout + 1 patches from Imre. They partially fix spurious
>   wait_event failures in the interrupt-driven dp aux/i2c code. The other
>   part is a core patch for the wait_event macros going in through -mm. A
>   few patches more than strictly required since Imre is pushing for a
>   general solution in 3.11.
> 
> Cheers, Daniel

Update pull request (same sha1 but with a tag) so that I can pile new
patches on top (there's one I want to give some testing for a few days
first ...).

Cheers, Daniel


The following changes since commit c7788792a5e7b0d5d7f96d0766b4cb6112d47d75:

  Linux 3.10-rc2 (2013-05-20 14:37:38 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~danvet/drm-intel tags/drm-intel-fixes-2013-05-28

for you to fetch changes up to 3598706b52cb45ba0a9e8aa99ce5ac59140f2b8b:

  drm/i915: avoid premature DP AUX timeouts (2013-05-22 13:51:26 +0200)


Chris Wilson (1):
  drm/i915: Propagate errors back from fb set-base

Imre Deak (5):
  drm/i915: force full modeset if the connector is in DPMS OFF mode
  drm/i915: add msecs_to_jiffies_timeout to guarantee minimum duration
  drm/i915: use msecs_to_jiffies_timeout instead of open coding the same
  drm/i915: avoid premature timeouts in __wait_seqno()
  drm/i915: avoid premature DP AUX timeouts

Rodrigo Vivi (1):
  drm/i915: Adding more reserved PCI IDs for Haswell.

 drivers/gpu/drm/i915/i915_drv.c  |   46 +++
 drivers/gpu/drm/i915/i915_drv.h  |   15 +++
 drivers/gpu/drm/i915/i915_gem.c  |2 +-
 drivers/gpu/drm/i915/intel_display.c |   49 ++
 drivers/gpu/drm/i915/intel_dp.c  |2 +-
 drivers/gpu/drm/i915/intel_i2c.c |5 ++--
 6 files changed, 87 insertions(+), 32 deletions(-)
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[Bug 52174] radeonsi enable GLSL 1.3 by default

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=52174

Michel D?nzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Michel D?nzer  ---
It wasn't actually ready yet, but now it is. :)

commit cdad129f9cda038fb29cde94645d2fcab7207c50
Author: Michel D?nzer 
Date:   Fri May 24 16:49:42 2013 +0200

radeonsi: Enable GLSL 1.30

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/fc0c1ad9/attachment.html>

[Bug 62889] ColorTiling results in glitches on Radeon HD 7970 + Glamor

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=62889

Michel D?nzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Michel D?nzer  ---
With no answer to comment #22, I assume the Steam issue was due to picking up
stale 32-bit binaries.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/7053d7c5/attachment-0001.html>

[PATCH -next] drm/i915: fix error return code in init_pipe_control()

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 05:51:44PM +0800, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> Fix to return -ENOMEM in the kmap() error handling case
> instead of 0, as done elsewhere in this function.
> 
> Signed-off-by: Wei Yongjun 
Queued for -next, thanks for the patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[PATCH -next] drm/i915: fix error return code in init_pipe_control()

2013-05-28 Thread Chris Wilson

On Tue, May 28, 2013 at 05:51:44PM +0800, Wei Yongjun wrote:
> From: Wei Yongjun 
> 
> Fix to return -ENOMEM in the kmap() error handling case
> instead of 0, as done elsewhere in this function.

kmap() can fail?

It is either translated to page_address() or kmap_high() (on x86),
neither of which may return NULL. However, only kmap_atomic() is
documented as being guaranteed to return a valid value. If we could
have a similar definitive statement for kmap(), we can then cleanup
quite a bit of redundant error handling.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 12:56:57PM +0900, Inki Dae wrote:
> 
> 
> > -Original Message-
> > From: linux-fbdev-owner at vger.kernel.org [mailto:linux-fbdev-
> > owner at vger.kernel.org] On Behalf Of Rob Clark
> > Sent: Tuesday, May 28, 2013 12:48 AM
> > To: Inki Dae
> > Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> > Park; myungjoo.ham; DRI mailing list;
> linux-arm-kernel at lists.infradead.org;
> > linux-media at vger.kernel.org
> > Subject: Re: Introduce a new helper framework for buffer synchronization
> > 
> > On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> > > Hi all,
> > >
> > > I have been removed previous branch and added new one with more cleanup.
> > > This time, the fence helper doesn't include user side interfaces and
> > cache
> > > operation relevant codes anymore because not only we are not sure that
> > > coupling those two things, synchronizing caches and buffer access
> > between
> > > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
> > a
> > > good idea yet but also existing codes for user side have problems with
> > badly
> > > behaved or crashing userspace. So this could be more discussed later.
> > >
> > > The below is a new branch,
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > exynos.git/?h=dma-f
> > > ence-helper
> > >
> > > And fence helper codes,
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > exynos.git/commit/?
> > > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> > >
> > > And example codes for device driver,
> > >
> > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > exynos.git/commit/?
> > > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> > >
> > > I think the time is not yet ripe for RFC posting: maybe existing dma
> > fence
> > > and reservation need more review and addition work. So I'd glad for
> > somebody
> > > giving other opinions and advices in advance before RFC posting.
> > 
> > thoughts from a *really* quick, pre-coffee, first look:
> > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > probably wouldn't want to bake in assumption that seqno_fence is used.
> > * I guess g2d is probably not actually a simple use case, since I
> > expect you can submit blits involving multiple buffers :-P
> 
> I don't think so. One and more buffers can be used: seqno_fence also has
> only one buffer. Actually, we have already applied this approach to most
> devices; multimedia, gpu and display controller. And this approach shows
> more performance; reduced power consumption against traditional way. And g2d
> example is just to show you how to apply my approach to device driver.

Note that seqno_fence is an implementation pattern for a certain type of
direct hw->hw synchronization which uses a shared dma_buf to exchange the
sync cookie. The dma_buf attached to the seqno_fence has _nothing_ to do
with the dma_buf the fence actually coordinates access to.

I think that confusing is a large reason for why Maarten&I don't
understand what you want to achieve with your fence helpers. Currently
they're using the seqno_fence, but totally not in a way the seqno_fence
was meant to be used.

Note that with the current code there is only a pointer from dma_bufs to
the fence. The fence itself has _no_ pointer to the dma_buf it syncs. This
shouldn't be a problem since the fence fastpath for already signalled
fences is completely barrier&lock free (it's just a load+bit-test), and
fences are meant to be embedded into whatever dma tracking structure you
already have, so no overhead there. The only ugly part is the fence
refcounting, but I don't think we can drop that.

Note that you completely reinvent this part of Maarten's fence patches by
adding new r/w_complete completions to the reservation object, which
completely replaces the fence stuff.

Also note that a list of reservation entries is again meant to be used
only when submitting the dma to the gpu. With your patches you seem to
hang onto that list until dma completes. This has the ugly side-effect
that you need to allocate these reservation entries with kmalloc, whereas
if you just use them in the execbuf ioctl to construct the dma you can
usually embed it into something else you need already anyway. At least
i915 and ttm based drivers can work that way.

Furthermore fences are specifically constructed as frankenstein-monsters
between completion/waitqueues and callbacks. All the different use-cases
need the different aspects:
- busy/idle checks and bo retiring need the completion semantics
- callbacks (in interrupt context) are used for hybrid hw->irq handler->hw
  sync approaches

> 
> > * otherwise, you probably don't want to depend on dmabuf, which is why
> > reservation/fence is split out the way it is..  you want to be able to
> > use a single reservation/fence mechanism within your driver without
> > having to care about which buffers are exported

[PATCH 2/6] gpu: host1x: Fix syncpoint wait return value

2013-05-28 Thread Thierry Reding

On Mon, May 27, 2013 at 09:55:46AM +0300, Arto Merilainen wrote:
> On 05/26/2013 01:12 PM, Thierry Reding wrote:
> >* PGP Signed by an unknown key
> >
> >On Fri, May 17, 2013 at 02:49:44PM +0300, Arto Merilainen wrote:
[...]
> >Thinking about it, maybe it would be good to have two separate error
> >codes. Keeping -EAGAIN for the case where a zero timeout was passed
> >doesn't sound too bad to differentiate it from the case where a non-
> >zero timeout was passed and it actually timed out. What do you think?
> 
> I agree, in this case it would not look bad at all. However, user
> space libraries may loop until the ioctl return code is something
> else than -EAGAIN or -EINTR. Especially function drmIoctl() in
> libdrm does this which is why I noted this isssue in the first
> place.
> 
> If user space uses zero timeout to just check if a syncpoint value
> has already passed the library continues looping until the syncpoint
> value actually passes. Of course, we could just modify the ioctl
> interface to "cast" this return code to something else but that does
> not seem correct.

That doesn't sound right. Maybe drmIoctl() needs fixing instead. Looking
at the history, drmIoctl() was introduced to automatically loop if a
signal was received (commit 8b9ab108ec1f2ba2b503f713769c4946849b3cb2).
However the ioctl(3p) manpage doesn't mention that ioctl() returns
EAGAIN in case it is interrupted by a signal.

I'm adding Keith as author of that commit and the xorg-devel mailing
list on Cc to get some more eyes on this.

Thierry
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/d12d45f9/attachment.pgp>

[PATCH -next] drm/i915: fix error return code in init_pipe_control()

2013-05-28 Thread Wei Yongjun

From: Wei Yongjun 

Fix to return -ENOMEM in the kmap() error handling case
instead of 0, as done elsewhere in this function.

Signed-off-by: Wei Yongjun 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5698fae..9b97cf6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -464,9 +464,11 @@ init_pipe_control(struct intel_ring_buffer *ring)
goto err_unref;

pc->gtt_offset = obj->gtt_offset;
-   pc->cpu_page =  kmap(sg_page(obj->pages->sgl));
-   if (pc->cpu_page == NULL)
+   pc->cpu_page = kmap(sg_page(obj->pages->sgl));
+   if (pc->cpu_page == NULL) {
+   ret = -ENOMEM;
goto err_unpin;
+   }

DRM_DEBUG_DRIVER("%s pipe control offset: 0x%08x\n",
 ring->name, pc->gtt_offset);

[Bug 58901] New: "trying to bind memory to uninitialized GART" error at resume from suspend to memory

2013-05-28 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=58901

   Summary: "trying to bind memory to uninitialized GART" error at
resume from suspend to memory
   Product: Drivers
   Version: 2.5
Kernel Version: 3.9.3
  Platform: All
OS/Version: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
AssignedTo: drivers_video-dri at kernel-bugs.osdl.org
ReportedBy: casteyde.christian at free.fr
Regression: Yes


Acer Aspire 7750G
Core i7-2630QM, 6Go
AMD Radeon HD6650M, no Intel graphics
Slackware64-current

Since kernel 3.9.x, my laptop cannot resume from suspend to memory with X
completly frozen and no other way to switch off/restart.

My rc scripts save dmesg at shutdown so I managed to get the following kernel
logs:

usb 1-1.4: reset high-speed USB device number 4 using ehci-pci
PM: resume of devices complete after 1013.038 msecs
Restarting tasks ... done.
video LNXVIDEO:01: Restoring backlight state
ata1.00: configured for UDMA/133
ata1: EH complete
EXT4-fs (sda2): re-mounted. Opts: discard,commit=0
EXT4-fs (sda3): re-mounted. Opts: discard,commit=0
eth0: deauthenticated from  (Reason: 6)
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 4 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 2 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 2 KHz), (300 mBi, 2000 mBm)
cfg80211:   (517 KHz - 525 KHz @ 4 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 4 KHz), (300 mBi, 2000 mBm)
eth0: authenticate with 
eth0: send auth to 
eth0: authenticated
ath9k :03:00.0 eth0: disabling HT as WMM/QoS is not supported by the AP
ath9k :03:00.0 eth0: disabling VHT as WMM/QoS is not supported by the AP
eth0: associate with  (try 1/3)
eth0: RX AssocResp from  (capab=0x411 status=0 aid=2)
eth0: associated
[ cut here ]
WARNING: at drivers/gpu/drm/radeon/radeon_gart.c:280
radeon_gart_bind+0xe1/0xf0()
Hardware name: Aspire 7750G
trying to bind memory to uninitialized GART !
Modules linked in:
Pid: 2305, comm: X Not tainted 3.9.3 #6
Call Trace:
 [] ? radeon_gart_bind+0xe1/0xf0
 [] warn_slowpath_common+0x6b/0xa0
 [] warn_slowpath_fmt+0x47/0x50
 [] radeon_gart_bind+0xe1/0xf0
 [] radeon_ttm_backend_bind+0x32/0x90
 [] ttm_tt_bind+0x47/0x60
 [] ttm_bo_handle_move_mem+0x54f/0x5e0
 [] ? ttm_bo_mem_space+0x161/0x340
 [] ttm_bo_move_buffer+0x11f/0x140
 [] ttm_bo_validate+0x92/0x110
 [] ttm_bo_init+0x2a9/0x3c0
 [] radeon_bo_create+0x176/0x1d0
 [] ? radeon_bo_clear_va+0x50/0x50
 [] radeon_gem_object_create+0x9b/0x160
 [] radeon_gem_create_ioctl+0x5b/0x130
 [] drm_ioctl+0x4d1/0x580
 [] ? radeon_gem_pwrite_ioctl+0x30/0x30
 [] do_vfs_ioctl+0x2e5/0x4d0
 [] sys_ioctl+0x40/0x80
 [] ? sys_read+0x6c/0x90
 [] system_call_fastpath+0x16/0x1b
---[ end trace 44b14b5d0d1cf7ab ]---
[drm:radeon_ttm_backend_bind] *ERROR* failed to bind 1175 pages at 0x0240D000
[drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (4812800,
2, 4096, -22)
BUG: unable to handle kernel NULL pointer dereference at 0008
IP: [] ttm_dma_populate+0x6a3/0x960
PGD 1c61c0067 PUD 1c4df8067 PMD 0 
Oops: 0002 [#1] PREEMPT SMP 
Modules linked in:
CPU 6 
Pid: 2305, comm: X Tainted: GW3.9.3 #6 Acer Aspire 7750G/JE70_HR
RIP: 0010:[]  []
ttm_dma_populate+0x6a3/0x960
RSP: 0018:8801c4c6b9c0  EFLAGS: 00010093
RAX: 88019b74c100 RBX: 0202 RCX: 88019b74c180
RDX:  RSI: 8801c7202928 RDI: 8801c7202914
RBP: 8801c4c6ba88 R08: 000146c0 R09: 8801cf5946c0
R10: ea0007190a80 R11: 8801c8802600 R12: 88019b74c100
R13: 8801c642a320 R14: 8801c7202900 R15: 0004
FS:  7fe093e7c8c0() GS:8801cf58() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0008 CR3: 0001c619f000 CR4: 000407e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process X (pid: 2305, threadinfo 8801c4c6a000, task 8801c712da20)
Stack:
 8801c7202964 8801c762c098 8801c7202928 8801c7202971
 88019fb67558 8801c699bc00  ffc0192a
 88019fb67500 8801c7202914 4004 88010004
Call Trace:
 [] ? do_select+0x5fa/0x670
 [] ? kmem_cache_alloc+0x9a/0xa0
 [] ? __kmalloc+0xd0/0xe0
 [] radeon_ttm_tt_populate+0x1c7/0x220
 [] ? radeon_ttm_tt_create+0x6a/0xb0
 [] ttm_tt_bind+0x36/0x60
 [] ttm_bo_handle_move_mem+0x54f/0x5e0
 [] ? ttm_bo_mem_space+0x161/0x340
 [] ttm_bo_move_buffer+0x11f/0x140
 [] ttm_bo_validate+0x92/0x110
 [] ttm_bo_init+0x2a

drm/exynos: make overlay data to be updated to valid hw

2013-05-28 Thread Inki Dae

2013/5/28 Inki Dae 

> This patch makes sure that overlay data are updated
> to real hardware enabled when framebuffer is released.
> For this, this patch checks if crtc and encoder are
> valid or not, and then makes it waiting for signal
> synchroniztion to only valid encoder.
>
> Signed-off-by: Inki Dae 
> Signed-off-by: Kyungmin Park 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_encoder.c |9 ++---
>  drivers/gpu/drm/exynos/exynos_drm_encoder.h |2 +-
>  drivers/gpu/drm/exynos/exynos_drm_fb.c  |   13 +++--
>  3 files changed, 18 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> index c63721f..9a6e3fd 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.c
> @@ -220,18 +220,21 @@ static void exynos_drm_encoder_commit(struct
> drm_encoder *encoder)
> exynos_encoder->dpms = DRM_MODE_DPMS_ON;
>  }
>
> -void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb)
> +void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc)
>  {
> struct exynos_drm_encoder *exynos_encoder;
> struct exynos_drm_manager_ops *ops;
> -   struct drm_device *dev = fb->dev;
> +   struct drm_device *dev = crtc->dev;
> struct drm_encoder *encoder;
>
> /*
>  * make sure that overlay data are updated to real hardware
> -* for all encoders.
> +* for valid encoders.
>  */
> list_for_each_entry(encoder, &dev->mode_config.encoder_list, head)
> {
> +   if (encoder->crtc != crtc)
> +   continue;
> +
> exynos_encoder = to_exynos_encoder(encoder);
> ops = exynos_encoder->manager->ops;
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> index 89e2fb0..e8dee1c 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_encoder.h
> @@ -32,6 +32,6 @@ void exynos_drm_encoder_plane_mode_set(struct
> drm_encoder *encoder, void *data);
>  void exynos_drm_encoder_plane_commit(struct drm_encoder *encoder, void
> *data);
>  void exynos_drm_encoder_plane_enable(struct drm_encoder *encoder, void
> *data);
>  void exynos_drm_encoder_plane_disable(struct drm_encoder *encoder, void
> *data);
> -void exynos_drm_encoder_complete_scanout(struct drm_framebuffer *fb);
> +void exynos_drm_encoder_complete_scanout(struct drm_crtc *crtc);
>
>  #endif
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c
> b/drivers/gpu/drm/exynos/exynos_drm_fb.c
> index 0e04f4e..1fc7ae6 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
> @@ -68,12 +68,21 @@ static int check_fb_gem_memory_type(struct drm_device
> *drm_dev,
>  static void exynos_drm_fb_destroy(struct drm_framebuffer *fb)
>  {
> struct exynos_drm_fb *exynos_fb = to_exynos_fb(fb);
> +   struct drm_device *dev = fb->dev;
> +   struct drm_crtc *crtc;
> unsigned int i;
>
> DRM_DEBUG_KMS("%s\n", __FILE__);
>
> -   /* make sure that overlay data are updated before relesing fb. */
> -   exynos_drm_encoder_complete_scanout(fb);
> +   list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
> +   if (crtc->fb == fb) {
>

Sorry, crtc->fb could be new fb so in this case, this condition will always
be failed. This patch will be posted again after fixed.

Thanks,
Inki Dae

+   /*
> +* make sure that overlay data are updated before
> +* relesing fb.
> +*/
> +   exynos_drm_encoder_complete_scanout(crtc);
> +   }
> +   }
>
> drm_framebuffer_cleanup(fb);
>
> --
> 1.7.5.4
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/b5949fc2/attachment.html>

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Rob Clark

On Mon, May 27, 2013 at 11:56 PM, Inki Dae  wrote:
>
>
>> -Original Message-
>> From: linux-fbdev-owner at vger.kernel.org [mailto:linux-fbdev-
>> owner at vger.kernel.org] On Behalf Of Rob Clark
>> Sent: Tuesday, May 28, 2013 12:48 AM
>> To: Inki Dae
>> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
>> Park; myungjoo.ham; DRI mailing list;
> linux-arm-kernel at lists.infradead.org;
>> linux-media at vger.kernel.org
>> Subject: Re: Introduce a new helper framework for buffer synchronization
>>
>> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
>> > Hi all,
>> >
>> > I have been removed previous branch and added new one with more cleanup.
>> > This time, the fence helper doesn't include user side interfaces and
>> cache
>> > operation relevant codes anymore because not only we are not sure that
>> > coupling those two things, synchronizing caches and buffer access
>> between
>> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is
>> a
>> > good idea yet but also existing codes for user side have problems with
>> badly
>> > behaved or crashing userspace. So this could be more discussed later.
>> >
>> > The below is a new branch,
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/?h=dma-f
>> > ence-helper
>> >
>> > And fence helper codes,
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
>> >
>> > And example codes for device driver,
>> >
>> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
>> exynos.git/commit/?
>> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
>> >
>> > I think the time is not yet ripe for RFC posting: maybe existing dma
>> fence
>> > and reservation need more review and addition work. So I'd glad for
>> somebody
>> > giving other opinions and advices in advance before RFC posting.
>>
>> thoughts from a *really* quick, pre-coffee, first look:
>> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
>> probably wouldn't want to bake in assumption that seqno_fence is used.
>> * I guess g2d is probably not actually a simple use case, since I
>> expect you can submit blits involving multiple buffers :-P
>
> I don't think so. One and more buffers can be used: seqno_fence also has
> only one buffer. Actually, we have already applied this approach to most
> devices; multimedia, gpu and display controller. And this approach shows
> more performance; reduced power consumption against traditional way. And g2d
> example is just to show you how to apply my approach to device driver.

no, you need the ww-mutex / reservation stuff any time you have
multiple independent devices (or rings/contexts for hw that can
support multiple contexts) which can do operations with multiple
buffers.  So you could conceivably hit this w/ gpu + g2d if multiple
buffers where shared between the two.  vram migration and such
'desktop stuff' might make the problem worse, but just because you
don't have vram doesn't mean you don't have a problem with multiple
buffers.

>> * otherwise, you probably don't want to depend on dmabuf, which is why
>> reservation/fence is split out the way it is..  you want to be able to
>> use a single reservation/fence mechanism within your driver without
>> having to care about which buffers are exported to dmabuf's and which
>> are not.  Creating a dmabuf for every GEM bo is too heavyweight.
>
> Right. But I think we should dealt with this separately. Actually, we are
> trying to use reservation for gpu pipe line synchronization such as sgx sync
> object and this approach is used without dmabuf. In order words, some device
> can use only reservation for such pipe line synchronization and at the same
> time, fence helper or similar thing with dmabuf for buffer synchronization.

it is probably easier to approach from the reverse direction.. ie, get
reservation/synchronization right first, and then dmabuf.  (Well, that
isn't really a problem because Maarten's reservation/fence patches
support dmabuf from the beginning.)

BR,
-R

>>
>> I'm not entirely sure if reservation/fence could/should be made any
>> simpler for multi-buffer users.  Probably the best thing to do is just
>> get reservation/fence rolled out in a few drivers and see if some
>> common patterns emerge.
>>
>> BR,
>> -R
>>
>> >
>> > Thanks,
>> > Inki Dae
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-fbdev" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae


Hi Daniel,

Thank you so much. And so very useful.:) Sorry but could be give me more
comments to the below my comments? There are still things making me
confusing.:(


> -Original Message-
> From: Daniel Vetter [mailto:daniel.vetter at ffwll.ch] On Behalf Of Daniel
> Vetter
> Sent: Tuesday, May 28, 2013 7:33 PM
> To: Inki Dae
> Cc: 'Rob Clark'; 'Maarten Lankhorst'; 'Daniel Vetter'; 'linux-fbdev';
> 'YoungJun Cho'; 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list';
> linux-arm-kernel at lists.infradead.org; linux-media at vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Tue, May 28, 2013 at 12:56:57PM +0900, Inki Dae wrote:
> >
> >
> > > -Original Message-
> > > From: linux-fbdev-owner at vger.kernel.org [mailto:linux-fbdev-
> > > owner at vger.kernel.org] On Behalf Of Rob Clark
> > > Sent: Tuesday, May 28, 2013 12:48 AM
> > > To: Inki Dae
> > > Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho;
> Kyungmin
> > > Park; myungjoo.ham; DRI mailing list;
> > linux-arm-kernel at lists.infradead.org;
> > > linux-media at vger.kernel.org
> > > Subject: Re: Introduce a new helper framework for buffer
> synchronization
> > >
> > > On Mon, May 27, 2013 at 6:38 AM, Inki Dae 
wrote:
> > > > Hi all,
> > > >
> > > > I have been removed previous branch and added new one with more
> cleanup.
> > > > This time, the fence helper doesn't include user side interfaces and
> > > cache
> > > > operation relevant codes anymore because not only we are not sure
> that
> > > > coupling those two things, synchronizing caches and buffer access
> > > between
> > > > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel
> side is
> > > a
> > > > good idea yet but also existing codes for user side have problems
> with
> > > badly
> > > > behaved or crashing userspace. So this could be more discussed
later.
> > > >
> > > > The below is a new branch,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/?h=dma-f
> > > > ence-helper
> > > >
> > > > And fence helper codes,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/commit/?
> > > > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> > > >
> > > > And example codes for device driver,
> > > >
> > > > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> > > exynos.git/commit/?
> > > > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> > > >
> > > > I think the time is not yet ripe for RFC posting: maybe existing dma
> > > fence
> > > > and reservation need more review and addition work. So I'd glad for
> > > somebody
> > > > giving other opinions and advices in advance before RFC posting.
> > >
> > > thoughts from a *really* quick, pre-coffee, first look:
> > > * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> > > probably wouldn't want to bake in assumption that seqno_fence is used.
> > > * I guess g2d is probably not actually a simple use case, since I
> > > expect you can submit blits involving multiple buffers :-P
> >
> > I don't think so. One and more buffers can be used: seqno_fence also has
> > only one buffer. Actually, we have already applied this approach to most
> > devices; multimedia, gpu and display controller. And this approach shows
> > more performance; reduced power consumption against traditional way. And
> g2d
> > example is just to show you how to apply my approach to device driver.
> 
> Note that seqno_fence is an implementation pattern for a certain type of
> direct hw->hw synchronization which uses a shared dma_buf to exchange the
> sync cookie.

I'm afraid that I don't understand hw->hw synchronization. hw->hw
synchronization means that device has a hardware feature which supports
buffer synchronization hardware internally? And what is the sync cookie?

> The dma_buf attached to the seqno_fence has _nothing_ to do
> with the dma_buf the fence actually coordinates access to.
> 
> I think that confusing is a large reason for why Maarten&I don't
> understand what you want to achieve with your fence helpers. Currently
> they're using the seqno_fence, but totally not in a way the seqno_fence
> was meant to be used.
> 
> Note that with the current code there is only a pointer from dma_bufs to
> the fence. The fence itself has _no_ pointer to the dma_buf it syncs. This
> shouldn't be a problem since the fence fastpath for already signalled
> fences is completely barrier&lock free (it's just a load+bit-test), and
> fences are meant to be embedded into whatever dma tracking structure you
> already have, so no overhead there. The only ugly part is the fence
> refcounting, but I don't think we can drop that.

The below is the proposed way,
dma device has to create a fence before accessing a shared buffer, and then
check if there are other dma which are accessing the shared buffer; if exist
then the dma device should be blocked, and then  it sets the f

[PATCH v4 0/4] add mutex wait/wound/style style locks

2013-05-28 Thread Maarten Lankhorst

Version 4 already?

Small api changes since v3:
- Remove ww_mutex_unlock_single and ww_mutex_lock_single.
- Rename ww_mutex_trylock_single to ww_mutex_trylock.
- Remove separate implementations of ww_mutex_lock_slow*, normal
  functions can be used. Inline versions still exist for extra
  debugging, and to annotate.
- Cleanup unneeded memory barriers, add comment to the remaining
  smp_mb().

Thanks to Daniel Vetter, Rob Clark and Peter Zijlstra for their feedback.
---

Daniel Vetter (1):
  mutex: w/w mutex slowpath debugging

Maarten Lankhorst (3):
  arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded 
or not.
  mutex: add support for wound/wait style locks, v5
  mutex: Add ww tests to lib/locking-selftest.c. v4


 Documentation/ww-mutex-design.txt |  344 +++
 arch/ia64/include/asm/mutex.h |   10 -
 arch/powerpc/include/asm/mutex.h  |   10 -
 arch/sh/include/asm/mutex-llsc.h  |4 
 arch/x86/include/asm/mutex_32.h   |   11 -
 arch/x86/include/asm/mutex_64.h   |   11 -
 include/asm-generic/mutex-dec.h   |   10 -
 include/asm-generic/mutex-null.h  |2 
 include/asm-generic/mutex-xchg.h  |   10 -
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  363 +
 kernel/mutex.c|  384 ---
 lib/Kconfig.debug |   13 +
 lib/debug_locks.c |2 
 lib/locking-selftest.c|  410 +++--
 15 files changed, 1492 insertions(+), 93 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

-- 
~Maarten

[PATCH v4 1/4] arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.

2013-05-28 Thread Maarten Lankhorst

This will allow me to call functions that have multiple arguments if fastpath 
fails.
This is required to support ticket mutexes, because they need to be able to 
pass an
extra argument to the fail function.

Originally I duplicated the functions, by adding 
__mutex_fastpath_lock_retval_arg.
This ended up being just a duplication of the existing function, so a way to 
test
if fastpath was called ended up being better.

This also cleaned up the reservation mutex patch some by being able to call an
atomic_set instead of atomic_xchg, and making it easier to detect if the wrong
unlock function was previously used.

Changes since v1, pointed out by Francesco Lavra:
- fix a small comment issue in mutex_32.h
- fix the __mutex_fastpath_lock_retval macro for mutex-null.h

Signed-off-by: Maarten Lankhorst 
---
 arch/ia64/include/asm/mutex.h|   10 --
 arch/powerpc/include/asm/mutex.h |   10 --
 arch/sh/include/asm/mutex-llsc.h |4 ++--
 arch/x86/include/asm/mutex_32.h  |   11 ---
 arch/x86/include/asm/mutex_64.h  |   11 ---
 include/asm-generic/mutex-dec.h  |   10 --
 include/asm-generic/mutex-null.h |2 +-
 include/asm-generic/mutex-xchg.h |   10 --
 kernel/mutex.c   |   32 ++--
 9 files changed, 41 insertions(+), 59 deletions(-)

diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
index bed73a6..f41e66d 100644
--- a/arch/ia64/include/asm/mutex.h
+++ b/arch/ia64/include/asm/mutex.h
@@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
-   return fail_fn(count);
+   return -1;
return 0;
 }

diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
index 5399f7e..127ab23 100644
--- a/arch/powerpc/include/asm/mutex.h
+++ b/arch/powerpc/include/asm/mutex.h
@@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if
- * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function returns.
+ * Change the count from 1 to a value lower than 1. This function returns 0
+ * if the fastpath succeeds, or -1 otherwise.
  */
 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
if (unlikely(__mutex_dec_return_lock(count) < 0))
-   return fail_fn(count);
+   return -1;
return 0;
 }

diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index 090358a..dad29b6 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void 
(*fail_fn)(atomic_t *))
 }

 static inline int
-__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
+__mutex_fastpath_lock_retval(atomic_t *count)
 {
int __done, __res;

@@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int 
(*fail_fn)(atomic_t *))
: "t");

if (unlikely(!__done || __res != 0))
-   __res = fail_fn(count);
+   __res = -1;

return __res;
 }
diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
index 03f90c8..0208c3c 100644
--- a/arch/x86/include/asm/mutex_32.h
+++ b/arch/x86/include/asm/mutex_32.h
@@ -42,17 +42,14 @@ do {
\
  *  __mutex_fastpath_lock_retval - try to take the lock by moving the count
  * from 1 to a 0 value
  *  @count: pointer of type atomic_t
- *  @fail_fn: function to call if the original value was not 1
  *
- * Change the count from 1 to a value lower than 1, and call  if it
- * wasn't 1 originally. This function returns 0 if the fastpath succeeds,
- * or anything the slow path function retur

[PATCH v4 2/4] mutex: add support for wound/wait style locks, v5

2013-05-28 Thread Maarten Lankhorst

Changes since RFC patch v1:
 - Updated to use atomic_long instead of atomic, since the reservation_id was a 
long.
 - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
 - removed mutex_locked_set_reservation_id (or w/e it was called)
Changes since RFC patch v2:
 - remove use of __mutex_lock_retval_arg, add warnings when using wrong 
combination of
   mutex_(,reserve_)lock/unlock.
Changes since v1:
 - Add __always_inline to __mutex_lock_common, otherwise reservation paths can 
be
   triggered from normal locks, because __builtin_constant_p might evaluate to 
false
   for the constant 0 in that case. Tests for this have been added in the next 
patch.
 - Updated documentation slightly.
Changes since v2:
 - Renamed everything to ww_mutex. (mlankhorst)
 - Added ww_acquire_ctx and ww_class. (mlankhorst)
 - Added a lot of checks for wrong api usage. (mlankhorst)
 - Documentation updates. (danvet)
Changes since v3:
 - Small documentation fixes (robclark)
 - Memory barrier fix (danvet)
Changes since v4:
 - Remove ww_mutex_unlock_single and ww_mutex_lock_single.
 - Rename ww_mutex_trylock_single to ww_mutex_trylock.
 - Remove separate implementations of ww_mutex_lock_slow*, normal
   functions can be used. Inline versions still exist for extra
   debugging.
 - Cleanup unneeded memory barriers, add comment to the remaining
   smp_mb().

Signed-off-by: Maarten Lankhorst 
Signed-off-by: Daniel Vetter 
Signed-off-by: Rob Clark 
---
 Documentation/ww-mutex-design.txt |  344 
 include/linux/mutex-debug.h   |1 
 include/linux/mutex.h |  355 +
 kernel/mutex.c|  318 +++--
 lib/debug_locks.c |2 
 5 files changed, 1003 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/ww-mutex-design.txt

diff --git a/Documentation/ww-mutex-design.txt 
b/Documentation/ww-mutex-design.txt
new file mode 100644
index 000..8bd1761
--- /dev/null
+++ b/Documentation/ww-mutex-design.txt
@@ -0,0 +1,344 @@
+Wait/Wound Deadlock-Proof Mutex Design
+==
+
+Please read mutex-design.txt first, as it applies to wait/wound mutexes too.
+
+Motivation for WW-Mutexes
+-
+
+GPU's do operations that commonly involve many buffers.  Those buffers
+can be shared across contexts/processes, exist in different memory
+domains (for example VRAM vs system memory), and so on.  And with
+PRIME / dmabuf, they can even be shared across devices.  So there are
+a handful of situations where the driver needs to wait for buffers to
+become ready.  If you think about this in terms of waiting on a buffer
+mutex for it to become available, this presents a problem because
+there is no way to guarantee that buffers appear in a execbuf/batch in
+the same order in all contexts.  That is directly under control of
+userspace, and a result of the sequence of GL calls that an application
+makes. Which results in the potential for deadlock.  The problem gets
+more complex when you consider that the kernel may need to migrate the
+buffer(s) into VRAM before the GPU operates on the buffer(s), which
+may in turn require evicting some other buffers (and you don't want to
+evict other buffers which are already queued up to the GPU), but for a
+simplified understanding of the problem you can ignore this.
+
+The algorithm that TTM came up with for dealing with this problem is quite
+simple.  For each group of buffers (execbuf) that need to be locked, the caller
+would be assigned a unique reservation id/ticket, from a global counter.  In
+case of deadlock while locking all the buffers associated with a execbuf, the
+one with the lowest reservation ticket (i.e. the oldest task) wins, and the one
+with the higher reservation id (i.e. the younger task) unlocks all of the
+buffers that it has already locked, and then tries again.
+
+In the RDBMS literature this deadlock handling approach is called wait/wound:
+The older tasks waits until it can acquire the contended lock. The younger 
tasks
+needs to back off and drop all the locks it is currently holding, i.e. the
+younger task is wounded.
+
+Concepts
+
+
+Compared to normal mutexes two additional concepts/objects show up in the lock
+interface for w/w mutexes:
+
+Acquire context: To ensure eventual forward progress it is important the a task
+trying to acquire locks doesn't grab a new reservation id, but keeps the one it
+acquired when starting the lock acquisition. This ticket is stored in the
+acquire context. Furthermore the acquire context keeps track of debugging state
+to catch w/w mutex interface abuse.
+
+W/w class: In contrast to normal mutexes the lock class needs to be explicit 
for
+w/w mutexes, since it is required to initialize the acquire context.
+
+Furthermore there are three different class of w/w lock acquire functions:
+
+* Normal lock acquisition with a context, using ww_mute

[PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4

2013-05-28 Thread Maarten Lankhorst

This stresses the lockdep code in some ways specifically useful to
ww_mutexes. It adds checks for most of the common locking errors.

Changes since v1:
 - Add tests to verify reservation_id is untouched.
 - Use L() and U() macros where possible.

Changes since v2:
 - Use the ww_mutex api directly.
 - Use macros for most of the code.
Changes since v3:
 - Rework tests for the api changes.

Signed-off-by: Maarten Lankhorst 
---
 lib/locking-selftest.c |  405 ++--
 1 file changed, 386 insertions(+), 19 deletions(-)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index c3eb261..b18f1d3 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -26,6 +26,8 @@
  */
 static unsigned int debug_locks_verbose;

+static DEFINE_WW_CLASS(ww_lockdep);
+
 static int __init setup_debug_locks_verbose(char *str)
 {
get_option(&str, &debug_locks_verbose);
@@ -42,6 +44,10 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
 #define LOCKTYPE_RWLOCK0x2
 #define LOCKTYPE_MUTEX 0x4
 #define LOCKTYPE_RWSEM 0x8
+#define LOCKTYPE_WW0x10
+
+static struct ww_acquire_ctx t, t2;
+static struct ww_mutex o, o2;

 /*
  * Normal standalone locks, for the circular and irq-context
@@ -193,6 +199,16 @@ static void init_shared_classes(void)
 #define RSU(x) up_read(&rwsem_##x)
 #define RWSI(x)init_rwsem(&rwsem_##x)

+#define WWAI(x)ww_acquire_init(x, &ww_lockdep)
+#define WWAD(x)ww_acquire_done(x)
+#define WWAF(x)ww_acquire_fini(x)
+
+#define WWL(x, c)  ww_mutex_lock(x, c)
+#define WWT(x) ww_mutex_trylock(x)
+#define WWL1(x)ww_mutex_lock(x, NULL)
+#define WWU(x) ww_mutex_unlock(x)
+
+
 #define LOCK_UNLOCK_2(x,y) LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)

 /*
@@ -894,11 +910,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 # define I_RWLOCK(x)   lockdep_reset_lock(&rwlock_##x.dep_map)
 # define I_MUTEX(x)lockdep_reset_lock(&mutex_##x.dep_map)
 # define I_RWSEM(x)lockdep_reset_lock(&rwsem_##x.dep_map)
+# define I_WW(x)   lockdep_reset_lock(&x.dep_map)
 #else
 # define I_SPINLOCK(x)
 # define I_RWLOCK(x)
 # define I_MUTEX(x)
 # define I_RWSEM(x)
+# define I_WW(x)
 #endif

 #define I1(x)  \
@@ -920,11 +938,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 static void reset_locks(void)
 {
local_irq_disable();
+   lockdep_free_key_range(&ww_lockdep.acquire_key, 1);
+   lockdep_free_key_range(&ww_lockdep.mutex_key, 1);
+
I1(A); I1(B); I1(C); I1(D);
I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
+   I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
lockdep_reset();
I2(A); I2(B); I2(C); I2(D);
init_shared_classes();
+
+   ww_mutex_init(&o, &ww_lockdep); ww_mutex_init(&o2, &ww_lockdep);
+   memset(&t, 0, sizeof(t)); memset(&t2, 0, sizeof(t2));
+   memset(&ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
+   memset(&ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
local_irq_enable();
 }

@@ -938,7 +965,6 @@ static int unexpected_testcase_failures;
 static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
 {
unsigned long saved_preempt_count = preempt_count();
-   int expected_failure = 0;

WARN_ON(irqs_disabled());

@@ -946,26 +972,16 @@ static void dotest(void (*testcase_fn)(void), int 
expected, int lockclass_mask)
/*
 * Filter out expected failures:
 */
+   if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
-   if ((lockclass_mask & LOCKTYPE_SPIN) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_RWLOCK) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_MUTEX) && debug_locks != expected)
-   expected_failure = 1;
-   if ((lockclass_mask & LOCKTYPE_RWSEM) && debug_locks != expected)
-   expected_failure = 1;
+   expected_testcase_failures++;
+   printk("failed|");
+#else
+   unexpected_testcase_failures++;
+   printk("FAILED|");
+
+   dump_stack();
 #endif
-   if (debug_locks != expected) {
-   if (expected_failure) {
-   expected_testcase_failures++;
-   printk("failed|");
-   } else {
-   unexpected_testcase_failures++;
-
-   printk("FAILED|");
-   dump_stack();
-   }
} else {
testcase_successes++;
printk("  ok  |");
@@ -1108,6 +1124,355 @@ static inline void print_testname(const char *testname)
DO_TESTCASE_6IRW(desc, name, 312);

[PATCH v4 4/4] mutex: w/w mutex slowpath debugging

2013-05-28 Thread Maarten Lankhorst

From: Daniel Vetter 

Injects EDEADLK conditions at pseudo-random interval, with exponential
backoff up to UINT_MAX (to ensure that every lock operation still
completes in a reasonable time).

This way we can test the wound slowpath even for ww mutex users where
contention is never expected, and the ww deadlock avoidance algorithm
is only needed for correctness against malicious userspace. An example
would be protecting kernel modesetting properties, which thanks to
single-threaded X isn't really expected to contend, ever.

I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but
decided against it for two reasons:

- EDEADLK handling is mandatory for ww mutex users and should never
  affect the outcome of a syscall. This is in contrast to -ENOMEM
  injection. So fine configurability isn't required.

- The fault injection framework only allows to set a simple
  probability for failure. Now the probability that a ww mutex acquire
  stage with N locks will never complete (due to too many injected
  EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock
  operations for the completely uncontended case would be O(exp(N)).
  The per-acuiqire ctx exponential backoff solution choosen here only
  results in O(log N) overhead due to injection and so O(log N * N)
  lock operations. This way we can fail with high probability (and so
  have good test coverage even for fancy backoff and lock acquisition
  paths) without running into patalogical cases.

Note that EDEADLK will only ever be injected when we managed to
acquire the lock. This prevents any behaviour changes for users which
rely on the EALREADY semantics.

v2: Drop the cargo-culted __sched (I should read docs next time
around) and annotate the non-debug case with inline to prevent gcc
from doing something horrible.

v3: Rebase on top of Maarten's latest patches.

v4: Actually make this stuff compile, I've misplace the hunk in the
wrong #ifdef block.

v5: Simplify ww_mutex_deadlock_injection definition, and fix
lib/locking-selftest.c warnings. Fix lib/Kconfig.debug definition
to work correctly. (mlankhorst)

v6:
Do not inject -EDEADLK when ctx->acquired == 0, because
the _slow paths are merged now. (mlankhorst)

Cc: Steven Rostedt 
Signed-off-by: Daniel Vetter 
Signed-off-by: Maarten Lankhorst 
---
 include/linux/mutex.h  |8 
 kernel/mutex.c |   44 +---
 lib/Kconfig.debug  |   13 +
 lib/locking-selftest.c |5 +
 4 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index f3ad181..2ff9178 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -95,6 +95,10 @@ struct ww_acquire_ctx {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned deadlock_inject_interval;
+   unsigned deadlock_inject_countdown;
+#endif
 };

 struct ww_mutex {
@@ -280,6 +284,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx 
*ctx,
 &ww_class->acquire_key, 0);
mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_);
 #endif
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   ctx->deadlock_inject_interval = 1;
+   ctx->deadlock_inject_countdown = ctx->stamp & 0xf;
+#endif
 }

 /**
diff --git a/kernel/mutex.c b/kernel/mutex.c
index 75fc7c4..e40004b 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -508,22 +508,60 @@ mutex_lock_interruptible_nested(struct mutex *lock, 
unsigned int subclass)

 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);

+static inline int
+ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
+   unsigned tmp;
+
+   if (ctx->deadlock_inject_countdown-- == 0) {
+   tmp = ctx->deadlock_inject_interval;
+   if (tmp > UINT_MAX/4)
+   tmp = UINT_MAX;
+   else
+   tmp = tmp*2 + tmp + tmp/2;
+
+   ctx->deadlock_inject_interval = tmp;
+   ctx->deadlock_inject_countdown = tmp;
+   ctx->contending_lock = lock;
+
+   ww_mutex_unlock(lock);
+
+   return -EDEADLK;
+   }
+#endif
+
+   return 0;
+}

 int __sched
 __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
+   ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
   0, &ctx->dep_map, _RET_IP_, ctx);
+   if (!ret && ctx->acquired > 0)
+   return ww_mutex_deadlock_injection(lock, ctx);
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(__ww_mutex_lock);

 int __sched
 __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx 
*ctx)
 {
+   int ret;
+
might_sleep();
-   return __mutex_lock_common(&lock->base, TASK_I

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Inki Dae



> -Original Message-
> From: linux-fbdev-owner at vger.kernel.org [mailto:linux-fbdev-
> owner at vger.kernel.org] On Behalf Of Rob Clark
> Sent: Tuesday, May 28, 2013 10:49 PM
> To: Inki Dae
> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho; Kyungmin
> Park; myungjoo.ham; DRI mailing list;
linux-arm-kernel at lists.infradead.org;
> linux-media at vger.kernel.org
> Subject: Re: Introduce a new helper framework for buffer synchronization
> 
> On Mon, May 27, 2013 at 11:56 PM, Inki Dae  wrote:
> >
> >
> >> -Original Message-
> >> From: linux-fbdev-owner at vger.kernel.org [mailto:linux-fbdev-
> >> owner at vger.kernel.org] On Behalf Of Rob Clark
> >> Sent: Tuesday, May 28, 2013 12:48 AM
> >> To: Inki Dae
> >> Cc: Maarten Lankhorst; Daniel Vetter; linux-fbdev; YoungJun Cho;
> Kyungmin
> >> Park; myungjoo.ham; DRI mailing list;
> > linux-arm-kernel at lists.infradead.org;
> >> linux-media at vger.kernel.org
> >> Subject: Re: Introduce a new helper framework for buffer
> synchronization
> >>
> >> On Mon, May 27, 2013 at 6:38 AM, Inki Dae  wrote:
> >> > Hi all,
> >> >
> >> > I have been removed previous branch and added new one with more
> cleanup.
> >> > This time, the fence helper doesn't include user side interfaces and
> >> cache
> >> > operation relevant codes anymore because not only we are not sure
> that
> >> > coupling those two things, synchronizing caches and buffer access
> >> between
> >> > CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side
> is
> >> a
> >> > good idea yet but also existing codes for user side have problems
> with
> >> badly
> >> > behaved or crashing userspace. So this could be more discussed later.
> >> >
> >> > The below is a new branch,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/?h=dma-f
> >> > ence-helper
> >> >
> >> > And fence helper codes,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/commit/?
> >> > h=dma-fence-helper&id=adcbc0fe7e285ce866e5816e5e21443dcce01005
> >> >
> >> > And example codes for device driver,
> >> >
> >> > https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-
> >> exynos.git/commit/?
> >> > h=dma-fence-helper&id=d2ce7af23835789602a99d0ccef1f53cdd5caaae
> >> >
> >> > I think the time is not yet ripe for RFC posting: maybe existing dma
> >> fence
> >> > and reservation need more review and addition work. So I'd glad for
> >> somebody
> >> > giving other opinions and advices in advance before RFC posting.
> >>
> >> thoughts from a *really* quick, pre-coffee, first look:
> >> * any sort of helper to simplify single-buffer sort of use-cases (v4l)
> >> probably wouldn't want to bake in assumption that seqno_fence is used.
> >> * I guess g2d is probably not actually a simple use case, since I
> >> expect you can submit blits involving multiple buffers :-P
> >
> > I don't think so. One and more buffers can be used: seqno_fence also has
> > only one buffer. Actually, we have already applied this approach to most
> > devices; multimedia, gpu and display controller. And this approach shows
> > more performance; reduced power consumption against traditional way. And
> g2d
> > example is just to show you how to apply my approach to device driver.
> 
> no, you need the ww-mutex / reservation stuff any time you have
> multiple independent devices (or rings/contexts for hw that can
> support multiple contexts) which can do operations with multiple
> buffers.

I think I already used reservation stuff any time in that way except
ww-mutex. And I'm not sure that embedded system really needs ww-mutex. If
there is any case, 
could you tell me the case? I really need more advice and understanding :)

Thanks,
Inki Dae

  So you could conceivably hit this w/ gpu + g2d if multiple
> buffers where shared between the two.  vram migration and such
> 'desktop stuff' might make the problem worse, but just because you
> don't have vram doesn't mean you don't have a problem with multiple
> buffers.
> 
> >> * otherwise, you probably don't want to depend on dmabuf, which is why
> >> reservation/fence is split out the way it is..  you want to be able to
> >> use a single reservation/fence mechanism within your driver without
> >> having to care about which buffers are exported to dmabuf's and which
> >> are not.  Creating a dmabuf for every GEM bo is too heavyweight.
> >
> > Right. But I think we should dealt with this separately. Actually, we
> are
> > trying to use reservation for gpu pipe line synchronization such as sgx
> sync
> > object and this approach is used without dmabuf. In order words, some
> device
> > can use only reservation for such pipe line synchronization and at the
> same
> > time, fence helper or similar thing with dmabuf for buffer
> synchronization.
> 
> it is probably easier to approach from the reverse direction.. ie, get
> reservation/synchronization right first, and then dmabuf.  (Well, that
> isn't really a problem becau

[Bug 65085] New: [radeonsi LLVM] Segfault during OpenCL kernel compilation

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65085

  Priority: medium
Bug ID: 65085
  Assignee: dri-devel at lists.freedesktop.org
   Summary: [radeonsi LLVM] Segfault during OpenCL kernel
compilation
  Severity: normal
Classification: Unclassified
OS: All
  Reporter: niels_ole at salscheider-online.de
  Hardware: Other
Status: NEW
   Version: git
 Component: Drivers/Gallium/radeonsi
   Product: Mesa

Created attachment 79901
  --> https://bugs.freedesktop.org/attachment.cgi?id=79901&action=edit
Kernel that causes the segfault

I get a segmentation fault in LLVM with Tom Stellard's recent radeonsi compute
patches
(http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130520/175743.html)
with the attached kernel.

I can work around the issue when executing the if-block in the kernel
unconditionally or with the attached patch to LLVM.
Desc.OpInfo[0].RegClass equals -1 when the segmentation fault occurs.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/c3f6506c/attachment.html>

[Bug 65085] [radeonsi LLVM] Segfault during OpenCL kernel compilation

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65085

--- Comment #1 from Niels Ole Salscheider  
---
Created attachment 79903
  --> https://bugs.freedesktop.org/attachment.cgi?id=79903&action=edit
Patch to work around the issue

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/ffd1f228/attachment.html>

[Bug 65085] [radeonsi LLVM] Segfault during OpenCL kernel compilation

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65085

--- Comment #2 from Niels Ole Salscheider  
---
Created attachment 79904
  --> https://bugs.freedesktop.org/attachment.cgi?id=79904&action=edit
Full backtrace

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/c36a498d/attachment.html>

[Bug 58901] "trying to bind memory to uninitialized GART" error at resume from suspend to memory

2013-05-28 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=58901





--- Comment #1 from Michel D?nzer   2013-05-28 16:31:21 
---
Can you bisect between 3.8.x and 3.9.x?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.

Introduce a new helper framework for buffer synchronization

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 4:50 PM, Inki Dae  wrote:
> I think I already used reservation stuff any time in that way except
> ww-mutex. And I'm not sure that embedded system really needs ww-mutex. If
> there is any case,
> could you tell me the case? I really need more advice and understanding :)

If you have only one driver, you can get away without ww_mutex.
drm/i915 does it, all buffer state is protected by dev->struct_mutex.
But as soon as you have multiple drivers sharing buffers with dma_buf
things will blow up.

Yep, current prime is broken and can lead to deadlocks.

In practice it doesn't (yet) matter since only the X server does the
sharing dance, and that one's single-threaded. Now you can claim that
since you have all buffers pinned in embedded gfx anyway, you don't
care. But both in desktop gfx and embedded gfx the real fun starts
once you put fences into the mix and link them up with buffers, then
every command submission risks that deadlock. Furthermore you can get
unlucky and construct a circle of fences waiting on each another (only
though if the fence singalling fires off the next batchbuffer
asynchronously).

To prevent such deadlocks you _absolutely_ need to lock _all_ buffers
that take part in a command submission at once. To do that you either
need a global lock (ugh) or ww_mutexes.

So ww_mutexes are the fundamental ingredient of all this, if you don't
see why you need them then everything piled on top is broken. I think
until you've understood why exactly we need ww_mutexes there's not
much point in discussing the finer issues of fences, reservation
objects and how to integrate it with dma_bufs exactly.

I'll try to clarify the motivating example in the ww_mutex
documentation a bit, but I dunno how else I could explain this ...

Yours, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[Bug 65091] New: power_profile not working for HD5650

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65091

  Priority: medium
Bug ID: 65091
  Assignee: dri-devel at lists.freedesktop.org
   Summary: power_profile not working for HD5650
  Severity: major
Classification: Unclassified
OS: Linux (All)
  Reporter: nevehanter at gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Drivers/DRI/R600
   Product: Mesa

I have HD5650 with Intel i3 Ironlake (1-Gen) switcheable graphics.
My problem is that changing power_profile in radeon gives me nothing.
I have enabled KMS, use vgaswitcheroo to disable i915 and switch X to radeon.
Changing anything in radeon params at kernel parameters gives me also nothing.

It always stay at the same clocks:

For auto/low/mid/high profile
# cat /sys/kernel/debug/dri/0/radeon_pm_info
default engine clock: 10 kHz
current engine clock: 0 kHz
default memory clock: 157000 kHz
current memory clock: 156930 kHz
voltage: 1000 mV
PCIE lanes: 16

And it should be engine: 55 kHz and memory 80 kHz ( values taken from
windows )

I wanted to know if you could provide fix for it, and I'm open in helping with
providing every information.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/94a9faa0/attachment.html>

[REGRESSION] system does not resume from ram due to commit "drm/nv50/fifo: prevent races between clients updating playlists"

2013-05-28 Thread Sven Joachim

On 2013-05-26 23:09 +0200, Maarten Maathuis wrote:

> My NV96 does not resume from suspend to ram (the screen stays black, magic
> sysrq keys do work) with the current linus git kernel, i bisected it to the
> following commit.
>
> drm/nv50/fifo: prevent races between clients updating playlists
> b5096566f6e1ee2b88324772f020ae9bc0cfa9a0
>
> It's not obvious to me how this causes problems, but reverting this commit
> does solve my problem.

Same here on my NV86.

Cheers,
   Sven

[PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4

2013-05-28 Thread Daniel Vetter

On Tue, May 28, 2013 at 04:48:45PM +0200, Maarten Lankhorst wrote:
> This stresses the lockdep code in some ways specifically useful to
> ww_mutexes. It adds checks for most of the common locking errors.
> 
> Changes since v1:
>  - Add tests to verify reservation_id is untouched.
>  - Use L() and U() macros where possible.
> 
> Changes since v2:
>  - Use the ww_mutex api directly.
>  - Use macros for most of the code.
> Changes since v3:
>  - Rework tests for the api changes.
> 
> Signed-off-by: Maarten Lankhorst 
> ---
>  lib/locking-selftest.c |  405 
> ++--
>  1 file changed, 386 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
> index c3eb261..b18f1d3 100644
> --- a/lib/locking-selftest.c
> +++ b/lib/locking-selftest.c
> @@ -26,6 +26,8 @@
>   */
>  static unsigned int debug_locks_verbose;
>  
> +static DEFINE_WW_CLASS(ww_lockdep);
> +
>  static int __init setup_debug_locks_verbose(char *str)
>  {
>   get_option(&str, &debug_locks_verbose);
> @@ -42,6 +44,10 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
>  #define LOCKTYPE_RWLOCK  0x2
>  #define LOCKTYPE_MUTEX   0x4
>  #define LOCKTYPE_RWSEM   0x8
> +#define LOCKTYPE_WW  0x10
> +
> +static struct ww_acquire_ctx t, t2;
> +static struct ww_mutex o, o2;
>  
>  /*
>   * Normal standalone locks, for the circular and irq-context
> @@ -193,6 +199,16 @@ static void init_shared_classes(void)
>  #define RSU(x)   up_read(&rwsem_##x)
>  #define RWSI(x)  init_rwsem(&rwsem_##x)
>  
> +#define WWAI(x)  ww_acquire_init(x, &ww_lockdep)
> +#define WWAD(x)  ww_acquire_done(x)
> +#define WWAF(x)  ww_acquire_fini(x)
> +
> +#define WWL(x, c)ww_mutex_lock(x, c)
> +#define WWT(x)   ww_mutex_trylock(x)
> +#define WWL1(x)  ww_mutex_lock(x, NULL)
> +#define WWU(x)   ww_mutex_unlock(x)
> +
> +
>  #define LOCK_UNLOCK_2(x,y)   LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x)
>  
>  /*
> @@ -894,11 +910,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
>  # define I_RWLOCK(x) lockdep_reset_lock(&rwlock_##x.dep_map)
>  # define I_MUTEX(x)  lockdep_reset_lock(&mutex_##x.dep_map)
>  # define I_RWSEM(x)  lockdep_reset_lock(&rwsem_##x.dep_map)
> +# define I_WW(x) lockdep_reset_lock(&x.dep_map)
>  #else
>  # define I_SPINLOCK(x)
>  # define I_RWLOCK(x)
>  # define I_MUTEX(x)
>  # define I_RWSEM(x)
> +# define I_WW(x)
>  #endif
>  
>  #define I1(x)\
> @@ -920,11 +938,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
>  static void reset_locks(void)
>  {
>   local_irq_disable();
> + lockdep_free_key_range(&ww_lockdep.acquire_key, 1);
> + lockdep_free_key_range(&ww_lockdep.mutex_key, 1);
> +
>   I1(A); I1(B); I1(C); I1(D);
>   I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
> + I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base);
>   lockdep_reset();
>   I2(A); I2(B); I2(C); I2(D);
>   init_shared_classes();
> +
> + ww_mutex_init(&o, &ww_lockdep); ww_mutex_init(&o2, &ww_lockdep);
> + memset(&t, 0, sizeof(t)); memset(&t2, 0, sizeof(t2));
> + memset(&ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key));
> + memset(&ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key));
>   local_irq_enable();
>  }
>  
> @@ -938,7 +965,6 @@ static int unexpected_testcase_failures;
>  static void dotest(void (*testcase_fn)(void), int expected, int 
> lockclass_mask)
>  {
>   unsigned long saved_preempt_count = preempt_count();
> - int expected_failure = 0;
>  
>   WARN_ON(irqs_disabled());
>  
> @@ -946,26 +972,16 @@ static void dotest(void (*testcase_fn)(void), int 
> expected, int lockclass_mask)
>   /*
>* Filter out expected failures:
>*/
> + if (debug_locks != expected) {
>  #ifndef CONFIG_PROVE_LOCKING
> - if ((lockclass_mask & LOCKTYPE_SPIN) && debug_locks != expected)
> - expected_failure = 1;
> - if ((lockclass_mask & LOCKTYPE_RWLOCK) && debug_locks != expected)
> - expected_failure = 1;
> - if ((lockclass_mask & LOCKTYPE_MUTEX) && debug_locks != expected)
> - expected_failure = 1;
> - if ((lockclass_mask & LOCKTYPE_RWSEM) && debug_locks != expected)
> - expected_failure = 1;
> + expected_testcase_failures++;
> + printk("failed|");
> +#else
> + unexpected_testcase_failures++;
> + printk("FAILED|");
> +
> + dump_stack();
>  #endif
> - if (debug_locks != expected) {
> - if (expected_failure) {
> - expected_testcase_failures++;
> - printk("failed|");
> - } else {
> - unexpected_testcase_failures++;
> -
> - printk("FAILED|");
> -

[Bug 65095] New: BARTS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65095

  Priority: medium
Bug ID: 65095
  Assignee: dri-devel at lists.freedesktop.org
   Summary: BARTS [drm:r600_uvd_init] *ERROR* UVD not responding,
trying to reset the VCPU!!!
  Severity: major
Classification: Unclassified
OS: Linux (All)
  Reporter: spamjunkeater at gmail.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: XOrg CVS
 Component: DRM/Radeon
   Product: DRI

UVD is not working on BARTS (HD6850) with Linux 3.10-rc2

I got
[drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
errors on log.

I believe it's same issue with bug ID 63935 (
https://bugs.freedesktop.org/show_bug.cgi?id=63935 )

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/6dd1eafe/attachment.html>

[PULL] drm-intel-next for 3.11

2013-05-28 Thread Daniel Vetter

Hi Dave,

So I've figured it's time to upon up drm-next with a nice pile of intel
patches. And there seems to be some other stuff pending on dri-devel
already, too ;-)

Highlights (copy-pasted from my testing cycle mails):
- fbc support for Haswell (Rodrigo)
- streamlined workaround comments, including an igt tool to grep for
  them (Damien)
- sdvo and TV out cleanups, including a fixup for sdvo multifunction devices
- refactor our eDP mess a bit (Imre)
- don't register the hdmi connector on haswell when desktop eDP is present
- vlv support is no longer preliminary!
- more vlv fixes from Jesse for stolen and dpll handling
- more flexible power well checking infrastructure from Paulo
- a few gtt patches from Ben
- a bit of OCD cleanups for transcoder #defines and an assorted pile
  of smaller things.
- fixes for the gmch modeset sequence
- a bit of OCD around plane/pipe usage (Ville)
- vlv turbo support (Jesse)
- tons of vlv modeset fixes (Jesse et al.)
- vlv pte write fixes (Kenneth Graunke)
- hpd filtering to avoid costly probes on unaffected outputs (Egbert Eich)
- intel dev_info cleanups and refactorings (Damien)
- vlv rc6 support (Jesse)
- random pile of fixes around non-24bpp modes handling
- asle/opregion cleanups and locking fixes (Jani)
- dp dpll refactoring
- improvements for reduced_clock computation on g4x/ilk+
- pfit state refactored to use pipe_config (Jesse)
- lots more computed modeset state moved to pipe_config, including readout
  and cross-check support
- fdi auto-dithering for ivb B/C links, using the neat pipe_config
  improvements
- drm_rect helpers plus sprite clipping fixes (Ville)
- hw context refcounting (Mika + Ben)

Note that the merge with Linus' tree was a bit messy so I've also pushed
out a 2nd tag drm-intel-next-2013-05-20-merged which has the backmerge
which is already in my queue. Pull request for the merged tree below. Just
drop the -merged suffix if you want to have some fun ;-)

Cheers, Daniel


The following changes since commit c7788792a5e7b0d5d7f96d0766b4cb6112d47d75:

  Linux 3.10-rc2 (2013-05-20 14:37:38 -0700)

are available in the git repository at:

  git://people.freedesktop.org/~danvet/drm-intel 
tags/drm-intel-next-2013-05-20-merged

for you to fetch changes up to e1b73cba13a0cc68dd4f746eced15bd6bb24cda4:

  Merge tag 'v3.10-rc2' into drm-intel-next-queued (2013-05-21 09:52:16 +0200)



Ben Widawsky (3):
  drm/i915: Assert mutex_is_locked on context lookup
  drm/i915: BUG_ON bad PPGTT offset
  drm/i915: Extract PDE writes

Chris Wilson (2):
  drm/i915: Only print the info message about incresing stolen size for FBC 
once
  drm/i915: put context upon switching

Damien Lespiau (12):
  drm/i915: Remove mention of Haswell in DDI code
  drm/i915: Turn DEV_INFO_FLAGS into a foreach style macro
  drm/i915: Replace the line of %s by a DEV_INFO_FOR_EACH_FLAG() invocation
  drm/i915: Use DEV_INFO_FOR_EACH_FLAG() to declare flags as well
  drm/i915: Turn HAS_DDI() into a device_info flag
  drm/i915: Introduce HAS_FPGA_DBG_UNCLAIMED()
  drm/i915: Turn HAS_FPGA_DBG_UNCLAIMED into a device_info flag
  drm/i915: Ivybridge is the odd one when it comes to pipe scalers
  drm/i915: Add platform information to implemented workarounds
  drm/i915: Add references to some workaround we implement
  drm/i915: Compute WR PLL dividers dynamically
  drm/i915: Add missing platform tags to FBC workaround comments

Daniel Vetter (56):
  drm/i915: don't enable the plane too early in i9xx_crtc_mode_set
  drm/i915: drop redundant vblank waits
  drm/i915: add pipe asserts for the crtc enable sequence
  drm/i915: add i9xx pfit pipe asserts
  drm/i915: move debug output back to the right place
  drm/i915: fix VLV limits
  drm/i915: magic VLV PLL registers in the dpio sideband
  drm/i915: disable interrupts earlier in the driver unload code
  drm/i915: Disable high-bpc on pre-1.4 EDID screens
  drm/i915: Fixup non-24bpp support for VGA screens on Haswell
  drm/i915: consolidate pch pll computations a bit
  drm/i915: shovel compute clock into crtc->config.dpll on ilk
  drm/i915: move dp clock computations to encoder->compute_config
  drm/i915: use pipe_config for lvds dithering
  drm/i915: don't force matching p1 for g4x/ilk+ reduced pll settings
  drm/i915: remove redundant has_pch_encoder check
  drm/i915: simplify config->pixel_multiplier handling
  drm/i915: put the right cpu_transcoder into pipe_config for hw state 
readout
  drm/i915: force bpp for eDP panels
  drm/i915: drop adjusted_mode from *_set_pipeconf functions
  drm/i915: implement high-bpc + pipeconf-dither support for g4x/vlv
  drm/i915: allow high-bpc modes on DP
  drm/i915: move intel_crtc->fdi_lanes to pipe_config
  drm/i915: hw state readout support for pipe_config->fdi_lanes
  drm/i915: split up fdi_

[PATCH 2/6] gpu: host1x: Fix syncpoint wait return value

2013-05-28 Thread Keith Packard

Thierry Reding  writes:


> That doesn't sound right. Maybe drmIoctl() needs fixing instead. Looking
> at the history, drmIoctl() was introduced to automatically loop if a
> signal was received (commit 8b9ab108ec1f2ba2b503f713769c4946849b3cb2).
> However the ioctl(3p) manpage doesn't mention that ioctl() returns
> EAGAIN in case it is interrupted by a signal.

EAGAIN is being returned when the GPU is wedged to ask the application
to re-submit the request, which will presumably be held until the  GPU
is un-wedged.

-- 
keith.packard at intel.com
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/80411e83/attachment.pgp>

[PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4

2013-05-28 Thread Maarten Lankhorst

Op 28-05-13 21:18, Daniel Vetter schreef:
> On Tue, May 28, 2013 at 04:48:45PM +0200, Maarten Lankhorst wrote:
>> This stresses the lockdep code in some ways specifically useful to
>> ww_mutexes. It adds checks for most of the common locking errors.
>>
>> Changes since v1:
>>  - Add tests to verify reservation_id is untouched.
>>  - Use L() and U() macros where possible.
>>
>> Changes since v2:
>>  - Use the ww_mutex api directly.
>>  - Use macros for most of the code.
>> Changes since v3:
>>  - Rework tests for the api changes.
>>
>> 
>>
>> +static void ww_test_normal(void)
>> +{
>> +int ret;
>> +
>> +WWAI(&t);
>> +
>> +/*
>> + * test if ww_id is kept identical if not
>> + * called with any of the ww_* locking calls
>> + */
>> +
>> +/* mutex_lock (and indirectly, mutex_lock_nested) */
>> +o.ctx = (void *)~0UL;
>> +mutex_lock(&o.base);
>> +mutex_unlock(&o.base);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* mutex_lock_interruptible (and *_nested) */
>> +o.ctx = (void *)~0UL;
>> +ret = mutex_lock_interruptible(&o.base);
>> +if (!ret)
>> +mutex_unlock(&o.base);
>> +else
>> +WARN_ON(1);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* mutex_lock_killable (and *_nested) */
>> +o.ctx = (void *)~0UL;
>> +ret = mutex_lock_killable(&o.base);
>> +if (!ret)
>> +mutex_unlock(&o.base);
>> +else
>> +WARN_ON(1);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* trylock, succeeding */
>> +o.ctx = (void *)~0UL;
>> +ret = mutex_trylock(&o.base);
>> +WARN_ON(!ret);
>> +if (ret)
>> +mutex_unlock(&o.base);
>> +else
>> +WARN_ON(1);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* trylock, failing */
>> +o.ctx = (void *)~0UL;
>> +mutex_lock(&o.base);
>> +ret = mutex_trylock(&o.base);
>> +WARN_ON(ret);
>> +mutex_unlock(&o.base);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +
>> +/* nest_lock */
>> +o.ctx = (void *)~0UL;
>> +mutex_lock_nest_lock(&o.base, &t);
>> +mutex_unlock(&o.base);
>> +WARN_ON(o.ctx != (void *)~0UL);
>> +}
> Since we don't really allow this any more (instead allow ww_mutex_lock
> without context) do we need this test here really?
Yes. This test verifies all the normal locking paths are not affected by the 
ww_ctx changes.

>> +
>> +static void ww_test_two_contexts(void)
>> +{
>> +WWAI(&t);
>> +WWAI(&t2);
>> +}
>> +
>> +static void ww_test_context_unlock_twice(void)
>> +{
>> +WWAI(&t);
>> +WWAD(&t);
>> +WWAF(&t);
>> +WWAF(&t);
>> +}
>> +
>> +static void ww_test_object_unlock_twice(void)
>> +{
>> +WWL1(&o);
>> +WWU(&o);
>> +WWU(&o);
>> +}
>> +
>> +static void ww_test_spin_nest_unlocked(void)
>> +{
>> +raw_spin_lock_nest_lock(&lock_A, &o.base);
>> +U(A);
>> +}
> I don't quite see the point of this one here ...
It's a lockdep test that was missing. o.base is not locked. So lock_A is being 
nested into an unlocked lock, resulting in a lockdep error.

>> +
>> +static void ww_test_unneeded_slow(void)
>> +{
>> +int ret;
>> +
>> +WWAI(&t);
>> +
>> +ww_mutex_lock_slow(&o, &t);
>> +}
> I think checking the _slow debug stuff would be neat, i.e.
> - fail/success tests for properly unlocking all held locks
> - fail/success tests for lock_slow acquiring the right lock.
>
> Otherwise I didn't spot anything that seems missing in these self-tests
> here.
>
Yes it would be nice, doing so is left as an excercise for the reviewer, who 
failed to raise this point sooner. ;-)

~Maarten

[Bug 65095] BARTS [drm:r600_uvd_init] ERROR UVD not responding, trying to reset the VCPU!!!

2013-05-28 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=65095

Erdem U. Alt?nyurt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20130528/a17f1005/attachment.html>

90 matches

Mail list logo