date:20160112

[Bug 93663] Stuck on screen blanking/dmps/monitor turned off/on

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=93663

Michel DÃ¤nzer  changed:

   What|Removed |Added

  Component|Driver/AMDgpu   |DRM/AMDgpu
   Assignee|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
Product|xorg|DRI
 QA Contact|xorg-team at lists.x.org   |

--- Comment #3 from Michel DÃ¤nzer  ---
Unless it works with the modesetting driver, it's probably a kernel driver
issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/f85f4a1c/attachment-0001.html>

[Bug 93658] Distortions on the right of the monitor

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=93658

Michel DÃ¤nzer  changed:

   What|Removed |Added

  Component|Driver/AMDgpu   |DRM/AMDgpu
   Assignee|xorg-driver-ati at lists.x.org |dri-devel at 
lists.freedesktop
   ||.org
Product|xorg|DRI
 QA Contact|xorg-team at lists.x.org   |

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/635f87a3/attachment.html>

[Bug 93653] Crash while using GALLIUM_HUD

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=93653

Michel DÃ¤nzer  changed:

   What|Removed |Added

  Component|DRM/Radeon  |Mesa core
Version|XOrg git|git
   Assignee|dri-devel at lists.freedesktop |mesa-dev at 
lists.freedesktop.
   |.org|org
Product|DRI |Mesa
 QA Contact||mesa-dev at lists.freedesktop.
   ||org

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/9a293721/attachment.html>

[Bug 92923] SGPR spilling

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=92923

--- Comment #11 from Nicolai HÃ¤hnle  ---
With the branch at http://cgit.freedesktop.org/~nh/mesa/log/?h=pub-invalidate
the in-game part of your trace plays back at around 11 FPS on a Carrizo, which
is an integrated GPU (laptop). It may be worth a shot.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/aec92a5e/attachment.html>

[PATCH 13/22] drm/exynos: Remove event cancelling from postclose

2016-01-12 Thread Inki Dae

Hi Daniel,

It seems your patch is exactly same as below my one I posted before,
http://www.spinics.net/lists/dri-devel/msg97922.html

Anyway, it's ok if this patch can go to mainline.

Acked-by: Inki Dae 

2016ë 01ì 12ì¼ 06:41ì Daniel Vetter ì´(ê°) ì´ ê¸:
> The core takes care of this now. And since kfree(NULL) is ok we can
> simplify the function even further now.
> 
> Cc: Inki Dae 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher 
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/exynos/exynos_drm_drv.c | 14 --
>  1 file changed, 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c 
> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> index 9756797a15a5..868ab9f54f17 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> @@ -335,20 +335,6 @@ static void exynos_drm_preclose(struct drm_device *dev,
>  
>  static void exynos_drm_postclose(struct drm_device *dev, struct drm_file 
> *file)
>  {
> - struct drm_pending_event *e, *et;
> - unsigned long flags;
> -
> - if (!file->driver_priv)
> - return;
> -
> - spin_lock_irqsave(&dev->event_lock, flags);
> - /* Release all events handled by page flip handler but not freed. */
> - list_for_each_entry_safe(e, et, &file->event_list, link) {
> - list_del(&e->link);
> - e->destroy(e);
> - }
> - spin_unlock_irqrestore(&dev->event_lock, flags);
> -
>   kfree(file->driver_priv);
>   file->driver_priv = NULL;
>  }
>

[PATCH 5/5] drm: Enable markdown^Wasciidoc for gpu.tmpl

2016-01-12 Thread Jani Nikula

On Tue, 12 Jan 2016, Jonathan Corbet  wrote:
> In my mind, there's clearly no good that can come from (further) delaying
> something that works in favor of an "it would be nice" that may never
> even exist.  So I'm currently thinking that I'll pull this into the docs
> tree once the merge window is done, with the plan to push it for 4.6.
> Then we can see if anybody screams.

Must... resist... urge to bikeshed about the choice of markup...

> The build-time increase is painful in the extreme - about a factor of
> three for a -j1 build, and that's with only one file using the feature.
> It feels wrong, somehow, for the docs build to take longer than building
> the kernel itself.  Can we do something about that?

"Holy big-O, batman. Asciidoc appears to be quadractically slow." [1]

Fortunately the same quote lead me to asciidoctor [2], which was maybe
twice as fast as asciidoc. An improvement, but could be much better.

BR,
Jani.


[1] https://twitter.com/marijnjh/status/473935469676216321
[2] http://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/


-- 
Jani Nikula, Intel Open Source Technology Center

[PATCH v3] drm/exynos: fix kernel panic issue at drm releasing

2016-01-12 Thread Inki Dae

Hi Daniel,

2016ë 01ì 12ì¼ 04:00ì Daniel Stone ì´(ê°) ì´ ê¸:
> Hi Inki,
> 
> On 8 January 2016 at 08:46, Inki Dae  wrote:
>> Changelog v3:
>> - initialize only device specific things. Each page flip event object
>>   is created by DRM core so DRM core should release the object including
>>   incrementing event space.
> 
> I'm a bit confused here; we no longer call event->base.destroy(),
> because you say that the DRM core should release it. But how does the
> DRM core know to release the event? From the core point of view, the
> event disappears into the driver, and it is no longer tracked.

DRM core would need something to track the events. I think basically, someone 
who created one object should also destroy the object.

> 
> As Daniel says though, later versions handle all this in the core in a
> much more clean way, so we can remove these from the drivers then.

So I think it's not reasonable for specific driver to destroy the object 
created by core although there is a memory leak. However, the memory leak would 
be more critical than temporary codes.
Ok, I will merge this patch with more comments which will say the object will 
be destroyed by core part later.

Thanks,
Inki Dae

> 
> Cheers,
> Daniel
> 
>

[Bug 93663] Stuck on screen blanking/dmps/monitor turned off/on

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=93663

--- Comment #4 from jody.frankowski at gmail.com ---
I thought it might be, because I also noticed that if I boot with the monitor
turned off, and then turn it on, the screen will never turn on.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/733f4340/attachment-0001.html>

[PATCH 5/5] drm: Enable markdown^Wasciidoc for gpu.tmpl

2016-01-12 Thread Daniel Vetter

On Mon, Jan 11, 2016 at 06:12:12PM -0700, Jonathan Corbet wrote:
> On Sat, 12 Dec 2015 12:13:45 +0100
> Daniel Vetter  wrote:
> 
> > I just figured there's no way this could get it, and I'd
> > much rather improve the docs themselves than trying to convince core
> > kernel folks that this might be useful.
> 
> So I'm not quite sure why you figured that; I never said it, certainly.

To clarify this wasn't really my impression of your stance, but of the
overall room opinion when we had the discussion at KS. And then my main
goal here is to write great docs for drm (we have about 3k lines more docs
in 4.5 already), so that's why I dropped the ball on upstreaming. It
seemed unlikely to succeed, at least without some really seriuos effort at
convincing everyone, all while the drm docs for atomic haven't been in
good shape yet. Since then we had a few contributors of new atomic drivers
note on irc already that "oh cool, this is documented now". Overall really
just boils down to what I see as the most important things for drm ;-)

> I've been messing with it a bit, seems to work.  I do still wish we could
> consider alternatives, especially those that might simplify the toolchain
> rather than complicating it.  But it's clear that I'm not succeeding in
> finding time to actually explore that idea; the contents of $EXCUSES are
> good, but the end result is the same.  And the patch fairy just isn't
> coming through for me on this one.
> 
> In my mind, there's clearly no good that can come from (further) delaying
> something that works in favor of an "it would be nice" that may never
> even exist.  So I'm currently thinking that I'll pull this into the docs
> tree once the merge window is done, with the plan to push it for 4.6.
> Then we can see if anybody screams.
> 
> That gives a couple of weeks for an updated patch set, should you have
> one.
> 
> The build-time increase is painful in the extreme - about a factor of
> three for a -j1 build, and that's with only one file using the feature.
> It feels wrong, somehow, for the docs build to take longer than building
> the kernel itself.  Can we do something about that?
> 
>  - How many of the comments actually use asciidoc features?  Might there
>be some possibility of detecting those in kernel-doc and skipping the
>callout to asciidoc when it's not needed?

I think that amounts to writing a partial parser (we use asciidoc for
tables, lists, links, formatting, code snippets by now already, someone
even thought of using the asciiart->png feature it has but it's not yet
wired up). I don't think it's feasible.

>  - Pandoc seems to do asciidoc.  I still don't like the idea of depending
>on it for this to work, but having the *option* to use it is fine.  If
>it's really that much faster (yes, Python startup is painful) then
>maybe providing the option is worth it.

Hm, Dave asked me to convert to use python-based asciidoc insted of
haskell-based pandoc.

>  - All over the kernel we've seen that batching improves performance.  It
>would take a bit of work, but I bet kernel-doc could put together all
>the snippets from one file, pass them through a single asciidoc
>invocation, then split the results back apart.  That would probably
>eliminate the performance hit entirely.
> 
> None of that is a condition for pulling this stuff in, but can it be
> looked into?

Besides what Jani mention that asciidoctor should be a drop-in replacement
if installed it also seems possible to parallelize the call-out to
kernel-doc from docproc.c without too much effort. I hoped Jani would get
around to implement the asciidoctor support, and I'm hoping I can snipe
away some free sometimes the next few months to look at docproc.c more
seriously. This would kinda be a cool intern project, but atm we throw
them all at improving testing infrastructure ...

Anyway I'm of course still open to get this upstream, and I think a few
things should be polished (like the speed-up). But right now bandwidth on
my side isn't too plentiful. Maybe we should aim to have a few better
ideas (perhaps even for all of the docs stuff) for next KS and respin that
discussion?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH 14/22] drm/imx: Unconfuse preclose logic

2016-01-12 Thread Philipp Zabel

Am Montag, den 11.01.2016, 22:41 +0100 schrieb Daniel Vetter:
> So this one is special, since it tries to prevent races when userspace
> crashes simply by disabling the vblank machinery. Well except that imx
> always has vblanks enabled, and the disable_vblank hook actually just
> tries to cancel a pending pageflip. Without any locking whatsoever. Of
> course this is wrong, since it'll result in the hw not actually
> displaying what drm thinks is the current frontbuffer.
> 
> Well since the core takes care of the disappearing DRM fd now. So we
> can nuke all this confused code without ill side-effects.
> 
> Someone else needs to audit the locking for ->newfb and
> ->page_flip_event and fix it up. Common approach is to reuse
> dev->event_lock for this.
> 
> Cc: Sascha Hauer 
> Cc: Philipp Zabel 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher 
> Signed-off-by: Daniel Vetter 

Acked-by: Philipp Zabel 

regards
Philipp

[PATCH v2 2/3] drm/exynos: use generic code for managing zpos plane property

2016-01-12 Thread Marek Szyprowski

Hello,

On 2016-01-11 16:13, Daniel Vetter wrote:
> On Mon, Jan 11, 2016 at 12:03:04PM +0100, Marek Szyprowski wrote:
>> This patch replaces zpos property handling custom code in Exynos DRM
>> driver with calls to generic DRM code.
>>
>> Signed-off-by: Marek Szyprowski 
>> ---
>>   drivers/gpu/drm/exynos/exynos_drm_drv.h   |  1 -
>>   drivers/gpu/drm/exynos/exynos_drm_plane.c | 66 
>> +++
>>   drivers/gpu/drm/exynos/exynos_mixer.c | 19 +++--
>>   3 files changed, 30 insertions(+), 56 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h 
>> b/drivers/gpu/drm/exynos/exynos_drm_drv.h
>> index 17b5ded72ff1..244ae6c4482c 100644
>> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
>> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
>> @@ -217,7 +217,6 @@ struct exynos_drm_private {
>>   * this array is used to be aware of which crtc did it request vblank.
>>   */
>>  struct drm_crtc *crtc[MAX_CRTC];
>> -struct drm_property *plane_zpos_property;
>>   
>>  unsigned long da_start;
>>  unsigned long da_space_size;
>> diff --git a/drivers/gpu/drm/exynos/exynos_drm_plane.c 
>> b/drivers/gpu/drm/exynos/exynos_drm_plane.c
>> index d86227236f55..ba46bc3de796 100644
>> --- a/drivers/gpu/drm/exynos/exynos_drm_plane.c
>> +++ b/drivers/gpu/drm/exynos/exynos_drm_plane.c
>> @@ -137,9 +137,9 @@ static void exynos_drm_plane_reset(struct drm_plane 
>> *plane)
>>   
>>  exynos_state = kzalloc(sizeof(*exynos_state), GFP_KERNEL);
>>  if (exynos_state) {
>> -exynos_state->zpos = exynos_plane->config->zpos;
>>  plane->state = &exynos_state->base;
>>  plane->state->plane = plane;
>> +plane->state->zpos = exynos_plane->config->zpos;
>>  }
>>   }
>>   
>> @@ -155,7 +155,6 @@ exynos_drm_plane_duplicate_state(struct drm_plane *plane)
>>  return NULL;
>>   
>>  __drm_atomic_helper_plane_duplicate_state(plane, ©->base);
>> -copy->zpos = exynos_state->zpos;
>>  return ©->base;
>>   }
>>   
>> @@ -168,43 +167,6 @@ static void exynos_drm_plane_destroy_state(struct 
>> drm_plane *plane,
>>  kfree(old_exynos_state);
>>   }
>>   
>> -static int exynos_drm_plane_atomic_set_property(struct drm_plane *plane,
>> -struct drm_plane_state *state,
>> -struct drm_property *property,
>> -uint64_t val)
>> -{
>> -struct exynos_drm_plane *exynos_plane = to_exynos_plane(plane);
>> -struct exynos_drm_plane_state *exynos_state =
>> -to_exynos_plane_state(state);
>> -struct exynos_drm_private *dev_priv = plane->dev->dev_private;
>> -const struct exynos_drm_plane_config *config = exynos_plane->config;
>> -
>> -if (property == dev_priv->plane_zpos_property &&
>> -(config->capabilities & EXYNOS_DRM_PLANE_CAP_ZPOS))
>> -exynos_state->zpos = val;
>> -else
>> -return -EINVAL;
>> -
>> -return 0;
>> -}
>> -
>> -static int exynos_drm_plane_atomic_get_property(struct drm_plane *plane,
>> -  const struct drm_plane_state *state,
>> -  struct drm_property *property,
>> -  uint64_t *val)
>> -{
>> -const struct exynos_drm_plane_state *exynos_state =
>> -container_of(state, const struct exynos_drm_plane_state, base);
>> -struct exynos_drm_private *dev_priv = plane->dev->dev_private;
>> -
>> -if (property == dev_priv->plane_zpos_property)
>> -*val = exynos_state->zpos;
>> -else
>> -return -EINVAL;
>> -
>> -return 0;
>> -}
>> -
>>   static struct drm_plane_funcs exynos_plane_funcs = {
>>  .update_plane   = drm_atomic_helper_update_plane,
>>  .disable_plane  = drm_atomic_helper_disable_plane,
>> @@ -213,8 +175,6 @@ static struct drm_plane_funcs exynos_plane_funcs = {
>>  .reset  = exynos_drm_plane_reset,
>>  .atomic_duplicate_state = exynos_drm_plane_duplicate_state,
>>  .atomic_destroy_state = exynos_drm_plane_destroy_state,
>> -.atomic_set_property = exynos_drm_plane_atomic_set_property,
>> -.atomic_get_property = exynos_drm_plane_atomic_get_property,
>>   };
>>   
>>   static int
>> @@ -302,20 +262,21 @@ static const struct drm_plane_helper_funcs 
>> plane_helper_funcs = {
>>   };
>>   
>>   static void exynos_plane_attach_zpos_property(struct drm_plane *plane,
>> -  unsigned int zpos)
>> +  unsigned int zpos, bool immutable)
>>   {
>>  struct drm_device *dev = plane->dev;
>> -struct exynos_drm_private *dev_priv = dev->dev_private;
>>  struct drm_property *prop;
>>   
>> -prop = dev_priv->plane_zpos_property;
>> -if (!prop) {
>> -prop = drm_property_create_range(dev, 0, "zpos",
>> -

[PATCH v3] drm/exynos: fix kernel panic issue at drm releasing

2016-01-12 Thread Daniel Stone

Hi Inki,

On 12 January 2016 at 06:25, Inki Dae  wrote:
> 2016ë 01ì 12ì¼ 04:00ì Daniel Stone ì´(ê°) ì´ ê¸:
>> On 8 January 2016 at 08:46, Inki Dae  wrote:
>>> Changelog v3:
>>> - initialize only device specific things. Each page flip event object
>>>   is created by DRM core so DRM core should release the object including
>>>   incrementing event space.
>>
>> I'm a bit confused here; we no longer call event->base.destroy(),
>> because you say that the DRM core should release it. But how does the
>> DRM core know to release the event? From the core point of view, the
>> event disappears into the driver, and it is no longer tracked.
>
> DRM core would need something to track the events. I think basically, someone 
> who created one object should also destroy the object.

You're right, but this doesn't exist until Daniel Vetter's rather
larger patchset which is still pending merge.

>> As Daniel says though, later versions handle all this in the core in a
>> much more clean way, so we can remove these from the drivers then.
>
> So I think it's not reasonable for specific driver to destroy the object 
> created by core although there is a memory leak. However, the memory leak 
> would be more critical than temporary codes.
> Ok, I will merge this patch with more comments which will say the object will 
> be destroyed by core part later.

Also, by stealing the event out of crtc_state->event and moving it to
exynos_crtc->event, you can argue that we have quite explictly removed
the responsibility from the core. ;)

Thanks for handling this!

Cheers,
Daniel

[PATCH 08/22] drm/gma500: Remove empty preclose hook

2016-01-12 Thread Patrik Jakobsson

On Mon, Jan 11, 2016 at 10:41 PM, Daniel Vetter  
wrote:
> I'm auditing them all, empty ones just confuse ...
>
> Cc: Patrik Jakobsson 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher 
> Signed-off-by: Daniel Vetter 

Acked-by: Patrik Jakobsson 

> ---
>  drivers/gpu/drm/gma500/psb_drv.c | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/gma500/psb_drv.c 
> b/drivers/gpu/drm/gma500/psb_drv.c
> index 92e7e5795398..4e1c6850520e 100644
> --- a/drivers/gpu/drm/gma500/psb_drv.c
> +++ b/drivers/gpu/drm/gma500/psb_drv.c
> @@ -442,14 +442,6 @@ static long psb_unlocked_ioctl(struct file *filp, 
> unsigned int cmd,
> /* FIXME: do we need to wrap the other side of this */
>  }
>
> -/*
> - * When a client dies:
> - *- Check for and clean up flipped page state
> - */
> -static void psb_driver_preclose(struct drm_device *dev, struct drm_file 
> *priv)
> -{
> -}
> -
>  static int psb_pci_probe(struct pci_dev *pdev, const struct pci_device_id 
> *ent)
>  {
> return drm_get_pci_dev(pdev, ent, &driver);
> @@ -495,7 +487,6 @@ static struct drm_driver driver = {
> .load = psb_driver_load,
> .unload = psb_driver_unload,
> .lastclose = psb_driver_lastclose,
> -   .preclose = psb_driver_preclose,
> .set_busid = drm_pci_set_busid,
>
> .num_ioctls = ARRAY_SIZE(psb_ioctls),
> --
> 2.6.4
>

[PATCH v2] drm: Release driver references to handle before making it available again

2016-01-12 Thread Ville Syrjälä

On Mon, Jan 11, 2016 at 08:44:03PM +, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 07:51:03PM +0200, Ville SyrjÃ¤lÃ¤ wrote:
> > On Fri, Jan 08, 2016 at 11:27:05PM +, Chris Wilson wrote:
> > > When userspace closes a handle, we remove it from the file->object_idr
> > > and then tell the driver to drop its references to that file/handle.
> > > However, as the file/handle is already available again for reuse, it may
> > > be reallocated back to userspace and active on a new object before the
> > > driver has had a chance to drop the old file/handle references.
> > 
> > Hmm. What's the problem with another object starting to reuse the same
> > handle while we're still deleting the old one? So far I didn't spot
> > anything in the code that would go boom if there's another object
> > already around with the same handle.
> 
> Imagine a driver storing a hashtable to contract the handle->object->vma
> lookup into just handle->vma, like the old idea of replacing the
> object_idr with an ida plus hashtable of objects. (This saves the double
> step lookup that caused a regression with the ppgtt work, and the linear
> walk of object->vma_list which is a major slowdown in various full-pggtt
> OpenGL tests). In such a scheme, where the driver has a parallel lut to
> the core, the driver needs to be notified before the handle is then
> accessible again to userspace. Or in any other scenario where the driver
> is using the handle, as would be implied by having the open/close
> callbacks.

I see. Yeah, that makes sense. I was just a bit confused since I
couldn't find any real problem in the tree currently.

-- 
Ville SyrjÃ¤lÃ¤
Intel OTC

[PATCH v2] drm: Release driver references to handle before making it available again

2016-01-12 Thread Ville Syrjälä

On Fri, Jan 08, 2016 at 11:27:05PM +, Chris Wilson wrote:
> When userspace closes a handle, we remove it from the file->object_idr
> and then tell the driver to drop its references to that file/handle.
> However, as the file/handle is already available again for reuse, it may
> be reallocated back to userspace and active on a new object before the
> driver has had a chance to drop the old file/handle references.
> 
> Whilst calling back into the driver, we have to drop the
> file->table_lock spinlock and so to prevent reusing the closed handle we
> mark that handle as stale in the idr, perform the callback and then
> remove the handle. We set the stale handle to point to the NULL object,
> then any idr_find() whilst the driver is removing the handle will return
> NULL, just as if the handle is already removed from idr.
> 
> v2: Use NULL rather than an ERR_PTR to avoid having to adjust callers.
> idr_alloc() tracks existing handles using an internal bitmap, so we are
> free to use the NULL object as our stale identifier.
> 
> Signed-off-by: Chris Wilson 
> Cc: dri-devel at lists.freedesktop.org
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Rob Clark 
> Cc: Ville SyrjÃ¤lÃ¤ 
> Cc: Thierry Reding 
> ---
>  drivers/gpu/drm/drm_gem.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 2e8c77e71e1f..d1909d1a1eb4 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -294,18 +294,21 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
>   spin_lock(&filp->table_lock);
>  
>   /* Check if we currently have a reference on the object */
> - obj = idr_find(&filp->object_idr, handle);
> - if (obj == NULL) {
> + obj = idr_replace(&filp->object_idr, NULL, handle);
> + if (IS_ERR(obj)) {
>   spin_unlock(&filp->table_lock);
>   return -EINVAL;
>   }
>   dev = obj->dev;
> + spin_unlock(&filp->table_lock);

Could shrink the spinlocked section to be just the idr_replace()
call I suppose, and thus avoid the spin_unlock() in the error path.

Otherwise makes sense so
Reviewed-by: Ville SyrjÃ¤lÃ¤ 

>  
>   /* Release reference and decrement refcount. */
> + drm_gem_object_release_handle(handle, obj, filp);
> +
> + spin_lock(&filp->table_lock);
>   idr_remove(&filp->object_idr, handle);
>   spin_unlock(&filp->table_lock);
>  
> - drm_gem_object_release_handle(handle, obj, filp);
>   return 0;
>  }
>  EXPORT_SYMBOL(drm_gem_handle_delete);
> -- 
> 2.7.0.rc3

-- 
Ville SyrjÃ¤lÃ¤
Intel OTC

[PATCH v2 2/3] drm/exynos: use generic code for managing zpos plane property

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 10:34:27AM +0100, Marek Szyprowski wrote:
> Hello,
> 
> On 2016-01-11 16:13, Daniel Vetter wrote:
> >On Mon, Jan 11, 2016 at 12:03:04PM +0100, Marek Szyprowski wrote:
> >>This patch replaces zpos property handling custom code in Exynos DRM
> >>driver with calls to generic DRM code.
> >>
> >>Signed-off-by: Marek Szyprowski 
> >>---
> >>  drivers/gpu/drm/exynos/exynos_drm_drv.h   |  1 -
> >>  drivers/gpu/drm/exynos/exynos_drm_plane.c | 66 
> >> +++
> >>  drivers/gpu/drm/exynos/exynos_mixer.c | 19 +++--
> >>  3 files changed, 30 insertions(+), 56 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h 
> >>b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> >>index 17b5ded72ff1..244ae6c4482c 100644
> >>--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
> >>+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> >>@@ -217,7 +217,6 @@ struct exynos_drm_private {
> >> * this array is used to be aware of which crtc did it request vblank.
> >> */
> >>struct drm_crtc *crtc[MAX_CRTC];
> >>-   struct drm_property *plane_zpos_property;
> >>unsigned long da_start;
> >>unsigned long da_space_size;
> >>diff --git a/drivers/gpu/drm/exynos/exynos_drm_plane.c 
> >>b/drivers/gpu/drm/exynos/exynos_drm_plane.c
> >>index d86227236f55..ba46bc3de796 100644
> >>--- a/drivers/gpu/drm/exynos/exynos_drm_plane.c
> >>+++ b/drivers/gpu/drm/exynos/exynos_drm_plane.c
> >>@@ -137,9 +137,9 @@ static void exynos_drm_plane_reset(struct drm_plane 
> >>*plane)
> >>exynos_state = kzalloc(sizeof(*exynos_state), GFP_KERNEL);
> >>if (exynos_state) {
> >>-   exynos_state->zpos = exynos_plane->config->zpos;
> >>plane->state = &exynos_state->base;
> >>plane->state->plane = plane;
> >>+   plane->state->zpos = exynos_plane->config->zpos;
> >>}
> >>  }
> >>@@ -155,7 +155,6 @@ exynos_drm_plane_duplicate_state(struct drm_plane 
> >>*plane)
> >>return NULL;
> >>__drm_atomic_helper_plane_duplicate_state(plane, ©->base);
> >>-   copy->zpos = exynos_state->zpos;
> >>return ©->base;
> >>  }
> >>@@ -168,43 +167,6 @@ static void exynos_drm_plane_destroy_state(struct 
> >>drm_plane *plane,
> >>kfree(old_exynos_state);
> >>  }
> >>-static int exynos_drm_plane_atomic_set_property(struct drm_plane *plane,
> >>-   struct drm_plane_state *state,
> >>-   struct drm_property *property,
> >>-   uint64_t val)
> >>-{
> >>-   struct exynos_drm_plane *exynos_plane = to_exynos_plane(plane);
> >>-   struct exynos_drm_plane_state *exynos_state =
> >>-   to_exynos_plane_state(state);
> >>-   struct exynos_drm_private *dev_priv = plane->dev->dev_private;
> >>-   const struct exynos_drm_plane_config *config = exynos_plane->config;
> >>-
> >>-   if (property == dev_priv->plane_zpos_property &&
> >>-   (config->capabilities & EXYNOS_DRM_PLANE_CAP_ZPOS))
> >>-   exynos_state->zpos = val;
> >>-   else
> >>-   return -EINVAL;
> >>-
> >>-   return 0;
> >>-}
> >>-
> >>-static int exynos_drm_plane_atomic_get_property(struct drm_plane *plane,
> >>- const struct drm_plane_state *state,
> >>- struct drm_property *property,
> >>- uint64_t *val)
> >>-{
> >>-   const struct exynos_drm_plane_state *exynos_state =
> >>-   container_of(state, const struct exynos_drm_plane_state, base);
> >>-   struct exynos_drm_private *dev_priv = plane->dev->dev_private;
> >>-
> >>-   if (property == dev_priv->plane_zpos_property)
> >>-   *val = exynos_state->zpos;
> >>-   else
> >>-   return -EINVAL;
> >>-
> >>-   return 0;
> >>-}
> >>-
> >>  static struct drm_plane_funcs exynos_plane_funcs = {
> >>.update_plane   = drm_atomic_helper_update_plane,
> >>.disable_plane  = drm_atomic_helper_disable_plane,
> >>@@ -213,8 +175,6 @@ static struct drm_plane_funcs exynos_plane_funcs = {
> >>.reset  = exynos_drm_plane_reset,
> >>.atomic_duplicate_state = exynos_drm_plane_duplicate_state,
> >>.atomic_destroy_state = exynos_drm_plane_destroy_state,
> >>-   .atomic_set_property = exynos_drm_plane_atomic_set_property,
> >>-   .atomic_get_property = exynos_drm_plane_atomic_get_property,
> >>  };
> >>  static int
> >>@@ -302,20 +262,21 @@ static const struct drm_plane_helper_funcs 
> >>plane_helper_funcs = {
> >>  };
> >>  static void exynos_plane_attach_zpos_property(struct drm_plane *plane,
> >>- unsigned int zpos)
> >>+ unsigned int zpos, bool immutable)
> >>  {
> >>struct drm_device *dev = plane->dev;
> >>-   struct exynos_drm_private *dev_priv = dev->dev_private;
> >>struct drm_property *prop;
> >>-   prop = dev_priv->plane_zpos_property;
> >>-   if (!prop) {
>

[PATCH v2] drm: Release driver references to handle before making it available again

2016-01-12 Thread Chris Wilson

On Tue, Jan 12, 2016 at 12:19:12PM +0200, Ville SyrjÃ¤lÃ¤ wrote:
> On Fri, Jan 08, 2016 at 11:27:05PM +, Chris Wilson wrote:
> > When userspace closes a handle, we remove it from the file->object_idr
> > and then tell the driver to drop its references to that file/handle.
> > However, as the file/handle is already available again for reuse, it may
> > be reallocated back to userspace and active on a new object before the
> > driver has had a chance to drop the old file/handle references.
> > 
> > Whilst calling back into the driver, we have to drop the
> > file->table_lock spinlock and so to prevent reusing the closed handle we
> > mark that handle as stale in the idr, perform the callback and then
> > remove the handle. We set the stale handle to point to the NULL object,
> > then any idr_find() whilst the driver is removing the handle will return
> > NULL, just as if the handle is already removed from idr.
> > 
> > v2: Use NULL rather than an ERR_PTR to avoid having to adjust callers.
> > idr_alloc() tracks existing handles using an internal bitmap, so we are
> > free to use the NULL object as our stale identifier.
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: dri-devel at lists.freedesktop.org
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: Rob Clark 
> > Cc: Ville SyrjÃ¤lÃ¤ 
> > Cc: Thierry Reding 
> > ---
> >  drivers/gpu/drm/drm_gem.c | 9 ++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > index 2e8c77e71e1f..d1909d1a1eb4 100644
> > --- a/drivers/gpu/drm/drm_gem.c
> > +++ b/drivers/gpu/drm/drm_gem.c
> > @@ -294,18 +294,21 @@ drm_gem_handle_delete(struct drm_file *filp, u32 
> > handle)
> > spin_lock(&filp->table_lock);
> >  
> > /* Check if we currently have a reference on the object */
> > -   obj = idr_find(&filp->object_idr, handle);
> > -   if (obj == NULL) {
> > +   obj = idr_replace(&filp->object_idr, NULL, handle);
> > +   if (IS_ERR(obj)) {
> > spin_unlock(&filp->table_lock);
> > return -EINVAL;
> > }
> > dev = obj->dev;
> > +   spin_unlock(&filp->table_lock);
> 
> Could shrink the spinlocked section to be just the idr_replace()
> call I suppose, and thus avoid the spin_unlock() in the error path.

Indeed, missed that. I also missed in v2 that the IS_ERR(obj) test needed
to become IS_ERR_OR_NULL(obj) to catch the concurrent deletion.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

[PATCH] etnaviv: etnaviv_drv: Remove owner assignment from platform_driver

2016-01-12 Thread Lucas Stach

Am Freitag, den 08.01.2016, 11:52 -0200 schrieb Fabio Estevam:
> This platform_driver does not need to set an owner as it will be
> populated by the driver core.
> 
> Generated by scripts/coccinelle/api/platform_no_drv_owner.cocci.
> 
> Signed-off-by: Fabio Estevam 

Thanks, I've picked this up. As it's not a critical fix, I'll send this
out when some more things have accumulated.

Regards,
Lucas
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> index 5c89ebb..e885898 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
> @@ -668,7 +668,6 @@ static struct platform_driver etnaviv_platform_driver = {
>   .probe  = etnaviv_pdev_probe,
>   .remove = etnaviv_pdev_remove,
>   .driver = {
> - .owner  = THIS_MODULE,
>   .name   = "etnaviv",
>   .of_match_table = dt_match,
>   },

-- 
Pengutronix e.K. | Lucas Stach |
Industrial Linux Solutions   | http://www.pengutronix.de/  |

[PATCH] drm/i915: Assign crtc correctly in load detection.

2016-01-12 Thread Maarten Lankhorst

drm_atomic_set_crtc_for_connector should be used,
and crtc->primary->crtc is assigned by atomic_commit.

Rely on the helpers for setting this correctly, so
connector_mask gets updated too.

Signed-off-by: Maarten Lankhorst 
---
Should this be applied to topic/drm-misc since atomic connector_masks are added 
there?

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index bc2ec444925e..6b25a90d1e0a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10553,7 +10553,9 @@ retry:
goto fail;
}

-   connector_state->crtc = crtc;
+   ret = drm_atomic_set_crtc_for_connector(connector_state, crtc);
+   if (ret)
+   goto fail;

crtc_state = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(crtc_state)) {
@@ -10597,7 +10599,6 @@ retry:
old->release_fb->funcs->destroy(old->release_fb);
goto fail;
}
-   crtc->primary->crtc = crtc;

/* let the connector get through one full cycle before testing */
intel_wait_for_vblank(dev, intel_crtc->pipe);

[PATCH 07/22] drm/armada: Remove NULL open/pre/postclose hooks

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 11:51:58AM +, Russell King - ARM Linux wrote:
> On Mon, Jan 11, 2016 at 10:41:01PM +0100, Daniel Vetter wrote:
> > The compiler will do this, but the void hits when grepping all the
> > hooks for a subsystem wide audit are slightly annoying. So remove them
> > for next time around.
> 
> I'll try to remember to queue this after -rc1, though a reminder
> after -rc1 would be useful.

I've planed to take the entire series in through drm-misc after -rc1,
since at least conceptually it's all the same topic. Would that be ok with
you too?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH 07/22] drm/armada: Remove NULL open/pre/postclose hooks

2016-01-12 Thread Russell King - ARM Linux

On Mon, Jan 11, 2016 at 10:41:01PM +0100, Daniel Vetter wrote:
> The compiler will do this, but the void hits when grepping all the
> hooks for a subsystem wide audit are slightly annoying. So remove them
> for next time around.

I'll try to remember to queue this after -rc1, though a reminder
after -rc1 would be useful.

Thanks.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

[PATCH] drm/i915: Assign crtc correctly in load detection.

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 12:35:59PM +0100, Maarten Lankhorst wrote:
> drm_atomic_set_crtc_for_connector should be used,
> and crtc->primary->crtc is assigned by atomic_commit.
> 
> Rely on the helpers for setting this correctly, so
> connector_mask gets updated too.
> 
> Signed-off-by: Maarten Lankhorst 

Reviewed-by: Daniel Vetter 
> ---
> Should this be applied to topic/drm-misc since atomic connector_masks are 
> added there?
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index bc2ec444925e..6b25a90d1e0a 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10553,7 +10553,9 @@ retry:
>   goto fail;
>   }
>  
> - connector_state->crtc = crtc;
> + ret = drm_atomic_set_crtc_for_connector(connector_state, crtc);
> + if (ret)
> + goto fail;
>  
>   crtc_state = intel_atomic_get_crtc_state(state, intel_crtc);
>   if (IS_ERR(crtc_state)) {
> @@ -10597,7 +10599,6 @@ retry:
>   old->release_fb->funcs->destroy(old->release_fb);
>   goto fail;
>   }
> - crtc->primary->crtc = crtc;
>  
>   /* let the connector get through one full cycle before testing */
>   intel_wait_for_vblank(dev, intel_crtc->pipe);
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH] drm/i915: Assign crtc correctly in load detection.

2016-01-12 Thread Maarten Lankhorst

Op 12-01-16 om 13:34 schreef Daniel Vetter:
> On Tue, Jan 12, 2016 at 12:35:59PM +0100, Maarten Lankhorst wrote:
>> drm_atomic_set_crtc_for_connector should be used,
>> and crtc->primary->crtc is assigned by atomic_commit.
>>
>> Rely on the helpers for setting this correctly, so
>> connector_mask gets updated too.
>>
>> Signed-off-by: Maarten Lankhorst 
> Reviewed-by: Daniel Vetter 

After examining the code some more I think this fix is incomplete.

It also needs to do the same on release and if you set i915.nuclear_pageflip 
you'll get a WARN since mode_blob's not set.
Fixing this will break release_load_detect which doesn't unset it.
Would the code work?

Cc'ing Ville since he may be able to test it.

--- >8 ---

drm/i915: Use atomic state to obtain load detection crtc.

Signed-off-by: Maarten Lankhorst 
---
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index bc2ec444925e..9eb1f4e263c6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10409,6 +10409,7 @@ mode_fits_in_fbdev(struct drm_device *dev,
if (obj->base.size < mode->vdisplay * fb->pitches[0])
return NULL;

+   drm_framebuffer_reference(fb);
return fb;
 #else
return NULL;
@@ -10474,6 +10475,9 @@ bool intel_get_load_detect_pipe(struct drm_connector 
*connector,
  encoder->base.id, encoder->name);

 retry:
+   old->old_pipe_config = NULL;
+   old->old_plane_state = NULL;
+
ret = drm_modeset_lock(&config->connection_mutex, ctx);
if (ret)
goto fail;
@@ -10489,24 +10493,15 @@ retry:
 */

/* See if we already have a CRTC for this connector */
-   if (encoder->crtc) {
-   crtc = encoder->crtc;
+   if (connector->state->crtc) {
+   crtc = connector->state->crtc;

ret = drm_modeset_lock(&crtc->mutex, ctx);
if (ret)
goto fail;
-   ret = drm_modeset_lock(&crtc->primary->mutex, ctx);
-   if (ret)
-   goto fail;
-
-   old->dpms_mode = connector->dpms;
-   old->load_detect_temp = false;

/* Make sure the crtc and connector are running */
-   if (connector->dpms != DRM_MODE_DPMS_ON)
-   connector->funcs->dpms(connector, DRM_MODE_DPMS_ON);
-
-   return true;
+   goto found;
}

/* Find an unused one (if possible) */
@@ -10514,8 +10509,15 @@ retry:
i++;
if (!(encoder->possible_crtcs & (1 << i)))
continue;
-   if (possible_crtc->state->enable)
+
+   ret = drm_modeset_lock(&crtc->mutex, ctx);
+   if (ret)
+   goto fail;
+
+   if (possible_crtc->state->enable) {
+   drm_modeset_unlock(&crtc->mutex);
continue;
+   }

crtc = possible_crtc;
break;
@@ -10529,17 +10531,19 @@ retry:
goto fail;
}

-   ret = drm_modeset_lock(&crtc->mutex, ctx);
-   if (ret)
-   goto fail;
+found:
+   intel_crtc = to_intel_crtc(crtc);
+
ret = drm_modeset_lock(&crtc->primary->mutex, ctx);
if (ret)
goto fail;

-   intel_crtc = to_intel_crtc(crtc);
-   old->dpms_mode = connector->dpms;
-   old->load_detect_temp = true;
-   old->release_fb = NULL;
+   old->old_pipe_config = intel_crtc_duplicate_state(crtc);
+   old->old_plane_state = intel_plane_duplicate_state(crtc->primary);
+   if (!old->old_pipe_config || !old->old_plane_state) {
+   ret = -ENOMEM;
+   goto fail;
+   }

state = drm_atomic_state_alloc(dev);
if (!state)
@@ -10553,7 +10557,9 @@ retry:
goto fail;
}

-   connector_state->crtc = crtc;
+   ret = drm_atomic_set_crtc_for_connector(connector_state, crtc);
+   if (ret)
+   goto fail;

crtc_state = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(crtc_state)) {
@@ -10577,7 +10583,6 @@ retry:
if (fb == NULL) {
DRM_DEBUG_KMS("creating tmp fb for load-detection\n");
fb = intel_framebuffer_create_for_mode(dev, mode, 24, 32);
-   old->release_fb = fb;
} else
DRM_DEBUG_KMS("reusing fbdev for load-detection framebuffer\n");
if (IS_ERR(fb)) {
@@ -10589,15 +10594,16 @@ retry:
if (ret)
goto fail;

-   drm_mode_copy(&crtc_state->base.mode, mode);
+   drm_framebuffer_unreference(fb);
+
+   ret = drm_atomic_set_mode_for_crtc(&crtc_state->base, mode);
+   if (ret)
+   goto fail;

if (drm_atomic_commit(state)) {
DRM_DEBUG_KMS("failed to set mode on load-detect pipe\n");

[PATCH v3 1/3] drm: add generic zpos property

2016-01-12 Thread Marek Szyprowski

This patch adds support for generic plane's zpos property property with
well-defined semantics:
- added zpos properties to drm core and plane state structures
- added helpers for normalizing zpos properties of given set of planes
- well defined semantics: planes are sorted by zpos values and then plane
  id value if zpos equals

Normalized zpos values are calculated automatically when generic
muttable zpos property has been initialized. Drivers can simply use
plane_state->normalized_zpos in their atomic_check and/or plane_update
callbacks without any additional calls to DRM core.

Signed-off-by: Marek Szyprowski 
---
 Documentation/DocBook/gpu.tmpl  |  14 -
 drivers/gpu/drm/drm_atomic.c|   4 ++
 drivers/gpu/drm/drm_atomic_helper.c | 116 
 drivers/gpu/drm/drm_crtc.c  |  53 
 include/drm/drm_crtc.h  |  14 +
 5 files changed, 199 insertions(+), 2 deletions(-)

diff --git a/Documentation/DocBook/gpu.tmpl b/Documentation/DocBook/gpu.tmpl
index 6c6e81a9eaf4..f6b7236141b6 100644
--- a/Documentation/DocBook/gpu.tmpl
+++ b/Documentation/DocBook/gpu.tmpl
@@ -2004,7 +2004,7 @@ void intel_crt_init(struct drm_device *dev)
Description/Restrictions


-   DRM
+   DRM
Generic
ârotationâ
BITMASK
@@ -2256,7 +2256,7 @@ void intel_crt_init(struct drm_device *dev)
property to suggest an Y offset for a connector


-   Optional
+   Optional
âscaling modeâ
ENUM
{ "None", "Full", "Center", "Full aspect" }
@@ -2280,6 +2280,16 @@ void intel_crt_init(struct drm_device *dev)
TBD


+"zpos" 
+   RANGE
+   Min=0, Max=255
+   Plane
+   Plane's 'z' position during blending (0 for 
background, 255 for frontmost).
+   If two planes assigned to same CRTC have equal zpos values, the 
plane with higher plane
+   id is treated as closer to front. Can be IMMUTABLE if driver 
doesn't support changing
+   planes' order.
+   
+   
i915
Generic
"Broadcast RGB"
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 6a21e5c378c1..97bb069cb6a3 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -614,6 +614,8 @@ int drm_atomic_plane_set_property(struct drm_plane *plane,
state->src_h = val;
} else if (property == config->rotation_property) {
state->rotation = val;
+   } else if (property == config->zpos_property) {
+   state->zpos = val;
} else if (plane->funcs->atomic_set_property) {
return plane->funcs->atomic_set_property(plane, state,
property, val);
@@ -670,6 +672,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
*val = state->src_h;
} else if (property == config->rotation_property) {
*val = state->rotation;
+   } else if (property == config->zpos_property) {
+   *val = state->zpos;
} else if (plane->funcs->atomic_get_property) {
return plane->funcs->atomic_get_property(plane, state, 
property, val);
} else {
diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index d0d4b2ff7c21..257946fac94b 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 

 /**
  * DOC: overview
@@ -507,6 +508,117 @@ drm_atomic_helper_check_modeset(struct drm_device *dev,
 }
 EXPORT_SYMBOL(drm_atomic_helper_check_modeset);

+static int drm_atomic_state_zpos_cmp(const void *a, const void *b)
+{
+   const struct drm_plane_state *sa = *(struct drm_plane_state **)a;
+   const struct drm_plane_state *sb = *(struct drm_plane_state **)b;
+   int zpos_a = (sa->zpos << 16) + sa->plane->base.id;
+   int zpos_b = (sb->zpos << 16) + sb->plane->base.id;
+
+   return zpos_a - zpos_b;
+}
+
+/**
+ * drm_atomic_helper_crtc_normalize_zpos - calculate normalized zpos values
+ * @crtc: crtc with planes, which have to be considered for normalization
+ * @crtc_state: new atomic state to apply
+ *
+ * This function checks new states of all planes assigned to given crtc and
+ * calculates normalized zpos value for them. Planes are compared first by 
their
+ * zpos values, then by plane id (if zpos equals). Plane with lowest zpos value
+ * is at the bottom. The plane_state->normalized_zpos is then filled with uniqe
+ * values from 0 to number of active planes in crtc minus one.
+ *
+ * RETURNS
+ * Zero for success or -errno
+ */
+int drm_atomic_helper_crtc_normalize_zpos(struct drm_crtc *crtc,
+ struct drm_crtc_state *crtc_state)
+{
+   struct drm_atomic_state *state = crtc_state->state;
+   struct drm_device *dev = crtc

[PATCH v3 0/3] drm/exynos: introduce generic zpos property

2016-01-12 Thread Marek Szyprowski

Hello all,

This patch series is a continuation of rework of blending support in
Exynos DRM driver. Some background can be found here:
http://www.spinics.net/lists/dri-devel/msg96969.html

Daniel Vetter suggested that zpos property should be made generic, with
well-defined semantics. This patchset is my proposal for such generic
zpos property:
- added zpos properties to drm core and plane state structures,
- added helpers for normalizing zpos properties of given set of planes,
- well defined semantics: planes are sorted by zpos values and then plane
  id value if zpos equals.

Patches are based on top of latest exynos-drm-next branch.

Best regards
Marek Szyprowski
Samsung R&D Institute Poland

Changelog:

v3:
- on request of Daniel Vetter, moved all normalization process to DRM
  core, drivers can simply use plane_state->normalized_zpos in their
  atomic_check/update callbacks with no additional changes needed
- updated documentation

v2: http://www.spinics.net/lists/dri-devel/msg98093.html
- dropped 2 fixes for Exynos DRM, which got merged in meantime
- added more comments and kernel docs for core functions as suggested
  by Daniel Vetter
- reworked initialization of zpos properties (moved assiging property
  class to common code), now the code in the driver is even simpler
- while reworking of intialization of zpos property code, did the same
  change to generic rotation property

v1: http://www.spinics.net/lists/dri-devel/msg97709.html
- initial version

Patch summary:

Marek Szyprowski (3):
  drm: add generic zpos property
  drm/exynos: use generic code for managing zpos plane property
  drm: simplify initialization of rotation property

 Documentation/DocBook/gpu.tmpl  |  14 ++-
 drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c |  10 +-
 drivers/gpu/drm/drm_atomic.c|   4 +
 drivers/gpu/drm/drm_atomic_helper.c | 116 
 drivers/gpu/drm/drm_crtc.c  |  82 +++--
 drivers/gpu/drm/exynos/exynos_drm_drv.h |   2 -
 drivers/gpu/drm/exynos/exynos_drm_plane.c   |  66 +++---
 drivers/gpu/drm/exynos/exynos_mixer.c   |   6 +-
 drivers/gpu/drm/i915/intel_display.c|   6 +-
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c   |   3 +-
 drivers/gpu/drm/omapdrm/omap_drv.c  |   3 +-
 include/drm/drm_crtc.h  |  18 +++-
 12 files changed, 250 insertions(+), 80 deletions(-)

-- 
1.9.2

[PATCH v3 3/3] drm: simplify initialization of rotation property

2016-01-12 Thread Marek Szyprowski

This patch simplifies initialization of generic rotation property and
aligns the code to match recently introduced function for intializing
generic zpos property. It also adds missing documentation.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 10 -
 drivers/gpu/drm/drm_crtc.c  | 29 -
 drivers/gpu/drm/i915/intel_display.c|  6 ++---
 drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c   |  3 +--
 drivers/gpu/drm/omapdrm/omap_drv.c  |  3 +--
 include/drm/drm_crtc.h  |  4 ++--
 6 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c 
b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
index 1ffe9c329c46..4f9606cdf0f2 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
@@ -967,12 +967,10 @@ atmel_hlcdc_plane_create_properties(struct drm_device 
*dev)
if (!props->alpha)
return ERR_PTR(-ENOMEM);

-   dev->mode_config.rotation_property =
-   drm_mode_create_rotation_property(dev,
- BIT(DRM_ROTATE_0) |
- BIT(DRM_ROTATE_90) |
- BIT(DRM_ROTATE_180) |
- BIT(DRM_ROTATE_270));
+   drm_mode_create_rotation_property(dev, BIT(DRM_ROTATE_0) |
+  BIT(DRM_ROTATE_90) |
+  BIT(DRM_ROTATE_180) |
+  BIT(DRM_ROTATE_270));
if (!dev->mode_config.rotation_property)
return ERR_PTR(-ENOMEM);

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 54a21e7c1ca5..99faccb63ce3 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -5861,10 +5861,23 @@ void drm_mode_config_cleanup(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_mode_config_cleanup);

-struct drm_property *drm_mode_create_rotation_property(struct drm_device *dev,
-  unsigned int 
supported_rotations)
+/**
+ * drm_mode_create_rotation_property - create generic rotation property
+ * @dev: DRM device
+ * @supported_rotations: bitmask of supported rotation modes
+ *
+ * This function initializes generic rotation property and enables support
+ * for it in drm core. Drivers can then attach this property to planes to 
enable
+ * support for different rotation modes.
+ *
+ * Returns:
+ * Zero on success, negative errno on failure.
+ */
+int drm_mode_create_rotation_property(struct drm_device *dev,
+ unsigned int supported_rotations)
 {
-   static const struct drm_prop_enum_list props[] = {
+   struct drm_property *prop;
+   static const struct drm_prop_enum_list values[] = {
{ DRM_ROTATE_0,   "rotate-0" },
{ DRM_ROTATE_90,  "rotate-90" },
{ DRM_ROTATE_180, "rotate-180" },
@@ -5873,9 +5886,13 @@ struct drm_property 
*drm_mode_create_rotation_property(struct drm_device *dev,
{ DRM_REFLECT_Y,  "reflect-y" },
};

-   return drm_property_create_bitmask(dev, 0, "rotation",
-  props, ARRAY_SIZE(props),
-  supported_rotations);
+   prop = drm_property_create_bitmask(dev, 0, "rotation", values,
+   ARRAY_SIZE(values), supported_rotations);
+   if (!prop)
+   return -ENOMEM;
+
+   dev->mode_config.rotation_property = prop;
+   return 0;
 }
 EXPORT_SYMBOL(drm_mode_create_rotation_property);

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 02f6ccb848a9..5b7ba46491a0 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -14042,8 +14042,7 @@ void intel_create_rotation_property(struct drm_device 
*dev, struct intel_plane *
if (INTEL_INFO(dev)->gen >= 9)
flags |= BIT(DRM_ROTATE_90) | BIT(DRM_ROTATE_270);

-   dev->mode_config.rotation_property =
-   drm_mode_create_rotation_property(dev, flags);
+   drm_mode_create_rotation_property(dev, flags);
}
if (dev->mode_config.rotation_property)
drm_object_attach_property(&plane->base.base,
@@ -14179,8 +14178,7 @@ static struct drm_plane 
*intel_cursor_plane_create(struct drm_device *dev,

if (INTEL_INFO(dev)->gen >= 4) {
if (!dev->mode_config.rotation_property)
-   dev->mode_config.rotation_property =
-   drm_mode_create_rotation_property(dev,
+

[PATCH v3 2/3] drm/exynos: use generic code for managing zpos plane property

2016-01-12 Thread Marek Szyprowski

This patch replaces zpos property handling custom code in Exynos DRM
driver with calls to generic DRM code.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/exynos/exynos_drm_drv.h   |  2 -
 drivers/gpu/drm/exynos/exynos_drm_plane.c | 66 +++
 drivers/gpu/drm/exynos/exynos_mixer.c |  6 ++-
 3 files changed, 18 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h 
b/drivers/gpu/drm/exynos/exynos_drm_drv.h
index 17b5ded72ff1..816537886e4e 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
@@ -64,7 +64,6 @@ struct exynos_drm_plane_state {
struct exynos_drm_rect src;
unsigned int h_ratio;
unsigned int v_ratio;
-   unsigned int zpos;
 };

 static inline struct exynos_drm_plane_state *
@@ -217,7 +216,6 @@ struct exynos_drm_private {
 * this array is used to be aware of which crtc did it request vblank.
 */
struct drm_crtc *crtc[MAX_CRTC];
-   struct drm_property *plane_zpos_property;

unsigned long da_start;
unsigned long da_space_size;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_plane.c 
b/drivers/gpu/drm/exynos/exynos_drm_plane.c
index d86227236f55..a434d3a6bb90 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_plane.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_plane.c
@@ -137,9 +137,9 @@ static void exynos_drm_plane_reset(struct drm_plane *plane)

exynos_state = kzalloc(sizeof(*exynos_state), GFP_KERNEL);
if (exynos_state) {
-   exynos_state->zpos = exynos_plane->config->zpos;
plane->state = &exynos_state->base;
plane->state->plane = plane;
+   plane->state->zpos = exynos_plane->config->zpos;
}
 }

@@ -155,7 +155,6 @@ exynos_drm_plane_duplicate_state(struct drm_plane *plane)
return NULL;

__drm_atomic_helper_plane_duplicate_state(plane, ©->base);
-   copy->zpos = exynos_state->zpos;
return ©->base;
 }

@@ -168,43 +167,6 @@ static void exynos_drm_plane_destroy_state(struct 
drm_plane *plane,
kfree(old_exynos_state);
 }

-static int exynos_drm_plane_atomic_set_property(struct drm_plane *plane,
-   struct drm_plane_state *state,
-   struct drm_property *property,
-   uint64_t val)
-{
-   struct exynos_drm_plane *exynos_plane = to_exynos_plane(plane);
-   struct exynos_drm_plane_state *exynos_state =
-   to_exynos_plane_state(state);
-   struct exynos_drm_private *dev_priv = plane->dev->dev_private;
-   const struct exynos_drm_plane_config *config = exynos_plane->config;
-
-   if (property == dev_priv->plane_zpos_property &&
-   (config->capabilities & EXYNOS_DRM_PLANE_CAP_ZPOS))
-   exynos_state->zpos = val;
-   else
-   return -EINVAL;
-
-   return 0;
-}
-
-static int exynos_drm_plane_atomic_get_property(struct drm_plane *plane,
- const struct drm_plane_state *state,
- struct drm_property *property,
- uint64_t *val)
-{
-   const struct exynos_drm_plane_state *exynos_state =
-   container_of(state, const struct exynos_drm_plane_state, base);
-   struct exynos_drm_private *dev_priv = plane->dev->dev_private;
-
-   if (property == dev_priv->plane_zpos_property)
-   *val = exynos_state->zpos;
-   else
-   return -EINVAL;
-
-   return 0;
-}
-
 static struct drm_plane_funcs exynos_plane_funcs = {
.update_plane   = drm_atomic_helper_update_plane,
.disable_plane  = drm_atomic_helper_disable_plane,
@@ -213,8 +175,6 @@ static struct drm_plane_funcs exynos_plane_funcs = {
.reset  = exynos_drm_plane_reset,
.atomic_duplicate_state = exynos_drm_plane_duplicate_state,
.atomic_destroy_state = exynos_drm_plane_destroy_state,
-   .atomic_set_property = exynos_drm_plane_atomic_set_property,
-   .atomic_get_property = exynos_drm_plane_atomic_get_property,
 };

 static int
@@ -302,20 +262,21 @@ static const struct drm_plane_helper_funcs 
plane_helper_funcs = {
 };

 static void exynos_plane_attach_zpos_property(struct drm_plane *plane,
- unsigned int zpos)
+ unsigned int zpos, bool immutable)
 {
struct drm_device *dev = plane->dev;
-   struct exynos_drm_private *dev_priv = dev->dev_private;
struct drm_property *prop;

-   prop = dev_priv->plane_zpos_property;
-   if (!prop) {
-   prop = drm_property_create_range(dev, 0, "zpos",
-0, MAX_PLANE - 1);
-   if (!prop)
-   return

[PATCH 16/22] drm/omap: Nuke close hooks

2016-01-12 Thread Tomi Valkeinen

On 11/01/16 23:41, Daniel Vetter wrote:
> Again since the core takes care of this we can remove them. While at
> it also remove the postclose hook, it's empty.
> 
> v2: Laurent pointed me at even more code to delete.
> 
> Cc: Laurent Pinchart 
> Cc: Tomi Valkeinen 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher 
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/omapdrm/omap_crtc.c | 13 +---
>  drivers/gpu/drm/omapdrm/omap_drv.c  | 41 
> -
>  drivers/gpu/drm/omapdrm/omap_drv.h  |  1 -
>  3 files changed, 1 insertion(+), 54 deletions(-)

Acked-by: Tomi Valkeinen 

 Tomi

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/6e897ae0/attachment.sig>

[PATCH 20/22] drm/tilcdc: Nuke preclose hook

2016-01-12 Thread Tomi Valkeinen


On 11/01/16 23:41, Daniel Vetter wrote:
> Again since the drm core takes care of event unlinking/disarming this
> is now just needless code.
> 
> v2: Fixup misplaced hunks.
> 
> Cc: Rob Clark 
> Acked-by: Daniel Stone 
> Reviewed-by: Alex Deucher  (v1)
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/tilcdc/tilcdc_crtc.c | 20 
>  drivers/gpu/drm/tilcdc/tilcdc_drv.c  |  8 
>  drivers/gpu/drm/tilcdc/tilcdc_drv.h  |  1 -
>  3 files changed, 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c 
> b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> index 7d07733bdc86..4802da8e6d6f 100644
> --- a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> +++ b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> @@ -662,26 +662,6 @@ irqreturn_t tilcdc_crtc_irq(struct drm_crtc *crtc)
>   return IRQ_HANDLED;
>  }
>  
> -void tilcdc_crtc_cancel_page_flip(struct drm_crtc *crtc, struct drm_file 
> *file)
> -{
> - struct tilcdc_crtc *tilcdc_crtc = to_tilcdc_crtc(crtc);
> - struct drm_pending_vblank_event *event;
> - struct drm_device *dev = crtc->dev;
> - unsigned long flags;
> -
> - /* Destroy the pending vertical blanking event associated with the
> -  * pending page flip, if any, and disable vertical blanking interrupts.
> -  */
> - spin_lock_irqsave(&dev->event_lock, flags);
> - event = tilcdc_crtc->event;
> - if (event && event->base.file_priv == file) {
> - tilcdc_crtc->event = NULL;
> - event->base.destroy(&event->base);
> - drm_vblank_put(dev, 0);
> - }
> - spin_unlock_irqrestore(&dev->event_lock, flags);
> -}
> -

Hmm, looks fine, but when I was comparing the omapdrm change and this
one, I see tilcdc doing drm_vblank_put() in the removed code but omapdrm
doesn't.

The other patches that nuke preclose hooks also contain vblank_put. Will
there be a vblank_put call missing here, or will there be an extra
vblank_put call happening somewhere on omapdrm?

 Tomi

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/07fb4ebf/attachment-0001.sig>

[Bug 92923] SGPR spilling

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=92923

--- Comment #12 from Nicolai HÃ¤hnle  ---
Created attachment 120987
  --> https://bugs.freedesktop.org/attachment.cgi?id=120987&action=edit
always add GTT to buffers' initial domain

How much VRAM do you have? Try running the game with
GALLIUM_HUD=num-bytes-moved, with and without the attached patch.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/bfb99e84/attachment.html>

[PATCH 20/22] drm/tilcdc: Nuke preclose hook

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 04:19:39PM +0200, Tomi Valkeinen wrote:
> 
> On 11/01/16 23:41, Daniel Vetter wrote:
> > Again since the drm core takes care of event unlinking/disarming this
> > is now just needless code.
> > 
> > v2: Fixup misplaced hunks.
> > 
> > Cc: Rob Clark 
> > Acked-by: Daniel Stone 
> > Reviewed-by: Alex Deucher  (v1)
> > Signed-off-by: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/tilcdc/tilcdc_crtc.c | 20 
> >  drivers/gpu/drm/tilcdc/tilcdc_drv.c  |  8 
> >  drivers/gpu/drm/tilcdc/tilcdc_drv.h  |  1 -
> >  3 files changed, 29 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c 
> > b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> > index 7d07733bdc86..4802da8e6d6f 100644
> > --- a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> > +++ b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
> > @@ -662,26 +662,6 @@ irqreturn_t tilcdc_crtc_irq(struct drm_crtc *crtc)
> > return IRQ_HANDLED;
> >  }
> >  
> > -void tilcdc_crtc_cancel_page_flip(struct drm_crtc *crtc, struct drm_file 
> > *file)
> > -{
> > -   struct tilcdc_crtc *tilcdc_crtc = to_tilcdc_crtc(crtc);
> > -   struct drm_pending_vblank_event *event;
> > -   struct drm_device *dev = crtc->dev;
> > -   unsigned long flags;
> > -
> > -   /* Destroy the pending vertical blanking event associated with the
> > -* pending page flip, if any, and disable vertical blanking interrupts.
> > -*/
> > -   spin_lock_irqsave(&dev->event_lock, flags);
> > -   event = tilcdc_crtc->event;
> > -   if (event && event->base.file_priv == file) {
> > -   tilcdc_crtc->event = NULL;
> > -   event->base.destroy(&event->base);
> > -   drm_vblank_put(dev, 0);
> > -   }
> > -   spin_unlock_irqrestore(&dev->event_lock, flags);
> > -}
> > -
> 
> Hmm, looks fine, but when I was comparing the omapdrm change and this
> one, I see tilcdc doing drm_vblank_put() in the removed code but omapdrm
> doesn't.
> 
> The other patches that nuke preclose hooks also contain vblank_put. Will
> there be a vblank_put call missing here, or will there be an extra
> vblank_put call happening somewhere on omapdrm?

Different approaches to the same problem:

- omap just unlinks the event from fpriv and still process it normally.
  But then before sending it out it checks whether the fpriv is still
  there or not and either sends it, or deletes the event directly. This
  way the vblank_put is always called from the worker/irq handler as part
  of the event processing.

  This is the same approach I implemented in core with this series.

- tilcdc (and most other drivers) entirely destroy the event in the
  preclose hook, which means they must also release any other resources
  acquired as part of that event. Therefore they have the vblank_put here.
  But the vblank_put is obviously also in the normal event processing
  paths, so with the new approach of only unlinking it we can handle this
  without any special cases in the driver.

I hope this explains what's going on. Since you're about driver maintainer
no. 3 with the same question: Can you pls review the kerneldoc and make
sure it explains this well? I tried to improve it already a bit after
Laurent's/Thomas' questions.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[PATCH v9 00/14] MT8173 DRM support

2016-01-12 Thread Philipp Zabel

Hi,

this MT8173 DRM update fixes a DPI suspend/resume refcounting bug and
cleans up the HDMI driver a bit further. The audio clock regeneration
configuration now just uses the N values recommended by the spec and
calculates CTS. A new patch enables the RENDER driver feature and adds
GEM creation and mapping IOCTLs.

Changes since v8:
 - Fixed a DPI enable/disable and suspend/resume power count problem
 - Reworked N, CTS setup (use recommended N from spec, calculate CTS)
 - Dropped deep color support
 - Dropped unused tables
 - Changed some noisy dev_info to dev_dbg
 - Improved mode_valid return value
 - Added RENDER driver feature and MTK_GEM_CREATE/MAP_OFFSET IOCTLs

The following patches are needed to cleanly apply the device tree changes on
top of v4.4:

61aee9342514 ("arm64: dts: mt8173: add MT8173 display PWM driver support node")
from https://github.com/mbgg/linux-mediatek.git v4.4-next/arm64

https://patchwork.kernel.org/patch/7880431/ ("dts: mt8173: Add iommu/smi nodes 
for mt8173")

And to build:

https://patchwork.kernel.org/patch/7880301/ ("dt-bindings: mediatek: Add smi 
dts binding")
https://patchwork.kernel.org/patch/7880321/ ("memory: mediatek: Add SMI driver")

regards
Philipp

CK Hu (6):
  dt-bindings: drm/mediatek: Add Mediatek display subsystem dts binding
  drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.
  drm/mediatek: Add DSI sub driver
  arm64: dts: mt8173: Add display subsystem related nodes
  arm64: dts: mt8173: Add HDMI related nodes
  drm/mediatek: Add interface to allocate Mediatek GEM buffer.

Jie Qiu (3):
  drm/mediatek: Add DPI sub driver
  drm/mediatek: Add HDMI support
  drm/mediatek: enable hdmi output control bit

Philipp Zabel (5):
  dt-bindings: drm/mediatek: Add Mediatek HDMI dts binding
  clk: mediatek: make dpi0_sel propagate rate changes
  clk: mediatek: Add hdmi_ref HDMI PHY PLL reference clock output
  dt-bindings: hdmi-connector: add DDC I2C bus phandle documentation
  clk: mediatek: remove hdmitx_dig_cts from TOP clocks

 .../bindings/display/connector/hdmi-connector.txt  |   1 +
 .../bindings/display/mediatek/mediatek,disp.txt| 203 +
 .../bindings/display/mediatek/mediatek,dpi.txt |  35 +
 .../bindings/display/mediatek/mediatek,dsi.txt |  60 ++
 .../bindings/display/mediatek/mediatek,hdmi.txt| 148 
 arch/arm64/boot/dts/mediatek/mt8173.dtsi   | 312 
 drivers/clk/mediatek/clk-mt8173.c  |   8 +-
 drivers/clk/mediatek/clk-mtk.h |   7 +-
 drivers/gpu/drm/Kconfig|   2 +
 drivers/gpu/drm/Makefile   |   1 +
 drivers/gpu/drm/mediatek/Kconfig   |  22 +
 drivers/gpu/drm/mediatek/Makefile  |  22 +
 drivers/gpu/drm/mediatek/mtk_cec.c | 245 ++
 drivers/gpu/drm/mediatek/mtk_cec.h |  25 +
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c| 301 
 drivers/gpu/drm/mediatek/mtk_dpi.c | 757 ++
 drivers/gpu/drm/mediatek/mtk_dpi.h |  85 +++
 drivers/gpu/drm/mediatek/mtk_dpi_regs.h| 228 ++
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c| 603 +++
 drivers/gpu/drm/mediatek/mtk_drm_crtc.h|  32 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp.c | 355 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp.h |  41 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c| 275 +++
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h| 148 
 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 592 ++
 drivers/gpu/drm/mediatek/mtk_drm_drv.h |  55 ++
 drivers/gpu/drm/mediatek/mtk_drm_fb.c  | 165 
 drivers/gpu/drm/mediatek/mtk_drm_fb.h  |  29 +
 drivers/gpu/drm/mediatek/mtk_drm_gem.c | 266 +++
 drivers/gpu/drm/mediatek/mtk_drm_gem.h |  67 ++
 drivers/gpu/drm/mediatek/mtk_drm_hdmi_drv.c| 609 +++
 drivers/gpu/drm/mediatek/mtk_drm_plane.c   | 242 ++
 drivers/gpu/drm/mediatek/mtk_drm_plane.h   |  59 ++
 drivers/gpu/drm/mediatek/mtk_dsi.c | 847 +
 drivers/gpu/drm/mediatek/mtk_dsi.h |  58 ++
 drivers/gpu/drm/mediatek/mtk_hdmi.c| 479 
 drivers/gpu/drm/mediatek/mtk_hdmi.h| 223 ++
 drivers/gpu/drm/mediatek/mtk_hdmi_ddc_drv.c| 362 +
 drivers/gpu/drm/mediatek/mtk_hdmi_hw.c | 663 
 drivers/gpu/drm/mediatek/mtk_hdmi_hw.h |  73 ++
 drivers/gpu/drm/mediatek/mtk_hdmi_regs.h   | 222 ++
 drivers/gpu/drm/mediatek/mtk_mipi_tx.c | 487 
 drivers/gpu/drm/mediatek/mtk_mt8173_hdmi_phy.c | 506 
 include/dt-bindings/clock/mt8173-clk.h |   3 +-
 include/uapi/drm/mediatek_drm.h|  59 ++
 45 files changed, 9977 insertions(+), 5 deletions(-)
 create mode 100644

[PATCH v9 01/14] dt-bindings: drm/mediatek: Add Mediatek display subsystem dts binding

2016-01-12 Thread Philipp Zabel

From: CK Hu 

Add device tree binding documentation for the display subsystem in
Mediatek MT8173 SoCs.

Signed-off-by: CK Hu 
Signed-off-by: Philipp Zabel 
Acked-by: Rob Herring 
---
 .../bindings/display/mediatek/mediatek,disp.txt| 203 +
 .../bindings/display/mediatek/mediatek,dpi.txt |  35 
 .../bindings/display/mediatek/mediatek,dsi.txt |  60 ++
 3 files changed, 298 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
 create mode 100644 
Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.txt
 create mode 100644 
Documentation/devicetree/bindings/display/mediatek/mediatek,dsi.txt

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
new file mode 100644
index 000..db6e77e
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -0,0 +1,203 @@
+Mediatek display subsystem
+==
+
+The Mediatek display subsystem consists of various DISP function blocks in the
+MMSYS register space. The connections between them can be configured by output
+and input selectors in the MMSYS_CONFIG register space. Pixel clock and start
+of frame signal are distributed to the other function blocks by a DISP_MUTEX
+function block.
+
+All DISP device tree nodes must be siblings to the central MMSYS_CONFIG node.
+For a description of the MMSYS_CONFIG binding, see
+Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.txt.
+
+DISP function blocks
+
+
+A display stream starts at a source function block that reads pixel data from
+memory and ends with a sink function block that drives pixels on a display
+interface, or writes pixels back to memory. All DISP function blocks have
+their own register space, interrupt, and clock gate. The blocks that can
+access memory additionally have to list the IOMMU and local arbiter they are
+connected to.
+
+For a description of the display interface sink function blocks, see
+Documentation/devicetree/bindings/display/mediatek/mediatek,dsi.txt and
+Documentation/devicetree/bindings/display/mediatek/mediatek,dpi.txt.
+
+Required properties (all function blocks):
+- compatible: "mediatek,-disp-", one of
+   "mediatek,-disp-ovl"   - overlay (4 layers, blending, csc)
+   "mediatek,-disp-rdma"  - read DMA / line buffer
+   "mediatek,-disp-wdma"  - write DMA
+   "mediatek,-disp-color" - color processor
+   "mediatek,-disp-aal"   - adaptive ambient light controller
+   "mediatek,-disp-gamma" - gamma correction
+   "mediatek,-disp-merge" - merge streams from two RDMA sources
+   "mediatek,-disp-split" - split stream to two encoders
+   "mediatek,-disp-ufoe"  - data compression engine
+   "mediatek,-dsi"- DSI controller, see mediatek,dsi.txt
+   "mediatek,-dpi"- DPI controller, see mediatek,dpi.txt
+   "mediatek,-disp-mutex" - display mutex
+   "mediatek,-disp-od"- overdrive
+- reg: Physical base address and length of the function block register space
+- interrupts: The interrupt signal from the function block (required, except 
for
+  merge and split function blocks).
+- clocks: device clocks
+  See Documentation/devicetree/bindings/clock/clock-bindings.txt for details.
+  For most function blocks this is just a single clock input. Only the DSI and
+  DPI controller nodes have multiple clock inputs. These are documented in
+  mediatek,dsi.txt and mediatek,dpi.txt, respectively.
+
+Required properties (DMA function blocks):
+- compatible: Should be one of
+   "mediatek,-disp-ovl"
+   "mediatek,-disp-rdma"
+   "mediatek,-disp-wdma"
+- larb: Should contain a phandle pointing to the local arbiter device as 
defined
+  in Documentation/devicetree/bindings/soc/mediatek/mediatek,smi-larb.txt
+- iommus: Should point to the respective IOMMU block with master port as
+  argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.txt
+  for details.
+
+Examples:
+
+mmsys: clock-controller at 1400 {
+   compatible = "mediatek,mt8173-mmsys", "syscon";
+   reg = <0 0x1400 0 0x1000>;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
+   #clock-cells = <1>;
+};
+
+ovl0: ovl at 1400c000 {
+   compatible = "mediatek,mt8173-disp-ovl";
+   reg = <0 0x1400c000 0 0x1000>;
+   interrupts = ;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
+   clocks = <&mmsys CLK_MM_DISP_OVL0>;
+   iommus = <&iommu M4U_PORT_DISP_OVL0>;
+   mediatek,larb = <&larb0>;
+};
+
+ovl1: ovl at 1400d000 {
+   compatible = "mediatek,mt8173-disp-ovl";
+   reg = <0 0x1400d000 0 0x1000>;
+   interrupts = ;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
+   clocks = <&mmsys CLK_MM_DISP_OVL1>;
+   iommus = <&iommu M4U_PORT_DISP_OVL1>;
+   mediatek,larb = <&larb4>;
+};
+
+rdma0: rdma

[PATCH v9 03/14] drm/mediatek: Add DSI sub driver

2016-01-12 Thread Philipp Zabel

From: CK Hu 

This patch add a drm encoder/connector driver for the MIPI DSI function
block of the Mediatek display subsystem and a phy driver for the MIPI TX
D-PHY control module.

Signed-off-by: Jitao Shi 
Signed-off-by: Philipp Zabel 
---
 drivers/gpu/drm/mediatek/Kconfig   |   3 +
 drivers/gpu/drm/mediatek/Makefile  |   4 +-
 drivers/gpu/drm/mediatek/mtk_drm_drv.c |   2 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.h |   2 +
 drivers/gpu/drm/mediatek/mtk_dsi.c | 847 +
 drivers/gpu/drm/mediatek/mtk_dsi.h |  58 +++
 drivers/gpu/drm/mediatek/mtk_mipi_tx.c | 487 +++
 7 files changed, 1402 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dsi.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dsi.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_mipi_tx.c

diff --git a/drivers/gpu/drm/mediatek/Kconfig b/drivers/gpu/drm/mediatek/Kconfig
index 8dad892..b7e0404 100644
--- a/drivers/gpu/drm/mediatek/Kconfig
+++ b/drivers/gpu/drm/mediatek/Kconfig
@@ -3,6 +3,9 @@ config DRM_MEDIATEK
depends on DRM
depends on ARCH_MEDIATEK || (ARM && COMPILE_TEST)
select DRM_KMS_HELPER
+   select DRM_MIPI_DSI
+   select DRM_PANEL
+   select DRM_PANEL_SIMPLE
select IOMMU_DMA
select MTK_SMI
help
diff --git a/drivers/gpu/drm/mediatek/Makefile 
b/drivers/gpu/drm/mediatek/Makefile
index c7cc41a..e1a40f4 100644
--- a/drivers/gpu/drm/mediatek/Makefile
+++ b/drivers/gpu/drm/mediatek/Makefile
@@ -5,6 +5,8 @@ mediatek-drm-y := mtk_disp_ovl.o \
  mtk_drm_drv.o \
  mtk_drm_fb.o \
  mtk_drm_gem.o \
- mtk_drm_plane.o
+ mtk_drm_plane.o \
+ mtk_dsi.o \
+ mtk_mipi_tx.o

 obj-$(CONFIG_DRM_MEDIATEK) += mediatek-drm.o
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 9db22b4..39267f9 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -536,6 +536,8 @@ static struct platform_driver mtk_drm_platform_driver = {
 static struct platform_driver * const mtk_drm_drivers[] = {
&mtk_drm_platform_driver,
&mtk_disp_ovl_driver,
+   &mtk_dsi_driver,
+   &mtk_mipi_tx_driver,
 };

 static int __init mtk_drm_init(void)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.h 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.h
index 75e1b7d..e86c19e 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.h
@@ -48,5 +48,7 @@ struct mtk_drm_private {
 };

 extern struct platform_driver mtk_disp_ovl_driver;
+extern struct platform_driver mtk_dsi_driver;
+extern struct platform_driver mtk_mipi_tx_driver;

 #endif /* MTK_DRM_DRV_H */
diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c 
b/drivers/gpu/drm/mediatek/mtk_dsi.c
new file mode 100644
index 000..6ab5a31
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -0,0 +1,847 @@
+/*
+ * Copyright (c) 2015 MediaTek Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_dsi.h"
+
+#define DSI_VIDEO_FIFO_DEPTH   (1920 / 4)
+#define DSI_HOST_FIFO_DEPTH64
+
+#define DSI_START  0x00
+
+#define DSI_CON_CTRL   0x10
+#define DSI_RESET  BIT(0)
+#define DSI_EN BIT(1)
+
+#define DSI_MODE_CTRL  0x14
+#define MODE   (3)
+#define CMD_MODE   0
+#define SYNC_PULSE_MODE1
+#define SYNC_EVENT_MODE2
+#define BURST_MODE 3
+#define FRM_MODE   BIT(16)
+#define MIX_MODE   BIT(17)
+
+#define DSI_TXRX_CTRL  0x18
+#define VC_NUM (2 << 0)
+#define LANE_NUM   (0xf << 2)
+#define DIS_EOTBIT(6)
+#define NULL_ENBIT(7)
+#define TE_FREERUN BIT(8)
+#define EXT_TE_EN  BIT(9)
+#define EXT_TE_EDGEBIT(10)
+#define MAX_RTN_SIZE   (0xf << 12)
+#define HSTX_CKLP_EN   BIT(16)
+
+#define DSI_PSCTRL 0x1c
+#define DSI_PS_WC  0x3fff
+#define DSI_PS_SEL (3 << 16)
+#define PACKED_PS_16BIT_RGB565 (0 << 16)
+#define LOOSELY_PS_18BIT_

[PATCH v9 02/14] drm/mediatek: Add DRM Driver for Mediatek SoC MT8173.

2016-01-12 Thread Philipp Zabel

From: CK Hu 

This patch adds an initial DRM driver for the Mediatek MT8173 DISP
subsystem. It currently supports two fixed output streams from the
OVL0/OVL1 sources to the DSI0/DPI0 sinks, respectively.

Signed-off-by: CK Hu 
Signed-off-by: YT Shen 
Signed-off-by: Daniel Kurtz 
Signed-off-by: Philipp Zabel 
---
 drivers/gpu/drm/Kconfig |   2 +
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/mediatek/Kconfig|  12 +
 drivers/gpu/drm/mediatek/Makefile   |  10 +
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 301 ++
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 603 
 drivers/gpu/drm/mediatek/mtk_drm_crtc.h |  32 ++
 drivers/gpu/drm/mediatek/mtk_drm_ddp.c  | 355 
 drivers/gpu/drm/mediatek/mtk_drm_ddp.h  |  41 ++
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 275 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 148 +++
 drivers/gpu/drm/mediatek/mtk_drm_drv.c  | 577 ++
 drivers/gpu/drm/mediatek/mtk_drm_drv.h  |  52 +++
 drivers/gpu/drm/mediatek/mtk_drm_fb.c   | 165 
 drivers/gpu/drm/mediatek/mtk_drm_fb.h   |  29 ++
 drivers/gpu/drm/mediatek/mtk_drm_gem.c  | 227 +++
 drivers/gpu/drm/mediatek/mtk_drm_gem.h  |  55 +++
 drivers/gpu/drm/mediatek/mtk_drm_plane.c| 242 +++
 drivers/gpu/drm/mediatek/mtk_drm_plane.h|  59 +++
 19 files changed, 3186 insertions(+)
 create mode 100644 drivers/gpu/drm/mediatek/Kconfig
 create mode 100644 drivers/gpu/drm/mediatek/Makefile
 create mode 100644 drivers/gpu/drm/mediatek/mtk_disp_ovl.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_crtc.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_crtc.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_ddp.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_ddp.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_drv.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_drv.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_fb.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_fb.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_gem.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_gem.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_plane.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_plane.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index c4bf9a1..8fdb0c2 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -266,3 +266,5 @@ source "drivers/gpu/drm/amd/amdkfd/Kconfig"
 source "drivers/gpu/drm/imx/Kconfig"

 source "drivers/gpu/drm/vc4/Kconfig"
+
+source "drivers/gpu/drm/mediatek/Kconfig"
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 1e9ff4c..607a49f 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -71,6 +71,7 @@ obj-$(CONFIG_DRM_MSM) += msm/
 obj-$(CONFIG_DRM_TEGRA) += tegra/
 obj-$(CONFIG_DRM_STI) += sti/
 obj-$(CONFIG_DRM_IMX) += imx/
+obj-$(CONFIG_DRM_MEDIATEK) += mediatek/
 obj-y  += i2c/
 obj-y  += panel/
 obj-y  += bridge/
diff --git a/drivers/gpu/drm/mediatek/Kconfig b/drivers/gpu/drm/mediatek/Kconfig
new file mode 100644
index 000..8dad892
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/Kconfig
@@ -0,0 +1,12 @@
+config DRM_MEDIATEK
+   tristate "DRM Support for Mediatek SoCs"
+   depends on DRM
+   depends on ARCH_MEDIATEK || (ARM && COMPILE_TEST)
+   select DRM_KMS_HELPER
+   select IOMMU_DMA
+   select MTK_SMI
+   help
+ Choose this option if you have a Mediatek SoCs.
+ The module will be called mediatek-drm
+ This driver provides kernel mode setting and
+ buffer management to userspace.
diff --git a/drivers/gpu/drm/mediatek/Makefile 
b/drivers/gpu/drm/mediatek/Makefile
new file mode 100644
index 000..c7cc41a
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/Makefile
@@ -0,0 +1,10 @@
+mediatek-drm-y := mtk_disp_ovl.o \
+ mtk_drm_crtc.o \
+ mtk_drm_ddp.o \
+ mtk_drm_ddp_comp.o \
+ mtk_drm_drv.o \
+ mtk_drm_fb.o \
+ mtk_drm_gem.o \
+ mtk_drm_plane.o
+
+obj-$(CONFIG_DRM_MEDIATEK) += mediatek-drm.o
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
new file mode 100644
index 000..455e62e
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -0,0 +1,301 @@
+/*
+ * Copyright (c) 2015 MediaTek Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WAR

[PATCH v9 04/14] drm/mediatek: Add DPI sub driver

2016-01-12 Thread Philipp Zabel

From: Jie Qiu 

Add DPI connector/encoder to support HDMI output via the
attached HDMI bridge.

Signed-off-by: Jie Qiu 
Signed-off-by: Philipp Zabel 
---
Changes since v8:
- Fixed a DPI enable/disable and suspend/resume power count problem
---
 drivers/gpu/drm/mediatek/Makefile   |   3 +-
 drivers/gpu/drm/mediatek/mtk_dpi.c  | 757 
 drivers/gpu/drm/mediatek/mtk_dpi.h  |  85 
 drivers/gpu/drm/mediatek/mtk_dpi_regs.h | 228 ++
 drivers/gpu/drm/mediatek/mtk_drm_drv.c  |   1 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.h  |   1 +
 6 files changed, 1074 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dpi.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dpi.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_dpi_regs.h

diff --git a/drivers/gpu/drm/mediatek/Makefile 
b/drivers/gpu/drm/mediatek/Makefile
index e1a40f4..218071c 100644
--- a/drivers/gpu/drm/mediatek/Makefile
+++ b/drivers/gpu/drm/mediatek/Makefile
@@ -7,6 +7,7 @@ mediatek-drm-y := mtk_disp_ovl.o \
  mtk_drm_gem.o \
  mtk_drm_plane.o \
  mtk_dsi.o \
- mtk_mipi_tx.o
+ mtk_mipi_tx.o \
+ mtk_dpi.o

 obj-$(CONFIG_DRM_MEDIATEK) += mediatek-drm.o
diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c 
b/drivers/gpu/drm/mediatek/mtk_dpi.c
new file mode 100644
index 000..e5011cc
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
@@ -0,0 +1,757 @@
+/*
+ * Copyright (c) 2014 MediaTek Inc.
+ * Author: Jie Qiu 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_dpi.h"
+#include "mtk_dpi_regs.h"
+
+enum mtk_dpi_polarity {
+   MTK_DPI_POLARITY_RISING,
+   MTK_DPI_POLARITY_FALLING,
+};
+
+enum mtk_dpi_power_ctl {
+   DPI_POWER_START = BIT(0),
+   DPI_POWER_ENABLE = BIT(1),
+   DPI_POWER_RESUME = BIT(2),
+};
+
+struct mtk_dpi_polarities {
+   enum mtk_dpi_polarity de_pol;
+   enum mtk_dpi_polarity ck_pol;
+   enum mtk_dpi_polarity hsync_pol;
+   enum mtk_dpi_polarity vsync_pol;
+};
+
+struct mtk_dpi_sync_param {
+   u32 sync_width;
+   u32 front_porch;
+   u32 back_porch;
+   bool shift_half_line;
+};
+
+struct mtk_dpi_yc_limit {
+   u16 y_top;
+   u16 y_bottom;
+   u16 c_top;
+   u16 c_bottom;
+};
+
+static void mtk_dpi_mask(struct mtk_dpi *dpi, u32 offset, u32 val, u32 mask)
+{
+   u32 tmp = readl(dpi->regs + offset) & ~mask;
+
+   tmp |= (val & mask);
+   writel(tmp, dpi->regs + offset);
+}
+
+static void mtk_dpi_sw_reset(struct mtk_dpi *dpi, bool reset)
+{
+   mtk_dpi_mask(dpi, DPI_RET, reset ? RST : 0, RST);
+}
+
+static void mtk_dpi_enable(struct mtk_dpi *dpi)
+{
+   mtk_dpi_mask(dpi, DPI_EN, EN, EN);
+}
+
+static void mtk_dpi_disable(struct mtk_dpi *dpi)
+{
+   mtk_dpi_mask(dpi, DPI_EN, 0, EN);
+}
+
+static void mtk_dpi_config_hsync(struct mtk_dpi *dpi,
+struct mtk_dpi_sync_param *sync)
+{
+   mtk_dpi_mask(dpi, DPI_TGEN_HWIDTH,
+ sync->sync_width << HPW, HPW_MASK);
+   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH,
+ sync->back_porch << HBP, HBP_MASK);
+   mtk_dpi_mask(dpi, DPI_TGEN_HPORCH, sync->front_porch << HFP,
+ HFP_MASK);
+}
+
+static void mtk_dpi_config_vsync(struct mtk_dpi *dpi,
+struct mtk_dpi_sync_param *sync,
+u32 width_addr, u32 porch_addr)
+{
+   mtk_dpi_mask(dpi, width_addr,
+sync->sync_width << VSYNC_WIDTH_SHIFT,
+VSYNC_WIDTH_MASK);
+   mtk_dpi_mask(dpi, width_addr,
+sync->shift_half_line << VSYNC_HALF_LINE_SHIFT,
+VSYNC_HALF_LINE_MASK);
+   mtk_dpi_mask(dpi, porch_addr,
+sync->back_porch << VSYNC_BACK_PORCH_SHIFT,
+VSYNC_BACK_PORCH_MASK);
+   mtk_dpi_mask(dpi, porch_addr,
+sync->front_porch << VSYNC_FRONT_PORCH_SHIFT,
+VSYNC_FRONT_PORCH_MASK);
+}
+
+static void mtk_dpi_config_vsync_lodd(struct mtk_dpi *dpi,
+ struct mtk_dpi_sync_param *sync)
+{
+   mtk_dpi_config_vsync(dpi, sync, DPI_TGEN_VWIDTH, DPI_TGEN_VPORCH);
+}
+
+static void mtk_dpi_config_vsync_leven(struct mtk_dpi *dpi,
+  struct mtk_dpi_sync_param *sync)
+{
+   mtk_dpi_config_vsync(d

[PATCH v9 05/14] dt-bindings: drm/mediatek: Add Mediatek HDMI dts binding

2016-01-12 Thread Philipp Zabel

Add the device tree binding documentation for Mediatek HDMI,
HDMI PHY and HDMI DDC devices.

Signed-off-by: Philipp Zabel 
Acked-by: Rob Herring 
---
 .../bindings/display/mediatek/mediatek,hdmi.txt| 148 +
 1 file changed, 148 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/mediatek/mediatek,hdmi.txt

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,hdmi.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,hdmi.txt
new file mode 100644
index 000..7b12424
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,hdmi.txt
@@ -0,0 +1,148 @@
+Mediatek HDMI Encoder
+=
+
+The Mediatek HDMI encoder can generate HDMI 1.4a or MHL 2.0 signals from
+its parallel input.
+
+Required properties:
+- compatible: Should be "mediatek,-hdmi".
+- reg: Physical base address and length of the controller's registers
+- interrupts: The interrupt signal from the function block.
+- clocks: device clocks
+  See Documentation/devicetree/bindings/clock/clock-bindings.txt for details.
+- clock-names: must contain "pixel", "pll", "bclk", and "spdif".
+- phys: phandle link to the HDMI PHY node.
+  See Documentation/devicetree/bindings/phy/phy-bindings.txt for details.
+- phy-names: must contain "hdmi"
+- mediatek,syscon-hdmi: phandle link and register offset to the system
+  configuration registers. For mt8173 this must be offset 0x900 into the
+  MMSYS_CONFIG region: <&mmsys 0x900>.
+- ports: A node containing input and output port nodes with endpoint
+  definitions as documented in Documentation/devicetree/bindings/graph.txt.
+- port at 0: The input port in the ports node should be connected to a DPI 
output
+  port.
+- port at 1: The output port in the ports node should be connected to the input
+  port of a connector node that contains a ddc-i2c-bus property, or to the
+  input port of an attached bridge chip, such as a SlimPort transmitter.
+
+HDMI CEC
+
+
+The HDMI CEC controller handles hotplug detection and CEC communication.
+
+Required properties:
+- compatible: Should be "mediatek,-cec"
+- reg: Physical base address and length of the controller's registers
+- interrupts: The interrupt signal from the function block.
+- clocks: device clock
+
+HDMI DDC
+
+
+The HDMI DDC i2c controller is used to interface with the HDMI DDC pins.
+The Mediatek's I2C controller is used to interface with I2C devices.
+
+Required properties:
+- compatible: Should be "mediatek,-hdmi-ddc"
+- reg: Physical base address and length of the controller's registers
+- clocks: device clock
+- clock-names: Should be "ddc-i2c".
+
+HDMI PHY
+
+
+The HDMI PHY serializes the HDMI encoder's three channel 10-bit parallel
+output and drives the HDMI pads.
+
+Required properties:
+- compatible: "mediatek,-hdmi-phy"
+- reg: Physical base address and length of the module's registers
+- clocks: PLL reference clock
+- clock-names: must contain "pll_ref"
+- clock-output-names: must be "hdmitx_dig_cts" on mt8173
+- #phy-cells: must be <0>
+- #clock-cells: must be <0>
+
+Optional properties:
+- mediatek,ibias: TX DRV bias current for <1.65Gbps, defaults to 0xa
+- mediatek,ibias_up: TX DRV bias current for >1.65Gbps, defaults to 0x1c
+
+Example:
+
+cec: cec at 10013000 {
+   compatible = "mediatek,mt8173-cec";
+   reg = <0 0x10013000 0 0xbc>;
+   interrupts = ;
+   clocks = <&infracfg CLK_INFRA_CEC>;
+};
+
+hdmi_phy: hdmi-phy at 10209100 {
+   compatible = "mediatek,mt8173-hdmi-phy";
+   reg = <0 0x10209100 0 0x24>;
+   clocks = <&apmixedsys CLK_APMIXED_HDMI_REF>;
+   clock-names = "pll_ref";
+   clock-output-names = "hdmitx_dig_cts";
+   mediatek,ibias = <0xa>;
+   mediatek,ibias_up = <0x1c>;
+   #clock-cells = <0>;
+   #phy-cells = <0>;
+};
+
+hdmi_ddc0: i2c at 11012000 {
+   compatible = "mediatek,mt8173-hdmi-ddc";
+   reg = <0 0x11012000 0 0x1c>;
+   interrupts = ;
+   clocks = <&pericfg CLK_PERI_I2C5>;
+   clock-names = "ddc-i2c";
+};
+
+hdmi0: hdmi at 14025000 {
+   compatible = "mediatek,mt8173-hdmi";
+   reg = <0 0x14025000 0 0x400>;
+   interrupts = ;
+   clocks = <&mmsys CLK_MM_HDMI_PIXEL>,
+<&mmsys CLK_MM_HDMI_PLLCK>,
+<&mmsys CLK_MM_HDMI_AUDIO>,
+<&mmsys CLK_MM_HDMI_SPDIF>;
+   clock-names = "pixel", "pll", "bclk", "spdif";
+   pinctrl-names = "default";
+   pinctrl-0 = <&hdmi_pin>;
+   phys = <&hdmi_phy>;
+   phy-names = "hdmi";
+   mediatek,syscon-hdmi = <&mmsys 0x900>;
+   assigned-clocks = <&topckgen CLK_TOP_HDMI_SEL>;
+   assigned-clock-parents = <&hdmi_phy>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port at 0 {
+   reg = <0>;
+
+   hdmi0_in: endpoint {
+   remote-endpoint = <&dpi0_out>;
+

[PATCH v9 06/14] drm/mediatek: Add HDMI support

2016-01-12 Thread Philipp Zabel

From: Jie Qiu 

This patch adds drivers for the HDMI bridge connected to the DPI0
display subsystem function block, for the HDMI DDC block, and for
the HDMI PHY to support HDMI output.

Signed-off-by: Jie Qiu 
Signed-off-by: Philipp Zabel 
---
Changes since v8:
 - Reworked N, CTS setup (use recommended N from spec, calculate CTS)
 - Dropped deep color support
 - Dropped unused divider tables
 - Changed some noisy dev_info to dev_dbg
 - Improved mode_valid return value
---
 drivers/gpu/drm/mediatek/Kconfig   |   7 +
 drivers/gpu/drm/mediatek/Makefile  |   9 +
 drivers/gpu/drm/mediatek/mtk_cec.c | 245 ++
 drivers/gpu/drm/mediatek/mtk_cec.h |  25 +
 drivers/gpu/drm/mediatek/mtk_drm_drv.c |   1 +
 drivers/gpu/drm/mediatek/mtk_drm_hdmi_drv.c| 609 +++
 drivers/gpu/drm/mediatek/mtk_hdmi.c| 479 ++
 drivers/gpu/drm/mediatek/mtk_hdmi.h| 223 +
 drivers/gpu/drm/mediatek/mtk_hdmi_ddc_drv.c| 362 ++
 drivers/gpu/drm/mediatek/mtk_hdmi_hw.c | 652 +
 drivers/gpu/drm/mediatek/mtk_hdmi_hw.h |  73 +++
 drivers/gpu/drm/mediatek/mtk_hdmi_regs.h   | 221 +
 drivers/gpu/drm/mediatek/mtk_mt8173_hdmi_phy.c | 506 +++
 13 files changed, 3412 insertions(+)
 create mode 100644 drivers/gpu/drm/mediatek/mtk_cec.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_cec.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_drm_hdmi_drv.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_hdmi.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_hdmi.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_hdmi_ddc_drv.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_hdmi_hw.c
 create mode 100644 drivers/gpu/drm/mediatek/mtk_hdmi_hw.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_hdmi_regs.h
 create mode 100644 drivers/gpu/drm/mediatek/mtk_mt8173_hdmi_phy.c

diff --git a/drivers/gpu/drm/mediatek/Kconfig b/drivers/gpu/drm/mediatek/Kconfig
index b7e0404..829ab66 100644
--- a/drivers/gpu/drm/mediatek/Kconfig
+++ b/drivers/gpu/drm/mediatek/Kconfig
@@ -13,3 +13,10 @@ config DRM_MEDIATEK
  The module will be called mediatek-drm
  This driver provides kernel mode setting and
  buffer management to userspace.
+
+config DRM_MEDIATEK_HDMI
+   tristate "DRM HDMI Support for Mediatek SoCs"
+   depends on DRM_MEDIATEK
+   select GENERIC_PHY
+   help
+ DRM/KMS HDMI driver for Mediatek SoCs
diff --git a/drivers/gpu/drm/mediatek/Makefile 
b/drivers/gpu/drm/mediatek/Makefile
index 218071c..2a81eeb 100644
--- a/drivers/gpu/drm/mediatek/Makefile
+++ b/drivers/gpu/drm/mediatek/Makefile
@@ -11,3 +11,12 @@ mediatek-drm-y := mtk_disp_ovl.o \
  mtk_dpi.o

 obj-$(CONFIG_DRM_MEDIATEK) += mediatek-drm.o
+
+mediatek-drm-hdmi-objs := mtk_cec.o \
+ mtk_drm_hdmi_drv.o \
+ mtk_hdmi.o \
+ mtk_hdmi_ddc_drv.o \
+ mtk_hdmi_hw.o \
+ mtk_mt8173_hdmi_phy.o
+
+obj-$(CONFIG_DRM_MEDIATEK_HDMI) += mediatek-drm-hdmi.o
diff --git a/drivers/gpu/drm/mediatek/mtk_cec.c 
b/drivers/gpu/drm/mediatek/mtk_cec.c
new file mode 100644
index 000..cba3647
--- /dev/null
+++ b/drivers/gpu/drm/mediatek/mtk_cec.c
@@ -0,0 +1,245 @@
+/*
+ * Copyright (c) 2014 MediaTek Inc.
+ * Author: Jie Qiu 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mtk_cec.h"
+
+#define TR_CONFIG  0x00
+#define CLEAR_CEC_IRQ  BIT(15)
+
+#define CEC_CKGEN  0x04
+#define CEC_32K_PDNBIT(19)
+#define PDNBIT(16)
+
+#define RX_EVENT   0x54
+#define HDMI_PORD  BIT(25)
+#define HDMI_HTPLG BIT(24)
+#define HDMI_PORD_INT_EN   BIT(9)
+#define HDMI_HTPLG_INT_EN  BIT(8)
+
+#define RX_GEN_WD  0x58
+#define HDMI_PORD_INT_32K_STATUS   BIT(26)
+#define RX_RISC_INT_32K_STATUS BIT(25)
+#define HDMI_HTPLG_INT_32K_STATUS  BIT(24)
+#define HDMI_PORD_INT_32K_CLR  BIT(18)
+#define RX_INT_32K_CLR BIT(17)
+#define HDMI_HTPLG_INT_32K_CLR BIT(16)
+#define HDMI_PORD_INT_32K_STA_MASK BIT(10)
+#define RX_RISC_INT_32K_STA_MASK   BIT(9)
+#define HDMI_HTPLG_INT_32K_STA_MASKBIT(8)
+#define HDMI_PORD_INT_32K_EN   BIT(2)
+#define RX_INT_32K_EN  BIT(1)
+#defi

[PATCH v9 07/14] drm/mediatek: enable hdmi output control bit

2016-01-12 Thread Philipp Zabel

From: Jie Qiu 

MT8173 HDMI hardware has a output control bit to enable/disable HDMI
output. Because of security reason, so this bit can ONLY be controlled
in ARM supervisor mode. Now the only way to enter ARM supervisor is the
ARM trusted firmware. So atf provides a API for HDMI driver to call to
setup this HDMI control bit to enable HDMI output in supervisor mode.

Signed-off-by: Jie Qiu 
Signed-off-by: Philipp Zabel 
---
 drivers/gpu/drm/mediatek/mtk_hdmi_hw.c   | 11 +++
 drivers/gpu/drm/mediatek/mtk_hdmi_regs.h |  1 +
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi_hw.c 
b/drivers/gpu/drm/mediatek/mtk_hdmi_hw.c
index 99c7ffc..054afc6 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi_hw.c
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi_hw.c
@@ -19,8 +19,15 @@
 #include 
 #include 
 #include 
+#include 
 #include 

+static int (*invoke_psci_fn)(u64, u64, u64, u64);
+typedef int (*psci_initcall_t)(const struct device_node *);
+
+asmlinkage int __invoke_psci_fn_hvc(u64, u64, u64, u64);
+asmlinkage int __invoke_psci_fn_smc(u64, u64, u64, u64);
+
 static u32 mtk_hdmi_read(struct mtk_hdmi *hdmi, u32 offset)
 {
return readl(hdmi->regs + offset);
@@ -50,6 +57,10 @@ void mtk_hdmi_hw_vid_black(struct mtk_hdmi *hdmi,

 void mtk_hdmi_hw_make_reg_writable(struct mtk_hdmi *hdmi, bool enable)
 {
+   invoke_psci_fn = __invoke_psci_fn_smc;
+   invoke_psci_fn(MTK_SIP_SET_AUTHORIZED_SECURE_REG,
+  0x14000904, 0x8000, 0);
+
regmap_update_bits(hdmi->sys_regmap, hdmi->sys_offset + HDMI_SYS_CFG20,
   HDMI_PCLK_FREE_RUN, enable ? HDMI_PCLK_FREE_RUN : 0);
regmap_update_bits(hdmi->sys_regmap, hdmi->sys_offset + HDMI_SYS_CFG1C,
diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi_regs.h 
b/drivers/gpu/drm/mediatek/mtk_hdmi_regs.h
index de7ee22..8d7d60a 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi_regs.h
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi_regs.h
@@ -218,4 +218,5 @@
 #define MHL_SYNC_AUTO_EN   BIT(30)
 #define HDMI_PCLK_FREE_RUN BIT(31)

+#define MTK_SIP_SET_AUTHORIZED_SECURE_REG 0x8201
 #endif
-- 
2.6.4

[PATCH v9 08/14] arm64: dts: mt8173: Add display subsystem related nodes

2016-01-12 Thread Philipp Zabel

From: CK Hu 

This patch adds the device nodes for the DISP function blocks
comprising the display subsystem.

Signed-off-by: CK Hu 
Signed-off-by: Cawa Cheng 
Signed-off-by: Jie Qiu 
Signed-off-by: Daniel Kurtz 
Signed-off-by: Philipp Zabel 
---
 arch/arm64/boot/dts/mediatek/mt8173.dtsi | 237 +++
 1 file changed, 237 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
index 4901f13..68c1cb2 100644
--- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
@@ -25,6 +25,23 @@
#address-cells = <2>;
#size-cells = <2>;

+   aliases {
+   ovl0 = &ovl0;
+   ovl1 = &ovl1;
+   rdma0 = &rdma0;
+   rdma1 = &rdma1;
+   rdma2 = &rdma2;
+   wdma0 = &wdma0;
+   wdma1 = &wdma1;
+   color0 = &color0;
+   color1 = &color1;
+   split0 = &split0;
+   split1 = &split1;
+   dpi0 = &dpi0;
+   dsi0 = &dsi0;
+   dsi1 = &dsi1;
+   };
+
cpus {
#address-cells = <1>;
#size-cells = <0>;
@@ -285,6 +302,24 @@
#clock-cells = <1>;
};

+   mipi_tx0: mipi-dphy at 10215000 {
+   compatible = "mediatek,mt8173-mipi-tx";
+   reg = <0 0x10215000 0 0x1000>;
+   clocks = <&clk26m>;
+   clock-output-names = "mipi_tx0_pll";
+   #clock-cells = <0>;
+   #phy-cells = <0>;
+   };
+
+   mipi_tx1: mipi-dphy at 10216000 {
+   compatible = "mediatek,mt8173-mipi-tx";
+   reg = <0 0x10216000 0 0x1000>;
+   clocks = <&clk26m>;
+   clock-output-names = "mipi_tx1_pll";
+   #clock-cells = <0>;
+   #phy-cells = <0>;
+   };
+
gic: interrupt-controller at 1022 {
compatible = "arm,gic-400";
#interrupt-cells = <3>;
@@ -431,6 +466,14 @@
status = "disabled";
};

+   hdmiddc0: i2c at 11012000 {
+   compatible = "mediatek,mt8173-hdmi-ddc";
+   interrupts = ;
+   reg = <0 0x11012000 0 0x1C>;
+   clocks = <&pericfg CLK_PERI_I2C5>;
+   clock-names = "ddc-i2c";
+   };
+
i2c6: i2c at 11013000 {
compatible = "mediatek,mt8173-i2c";
reg = <0 0x11013000 0 0x70>,
@@ -525,7 +568,187 @@
mmsys: clock-controller at 1400 {
compatible = "mediatek,mt8173-mmsys", "syscon";
reg = <0 0x1400 0 0x1000>;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
#clock-cells = <1>;
+
+   /* FIXME - remove iommus here */
+   iommus = <&iommu M4U_PORT_DISP_OVL0>,
+<&iommu M4U_PORT_DISP_OVL1>;
+   };
+
+   ovl0: ovl at 1400c000 {
+   compatible = "mediatek,mt8173-disp-ovl";
+   reg = <0 0x1400c000 0 0x1000>;
+   interrupts = ;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
+   clocks = <&mmsys CLK_MM_DISP_OVL0>;
+   iommus = <&iommu M4U_PORT_DISP_OVL0>;
+   mediatek,larb = <&larb0>;
+   };
+
+   ovl1: ovl at 1400d000 {
+   compatible = "mediatek,mt8173-disp-ovl";
+   reg = <0 0x1400d000 0 0x1000>;
+   interrupts = ;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
+   clocks = <&mmsys CLK_MM_DISP_OVL1>;
+   iommus = <&iommu M4U_PORT_DISP_OVL1>;
+   mediatek,larb = <&larb4>;
+   };
+
+   rdma0: rdma at 1400e000 {
+   compatible = "mediatek,mt8173-disp-rdma";
+   reg = <0 0x1400e000 0 0x1000>;
+   interrupts = ;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>;
+   clocks = <&mmsys CLK_MM_DISP_RDMA0>;
+   iommus = <&iommu M4U_PORT_DISP_RDMA0>;
+   mediatek,larb = <&larb0>;
+   };
+
+   rdma1: rdma at 1400f000 {
+   compatible = "mediatek,mt8173-disp-rdma";
+   reg = <0 0x1400f000 0 0x1000>;
+   interrupts = ;
+   power-domains = <&scpsys MT8173_POWER_DOMAIN_M

[PATCH v9 09/14] arm64: dts: mt8173: Add HDMI related nodes

2016-01-12 Thread Philipp Zabel

From: CK Hu 

This patch adds the device nodes for the HDMI encoder, HDMI PHY,
and HDMI CEC modules.

Signed-off-by: CK Hu 
Signed-off-by: Cawa Cheng 
Signed-off-by: Jie Qiu 
Signed-off-by: Daniel Kurtz 
Signed-off-by: Philipp Zabel 
---
 arch/arm64/boot/dts/mediatek/mt8173.dtsi | 75 
 1 file changed, 75 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8173.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
index 68c1cb2..eb5210e 100644
--- a/arch/arm64/boot/dts/mediatek/mt8173.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8173.dtsi
@@ -198,6 +198,30 @@
 ,
 ;

+   hdmi_pin: xxx {
+
+   /*hdmi htplg pin*/
+   pins1 {
+   pinmux = 
;
+   input-enable;
+   bias-pull-down;
+   };
+
+   /*hdmi flt 5v pin*/
+   pins2 {
+   pinmux = 
;
+   input-enable;
+   bias-pull-up;
+   };
+
+   /*hdmi 5v pin*/
+   pins3 {
+   pinmux = 
;
+   output-enable;
+   bias-pull-up;
+   };
+   };
+
i2c0_pins_a: i2c0 {
pins1 {
pinmux = 
,
@@ -276,6 +300,13 @@
clock-names = "spi", "wrap";
};

+   cec: cec at 10013000 {
+   compatible = "mediatek,mt8173-cec";
+   reg = <0 0x10013000 0 0xbc>;
+   interrupts = ;
+   clocks = <&infracfg CLK_INFRA_CEC>;
+   };
+
sysirq: intpol-controller at 10200620 {
compatible = "mediatek,mt8173-sysirq",
 "mediatek,mt6577-sysirq";
@@ -302,6 +333,18 @@
#clock-cells = <1>;
};

+   hdmi_phy: hdmi-phy at 10209100 {
+   compatible = "mediatek,mt8173-hdmi-phy";
+   reg = <0 0x10209100 0 0x24>;
+   clocks = <&apmixedsys CLK_APMIXED_HDMI_REF>;
+   clock-names = "pll_ref";
+   clock-output-names = "hdmitx_dig_cts";
+   mediatek,ibias = <0xa>;
+   mediatek,ibias_up = <0x1c>;
+   #clock-cells = <0>;
+   #phy-cells = <0>;
+   };
+
mipi_tx0: mipi-dphy at 10215000 {
compatible = "mediatek,mt8173-mipi-tx";
reg = <0 0x10215000 0 0x1000>;
@@ -806,6 +849,38 @@
clock-names = "apb", "smi";
};

+   hdmi0: hdmi at 14025000 {
+   compatible = "mediatek,mt8173-hdmi";
+   reg = <0 0x14025000 0 0x400>;
+   interrupts = ;
+   clocks = <&mmsys CLK_MM_HDMI_PIXEL>,
+<&mmsys CLK_MM_HDMI_PLLCK>,
+<&mmsys CLK_MM_HDMI_AUDIO>,
+<&mmsys CLK_MM_HDMI_SPDIF>;
+   clock-names = "pixel", "pll", "bclk", "spdif";
+   pinctrl-names = "default";
+   pinctrl-0 = <&hdmi_pin>;
+   phys = <&hdmi_phy>;
+   phy-names = "hdmi";
+   mediatek,syscon-hdmi = <&mmsys 0x900>;
+   assigned-clocks = <&topckgen CLK_TOP_HDMI_SEL>;
+   assigned-clock-parents = <&hdmi_phy>;
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port at 0 {
+   reg = <0>;
+
+   hdmi0_in: endpoint {
+   remote-endpoint = <&dpi0_out>;
+   };
+   };
+   };
+   };
+
larb4: larb at 14027000 {
compatible = "mediatek,mt8173-smi-larb";
reg = <0 0x14027000 0 0x1000>;
-- 
2.6.4

[PATCH v9 10/14] clk: mediatek: make dpi0_sel propagate rate changes

2016-01-12 Thread Philipp Zabel

This mux is supposed to select a fitting divider after the PLL
is already set to the correct rate.

Signed-off-by: Philipp Zabel 
Acked-by: James Liao 
---
 drivers/clk/mediatek/clk-mt8173.c | 2 +-
 drivers/clk/mediatek/clk-mtk.h| 7 +--
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/mediatek/clk-mt8173.c 
b/drivers/clk/mediatek/clk-mt8173.c
index 227e356..682b275 100644
--- a/drivers/clk/mediatek/clk-mt8173.c
+++ b/drivers/clk/mediatek/clk-mt8173.c
@@ -558,7 +558,7 @@ static const struct mtk_composite top_muxes[] __initconst = 
{
MUX_GATE(CLK_TOP_ATB_SEL, "atb_sel", atb_parents, 0x0090, 16, 2, 23),
MUX_GATE(CLK_TOP_VENC_LT_SEL, "venclt_sel", venc_lt_parents, 0x0090, 
24, 4, 31),
/* CLK_CFG_6 */
-   MUX_GATE(CLK_TOP_DPI0_SEL, "dpi0_sel", dpi0_parents, 0x00a0, 0, 3, 7),
+   MUX_GATE_FLAGS(CLK_TOP_DPI0_SEL, "dpi0_sel", dpi0_parents, 0x00a0, 0, 
3, 7, 0),
MUX_GATE(CLK_TOP_IRDA_SEL, "irda_sel", irda_parents, 0x00a0, 8, 2, 15),
MUX_GATE(CLK_TOP_CCI400_SEL, "cci400_sel", cci400_parents, 0x00a0, 16, 
3, 23),
MUX_GATE(CLK_TOP_AUD_1_SEL, "aud_1_sel", aud_1_parents, 0x00a0, 24, 2, 
31),
diff --git a/drivers/clk/mediatek/clk-mtk.h b/drivers/clk/mediatek/clk-mtk.h
index 32d2e45..b607996 100644
--- a/drivers/clk/mediatek/clk-mtk.h
+++ b/drivers/clk/mediatek/clk-mtk.h
@@ -83,7 +83,7 @@ struct mtk_composite {
signed char num_parents;
 };

-#define MUX_GATE(_id, _name, _parents, _reg, _shift, _width, _gate) {  \
+#define MUX_GATE_FLAGS(_id, _name, _parents, _reg, _shift, _width, _gate, 
_flags) {\
.id = _id,  \
.name = _name,  \
.mux_reg = _reg,\
@@ -94,9 +94,12 @@ struct mtk_composite {
.divider_shift = -1,\
.parent_names = _parents,   \
.num_parents = ARRAY_SIZE(_parents),\
-   .flags = CLK_SET_RATE_PARENT,   \
+   .flags = _flags,\
}

+#define MUX_GATE(_id, _name, _parents, _reg, _shift, _width, _gate)\
+   MUX_GATE_FLAGS(_id, _name, _parents, _reg, _shift, _width, _gate, 
CLK_SET_RATE_PARENT)
+
 #define MUX(_id, _name, _parents, _reg, _shift, _width) {  \
.id = _id,  \
.name = _name,  \
-- 
2.6.4

[PATCH v9 11/14] clk: mediatek: Add hdmi_ref HDMI PHY PLL reference clock output

2016-01-12 Thread Philipp Zabel

The configurable hdmi_ref output of the PLL block is derived from
the tvdpll_594m clock signal via a configurable PLL post-divider.
It is used as the PLL reference input to the HDMI PHY module.

Signed-off-by: Philipp Zabel 
Acked-by: James Liao 
---
 drivers/clk/mediatek/clk-mt8173.c  | 5 +
 include/dt-bindings/clock/mt8173-clk.h | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/mediatek/clk-mt8173.c 
b/drivers/clk/mediatek/clk-mt8173.c
index 682b275..3ae0b88 100644
--- a/drivers/clk/mediatek/clk-mt8173.c
+++ b/drivers/clk/mediatek/clk-mt8173.c
@@ -1091,6 +1091,11 @@ static void __init mtk_apmixedsys_init(struct 
device_node *node)
clk_data->clks[cku->id] = clk;
}

+   clk = clk_register_divider(NULL, "hdmi_ref", "tvdpll_594m", 0,
+  base + 0x40, 16, 3, CLK_DIVIDER_POWER_OF_TWO,
+  NULL);
+   clk_data->clks[CLK_APMIXED_HDMI_REF] = clk;
+
r = of_clk_add_provider(node, of_clk_src_onecell_get, clk_data);
if (r)
pr_err("%s(): could not register clock provider: %d\n",
diff --git a/include/dt-bindings/clock/mt8173-clk.h 
b/include/dt-bindings/clock/mt8173-clk.h
index 7956ba1..6094bf7 100644
--- a/include/dt-bindings/clock/mt8173-clk.h
+++ b/include/dt-bindings/clock/mt8173-clk.h
@@ -176,7 +176,8 @@
 #define CLK_APMIXED_LVDSPLL13
 #define CLK_APMIXED_MSDCPLL2   14
 #define CLK_APMIXED_REF2USB_TX 15
-#define CLK_APMIXED_NR_CLK 16
+#define CLK_APMIXED_HDMI_REF   16
+#define CLK_APMIXED_NR_CLK 17

 /* INFRA_SYS */

-- 
2.6.4

[PATCH v9 13/14] clk: mediatek: remove hdmitx_dig_cts from TOP clocks

2016-01-12 Thread Philipp Zabel

The hdmitx_dig_cts clock signal is not a child of tvdpll_445p5m,
but is routed out of the HDMI PHY module.

Signed-off-by: Philipp Zabel 
---
 drivers/clk/mediatek/clk-mt8173.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/clk/mediatek/clk-mt8173.c 
b/drivers/clk/mediatek/clk-mt8173.c
index 3ae0b88..e0d9994 100644
--- a/drivers/clk/mediatek/clk-mt8173.c
+++ b/drivers/clk/mediatek/clk-mt8173.c
@@ -61,7 +61,6 @@ static const struct mtk_fixed_factor top_divs[] __initconst = 
{
FACTOR(CLK_TOP_CLKRTC_INT, "clkrtc_int", "clk26m", 1, 793),
FACTOR(CLK_TOP_FPC, "fpc_ck", "clk26m", 1, 1),

-   FACTOR(CLK_TOP_HDMITX_DIG_CTS, "hdmitx_dig_cts", "tvdpll_445p5m", 1, 3),
FACTOR(CLK_TOP_HDMITXPLL_D2, "hdmitxpll_d2", "hdmitx_dig_cts", 1, 2),
FACTOR(CLK_TOP_HDMITXPLL_D3, "hdmitxpll_d3", "hdmitx_dig_cts", 1, 3),

-- 
2.6.4

[PATCH v9 12/14] dt-bindings: hdmi-connector: add DDC I2C bus phandle documentation

2016-01-12 Thread Philipp Zabel

Add an optional ddc-i2c-bus phandle property that points to
an I2C master controller that handles the connector DDC pins.

Signed-off-by: Philipp Zabel 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/display/connector/hdmi-connector.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/display/connector/hdmi-connector.txt 
b/Documentation/devicetree/bindings/display/connector/hdmi-connector.txt
index acd5668..508aee4 100644
--- a/Documentation/devicetree/bindings/display/connector/hdmi-connector.txt
+++ b/Documentation/devicetree/bindings/display/connector/hdmi-connector.txt
@@ -8,6 +8,7 @@ Required properties:
 Optional properties:
 - label: a symbolic name for the connector
 - hpd-gpios: HPD GPIO number
+- ddc-i2c-bus: phandle link to the I2C controller used for DDC EDID probing

 Required nodes:
 - Video port for HDMI input
-- 
2.6.4

[PATCH v9 14/14] drm/mediatek: Add interface to allocate Mediatek GEM buffer.

2016-01-12 Thread Philipp Zabel

From: CK Hu 

Add an interface to allocate Mediatek GEM buffers, allow the IOCTLs
to be used by render nodes.
This patch also sets the RENDER driver feature.

Signed-off-by: CK Hu 
Signed-off-by: Nicolas Boichat 
Signed-off-by: Philipp Zabel 
---
 drivers/gpu/drm/mediatek/mtk_drm_drv.c | 13 +++-
 drivers/gpu/drm/mediatek/mtk_drm_gem.c | 39 ++
 drivers/gpu/drm/mediatek/mtk_drm_gem.h | 12 +++
 include/uapi/drm/mediatek_drm.h| 59 ++
 4 files changed, 122 insertions(+), 1 deletion(-)
 create mode 100644 include/uapi/drm/mediatek_drm.h

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index fdb27e9..1f776a9 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "mtk_cec.h"
 #include "mtk_drm_crtc.h"
@@ -222,6 +223,14 @@ static const struct vm_operations_struct 
mtk_drm_gem_vm_ops = {
.close = drm_gem_vm_close,
 };

+static const struct drm_ioctl_desc mtk_ioctls[] = {
+   DRM_IOCTL_DEF_DRV(MTK_GEM_CREATE, mtk_gem_create_ioctl,
+ DRM_UNLOCKED | DRM_AUTH | DRM_RENDER_ALLOW),
+   DRM_IOCTL_DEF_DRV(MTK_GEM_MAP_OFFSET,
+ mtk_gem_map_offset_ioctl,
+ DRM_UNLOCKED | DRM_AUTH | DRM_RENDER_ALLOW),
+};
+
 static const struct file_operations mtk_drm_fops = {
.owner = THIS_MODULE,
.open = drm_open,
@@ -237,7 +246,7 @@ static const struct file_operations mtk_drm_fops = {

 static struct drm_driver mtk_drm_driver = {
.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME |
-  DRIVER_ATOMIC,
+  DRIVER_ATOMIC | DRIVER_RENDER,
.unload = mtk_drm_unload,
.set_busid = drm_platform_set_busid,

@@ -257,6 +266,8 @@ static struct drm_driver mtk_drm_driver = {
.gem_prime_import = drm_gem_prime_import,
.gem_prime_get_sg_table = mtk_gem_prime_get_sg_table,
.gem_prime_mmap = mtk_drm_gem_mmap_buf,
+   .ioctls = mtk_ioctls,
+   .num_ioctls = ARRAY_SIZE(mtk_ioctls),
.fops = &mtk_drm_fops,

.name = DRIVER_NAME,
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 96cc980..f726d55 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -13,6 +13,7 @@

 #include 
 #include 
+#include 

 #include "mtk_drm_gem.h"

@@ -225,3 +226,41 @@ struct sg_table *mtk_gem_prime_get_sg_table(struct 
drm_gem_object *obj)

return sgt;
 }
+
+int mtk_gem_map_offset_ioctl(struct drm_device *drm, void *data,
+struct drm_file *file_priv)
+{
+   struct drm_mtk_gem_map_off *args = data;
+
+   return mtk_drm_gem_dumb_map_offset(file_priv, drm, args->handle,
+  &args->offset);
+}
+
+int mtk_gem_create_ioctl(struct drm_device *dev, void *data,
+struct drm_file *file_priv)
+{
+   struct mtk_drm_gem_obj *mtk_gem;
+   struct drm_mtk_gem_create *args = data;
+   int ret;
+
+   mtk_gem = mtk_drm_gem_create(dev, args->size, false);
+   if (IS_ERR(mtk_gem))
+   return PTR_ERR(mtk_gem);
+
+   /*
+* allocate a id of idr table where the obj is registered
+* and handle has the id what user can see.
+*/
+   ret = drm_gem_handle_create(file_priv, &mtk_gem->base, &args->handle);
+   if (ret)
+   goto err_handle_create;
+
+   /* drop reference from allocate - handle holds it now. */
+   drm_gem_object_unreference_unlocked(&mtk_gem->base);
+
+   return 0;
+
+err_handle_create:
+   mtk_drm_gem_free_object(&mtk_gem->base);
+   return ret;
+}
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.h 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
index 9bdeeb3..28b8fa7 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
@@ -52,4 +52,16 @@ int mtk_drm_gem_mmap_buf(struct drm_gem_object *obj,
struct vm_area_struct *vma);
 struct sg_table *mtk_gem_prime_get_sg_table(struct drm_gem_object *obj);

+/*
+ * request gem object creation and buffer allocation as the size
+ * that it is calculated with framebuffer information such as width,
+ * height and bpp.
+ */
+int mtk_gem_create_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file_priv);
+
+/* get buffer offset to map to user space. */
+int mtk_gem_map_offset_ioctl(struct drm_device *dev, void *data,
+   struct drm_file *file_priv);
+
 #endif
diff --git a/include/uapi/drm/mediatek_drm.h b/include/uapi/drm/mediatek_drm.h
new file mode 100644
index 000..19ea357
--- /dev/null
+++ b/include/uapi/drm/mediatek_drm.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (c) 2015 MediaTek Inc.
+ *
+ * This program is free software; you can

[PATCH v3 0/3] drm/exynos: introduce generic zpos property

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 02:39:17PM +0100, Marek Szyprowski wrote:
> Hello all,
> 
> This patch series is a continuation of rework of blending support in
> Exynos DRM driver. Some background can be found here:
> http://www.spinics.net/lists/dri-devel/msg96969.html
> 
> Daniel Vetter suggested that zpos property should be made generic, with
> well-defined semantics. This patchset is my proposal for such generic
> zpos property:
> - added zpos properties to drm core and plane state structures,
> - added helpers for normalizing zpos properties of given set of planes,
> - well defined semantics: planes are sorted by zpos values and then plane
>   id value if zpos equals.
> 
> Patches are based on top of latest exynos-drm-next branch.
> 
> Best regards
> Marek Szyprowski
> Samsung R&D Institute Poland
> 
> Changelog:
> 
> v3:
> - on request of Daniel Vetter, moved all normalization process to DRM
>   core, drivers can simply use plane_state->normalized_zpos in their
>   atomic_check/update callbacks with no additional changes needed
> - updated documentation
> 
> v2: http://www.spinics.net/lists/dri-devel/msg98093.html
> - dropped 2 fixes for Exynos DRM, which got merged in meantime
> - added more comments and kernel docs for core functions as suggested
>   by Daniel Vetter
> - reworked initialization of zpos properties (moved assiging property
>   class to common code), now the code in the driver is even simpler
> - while reworking of intialization of zpos property code, did the same
>   change to generic rotation property
> 
> v1: http://www.spinics.net/lists/dri-devel/msg97709.html
> - initial version

Yeah I think that looks overall rather neat now. Probably best if someone
from the exynos team reviews it all in detail, and then we could pull it
in through drm-misc.

Cheers, Daniel

> 
> Patch summary:
> 
> Marek Szyprowski (3):
>   drm: add generic zpos property
>   drm/exynos: use generic code for managing zpos plane property
>   drm: simplify initialization of rotation property
> 
>  Documentation/DocBook/gpu.tmpl  |  14 ++-
>  drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c |  10 +-
>  drivers/gpu/drm/drm_atomic.c|   4 +
>  drivers/gpu/drm/drm_atomic_helper.c | 116 
> 
>  drivers/gpu/drm/drm_crtc.c  |  82 +++--
>  drivers/gpu/drm/exynos/exynos_drm_drv.h |   2 -
>  drivers/gpu/drm/exynos/exynos_drm_plane.c   |  66 +++---
>  drivers/gpu/drm/exynos/exynos_mixer.c   |   6 +-
>  drivers/gpu/drm/i915/intel_display.c|   6 +-
>  drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c   |   3 +-
>  drivers/gpu/drm/omapdrm/omap_drv.c  |   3 +-
>  include/drm/drm_crtc.h  |  18 +++-
>  12 files changed, 250 insertions(+), 80 deletions(-)
> 
> -- 
> 1.9.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] RPM wakelock ref not held during HW access

2016-01-12 Thread Daniel Vetter

On Wed, Jan 13, 2016 at 12:06:07AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> -mmots 4.4.0-mm1-dbg-00602-g776bd09

Patch to shut this up (rpm is disabled by default for a reason still) on
it's way into 4.5/-next.

Thanks anyway for the report.
-Daniel
> 
> 
> [ 5331.509087] WARNING: CPU: 0 PID: 359 at 
> drivers/gpu/drm/i915/intel_drv.h:1446 gen6_read32+0x7b/0x253 [i915]()
> [ 5331.509091] RPM wakelock ref not held during HW access
> [ 5331.509093] Modules linked in:
> [ 5331.509182] CPU: 0 PID: 359 Comm: Xorg Not tainted 
> 4.4.0-mm1-dbg-00602-g776bd09-dirty #34
> [ 5331.509186]   88041bddfac0 811eb860 
> 88041bddfb08
> [ 5331.509194]  88041bddfaf8 81040fc2 a05442a9 
> 88041aab
> [ 5331.509200]  00064000 88041afcb001 0001 
> 88041bddfb60
> [ 5331.509207] Call Trace:
> [ 5331.509218]  [] dump_stack+0x4b/0x63
> [ 5331.509227]  [] warn_slowpath_common+0x99/0xb2
> [ 5331.509277]  [] ? gen6_read32+0x7b/0x253 [i915]
> [ 5331.509283]  [] warn_slowpath_fmt+0x48/0x50
> [ 5331.509331]  [] gen6_read32+0x7b/0x253 [i915]
> [ 5331.509338]  [] ? mutex_unlock+0xe/0x10
> [ 5331.509391]  [] intel_ddi_get_hw_state+0x5e/0x159 [i915]
> [ 5331.509443]  [] 
> intel_ddi_connector_get_hw_state+0x5c/0xdf [i915]
> [ 5331.509494]  [] intel_atomic_commit+0x8e0/0x1250 [i915]
> [ 5331.509528]  [] ? drm_atomic_check_only+0x293/0x564 [drm]
> [ 5331.509558]  [] drm_atomic_commit+0x4d/0x52 [drm]
> [ 5331.509572]  [] 
> drm_atomic_helper_connector_dpms+0x116/0x17e [drm_kms_helper]
> [ 5331.509599]  [] 
> drm_mode_obj_set_property_ioctl+0xef/0x17a [drm]
> [ 5331.509626]  [] 
> drm_mode_connector_property_set_ioctl+0x30/0x32 [drm]
> [ 5331.509642]  [] drm_ioctl+0x26d/0x3a8 [drm]
> [ 5331.509668]  [] ? 
> drm_mode_obj_set_property_ioctl+0x17a/0x17a [drm]
> [ 5331.509675]  [] ? lock_acquire+0x101/0x188
> [ 5331.509681]  [] ? __fget+0x5/0x19d
> [ 5331.509685]  [] ? __lock_is_held+0x3c/0x57
> [ 5331.509691]  [] vfs_ioctl+0x18/0x34
> [ 5331.509695]  [] do_vfs_ioctl+0x572/0x5f1
> [ 5331.509699]  [] ? __fget_light+0x62/0x71
> [ 5331.509704]  [] SyS_ioctl+0x43/0x61
> [ 5331.509711]  [] entry_SYSCALL_64_fastpath+0x12/0x6f
> [ 5331.509715] ---[ end trace 70d4fd86a0395d92 ]---
> 
>   -ss
> ___
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Bug 91880] Radeonsi on Grenada cards (r9 390) exceptionally unstable and poorly performing

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=91880

--- Comment #48 from Julian  ---
I have the same issue. R9 390, running DPM=1 will end up in an eventual freeze.
Using DPM=0 works fine but is obviously suboptimal.

I could provide another xorg log/vbios/demsg but they seem to be quite similar
to the thread starter's.

Two things to add:

* I've tried with Ubuntu's default Mesa version (which is 11.0.8 I think) and
the newest (11.1) and the newer release seems to cause the freeze much faster.
On 11.0.8 I'd be able to use the system for hours until it froze. 11.1 never
lasted for more than 20 minutes.

* A quick way to force the freeze to happen is to use google maps. Half a
minute of scrolling around and zooming in/out is enough to make it happen.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/ea3da95a/attachment.html>

[PATCH v9 14/14] drm/mediatek: Add interface to allocate Mediatek GEM buffer.

2016-01-12 Thread Frank Binns

Hi Philipp,

Comments below.

On 12/01/16 15:15, Philipp Zabel wrote:
> From: CK Hu 
>
> Add an interface to allocate Mediatek GEM buffers, allow the IOCTLs
> to be used by render nodes.
> This patch also sets the RENDER driver feature.
>
> Signed-off-by: CK Hu 
> Signed-off-by: Nicolas Boichat 
> Signed-off-by: Philipp Zabel 
> ---
>  drivers/gpu/drm/mediatek/mtk_drm_drv.c | 13 +++-
>  drivers/gpu/drm/mediatek/mtk_drm_gem.c | 39 ++
>  drivers/gpu/drm/mediatek/mtk_drm_gem.h | 12 +++
>  include/uapi/drm/mediatek_drm.h| 59 
> ++
>  4 files changed, 122 insertions(+), 1 deletion(-)
>  create mode 100644 include/uapi/drm/mediatek_drm.h
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> index fdb27e9..1f776a9 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "mtk_cec.h"
>  #include "mtk_drm_crtc.h"
> @@ -222,6 +223,14 @@ static const struct vm_operations_struct 
> mtk_drm_gem_vm_ops = {
>   .close = drm_gem_vm_close,
>  };
>  
> +static const struct drm_ioctl_desc mtk_ioctls[] = {
> + DRM_IOCTL_DEF_DRV(MTK_GEM_CREATE, mtk_gem_create_ioctl,
> +   DRM_UNLOCKED | DRM_AUTH | DRM_RENDER_ALLOW),
> + DRM_IOCTL_DEF_DRV(MTK_GEM_MAP_OFFSET,
> +   mtk_gem_map_offset_ioctl,
> +   DRM_UNLOCKED | DRM_AUTH | DRM_RENDER_ALLOW),
> +};
> +
>  static const struct file_operations mtk_drm_fops = {
>   .owner = THIS_MODULE,
>   .open = drm_open,
> @@ -237,7 +246,7 @@ static const struct file_operations mtk_drm_fops = {
>  
>  static struct drm_driver mtk_drm_driver = {
>   .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME |
> -DRIVER_ATOMIC,
> +DRIVER_ATOMIC | DRIVER_RENDER,
>   .unload = mtk_drm_unload,
>   .set_busid = drm_platform_set_busid,
>  
> @@ -257,6 +266,8 @@ static struct drm_driver mtk_drm_driver = {
>   .gem_prime_import = drm_gem_prime_import,
>   .gem_prime_get_sg_table = mtk_gem_prime_get_sg_table,
>   .gem_prime_mmap = mtk_drm_gem_mmap_buf,
> + .ioctls = mtk_ioctls,
> + .num_ioctls = ARRAY_SIZE(mtk_ioctls),
>   .fops = &mtk_drm_fops,
>  
>   .name = DRIVER_NAME,
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
> b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> index 96cc980..f726d55 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
> @@ -13,6 +13,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  #include "mtk_drm_gem.h"
>  
> @@ -225,3 +226,41 @@ struct sg_table *mtk_gem_prime_get_sg_table(struct 
> drm_gem_object *obj)
>  
>   return sgt;
>  }
> +
> +int mtk_gem_map_offset_ioctl(struct drm_device *drm, void *data,
> +  struct drm_file *file_priv)
> +{
> + struct drm_mtk_gem_map_off *args = data;
> +
You should validate args->pad here.

> + return mtk_drm_gem_dumb_map_offset(file_priv, drm, args->handle,
> +&args->offset);
> +}
> +
> +int mtk_gem_create_ioctl(struct drm_device *dev, void *data,
> +  struct drm_file *file_priv)
> +{
> + struct mtk_drm_gem_obj *mtk_gem;
> + struct drm_mtk_gem_create *args = data;
> + int ret;
> +
You should validate args->flags here.

> + mtk_gem = mtk_drm_gem_create(dev, args->size, false);
> + if (IS_ERR(mtk_gem))
> + return PTR_ERR(mtk_gem);
> +
> + /*
> +  * allocate a id of idr table where the obj is registered
> +  * and handle has the id what user can see.
> +  */
This comment doesn't seem that useful.

> + ret = drm_gem_handle_create(file_priv, &mtk_gem->base, &args->handle);
> + if (ret)
> + goto err_handle_create;
> +
> + /* drop reference from allocate - handle holds it now. */
> + drm_gem_object_unreference_unlocked(&mtk_gem->base);
> +
> + return 0;
> +
> +err_handle_create:
> + mtk_drm_gem_free_object(&mtk_gem->base);
> + return ret;
> +}
> diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.h 
> b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
> index 9bdeeb3..28b8fa7 100644
> --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.h
> +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
> @@ -52,4 +52,16 @@ int mtk_drm_gem_mmap_buf(struct drm_gem_object *obj,
>   struct vm_area_struct *vma);
>  struct sg_table *mtk_gem_prime_get_sg_table(struct drm_gem_object *obj);
>  
> +/*
> + * request gem object creation and buffer allocation as the size
> + * that it is calculated with framebuffer information such as width,
> + * height and bpp.
> + */
> +int mtk_gem_create_ioctl(struct drm_device *dev, void *data,
> + struct drm_file *file_priv);
> +
> +/* get buffer offset to map to user space. */
> +int mtk_gem_map_of

[PATCH 1/2] drm/dp: Add definition for Display Control DPCD Registers capability size

2016-01-12 Thread Yetunde Adebisi

This is used when reading Display Control capability Registers on the sink
device.

cc: Jani Nikula 
cc: dri-devel at lists.freedesktop.org
Signed-off-by: Yetunde Adebisi 
---
 include/drm/drm_dp_helper.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
index 1252108..92d9a52 100644
--- a/include/drm/drm_dp_helper.h
+++ b/include/drm/drm_dp_helper.h
@@ -621,6 +621,7 @@ u8 drm_dp_get_adjust_request_pre_emphasis(const u8 
link_status[DP_LINK_STATUS_SI
 #define DP_BRANCH_OUI_HEADER_SIZE  0xc
 #define DP_RECEIVER_CAP_SIZE   0xf
 #define EDP_PSR_RECEIVER_CAP_SIZE  2
+#define EDP_DISPLAY_CTL_CAP_SIZE   3

 void drm_dp_link_train_clock_recovery_delay(const u8 
dpcd[DP_RECEIVER_CAP_SIZE]);
 void drm_dp_link_train_channel_eq_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE]);
-- 
1.9.3

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Chris Wilson

On Mon, Jan 11, 2016 at 09:05:06PM +, Chris Wilson wrote:
> I can narrow down the principal buggy path by doing the clflush(vend-1)
> in the callers at least.

That leads to the suspect path being a read back of a cache line from
main memory that was just written to by the GPU. Writes to memory before
using them on the GPU do not seem to be affected (or at least we have
sufficient flushing in sending the commands to the GPU that we don't
notice anything wrong).

And back to the oddity.

Instead of doing:

clflush_cache_range(vaddr + offset, size);
clflush(vaddr+offset+size-1);
mb();
memcpy(user, vaddr+offset, size);

what also worked was:

clflush_cache_range(vaddr + offset, size);
clflush(vaddr);
mb();
memcpy(user, vaddr+offset, size);

(size is definitely non-zero, offset is offset_in_page(), vaddr is from
kmap_atomic()).

i.e.

void clflush_cache_range(void *vaddr, unsigned int size)
{
const unsigned long clflush_size = boot_cpu_data.x86_clflush_size;
void *p = (void *)((unsigned long)vaddr & ~(clflush_size - 1));
void *vend = vaddr + size;

if (p >= vend)
return;

mb();

for (; p < vend; p += clflush_size)
clflushopt(p);

clflushopt(vaddr);

mb();
}

I have also confirmed that this doesn't just happen for single
cachelines (i.e. where the earlier clflush(vend-1) and this clflush(vaddr)
would be equivalent).

At the moment I am more inclined this is serialising the clflush()
(since clflush to the same cacheline is regarded as ordered with respect
to the earlier clflush iirc) as opposed to the writes not landing timely
from the GPU.

Am I completely going mad?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Linus Torvalds

On Tue, Jan 12, 2016 at 8:37 AM, Chris Wilson  
wrote:
> On Mon, Jan 11, 2016 at 09:05:06PM +, Chris Wilson wrote:
>> I can narrow down the principal buggy path by doing the clflush(vend-1)
>> in the callers at least.
>
> That leads to the suspect path being a read back of a cache line from
> main memory that was just written to by the GPU.

How do you know it was written by the GPU?

Maybe it's a memory ordering issue on the GPU. Say it writes something
to memory, then sets the "I'm done" flag (or whatever you check), but
because of ordering on the GPU the "I'm done" flag is visible before.

So the reason you see the old content may just be that the GPU writes
are still buffered on the GPU. And you adding a clflushopt on the same
address just changes the timing enough that you don't see the memory
ordering any more (or it's just much harder to see, it might still be
there).

Maybe the reason you only see the problem with the last cacheline is
simply that the "last" cacheline is also the last that was written by
the GPU, and it's still in the GPU write buffers.

Also, did you ever print out the value of clflush_size? Maybe we just
got it wrong and it's bogus data.

Linus

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread H. Peter Anvin

On January 11, 2016 3:28:01 AM PST, Chris Wilson  
wrote:
>On Sat, Jan 09, 2016 at 02:36:03PM -0800, Andy Lutomirski wrote:
>> On Sat, Jan 9, 2016 at 12:01 AM, Chris Wilson
> wrote:
>> > On Thu, Jan 07, 2016 at 02:32:23PM -0800, H. Peter Anvin wrote:
>> >> On 01/07/16 14:29, H. Peter Anvin wrote:
>> >> >
>> >> > I would be very interested in knowing if replacing the final
>clflushopt
>> >> > with a clflush would resolve your problems (in which case the
>last mb()
>> >> > shouldn't be necessary either.)
>> >> >
>> >>
>> >> Nevermind.  CLFLUSH is not ordered with regards to CLFLUSHOPT to
>the
>> >> same cache line.
>> >>
>> >> Could you add a sync_cpu(); call to the end (can replace the final
>mb())
>> >> and see if that helps your case?
>> >
>> > s/sync_cpu()/sync_core()/
>> >
>> > No. I still see failures on Baytrail and Braswell (Pineview is not
>> > affected) with the final mb() replaced with sync_core(). I can
>reproduce
>> > failures on Pineview by tweaking the clflush_cache_range()
>parameters,
>> > so I am fairly confident that it is validating the current code.
>> >
>> > iirc sync_core() is cpuid, a heavy serialising instruction, an
>> > alternative to mfence.  Is there anything that else I can infer
>about
>> > the nature of my bug from this result?
>> 
>> No clue, but I don't know much about the underlying architecture.
>> 
>> Can you try clflush_cache_ranging one cacheline less and then
>manually
>> doing clflushopt; mb on the last cache line, just to make sure that
>> the helper is really doing the right thing?  You could also try
>> clflush instead of clflushopt to see if that makes a difference.
>
>I had looked at increasing the range over which clflush_cache_range()
>runs (using roundup/rounddown by cache lines), but it took something
>like +/- 256 bytes to pass all the tests. And also did
>s/clflushopt/clflush/ to confirm that made no differnce.
>
>Bizarrely,
>
>diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
>index 6000ad7..cf074400 100644
>--- a/arch/x86/mm/pageattr.c
>+++ b/arch/x86/mm/pageattr.c
>@@ -141,6 +141,7 @@ void clflush_cache_range(void *vaddr, unsigned int
>size)
>for (; p < vend; p += clflush_size)
>clflushopt(p);
> 
>+   clflushopt(vend-1);
>mb();
> }
> EXPORT_SYMBOL_GPL(clflush_cache_range);
>
>works like a charm.
>-Chris

That clflushopt touches a cache line already touched and therefore serializes 
with it.
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.

[PATCH 5/5] drm: Enable markdown^Wasciidoc for gpu.tmpl

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 11:06:17AM +, Graham Whaley wrote:
> On Tue, 2016-01-12 at 09:34 +0100, Daniel Vetter wrote:
> > On Mon, Jan 11, 2016 at 06:12:12PM -0700, Jonathan Corbet wrote:
> > > On Sat, 12 Dec 2015 12:13:45 +0100
> > > Daniel Vetter  wrote:
> > > 
> > > > I just figured there's no way this could get it, and I'd
> > > > much rather improve the docs themselves than trying to convince
> > > > core
> > > > kernel folks that this might be useful.
> > > 
> > > So I'm not quite sure why you figured that; I never said it,
> > > certainly.
> > 
> > To clarify this wasn't really my impression of your stance, but of
> > the
> > overall room opinion when we had the discussion at KS. And then my
> > main
> > goal here is to write great docs for drm (we have about 3k lines more
> > docs
> > in 4.5 already), so that's why I dropped the ball on upstreaming. It
> > seemed unlikely to succeed, at least without some really seriuos
> > effort at
> > convincing everyone, all while the drm docs for atomic haven't been
> > in
> > good shape yet. Since then we had a few contributors of new atomic
> > drivers
> > note on irc already that "oh cool, this is documented now". Overall
> > really
> > just boils down to what I see as the most important things for drm ;
> > -)
> > 
> > > I've been messing with it a bit, seems to work.  I do still wish we
> > > could
> > > consider alternatives, especially those that might simplify the
> > > toolchain
> > > rather than complicating it.  But it's clear that I'm not
> > > succeeding in
> > > finding time to actually explore that idea; the contents of
> > > $EXCUSES are
> > > good, but the end result is the same.  And the patch fairy just
> > > isn't
> > > coming through for me on this one.
> > > 
> > > In my mind, there's clearly no good that can come from (further)
> > > delaying
> > > something that works in favor of an "it would be nice" that may
> > > never
> > > even exist.  So I'm currently thinking that I'll pull this into the
> > > docs
> > > tree once the merge window is done, with the plan to push it for
> > > 4.6.
> > > Then we can see if anybody screams.
> > > 
> > > That gives a couple of weeks for an updated patch set, should you
> > > have
> > > one.
> > > 
> > > The build-time increase is painful in the extreme - about a factor
> > > of
> > > three for a -j1 build, and that's with only one file using the
> > > feature.
> > > It feels wrong, somehow, for the docs build to take longer than
> > > building
> > > the kernel itself.  Can we do something about that?
> > > 
> > >  - How many of the comments actually use asciidoc features?  Might
> > > there
> > >be some possibility of detecting those in kernel-doc and
> > > skipping the
> > >callout to asciidoc when it's not needed?
> > 
> > I think that amounts to writing a partial parser (we use asciidoc for
> > tables, lists, links, formatting, code snippets by now already,
> > someone
> > even thought of using the asciiart->png feature it has but it's not
> > yet
> > wired up). I don't think it's feasible.
> > 
> > >  - Pandoc seems to do asciidoc.  I still don't like the idea of
> > > depending
> > >on it for this to work, but having the *option* to use it is
> > > fine.  If
> > >it's really that much faster (yes, Python startup is painful)
> > > then
> > >maybe providing the option is worth it.
> > 
> > Hm, Dave asked me to convert to use python-based asciidoc insted of
> > haskell-based pandoc.
> > 
> > >  - All over the kernel we've seen that batching improves
> > > performance.  It
> > >would take a bit of work, but I bet kernel-doc could put
> > > together all
> > >the snippets from one file, pass them through a single asciidoc
> > >invocation, then split the results back apart.  That would
> > > probably
> > >eliminate the performance hit entirely.
> > > 
> > > None of that is a condition for pulling this stuff in, but can it
> > > be
> > > looked into?
> > 
> > Besides what Jani mention that asciidoctor should be a drop-in
> > replacement
> > if installed it also seems possible to parallelize the call-out to
> > kernel-doc from docproc.c without too much effort. I hoped Jani would
> > get
> > around to implement the asciidoctor support, and I'm hoping I can
> > snipe
> > away some free sometimes the next few months to look at docproc.c
> > more
> > seriously. This would kinda be a cool intern project, but atm we
> > throw
> > them all at improving testing infrastructure ...
> > 
> > Anyway I'm of course still open to get this upstream, and I think a
> > few
> > things should be polished (like the speed-up). But right now
> > bandwidth on
> > my side isn't too plentiful. Maybe we should aim to have a few better
> > ideas (perhaps even for all of the docs stuff) for next KS and respin
> > that
> > discussion?
> 
> I was just about to reply to the thread looking at the
> linux.conf.au schedule it would seem that you are both attending and
> presenting, and there appea

[PATCH 5/5] drm: Enable markdown^Wasciidoc for gpu.tmpl

2016-01-12 Thread Graham Whaley

On Tue, 2016-01-12 at 09:34 +0100, Daniel Vetter wrote:
> On Mon, Jan 11, 2016 at 06:12:12PM -0700, Jonathan Corbet wrote:
> > On Sat, 12 Dec 2015 12:13:45 +0100
> > Daniel Vetter  wrote:
> > 
> > > I just figured there's no way this could get it, and I'd
> > > much rather improve the docs themselves than trying to convince
> > > core
> > > kernel folks that this might be useful.
> > 
> > So I'm not quite sure why you figured that; I never said it,
> > certainly.
> 
> To clarify this wasn't really my impression of your stance, but of
> the
> overall room opinion when we had the discussion at KS. And then my
> main
> goal here is to write great docs for drm (we have about 3k lines more
> docs
> in 4.5 already), so that's why I dropped the ball on upstreaming. It
> seemed unlikely to succeed, at least without some really seriuos
> effort at
> convincing everyone, all while the drm docs for atomic haven't been
> in
> good shape yet. Since then we had a few contributors of new atomic
> drivers
> note on irc already that "oh cool, this is documented now". Overall
> really
> just boils down to what I see as the most important things for drm ;
> -)
> 
> > I've been messing with it a bit, seems to work.  I do still wish we
> > could
> > consider alternatives, especially those that might simplify the
> > toolchain
> > rather than complicating it.  But it's clear that I'm not
> > succeeding in
> > finding time to actually explore that idea; the contents of
> > $EXCUSES are
> > good, but the end result is the same.  And the patch fairy just
> > isn't
> > coming through for me on this one.
> > 
> > In my mind, there's clearly no good that can come from (further)
> > delaying
> > something that works in favor of an "it would be nice" that may
> > never
> > even exist.  So I'm currently thinking that I'll pull this into the
> > docs
> > tree once the merge window is done, with the plan to push it for
> > 4.6.
> > Then we can see if anybody screams.
> > 
> > That gives a couple of weeks for an updated patch set, should you
> > have
> > one.
> > 
> > The build-time increase is painful in the extreme - about a factor
> > of
> > three for a -j1 build, and that's with only one file using the
> > feature.
> > It feels wrong, somehow, for the docs build to take longer than
> > building
> > the kernel itself.  Can we do something about that?
> > 
> >  - How many of the comments actually use asciidoc features?  Might
> > there
> >be some possibility of detecting those in kernel-doc and
> > skipping the
> >callout to asciidoc when it's not needed?
> 
> I think that amounts to writing a partial parser (we use asciidoc for
> tables, lists, links, formatting, code snippets by now already,
> someone
> even thought of using the asciiart->png feature it has but it's not
> yet
> wired up). I don't think it's feasible.
> 
> >  - Pandoc seems to do asciidoc.  I still don't like the idea of
> > depending
> >on it for this to work, but having the *option* to use it is
> > fine.  If
> >it's really that much faster (yes, Python startup is painful)
> > then
> >maybe providing the option is worth it.
> 
> Hm, Dave asked me to convert to use python-based asciidoc insted of
> haskell-based pandoc.
> 
> >  - All over the kernel we've seen that batching improves
> > performance.  It
> >would take a bit of work, but I bet kernel-doc could put
> > together all
> >the snippets from one file, pass them through a single asciidoc
> >invocation, then split the results back apart.  That would
> > probably
> >eliminate the performance hit entirely.
> > 
> > None of that is a condition for pulling this stuff in, but can it
> > be
> > looked into?
> 
> Besides what Jani mention that asciidoctor should be a drop-in
> replacement
> if installed it also seems possible to parallelize the call-out to
> kernel-doc from docproc.c without too much effort. I hoped Jani would
> get
> around to implement the asciidoctor support, and I'm hoping I can
> snipe
> away some free sometimes the next few months to look at docproc.c
> more
> seriously. This would kinda be a cool intern project, but atm we
> throw
> them all at improving testing infrastructure ...
> 
> Anyway I'm of course still open to get this upstream, and I think a
> few
> things should be polished (like the speed-up). But right now
> bandwidth on
> my side isn't too plentiful. Maybe we should aim to have a few better
> ideas (perhaps even for all of the docs stuff) for next KS and respin
> that
> discussion?

I was just about to reply to the thread looking at the
linux.conf.au schedule it would seem that you are both attending and
presenting, and there appears to be some sort of Documentation mini
-summit on the Monday as well (not sure if that is the place for a
discussion though). I will be at LCA for the Wed-Fri. You may not have
to wait until the next KS?

 Graham
> 
> Thanks, Daniel

[PATCH] drm/panel: simple: Add support for Sharp LQ101K1LY04

2016-01-12 Thread Joshua Clayton

Sharp LQ101K1LY04 is a 10 inch WXGA (1280x800) lvds panel

---
 drivers/gpu/drm/panel/panel-simple.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/panel/panel-simple.c 
b/drivers/gpu/drm/panel/panel-simple.c
index f97b73e..9207b5d 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -708,6 +708,29 @@ static const struct panel_desc giantplus_gpg482739qs5 = {
.bus_format = MEDIA_BUS_FMT_RGB888_1X24,
 };

+static const struct display_timing sharp_lq101k1ly04_timing = {
+   .pixelclock = { 6000, 6500, 8000 },
+   .hactive = { 1280, 1280, 1280 },
+   .hfront_porch = { 20, 20, 20 },
+   .hback_porch = { 20, 20, 20 },
+   .hsync_len = { 10, 10, 10 },
+   .vactive = { 800, 800, 800 },
+   .vfront_porch = { 4, 4, 4 },
+   .vback_porch = { 4, 4, 4 },
+   .vsync_len = { 4, 4, 4 },
+   .flags = DISPLAY_FLAGS_PIXDATA_POSEDGE,
+};
+static const struct panel_desc sharp_lq101k1ly04 = {
+   .timings = &sharp_lq101k1ly04_timing,
+   .num_timings = 1,
+   .bpc = 8,
+   .size = {
+   .width = 217,
+   .height = 136,
+   },
+   .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA,
+};
+
 static const struct display_timing hannstar_hsd070pww1_timing = {
.pixelclock = { 6430, 7110, 8200 },
.hactive = { 1280, 1280, 1280 },
@@ -1146,6 +1169,9 @@ static const struct of_device_id platform_of_match[] = {
.compatible = "hannstar,hsd070pww1",
.data = &hannstar_hsd070pww1,
}, {
+   .compatible = "sharp,lq101k1ly04",
+   .data = &sharp_lq101k1ly04,
+   }, {
.compatible = "hannstar,hsd100pxn1",
.data = &hannstar_hsd100pxn1,
}, {
-- 
2.5.0

[PATCH] drm/panel: simple: Add support for Sharp LQ101K1LY04

2016-01-12 Thread Joshua Clayton

On Tue, 12 Jan 2016 08:05:46 -0800
Joshua Clayton  wrote:

Bah. I mistakenly thought that send-email would add my SoB
if I passed "-s" to git-send-email.

> Sharp LQ101K1LY04 is a 10 inch WXGA (1280x800) lvds panel
> 

Please add:
Signed-off-by: Joshua Clayton 
> ---
>  drivers/gpu/drm/panel/panel-simple.c | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panel/panel-simple.c
> b/drivers/gpu/drm/panel/panel-simple.c index f97b73e..9207b5d 100644
> --- a/drivers/gpu/drm/panel/panel-simple.c
> +++ b/drivers/gpu/drm/panel/panel-simple.c
> @@ -708,6 +708,29 @@ static const struct panel_desc
> giantplus_gpg482739qs5 = { .bus_format = MEDIA_BUS_FMT_RGB888_1X24,
>  };
>  
> +static const struct display_timing sharp_lq101k1ly04_timing = {
> + .pixelclock = { 6000, 6500, 8000 },
> + .hactive = { 1280, 1280, 1280 },
> + .hfront_porch = { 20, 20, 20 },
> + .hback_porch = { 20, 20, 20 },
> + .hsync_len = { 10, 10, 10 },
> + .vactive = { 800, 800, 800 },
> + .vfront_porch = { 4, 4, 4 },
> + .vback_porch = { 4, 4, 4 },
> + .vsync_len = { 4, 4, 4 },
> + .flags = DISPLAY_FLAGS_PIXDATA_POSEDGE,
> +};
> +static const struct panel_desc sharp_lq101k1ly04 = {
> + .timings = &sharp_lq101k1ly04_timing,
> + .num_timings = 1,
> + .bpc = 8,
> + .size = {
> + .width = 217,
> + .height = 136,
> + },
> + .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA,
> +};
> +
>  static const struct display_timing hannstar_hsd070pww1_timing = {
>   .pixelclock = { 6430, 7110, 8200 },
>   .hactive = { 1280, 1280, 1280 },
> @@ -1146,6 +1169,9 @@ static const struct of_device_id
> platform_of_match[] = { .compatible = "hannstar,hsd070pww1",
>   .data = &hannstar_hsd070pww1,
>   }, {
> + .compatible = "sharp,lq101k1ly04",
> + .data = &sharp_lq101k1ly04,
> + }, {
>   .compatible = "hannstar,hsd100pxn1",
>   .data = &hannstar_hsd100pxn1,
>   }, {

[PATCH] drm/rockchip: vop: fix mask when updating interrupts

2016-01-12 Thread John Keeping

Commit dbb3d94 (drm/rockchip: vop: move interrupt registers into
vop_data) introduced new macros for updating the interrupt control
registers but these always use the mask from the register definition
without refining it for the particular bits that are being changed.

This means that whenever we enable/disable a particular interrupt we end
up disabling all of the others as a side effect.

Signed-off-by: John Keeping 
---
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index 46c2a8d..fd37054 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -43,8 +43,8 @@

 #define REG_SET(x, base, reg, v, mode) \
__REG_SET_##mode(x, base + reg.offset, reg.mask, reg.shift, v)
-#define REG_SET_MASK(x, base, reg, v, mode) \
-   __REG_SET_##mode(x, base + reg.offset, reg.mask, reg.shift, v)
+#define REG_SET_MASK(x, base, reg, mask, v, mode) \
+   __REG_SET_##mode(x, base + reg.offset, mask, reg.shift, v)

 #define VOP_WIN_SET(x, win, name, v) \
REG_SET(x, win->base, win->phy->name, v, RELAXED)
@@ -58,16 +58,18 @@
 #define VOP_INTR_GET(vop, name) \
vop_read_reg(vop, 0, &vop->data->ctrl->name)

-#define VOP_INTR_SET(vop, name, v) \
-   REG_SET(vop, 0, vop->data->intr->name, v, NORMAL)
+#define VOP_INTR_SET(vop, name, mask, v) \
+   REG_SET_MASK(vop, 0, vop->data->intr->name, mask, v, NORMAL)
 #define VOP_INTR_SET_TYPE(vop, name, type, v) \
do { \
-   int i, reg = 0; \
+   int i, reg = 0, mask = 0; \
for (i = 0; i < vop->data->intr->nintrs; i++) { \
-   if (vop->data->intr->intrs[i] & type) \
+   if (vop->data->intr->intrs[i] & type) { \
reg |= (v) << i; \
+   mask |= 1 << i; \
+   } \
} \
-   VOP_INTR_SET(vop, name, reg); \
+   VOP_INTR_SET(vop, name, mask, reg); \
} while (0)
 #define VOP_INTR_GET_TYPE(vop, name, type) \
vop_get_intr_type(vop, &vop->data->intr->name, type)
-- 
2.7.0.rc3.140.g520a093

[PATCH] drm/panel: simple: Add support for Sharp LQ101K1LY04

2016-01-12 Thread Lucas Stach

Hi Joshua,

Am Dienstag, den 12.01.2016, 08:05 -0800 schrieb Joshua Clayton:
> Sharp LQ101K1LY04 is a 10 inch WXGA (1280x800) lvds panel
> 

> ---
> Â drivers/gpu/drm/panel/panel-simple.c | 26 ++

Missing documentation for the DT binding.

> Â 1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/gpu/drm/panel/panel-simple.c
> b/drivers/gpu/drm/panel/panel-simple.c
> index f97b73e..9207b5d 100644
> --- a/drivers/gpu/drm/panel/panel-simple.c
> +++ b/drivers/gpu/drm/panel/panel-simple.c
> @@ -708,6 +708,29 @@ static const struct panel_desc
> giantplus_gpg482739qs5 = {
> Â .bus_format = MEDIA_BUS_FMT_RGB888_1X24,
> Â };
> Â 
> +static const struct display_timing sharp_lq101k1ly04_timing = {
> + .pixelclock = { 6000, 6500, 8000 },
> + .hactive = { 1280, 1280, 1280 },
> + .hfront_porch = { 20, 20, 20 },
> + .hback_porch = { 20, 20, 20 },
> + .hsync_len = { 10, 10, 10 },
> + .vactive = { 800, 800, 800 },
> + .vfront_porch = { 4, 4, 4 },
> + .vback_porch = { 4, 4, 4 },
> + .vsync_len = { 4, 4, 4 },
> + .flags = DISPLAY_FLAGS_PIXDATA_POSEDGE,
> +};
> +static const struct panel_desc sharp_lq101k1ly04 = {
> + .timings = &sharp_lq101k1ly04_timing,
> + .num_timings = 1,
> + .bpc = 8,
> + .size = {
> + .width = 217,
> + .height = 136,
> + },
> + .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA,
> +};
> +
This hunk isn't added at the correct place. Please keep the
alphabetical sorting.

> Â static const struct display_timing hannstar_hsd070pww1_timing = {
> Â .pixelclock = { 6430, 7110, 8200 },
> Â .hactive = { 1280, 1280, 1280 },
> @@ -1146,6 +1169,9 @@ static const struct of_device_id
> platform_of_match[] = {
> Â .compatible = "hannstar,hsd070pww1",
> Â .data = &hannstar_hsd070pww1,
> Â }, {
> + .compatible = "sharp,lq101k1ly04",
> + .data = &sharp_lq101k1ly04,
> + }, {

Wrong insertion place again.

> Â .compatible = "hannstar,hsd100pxn1",
> Â .data = &hannstar_hsd100pxn1,
> Â }, {

[Bug 92923] SGPR spilling

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=92923

--- Comment #13 from pat  ---
2048M VRAM

Test 1:
Version:  OpenGL 3.0 [3.0 Mesa 11.2.0-devel (git-6f898f7)]
Patched with both patches you provided.

With patch: the graph spikes up to 45.5 MB during the loading screen, later it
stays at 0 byte. When I turn around it goes up to KB but not MB. Still around 3
FPS, up to 11 when looking at the sky.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/d230a2ec/attachment.html>

[Bug 92923] SGPR spilling

2016-01-12 Thread bugzilla-dae...@freedesktop.org

https://bugs.freedesktop.org/show_bug.cgi?id=92923

--- Comment #14 from pat  ---
Test 1: no noticeable difference between patched and not.

Test 2: your repo/branch (git-016eba7), no noticeable improvement (including
the GTT patch). Graph also shows minimal bytes moved.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20160112/1991690a/attachment.html>

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Chris Wilson

On Tue, Jan 12, 2016 at 09:05:19AM -0800, Linus Torvalds wrote:
> On Tue, Jan 12, 2016 at 8:37 AM, Chris Wilson  
> wrote:
> > On Mon, Jan 11, 2016 at 09:05:06PM +, Chris Wilson wrote:
> >> I can narrow down the principal buggy path by doing the clflush(vend-1)
> >> in the callers at least.
> >
> > That leads to the suspect path being a read back of a cache line from
> > main memory that was just written to by the GPU.
> 
> How do you know it was written by the GPU?

Test construction: write some data, copy it on the GPU, read it back.
Repeat for various data, sequences of copy (differing GPU instructions,
intermediates etc), and accessors.

> Maybe it's a memory ordering issue on the GPU. Say it writes something
> to memory, then sets the "I'm done" flag (or whatever you check), but
> because of ordering on the GPU the "I'm done" flag is visible before.

That is a continual worry. To try and assuage that fear, I sent 8x
flush gpu writes between the end of the copy and setting the "I'm done"
flag. The definition of the GPU flush is that it both flushes all
previous writes before it completes and only after it completes does it
do the post-sync write (before moving onto the next command). The spec
is always a bit hazy on what order the memory writes will be visible on
the CPU though.

Sending the 8x GPU flushes before marking "I'm done" did not fix the
corruption.

> So the reason you see the old content may just be that the GPU writes
> are still buffered on the GPU. And you adding a clflushopt on the same
> address just changes the timing enough that you don't see the memory
> ordering any more (or it's just much harder to see, it might still be
> there).

Indeed. So I replaced the post-clflush_cache_range() clflush() with a
udelay(10) instead, and the corruption vanished. Putting the udelay(10)
before the clflush_cache_range() does not fix the corruption.

> Maybe the reason you only see the problem with the last cacheline is
> simply that the "last" cacheline is also the last that was written by
> the GPU, and it's still in the GPU write buffers.

Exactly the fear.

> Also, did you ever print out the value of clflush_size? Maybe we just
> got it wrong and it's bogus data.

It's 64 bytes as expected. And fudging it to any other value quickly
explodes :)

Since:

/* lots of GPU flushes + GPU/CPU sync point */
udelay(10);
clflush_cache_range(vaddr, size);
memcpy(user, vaddr, size);

fails, but

/* lots of GPU flushes + GPU/CPU sync point */
clflush_cache_range(vaddr, size);
udelay(10);
memcpy(user, vaddr, size);

passes, I'm inclined to point the finger at the mb() following the
clflush_cache_range().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

[PATCH] configure.ac: disable annoying warning -Wmissing-field-initializers

2016-01-12 Thread Marek Olšák

From: Marek OlÅ¡Ã¡k 

It warns for all "{}" initializers. Well, I want us to use {}.
---
 configure.ac | 3 ++-
 intel/intel_decode.c | 2 --
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index c8c4ace..057a846 100644
--- a/configure.ac
+++ b/configure.ac
@@ -174,7 +174,8 @@ MAYBE_WARN="-Wall -Wextra \
 -Wstrict-aliasing=2 -Winit-self \
 -Wdeclaration-after-statement -Wold-style-definition \
 -Wno-unused-parameter \
--Wno-attributes -Wno-long-long -Winline -Wshadow"
+-Wno-attributes -Wno-long-long -Winline -Wshadow \
+-Wno-missing-field-initializers"

 # invalidate cached value if MAYBE_WARN has changed
 if test "x$libdrm_cv_warn_maybe" != "x$MAYBE_WARN"; then
diff --git a/intel/intel_decode.c b/intel/intel_decode.c
index e7aef74..287c342 100644
--- a/intel/intel_decode.c
+++ b/intel/intel_decode.c
@@ -38,8 +38,6 @@
 #include "intel_chipset.h"
 #include "intel_bufmgr.h"

-/* The compiler throws ~90 warnings. Do not spam the build, until we fix them. 
*/
-#pragma GCC diagnostic ignored "-Wmissing-field-initializers"

 /* Struct for tracking drm_intel_decode state. */
 struct drm_intel_decode {
-- 
2.1.4

[PATCH 00/10] libdrm amdgpu patches

2016-01-12 Thread Marek Olšák

Hi,

These are libdrm_amdgpu patches harvested from an internal branch.

The first patch is a revert I had to make to fix the build. Yeah, 
sequence_mutex should be renamed to a more appropriate name. That can be done 
as a follow-up.

One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope the 
kernel contains (or will contain) the changes too, so that I don't push 
something that doesn't exist in the kernel.

Please let me know if these are okay to push.

Thanks,

Chunming Zhou (3):
  amdgpu: add semaphore support
  tests/amdgpu: add semaphore test
  amdgpu: validate user memory for userptr

Junwei Zhang (3):
  amdgpu: add the interface of waiting multiple fences
  amdgpu/tests: add multi-fence test in base test
  amdgpu: list each entry safely for sw semaphore when submit ib

Marek OlÅ¡Ã¡k (1):
  Revert "amdgpu: remove sequence mutex"

Michel DÃ¤nzer (1):
  amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

monk.liu (2):
  amdgpu: drop address patching logics
  amdgpu: cs_wait_fences now can return the first signaled fence index

 amdgpu/amdgpu.h|  88 +
 amdgpu/amdgpu_bo.c |  14 ++-
 amdgpu/amdgpu_cs.c | 253 
--
 amdgpu/amdgpu_internal.h   |  15 +++
 include/drm/amdgpu_drm.h   |  28 +
 tests/amdgpu/basic_tests.c | 233 

 6 files changed, 616 insertions(+), 15 deletions(-)

Marek

[PATCH 01/10] Revert "amdgpu: remove sequence mutex"

2016-01-12 Thread Marek Olšák

From: Marek OlÅ¡Ã¡k 

This reverts commit f6f25d67a9c0d26be9b8021a45f2acf3a4042ade.

Required by the new semaphore patches.
---
 amdgpu/amdgpu_cs.c   | 10 ++
 amdgpu/amdgpu_internal.h |  3 +++
 2 files changed, 13 insertions(+)

diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 6747158..511d53f 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -66,6 +66,10 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev,

gpu_context->dev = dev;

+   r = pthread_mutex_init(&gpu_context->sequence_mutex, NULL);
+   if (r)
+   goto error;
+
/* Create the context */
memset(&args, 0, sizeof(args));
args.in.op = AMDGPU_CTX_OP_ALLOC_CTX;
@@ -79,6 +83,7 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev,
return 0;

 error:
+   pthread_mutex_destroy(&gpu_context->sequence_mutex);
free(gpu_context);
return r;
 }
@@ -99,6 +104,8 @@ int amdgpu_cs_ctx_free(amdgpu_context_handle context)
if (NULL == context)
return -EINVAL;

+   pthread_mutex_destroy(&context->sequence_mutex);
+
/* now deal with kernel side */
memset(&args, 0, sizeof(args));
args.in.op = AMDGPU_CTX_OP_FREE_CTX;
@@ -196,6 +203,8 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
chunk_data[i].ib_data.flags = ib->flags;
}

+   pthread_mutex_lock(&context->sequence_mutex);
+
if (user_fence) {
i = cs.in.num_chunks++;

@@ -248,6 +257,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
ibs_request->seq_no = cs.out.handle;

 error_unlock:
+   pthread_mutex_unlock(&context->sequence_mutex);
free(dependencies);
return r;
 }
diff --git a/amdgpu/amdgpu_internal.h b/amdgpu/amdgpu_internal.h
index 7dd5c1c..5d86603 100644
--- a/amdgpu/amdgpu_internal.h
+++ b/amdgpu/amdgpu_internal.h
@@ -111,6 +111,9 @@ struct amdgpu_bo_list {

 struct amdgpu_context {
struct amdgpu_device *dev;
+   /** Mutex for accessing fences and to maintain command submissions
+   in good sequence. */
+   pthread_mutex_t sequence_mutex;
/* context id*/
uint32_t id;
 };
-- 
2.1.4

[PATCH 02/10] amdgpu: add the interface of waiting multiple fences

2016-01-12 Thread Marek Olšák

From: Junwei Zhang 

Signed-off-by: Junwei Zhang 
Reviewed-by: Christian KÃ¶nig 
Reviewed-by: Jammy Zhou 
---
 amdgpu/amdgpu.h  | 22 +++
 amdgpu/amdgpu_cs.c   | 71 
 include/drm/amdgpu_drm.h | 27 ++
 3 files changed, 120 insertions(+)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index e44d802..9ae6ca3 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -902,6 +902,28 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence 
*fence,
 uint64_t flags,
 uint32_t *expired);

+/**
+ *  Wait for multiple fences
+ *
+ * \param   fences  - \c [in] The fence array to wait
+ * \param   fence_count - \c [in] The fence count
+ * \param   wait_all- \c [in] If true, wait all fences to be signaled,
+ *otherwise, wait at least one fence
+ * \param   timeout_ns  - \c [in] The timeout to wait, in nanoseconds
+ * \param   status  - \c [out] '1' for signaled, '0' for timeout
+ *
+ * \return  0 on success
+ *  <0 - Negative POSIX Error code
+ *
+ * \noteCurrently it supports only one amdgpu_device. All fences come from
+ *  the same amdgpu_device with the same fd.
+*/
+int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences,
+ uint32_t fence_count,
+ bool wait_all,
+ uint64_t timeout_ns,
+ uint32_t *status);
+
 /*
  * Query / Info API
  *
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 511d53f..d5e4ea0 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -379,3 +379,74 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence 
*fence,
return r;
 }

+static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence *fences,
+   uint32_t fence_count,
+   bool wait_all,
+   uint64_t timeout_ns,
+   uint32_t *status)
+{
+   struct drm_amdgpu_fence *drm_fences;
+   amdgpu_device_handle dev = fences[0].context->dev;
+   union drm_amdgpu_wait_fences args;
+   int r;
+   uint32_t i;
+
+   drm_fences = alloca(sizeof(struct drm_amdgpu_fence) * fence_count);
+   for (i = 0; i < fence_count; i++) {
+   drm_fences[i].ctx_id = fences[i].context->id;
+   drm_fences[i].ip_type = fences[i].ip_type;
+   drm_fences[i].ip_instance = fences[i].ip_instance;
+   drm_fences[i].ring = fences[i].ring;
+   drm_fences[i].seq_no = fences[i].fence;
+   }
+
+   memset(&args, 0, sizeof(args));
+   args.in.fences = (uint64_t)(uintptr_t)drm_fences;
+   args.in.fence_count = fence_count;
+   args.in.wait_all = wait_all;
+   args.in.timeout_ns = amdgpu_cs_calculate_timeout(timeout_ns);
+
+   r = drmIoctl(dev->fd, DRM_IOCTL_AMDGPU_WAIT_FENCES, &args);
+   if (r)
+   return -errno;
+
+   *status = args.out.status;
+   return 0;
+}
+
+int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences,
+ uint32_t fence_count,
+ bool wait_all,
+ uint64_t timeout_ns,
+ uint32_t *status)
+{
+   uint32_t ioctl_status = 0;
+   uint32_t i;
+   int r;
+
+   /* Sanity check */
+   if (NULL == fences)
+   return -EINVAL;
+   if (NULL == status)
+   return -EINVAL;
+   if (fence_count <= 0)
+   return -EINVAL;
+   for (i = 0; i < fence_count; i++) {
+   if (NULL == fences[i].context)
+   return -EINVAL;
+   if (fences[i].ip_type >= AMDGPU_HW_IP_NUM)
+   return -EINVAL;
+   if (fences[i].ring >= AMDGPU_CS_MAX_RINGS)
+   return -EINVAL;
+   }
+
+   *status = 0;
+
+   r = amdgpu_ioctl_wait_fences(fences, fence_count, wait_all, timeout_ns,
+   &ioctl_status);
+
+   if (!r)
+   *status = ioctl_status;
+
+   return r;
+}
diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index fbdd118..2cbea72 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -46,6 +46,7 @@
 #define DRM_AMDGPU_WAIT_CS 0x09
 #define DRM_AMDGPU_GEM_OP  0x10
 #define DRM_AMDGPU_GEM_USERPTR 0x11
+#define DRM_AMDGPU_WAIT_FENCES 0x12

 #define DRM_IOCTL_AMDGPU_GEM_CREATEDRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
 #define DRM_IOCTL_AMDGPU_GEM_MMAP  DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -59,6 +60,7 @@
 #define DRM_IOCTL_AMDGPU_WAIT_CS   DRM_IOWR(DRM_COMMAND_BASE + 
DRM_AMDGPU_WAIT_CS, union drm_amdgpu_wait_cs)
 #define DRM_IOCTL_AMDGPU_GEM_OP

[PATCH 05/10] amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer

2016-01-12 Thread Marek Olšák

From: Michel DÃ¤nzer 

  CC   amdgpu_bo.lo
../../amdgpu/amdgpu_bo.c: In function 'amdgpu_create_bo_from_user_mem':
../../amdgpu/amdgpu_bo.c:539:12: warning: assignment makes integer from pointer 
without a cast [-Wint-conversion]
  args.addr = cpu;
^

Reviewed-by: Jammy Zhou 
---
 amdgpu/amdgpu_bo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index 61db58c..2ae1c18 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -538,7 +538,7 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,
struct amdgpu_bo *bo;
struct drm_amdgpu_gem_userptr args;

-   args.addr = cpu;
+   args.addr = (uintptr_t)cpu;
args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER;
args.size = size;
r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR,
-- 
2.1.4

[PATCH 03/10] amdgpu: drop address patching logics

2016-01-12 Thread Marek Olšák

From: "monk.liu" 

we don't support non-page-aligned cpu pointer anymore

Signed-off-by: monk.liu 
Reviewed-by: Christian KÃ¶nig 
---
 amdgpu/amdgpu_bo.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index 1a5a401..61db58c 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -537,17 +537,8 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle 
dev,
int r;
struct amdgpu_bo *bo;
struct drm_amdgpu_gem_userptr args;
-   uintptr_t cpu0;
-   uint32_t ps, off;

-   memset(&args, 0, sizeof(args));
-   ps = getpagesize();
-
-   cpu0 = ROUND_DOWN((uintptr_t)cpu, ps);
-   off = (uintptr_t)cpu - cpu0;
-   size = ROUND_UP(size + off, ps);
-
-   args.addr = cpu0;
+   args.addr = cpu;
args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER;
args.size = size;
r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR,
-- 
2.1.4

[PATCH 08/10] amdgpu: validate user memory for userptr

2016-01-12 Thread Marek Olšák

From: Chunming Zhou 

Signed-off-by: Chunming Zhou 
Reviewed-by: Christian KÃ¶nig 
---
 amdgpu/amdgpu_bo.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index 2ae1c18..d30fd1e 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -539,7 +539,8 @@ int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,
struct drm_amdgpu_gem_userptr args;

args.addr = (uintptr_t)cpu;
-   args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER;
+   args.flags = AMDGPU_GEM_USERPTR_ANONONLY | AMDGPU_GEM_USERPTR_REGISTER |
+   AMDGPU_GEM_USERPTR_VALIDATE;
args.size = size;
r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_GEM_USERPTR,
&args, sizeof(args));
-- 
2.1.4

[PATCH 06/10] amdgpu: add semaphore support

2016-01-12 Thread Marek Olšák

From: Chunming Zhou 

the semaphore is a binary semaphore. the work flow is:
1. create sem
2. signal sem
3. wait sem, reset sem after signalled
4. destroy sem.

Signed-off-by: Chunming Zhou 
Reviewed-by: Jammy Zhou 
Reviewed-by: Christian KÃ¶nig 
---
 amdgpu/amdgpu.h  |  65 +++
 amdgpu/amdgpu_cs.c   | 166 +--
 amdgpu/amdgpu_internal.h |  12 
 3 files changed, 239 insertions(+), 4 deletions(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index 9ae6ca3..8822a0c 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -124,6 +124,11 @@ typedef struct amdgpu_bo_list *amdgpu_bo_list_handle;
  */
 typedef struct amdgpu_va *amdgpu_va_handle;

+/**
+ * Define handle for semaphore
+ */
+typedef struct amdgpu_semaphore *amdgpu_semaphore_handle;
+
 /*--*/
 /* -- Structures -- */
 /*--*/
@@ -1202,4 +1207,64 @@ int amdgpu_bo_va_op(amdgpu_bo_handle bo,
uint64_t flags,
uint32_t ops);

+/**
+ *  create semaphore
+ *
+ * \param   sem   - \c [out] semaphore handle
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+*/
+int amdgpu_cs_create_semaphore(amdgpu_semaphore_handle *sem);
+
+/**
+ *  signal semaphore
+ *
+ * \param   context- \c [in] GPU Context
+ * \param   ip_type- \c [in] Hardware IP block type = AMDGPU_HW_IP_*
+ * \param   ip_instance- \c [in] Index of the IP block of the same type
+ * \param   ring   - \c [in] Specify ring index of the IP
+ * \param   sem   - \c [in] semaphore handle
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+*/
+int amdgpu_cs_signal_semaphore(amdgpu_context_handle ctx,
+  uint32_t ip_type,
+  uint32_t ip_instance,
+  uint32_t ring,
+  amdgpu_semaphore_handle sem);
+
+/**
+ *  wait semaphore
+ *
+ * \param   context- \c [in] GPU Context
+ * \param   ip_type- \c [in] Hardware IP block type = AMDGPU_HW_IP_*
+ * \param   ip_instance- \c [in] Index of the IP block of the same type
+ * \param   ring   - \c [in] Specify ring index of the IP
+ * \param   sem   - \c [in] semaphore handle
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+*/
+int amdgpu_cs_wait_semaphore(amdgpu_context_handle ctx,
+uint32_t ip_type,
+uint32_t ip_instance,
+uint32_t ring,
+amdgpu_semaphore_handle sem);
+
+/**
+ *  destroy semaphore
+ *
+ * \param   sem- \c [in] semaphore handle
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+*/
+int amdgpu_cs_destroy_semaphore(amdgpu_semaphore_handle sem);
+
 #endif /* #ifdef _AMDGPU_H_ */
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index d5e4ea0..d033f8e 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -40,6 +40,9 @@
 #include "amdgpu_drm.h"
 #include "amdgpu_internal.h"

+static int amdgpu_cs_unreference_sem(amdgpu_semaphore_handle sem);
+static int amdgpu_cs_reset_sem(amdgpu_semaphore_handle sem);
+
 /**
  * Create command submission context
  *
@@ -53,6 +56,7 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev,
 {
struct amdgpu_context *gpu_context;
union drm_amdgpu_ctx args;
+   int i, j, k;
int r;

if (NULL == dev)
@@ -78,6 +82,10 @@ int amdgpu_cs_ctx_create(amdgpu_device_handle dev,
goto error;

gpu_context->id = args.out.alloc.ctx_id;
+   for (i = 0; i < AMDGPU_HW_IP_NUM; i++)
+   for (j = 0; j < AMDGPU_HW_IP_INSTANCE_MAX_COUNT; j++)
+   for (k = 0; k < AMDGPU_CS_MAX_RINGS; k++)
+   list_inithead(&gpu_context->sem_list[i][j][k]);
*context = (amdgpu_context_handle)gpu_context;

return 0;
@@ -99,6 +107,7 @@ error:
 int amdgpu_cs_ctx_free(amdgpu_context_handle context)
 {
union drm_amdgpu_ctx args;
+   int i, j, k;
int r;

if (NULL == context)
@@ -112,7 +121,18 @@ int amdgpu_cs_ctx_free(amdgpu_context_handle context)
args.in.ctx_id = context->id;
r = drmCommandWriteRead(context->dev->fd, DRM_AMDGPU_CTX,
&args, sizeof(args));
-
+   for (i = 0; i < AMDGPU_HW_IP_NUM; i++) {
+   for (j = 0; j < AMDGPU_HW_IP_INSTANCE_MAX_COUNT; j++) {
+   for (k = 0; k < AMDGPU_CS_MAX_RINGS; k++) {
+   amdgpu_semaphore_handle sem;
+   LIST_FOR_EACH_ENTRY(sem, 
&context->sem_list[i][j][k], list) {
+

[PATCH 07/10] tests/amdgpu: add semaphore test

2016-01-12 Thread Marek Olšák

From: Chunming Zhou 

Signed-off-by: Chunming Zhou 
Reviewed-by: Jammy Zhou 
Reviewed-by: Christian KÃ¶nig 
---
 tests/amdgpu/basic_tests.c | 133 +
 1 file changed, 133 insertions(+)

diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c
index a666d32..56db935 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests/amdgpu/basic_tests.c
@@ -48,6 +48,7 @@ static void amdgpu_command_submission_compute(void);
 static void amdgpu_command_submission_sdma(void);
 static void amdgpu_command_submission_multi_fence(void);
 static void amdgpu_userptr_test(void);
+static void amdgpu_semaphore_test(void);

 CU_TestInfo basic_tests[] = {
{ "Query Info Test",  amdgpu_query_info_test },
@@ -57,6 +58,7 @@ CU_TestInfo basic_tests[] = {
{ "Command submission Test (Compute)", 
amdgpu_command_submission_compute },
{ "Command submission Test (SDMA)", amdgpu_command_submission_sdma },
{ "Command submission Test (Multi-fence)", 
amdgpu_command_submission_multi_fence },
+   { "SW semaphore Test",  amdgpu_semaphore_test },
CU_TEST_INFO_NULL,
 };
 #define BUFFER_SIZE (8 * 1024)
@@ -79,6 +81,9 @@ CU_TestInfo basic_tests[] = {
 #defineSDMA_OPCODE_COPY  1
 #   define SDMA_COPY_SUB_OPCODE_LINEAR0

+#define GFX_COMPUTE_NOP  0x1000
+#define SDMA_NOP  0x0
+
 int suite_basic_tests_init(void)
 {
int r;
@@ -335,6 +340,134 @@ static void amdgpu_command_submission_gfx(void)
amdgpu_command_submission_gfx_shared_ib();
 }

+static void amdgpu_semaphore_test(void)
+{
+   amdgpu_context_handle context_handle[2];
+   amdgpu_semaphore_handle sem;
+   amdgpu_bo_handle ib_result_handle[2];
+   void *ib_result_cpu[2];
+   uint64_t ib_result_mc_address[2];
+   struct amdgpu_cs_request ibs_request[2] = {0};
+   struct amdgpu_cs_ib_info ib_info[2] = {0};
+   struct amdgpu_cs_fence fence_status = {0};
+   uint32_t *ptr;
+   uint32_t expired;
+   amdgpu_bo_list_handle bo_list[2];
+   amdgpu_va_handle va_handle[2];
+   int r, i;
+
+   r = amdgpu_cs_create_semaphore(&sem);
+   CU_ASSERT_EQUAL(r, 0);
+   for (i = 0; i < 2; i++) {
+   r = amdgpu_cs_ctx_create(device_handle, &context_handle[i]);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096,
+   AMDGPU_GEM_DOMAIN_GTT, 0,
+   &ib_result_handle[i], 
&ib_result_cpu[i],
+   &ib_result_mc_address[i], 
&va_handle[i]);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_get_bo_list(device_handle, ib_result_handle[i],
+  NULL, &bo_list[i]);
+   CU_ASSERT_EQUAL(r, 0);
+   }
+
+   /* 1. same context different engine */
+   ptr = ib_result_cpu[0];
+   ptr[0] = SDMA_NOP;
+   ib_info[0].ib_mc_address = ib_result_mc_address[0];
+   ib_info[0].size = 1;
+
+   ibs_request[0].ip_type = AMDGPU_HW_IP_DMA;
+   ibs_request[0].number_of_ibs = 1;
+   ibs_request[0].ibs = &ib_info[0];
+   ibs_request[0].resources = bo_list[0];
+   ibs_request[0].fence_info.handle = NULL;
+   r = amdgpu_cs_submit(context_handle[0], 0,&ibs_request[0], 1);
+   CU_ASSERT_EQUAL(r, 0);
+   r = amdgpu_cs_signal_semaphore(context_handle[0], AMDGPU_HW_IP_DMA, 0, 
0, sem);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_cs_wait_semaphore(context_handle[0], AMDGPU_HW_IP_GFX, 0, 0, 
sem);
+   CU_ASSERT_EQUAL(r, 0);
+   ptr = ib_result_cpu[1];
+   ptr[0] = GFX_COMPUTE_NOP;
+   ib_info[1].ib_mc_address = ib_result_mc_address[1];
+   ib_info[1].size = 1;
+
+   ibs_request[1].ip_type = AMDGPU_HW_IP_GFX;
+   ibs_request[1].number_of_ibs = 1;
+   ibs_request[1].ibs = &ib_info[1];
+   ibs_request[1].resources = bo_list[1];
+   ibs_request[1].fence_info.handle = NULL;
+
+   r = amdgpu_cs_submit(context_handle[0], 0,&ibs_request[1], 1);
+   CU_ASSERT_EQUAL(r, 0);
+
+   fence_status.context = context_handle[0];
+   fence_status.ip_type = AMDGPU_HW_IP_GFX;
+   fence_status.fence = ibs_request[1].seq_no;
+   r = amdgpu_cs_query_fence_status(&fence_status,
+5, 0, &expired);
+   CU_ASSERT_EQUAL(r, 0);
+   CU_ASSERT_EQUAL(expired, true);
+
+   /* 2. same engine different context */
+   ptr = ib_result_cpu[0];
+   ptr[0] = GFX_COMPUTE_NOP;
+   ib_info[0].ib_mc_address = ib_result_mc_address[0];
+   ib_info[0].size = 1;
+
+   ibs_request[0].ip_type = AMDGPU_HW_IP_GFX;
+   ibs_request[0].number_of_ibs = 1;
+   ibs_request[0].ibs = &ib_info[0];
+   ibs_request[0].resources = bo_list[0];
+   ibs_request[0].fence_info.handle = NULL;
+   r = amdgpu_cs_submit(context_handle[

[PATCH 09/10] amdgpu: cs_wait_fences now can return the first signaled fence index

2016-01-12 Thread Marek Olšák

From: "monk.liu" 

Signed-off-by: monk.liu 
---
 amdgpu/amdgpu.h|  3 ++-
 amdgpu/amdgpu_cs.c | 12 +---
 include/drm/amdgpu_drm.h   |  3 ++-
 tests/amdgpu/basic_tests.c |  2 +-
 4 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index 8822a0c..d4be7fc 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -916,6 +916,7 @@ int amdgpu_cs_query_fence_status(struct amdgpu_cs_fence 
*fence,
  *otherwise, wait at least one fence
  * \param   timeout_ns  - \c [in] The timeout to wait, in nanoseconds
  * \param   status  - \c [out] '1' for signaled, '0' for timeout
+ * \param   first   - \c [out] the index of the first signaled fence from 
@fences
  *
  * \return  0 on success
  *  <0 - Negative POSIX Error code
@@ -927,7 +928,7 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences,
  uint32_t fence_count,
  bool wait_all,
  uint64_t timeout_ns,
- uint32_t *status);
+ uint32_t *status, uint32_t *first);

 /*
  * Query / Info API
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index d033f8e..5c7a3a3 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -439,7 +439,8 @@ static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence 
*fences,
uint32_t fence_count,
bool wait_all,
uint64_t timeout_ns,
-   uint32_t *status)
+   uint32_t *status,
+   uint32_t *first)
 {
struct drm_amdgpu_fence *drm_fences;
amdgpu_device_handle dev = fences[0].context->dev;
@@ -467,6 +468,10 @@ static int amdgpu_ioctl_wait_fences(struct amdgpu_cs_fence 
*fences,
return -errno;

*status = args.out.status;
+
+   if (first)
+   *first = args.out.first_signaled;
+
return 0;
 }

@@ -474,7 +479,8 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences,
  uint32_t fence_count,
  bool wait_all,
  uint64_t timeout_ns,
- uint32_t *status)
+ uint32_t *status,
+ uint32_t *first)
 {
uint32_t ioctl_status = 0;
uint32_t i;
@@ -499,7 +505,7 @@ int amdgpu_cs_wait_fences(struct amdgpu_cs_fence *fences,
*status = 0;

r = amdgpu_ioctl_wait_fences(fences, fence_count, wait_all, timeout_ns,
-   &ioctl_status);
+   &ioctl_status, first);

if (!r)
*status = ioctl_status;
diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index 2cbea72..194e1f9 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -316,7 +316,8 @@ struct drm_amdgpu_wait_fences_in {
 };

 struct drm_amdgpu_wait_fences_out {
-   uint64_t status;
+   uint32_t status;
+   uint32_t first_signaled;
 };

 union drm_amdgpu_wait_fences {
diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c
index 56db935..47cd1db 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests/amdgpu/basic_tests.c
@@ -974,7 +974,7 @@ static void 
amdgpu_command_submission_multi_fence_wait_all(bool wait_all)

r = amdgpu_cs_wait_fences(fence_status, ib_cs_num, wait_all,
AMDGPU_TIMEOUT_INFINITE,
-   &expired);
+   &expired, NULL);
CU_ASSERT_EQUAL(r, 0);

r = amdgpu_bo_unmap_and_free(ib_result_handle, va_handle,
-- 
2.1.4

[PATCH 10/10] amdgpu: list each entry safely for sw semaphore when submit ib

2016-01-12 Thread Marek Olšák

From: Junwei Zhang 

Signed-off-by: Junwei Zhang 
Reviewed-by: Michel DÃ¤nzer 
Reviewed-by: David Zhou 
Reviewed-by: Christian KÃ¶nig 
---
 amdgpu/amdgpu_cs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index 5c7a3a3..82fa805 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -179,7 +179,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
struct drm_amdgpu_cs_chunk_dep *dependencies = NULL;
struct drm_amdgpu_cs_chunk_dep *sem_dependencies = NULL;
struct list_head *sem_list;
-   amdgpu_semaphore_handle sem;
+   amdgpu_semaphore_handle sem, tmp;
uint32_t i, size, sem_count = 0;
bool user_fence;
int r = 0;
@@ -282,7 +282,7 @@ static int amdgpu_cs_submit_one(amdgpu_context_handle 
context,
goto error_unlock;
}
sem_count = 0;
-   LIST_FOR_EACH_ENTRY(sem, sem_list, list) {
+   LIST_FOR_EACH_ENTRY_SAFE(sem, tmp, sem_list, list) {
struct amdgpu_cs_fence *info = &sem->signal_fence;
struct drm_amdgpu_cs_chunk_dep *dep = 
&sem_dependencies[sem_count++];
dep->ip_type = info->ip_type;
-- 
2.1.4

[PATCH 04/10] amdgpu/tests: add multi-fence test in base test

2016-01-12 Thread Marek Olšák

From: Junwei Zhang 

Signed-off-by: Junwei Zhang 
Reviewed-by: Jammy Zhou 
---
 tests/amdgpu/basic_tests.c | 100 +
 1 file changed, 100 insertions(+)

diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c
index e489e6e..a666d32 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests/amdgpu/basic_tests.c
@@ -46,6 +46,7 @@ static void amdgpu_memory_alloc(void);
 static void amdgpu_command_submission_gfx(void);
 static void amdgpu_command_submission_compute(void);
 static void amdgpu_command_submission_sdma(void);
+static void amdgpu_command_submission_multi_fence(void);
 static void amdgpu_userptr_test(void);

 CU_TestInfo basic_tests[] = {
@@ -55,6 +56,7 @@ CU_TestInfo basic_tests[] = {
{ "Command submission Test (GFX)",  amdgpu_command_submission_gfx },
{ "Command submission Test (Compute)", 
amdgpu_command_submission_compute },
{ "Command submission Test (SDMA)", amdgpu_command_submission_sdma },
+   { "Command submission Test (Multi-fence)", 
amdgpu_command_submission_multi_fence },
CU_TEST_INFO_NULL,
 };
 #define BUFFER_SIZE (8 * 1024)
@@ -765,6 +767,104 @@ static void amdgpu_command_submission_sdma(void)
amdgpu_command_submission_sdma_copy_linear();
 }

+static void amdgpu_command_submission_multi_fence_wait_all(bool wait_all)
+{
+   amdgpu_context_handle context_handle;
+   amdgpu_bo_handle ib_result_handle, ib_result_ce_handle;
+   void *ib_result_cpu, *ib_result_ce_cpu;
+   uint64_t ib_result_mc_address, ib_result_ce_mc_address;
+   struct amdgpu_cs_request ibs_request[2] = {0};
+   struct amdgpu_cs_ib_info ib_info[2];
+   struct amdgpu_cs_fence fence_status[2] = {0};
+   uint32_t *ptr;
+   uint32_t expired;
+   amdgpu_bo_list_handle bo_list;
+   amdgpu_va_handle va_handle, va_handle_ce;
+   int r;
+   int i, ib_cs_num = 2;
+
+   r = amdgpu_cs_ctx_create(device_handle, &context_handle);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096,
+   AMDGPU_GEM_DOMAIN_GTT, 0,
+   &ib_result_handle, &ib_result_cpu,
+   &ib_result_mc_address, &va_handle);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_bo_alloc_and_map(device_handle, 4096, 4096,
+   AMDGPU_GEM_DOMAIN_GTT, 0,
+   &ib_result_ce_handle, &ib_result_ce_cpu,
+   &ib_result_ce_mc_address, &va_handle_ce);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_get_bo_list(device_handle, ib_result_handle,
+  ib_result_ce_handle, &bo_list);
+   CU_ASSERT_EQUAL(r, 0);
+
+   memset(ib_info, 0, 2 * sizeof(struct amdgpu_cs_ib_info));
+
+   /* IT_SET_CE_DE_COUNTERS */
+   ptr = ib_result_ce_cpu;
+   ptr[0] = 0xc0008900;
+   ptr[1] = 0;
+   ptr[2] = 0xc0008400;
+   ptr[3] = 1;
+   ib_info[0].ib_mc_address = ib_result_ce_mc_address;
+   ib_info[0].size = 4;
+   ib_info[0].flags = AMDGPU_IB_FLAG_CE;
+
+   /* IT_WAIT_ON_CE_COUNTER */
+   ptr = ib_result_cpu;
+   ptr[0] = 0xc0008600;
+   ptr[1] = 0x0001;
+   ib_info[1].ib_mc_address = ib_result_mc_address;
+   ib_info[1].size = 2;
+
+   for (i = 0; i < ib_cs_num; i++) {
+   ibs_request[i].ip_type = AMDGPU_HW_IP_GFX;
+   ibs_request[i].number_of_ibs = 2;
+   ibs_request[i].ibs = ib_info;
+   ibs_request[i].resources = bo_list;
+   ibs_request[i].fence_info.handle = NULL;
+   }
+
+   r = amdgpu_cs_submit(context_handle, 0,ibs_request, ib_cs_num);
+
+   CU_ASSERT_EQUAL(r, 0);
+
+   for (i = 0; i < ib_cs_num; i++) {
+   fence_status[i].context = context_handle;
+   fence_status[i].ip_type = AMDGPU_HW_IP_GFX;
+   fence_status[i].fence = ibs_request[i].seq_no;
+   }
+
+   r = amdgpu_cs_wait_fences(fence_status, ib_cs_num, wait_all,
+   AMDGPU_TIMEOUT_INFINITE,
+   &expired);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_bo_unmap_and_free(ib_result_handle, va_handle,
+ib_result_mc_address, 4096);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_bo_unmap_and_free(ib_result_ce_handle, va_handle_ce,
+ib_result_ce_mc_address, 4096);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_bo_list_destroy(bo_list);
+   CU_ASSERT_EQUAL(r, 0);
+
+   r = amdgpu_cs_ctx_free(context_handle);
+   CU_ASSERT_EQUAL(r, 0);
+}
+
+static void amdgpu_command_submission_multi_fence(void)
+{
+   amdgpu_command_submission_multi_fence_wait_all(true);
+   amdgpu_command_submission_multi_fence_wait_all(false);
+}
+
 static void amdgpu_userptr_test(void)
 {
int i, r, j;
-- 
2.1.4

[PATCH 00/10] libdrm amdgpu patches

2016-01-12 Thread Alex Deucher

On Tue, Jan 12, 2016 at 4:23 PM, Marek OlÅ¡Ã¡k  wrote:
> Hi,
>
> These are libdrm_amdgpu patches harvested from an internal branch.
>
> The first patch is a revert I had to make to fix the build. Yeah, 
> sequence_mutex should be renamed to a more appropriate name. That can be done 
> as a follow-up.
>
> One notable change is the addition of DRM_IOCTL_AMDGPU_WAIT_FENCES. I hope 
> the kernel contains (or will contain) the changes too, so that I don't push 
> something that doesn't exist in the kernel.

We haven't pushed DRM_IOCTL_AMDGPU_WAIT_FENCES upstream yet so I would
hold off on any changes that depend on that.

Alex

>
> Please let me know if these are okay to push.
>
> Thanks,
>
> Chunming Zhou (3):
>   amdgpu: add semaphore support
>   tests/amdgpu: add semaphore test
>   amdgpu: validate user memory for userptr
>
> Junwei Zhang (3):
>   amdgpu: add the interface of waiting multiple fences
>   amdgpu/tests: add multi-fence test in base test
>   amdgpu: list each entry safely for sw semaphore when submit ib
>
> Marek OlÅ¡Ã¡k (1):
>   Revert "amdgpu: remove sequence mutex"
>
> Michel DÃ¤nzer (1):
>   amdgpu: Cast pointer to uintptr_t for assignment to unsigned integer
>
> monk.liu (2):
>   amdgpu: drop address patching logics
>   amdgpu: cs_wait_fences now can return the first signaled fence index
>
>  amdgpu/amdgpu.h|  88 +
>  amdgpu/amdgpu_bo.c |  14 ++-
>  amdgpu/amdgpu_cs.c | 253 
> --
>  amdgpu/amdgpu_internal.h   |  15 +++
>  include/drm/amdgpu_drm.h   |  28 +
>  tests/amdgpu/basic_tests.c | 233 
> 
>  6 files changed, 616 insertions(+), 15 deletions(-)
>
> Marek
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v9 14/14] drm/mediatek: Add interface to allocate Mediatek GEM buffer.

2016-01-12 Thread Rob Herring

On Tue, Jan 12, 2016 at 9:15 AM, Philipp Zabel  
wrote:
> From: CK Hu 
>
> Add an interface to allocate Mediatek GEM buffers, allow the IOCTLs
> to be used by render nodes.
> This patch also sets the RENDER driver feature.

But it should not a be render node unless it has a GPU AFAIK. Then
again, I still don't understand the madness of every driver defining
their own buffer ioctls either. The only line remotely h/w specific
here is mtk_drm_gem_create call.

Rob

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Linus Torvalds

On Tue, Jan 12, 2016 at 1:13 PM, Chris Wilson  
wrote:
>
> That is a continual worry. To try and assuage that fear, I sent 8x
> flush gpu writes between the end of the copy and setting the "I'm done"
> flag. The definition of the GPU flush is that it both flushes all
> previous writes before it completes and only after it completes does it
> do the post-sync write (before moving onto the next command). The spec
> is always a bit hazy on what order the memory writes will be visible on
> the CPU though.
>
> Sending the 8x GPU flushes before marking "I'm done" did not fix the
> corruption.

Ok. So assuming the GPU flushes are supposed to work, it should be all good.

>> So the reason you see the old content may just be that the GPU writes
>> are still buffered on the GPU. And you adding a clflushopt on the same
>> address just changes the timing enough that you don't see the memory
>> ordering any more (or it's just much harder to see, it might still be
>> there).
>
> Indeed. So I replaced the post-clflush_cache_range() clflush() with a
> udelay(10) instead, and the corruption vanished. Putting the udelay(10)
> before the clflush_cache_range() does not fix the corruption.

Odd.

> passes, I'm inclined to point the finger at the mb() following the
> clflush_cache_range().

We have an entirely unrelated discussion about the value of "mfence"
as a memory barrier.

Mind trying to just make the memory barrier (in
arch/x86/include/asm/barrier.h) be a locked op instead?

The docs say "Executions of the CLFLUSHOPT instruction are ordered
with respect to fence instructions and to locked read-modify-write
instructions; ..", so the mfence should be plenty good enough. But
nobody sane uses mfence for memory ordering (that's the other
discussion we're having), since a locked rmw instruction is faster.

So maybe it's a CPU bug. I'd still consider a GPU memory ordering bug
*way* more likely (the CPU core tensd to be better validated in my
experience), but since you're trying odd things anyway, try changing
the "mfence" to "lock; addl $0,0(%%rsp)" instead.

I doubt it makes any difference, but ..

Linus

[PATCH v9 14/14] drm/mediatek: Add interface to allocate Mediatek GEM buffer.

2016-01-12 Thread Daniel Vetter

On Tue, Jan 12, 2016 at 11:02 PM, Rob Herring  wrote:
> On Tue, Jan 12, 2016 at 9:15 AM, Philipp Zabel  
> wrote:
>> From: CK Hu 
>>
>> Add an interface to allocate Mediatek GEM buffers, allow the IOCTLs
>> to be used by render nodes.
>> This patch also sets the RENDER driver feature.
>
> But it should not a be render node unless it has a GPU AFAIK. Then
> again, I still don't understand the madness of every driver defining
> their own buffer ioctls either. The only line remotely h/w specific
> here is mtk_drm_gem_create call.

Support gem_create/mmap_offet alone is indeed pointless without some
real support for gpu workloads. For plain display drivers the dumb
buffer api, plus prime/dma-buf import should be plenty enough.

The usual reason for doing this is some blob driver for opengl that
can't be open-source, which is a big no-go for upstream.

Imo best to just rip this out, consider it nacked without full-blown
userspace and whatever else is needing these buffers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

[PATCH] drm/panel: simple: Add support for Sharp LQ101K1LY04

2016-01-12 Thread Joshua Clayton

Hi Lucas,
Thanks for the review. 

On Tue, 12 Jan 2016 19:45:30 +0100
Lucas Stach  wrote:

> > Â drivers/gpu/drm/panel/panel-simple.c | 26
> > ++
> 
> Missing documentation for the DT binding.
>

Thanks, will add. 

...

> > +   .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA,
> > +};
> > +
> This hunk isn't added at the correct place. Please keep the
> alphabetical sorting.
OK. Makes sense. I'll reorder these.
> 
> > Â static const struct display_timing hannstar_hsd070pww1_timing = {
> > Â   .pixelclock = { 6430, 7110, 8200 },
> > Â   .hactive = { 1280, 1280, 1280 },
> > @@ -1146,6 +1169,9 @@ static const struct of_device_id
> > platform_of_match[] = {
> > Â   .compatible = "hannstar,hsd070pww1",
> > Â   .data = &hannstar_hsd070pww1,
> > Â   }, {
> > +   .compatible = "sharp,lq101k1ly04",
> > +   .data = &sharp_lq101k1ly04,
> > +   }, {
> 
> Wrong insertion place again.
OK. Will reorder alphabetically also. 

Joshua

[PATCH 1/2] drm/panel: simple: Add support for Sharp LQ101K1LY04

2016-01-12 Thread Joshua Clayton

Add simple-panel support for the  Sharp LQ101K1LY04i, which  is
a 10 inch WXGA (1280x800) lvds panel.

Signed-off-by: Joshua Clayton 
---
 drivers/gpu/drm/panel/panel-simple.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/panel/panel-simple.c 
b/drivers/gpu/drm/panel/panel-simple.c
index f97b73e..e01eacb 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -1073,6 +1073,30 @@ static const struct panel_desc samsung_ltn140at29_301 = {
},
 };

+static const struct display_timing sharp_lq101k1ly04_timing = {
+   .pixelclock = { 6000, 6500, 8000 },
+   .hactive = { 1280, 1280, 1280 },
+   .hfront_porch = { 20, 20, 20 },
+   .hback_porch = { 20, 20, 20 },
+   .hsync_len = { 10, 10, 10 },
+   .vactive = { 800, 800, 800 },
+   .vfront_porch = { 4, 4, 4 },
+   .vback_porch = { 4, 4, 4 },
+   .vsync_len = { 4, 4, 4 },
+   .flags = DISPLAY_FLAGS_PIXDATA_POSEDGE,
+};
+
+static const struct panel_desc sharp_lq101k1ly04 = {
+   .timings = &sharp_lq101k1ly04_timing,
+   .num_timings = 1,
+   .bpc = 8,
+   .size = {
+   .width = 217,
+   .height = 136,
+   },
+   .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA,
+};
+
 static const struct drm_display_mode shelly_sca07010_bfn_lnn_mode = {
.clock = 33300,
.hdisplay = 800,
@@ -1188,6 +1212,9 @@ static const struct of_device_id platform_of_match[] = {
.compatible = "samsung,ltn140at29-301",
.data = &samsung_ltn140at29_301,
}, {
+   .compatible = "sharp,lq101k1ly04",
+   .data = &sharp_lq101k1ly04,
+   }, {
.compatible = "shelly,sca07010-bfn-lnn",
.data = &shelly_sca07010_bfn_lnn,
}, {
-- 
2.5.0

[PATCH 2/2] drm/panel: simple: Add Documentation for Sharp LQ101K1LY04

2016-01-12 Thread Joshua Clayton

Document basic simple-panel support for Sharp LQ101K1LY04

Signed-off-by: Joshua Clayton 
---
 .../devicetree/bindings/display/panel/sharp,lq101k1ly04.txt| 7 +++
 1 file changed, 7 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/panel/sharp,lq101k1ly04.txt

diff --git 
a/Documentation/devicetree/bindings/display/panel/sharp,lq101k1ly04.txt 
b/Documentation/devicetree/bindings/display/panel/sharp,lq101k1ly04.txt
new file mode 100644
index 000..4aff25b
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/sharp,lq101k1ly04.txt
@@ -0,0 +1,7 @@
+Sharp Display Corp. LQ101K1LY04 10.07" WXGA TFT LCD panel
+
+Required properties:
+- compatible: should be "sharp,lq101k1ly04"
+
+This binding is compatible with the simple-panel binding, which is specified
+in simple-panel.txt in this directory.
-- 
2.5.0

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Linus Torvalds

On Tue, Jan 12, 2016 at 4:55 PM, Chris Wilson  
wrote:
>
> The double clflush() remains a mystery.

Actually, I think it's explainable.

It's wrong to do the clflush *after* the GPU has done the write, which
seems to be what you are doing.

Why?

If the GPU really isn't cache coherent, what can happen is:

 - the CPU has the line cached

 - the GPU writes the data

 - you do the clflushopt to invalidate the cacheline

 - you expect to see the GPU data.

Right?

Wrong. The above is complete crap.

Why?

Very simple reason: the CPU may have had the cacheline dirty at some
level in its caches, so when you did the clflushopt, it didn't just
invalidate the CPU cacheline, it wrote it back to memory. And in the
process over-wrote the data that the GPU had written.

Now you can say "but the CPU never wrote to the cacheline, so it's not
dirty in the CPU caches". That may or may not be trie. The CPU may
have written to it quite a long time ago.

So if you are doing a GPU write, and you want to see the data that the
GPU wrote, you had better do the clflushopt long *before* the GPU ever
writes to memory.

Your pattern of doing "flush and read" is simply fundamentally buggy.
There are only two valid CPU flushing patterns:

 - write and flush (to make the writes visible to the GPU)

 - flush before starting GPU accesses, and then read

At no point can "flush and read" be right.

Now, I haven't actually seen your code, so I'm just going by your
high-level description of where the CPU flush and CPU read were done,
but it *sounds* like you did that invalid "flush and read" behavior.

 Linus

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Andy Lutomirski

On Tue, Jan 12, 2016 at 6:06 PM, Linus Torvalds
 wrote:
> On Tue, Jan 12, 2016 at 4:55 PM, Chris Wilson  
> wrote:
>>
>> The double clflush() remains a mystery.
>
> Actually, I think it's explainable.
>
> It's wrong to do the clflush *after* the GPU has done the write, which
> seems to be what you are doing.
>
> Why?
>
> If the GPU really isn't cache coherent, what can happen is:
>
>  - the CPU has the line cached
>
>  - the GPU writes the data
>
>  - you do the clflushopt to invalidate the cacheline
>
>  - you expect to see the GPU data.
>
> Right?
>
> Wrong. The above is complete crap.
>
> Why?
>
> Very simple reason: the CPU may have had the cacheline dirty at some
> level in its caches, so when you did the clflushopt, it didn't just
> invalidate the CPU cacheline, it wrote it back to memory. And in the
> process over-wrote the data that the GPU had written.
>
> Now you can say "but the CPU never wrote to the cacheline, so it's not
> dirty in the CPU caches". That may or may not be trie. The CPU may
> have written to it quite a long time ago.
>
> So if you are doing a GPU write, and you want to see the data that the
> GPU wrote, you had better do the clflushopt long *before* the GPU ever
> writes to memory.
>
> Your pattern of doing "flush and read" is simply fundamentally buggy.
> There are only two valid CPU flushing patterns:
>
>  - write and flush (to make the writes visible to the GPU)
>
>  - flush before starting GPU accesses, and then read
>
> At no point can "flush and read" be right.
>
> Now, I haven't actually seen your code, so I'm just going by your
> high-level description of where the CPU flush and CPU read were done,
> but it *sounds* like you did that invalid "flush and read" behavior.

Since barriers are on my mind: how strong a barrier is needed to
prevent cache fills from being speculated across the barrier?

Concretely, if I do:

clflush A
clflush B
load A
load B

the architecture guarantees that (unless store forwarding happens) the
value I see for B is at least as new as the value I see for A *with
respect to other access within the coherency domain*.  But the GPU
isn't in the coherency domain at all.

Is:

clflush A
clflush B
load A
MFENCE
load B

good enough?  If it is, and if

clflush A
clflush B
load A
LOCK whatever
load B

is *not*, then this might account for the performance difference.

In any event, it seems to me that what i915 is trying to do isn't
really intended to be supported for WB memory.  i915 really wants to
force a read from main memory and to simultaneously prevent the CPU
from writing back to main memory.  Ick.  I'd assume that:

clflush A
clflush B
load A
serializing instruction here
load B

is good enough, as long as you make sure that the GPU does its writes
after the clflushes make it all the way out to main memory (which
might require a second serializing instruction in the case of
clflushopt), but this still relies on the hardware prefetcher not
prefetching B too early, which it's permitted to do even in the
absence of any explicit access at all.

Presumably this is good enough on any implementation:

clflush A
clflush B
load A
clflush B
load B

But that will be really, really slow.  And you're still screwed if the
hardware is permitted to arbitrarily change cache lines from S to M.

In other words, I'm not really convinced that x86 was ever intended to
have well-defined behavior if something outside the coherency domain
writes to a page of memory while that page is mapped WB.  Of course,
I'm also not sure how to reliably switch a page from WB to any other
memory type short of remapping it and doing CLFLUSH after remapping.

SDM Volume 3 11.12.4 seems to agree with me.

Could the driver be changed to use WC or UC and to use MOVNTDQA on
supported CPUs to get the performance back?  It sounds like i915 is
effectively doing PIO here, and reasonably modern CPUs have a nice set
of fast PIO instructions.

--Andy

[PATCH] x86: Add an explicit barrier() to clflushopt()

2016-01-12 Thread Linus Torvalds

On Tue, Jan 12, 2016 at 6:42 PM, Andy Lutomirski  wrote:
>
> Since barriers are on my mind: how strong a barrier is needed to
> prevent cache fills from being speculated across the barrier?

I don't think there are *any* architectural guarantees.

I suspect that a real serializing instruction should do it. But I
don't think even that is guaranteed.

Non-coherent IO is crazy. I really thought Intel had learnt their
lesson, and finally made all the GPU's coherent. I'm afraid to even
ask why Chris is actually working on some sh*t that requires clflush.

In general, you should probably do something nasty like

 - flush before starting IO that generates data (to make sure you have
no dirty cachelines that will write back and mess up)

 - start the IO, wait for it to complete

 - flush after finishing IO that generates the data (to make sure you
have no speculative clean cachelines with stale data)

 - read the data now.

Of course, what people actually end up doing to avoid all this is to
mark the memory noncacheable.

And finally, the *correct* thing is to not have crap hardware, and
have IO be cache coherent. Things that don't do that are shit. Really.

 Linus

85 matches

Mail list logo