[PATCH 1/1] drm/amd/amdgpu: get maximum and used UVD handles (v3)

2016-12-16 Thread Grazvydas Ignotas
On Thu, Dec 15, 2016 at 4:12 PM, Christian König
 wrote:
>
> Regarding which error code to return I think that Emil has the right idea
> here.
>
> Returning -EINVAL usually means that userspace provided an invalid value,
> but in this case it doesn't matter which value the UMD provide all of them
> would be invalid because starting with Polaris the hardware/firmware simply
> doesn't work this way any more.
>
> So using -ENODEV or maybe -ENODATA indeed sound like the right think to do
> here.

What about ERANGE then, "Math result not representable" aka infinity?
To me ENODEV is more like "the GPU you are asking about is not there"
and ENODATA "information you ask for is not known", although the later
still somewhat makes sense.

Gražvydas


[Intel-gfx] [PATCH] drm: Convert all helpers to drm_connector_list_iter

2016-12-16 Thread kbuild test robot
Hi Daniel,

[auto build test ERROR on drm/drm-next]
[also build test ERROR on next-20161215]
[cannot apply to v4.9]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm-Convert-all-helpers-to-drm_connector_list_iter/20161216-061508
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
config: i386-randconfig-x005-201650 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_helper_encoder_in_use':
   drivers/gpu/drm/drm_crtc_helper.c:91:33: error: storage size of 'conn_iter' 
isn't known
 struct drm_connector_list_iter conn_iter;
^
   drivers/gpu/drm/drm_crtc_helper.c:104:2: error: implicit declaration of 
function 'drm_connector_list_iter_get' [-Werror=implicit-function-declaration]
 drm_connector_list_iter_get(dev, &conn_iter);
 ^~~
   drivers/gpu/drm/drm_crtc_helper.c:105:2: error: implicit declaration of 
function 'drm_for_each_connector_iter' [-Werror=implicit-function-declaration]
 drm_for_each_connector_iter(connector, &conn_iter) {
 ^~~
   drivers/gpu/drm/drm_crtc_helper.c:105:53: error: expected ';' before '{' 
token
 drm_for_each_connector_iter(connector, &conn_iter) {
^
   drivers/gpu/drm/drm_crtc_helper.c:91:33: warning: unused variable 
'conn_iter' [-Wunused-variable]
 struct drm_connector_list_iter conn_iter;
^
   drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_crtc_helper_disable':
   drivers/gpu/drm/drm_crtc_helper.c:446:34: error: storage size of 'conn_iter' 
isn't known
  struct drm_connector_list_iter conn_iter;
 ^
   drivers/gpu/drm/drm_crtc_helper.c:452:54: error: expected ';' before '{' 
token
  drm_for_each_connector_iter(connector, &conn_iter) {
 ^
   drivers/gpu/drm/drm_crtc_helper.c:446:34: warning: unused variable 
'conn_iter' [-Wunused-variable]
  struct drm_connector_list_iter conn_iter;
 ^
   drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_crtc_helper_set_config':
   drivers/gpu/drm/drm_crtc_helper.c:521:33: error: storage size of 'conn_iter' 
isn't known
 struct drm_connector_list_iter conn_iter;
^
   drivers/gpu/drm/drm_crtc_helper.c:588:3: error: expected ';' before 
'save_connector_encoders'
  save_connector_encoders[count++] = connector->encoder;
  ^~~
   drivers/gpu/drm/drm_crtc_helper.c:589:2: error: implicit declaration of 
function 'drm_connector_list_iter_put' [-Werror=implicit-function-declaration]
 drm_connector_list_iter_put(&conn_iter);
 ^~~
   drivers/gpu/drm/drm_crtc_helper.c:633:53: error: expected ';' before '{' 
token
 drm_for_each_connector_iter(connector, &conn_iter) {
^
   drivers/gpu/drm/drm_crtc_helper.c:675:53: error: expected ';' before '{' 
token
 drm_for_each_connector_iter(connector, &conn_iter) {
^
   drivers/gpu/drm/drm_crtc_helper.c:767:3: error: expected ';' before 
'connector'
  connector->encoder = save_connector_encoders[count++];
  ^
   drivers/gpu/drm/drm_crtc_helper.c:521:33: warning: unused variable 
'conn_iter' [-Wunused-variable]
 struct drm_connector_list_iter conn_iter;
^
   drivers/gpu/drm/drm_crtc_helper.c:517:49: warning: unused variable 
'new_encoder' [-Wunused-variable]
 struct drm_encoder **save_connector_encoders, *new_encoder, *encoder;
^~~
   drivers/gpu/drm/drm_crtc_helper.c:516:41: warning: unused variable 
'new_crtc' [-Wunused-variable]
 struct drm_crtc **save_encoder_crtcs, *new_crtc;
^~~~
   drivers/gpu/drm/drm_crtc_helper.c: In function 
'drm_helper_choose_encoder_dpms':
   drivers/gpu/drm/drm_crtc_helper.c:795:33: error: storage size of 'conn_iter' 
isn't known
 struct drm_connector_list_iter conn_iter;
^
>> drivers/gpu/drm/drm_cr

[Intel-gfx] [PATCH] drm: Convert all helpers to drm_connector_list_iter

2016-12-16 Thread kbuild test robot
Hi Daniel,

[auto build test ERROR on drm/drm-next]
[also build test ERROR on next-20161215]
[cannot apply to v4.9]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm-Convert-all-helpers-to-drm_connector_list_iter/20161216-061508
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
config: i386-randconfig-x003-201650 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All error/warnings (new ones prefixed by >>):

   drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_helper_encoder_in_use':
>> drivers/gpu/drm/drm_crtc_helper.c:91:33: error: storage size of 'conn_iter' 
>> isn't known
 struct drm_connector_list_iter conn_iter;
^
>> drivers/gpu/drm/drm_crtc_helper.c:104:2: error: implicit declaration of 
>> function 'drm_connector_list_iter_get' 
>> [-Werror=implicit-function-declaration]
 drm_connector_list_iter_get(dev, &conn_iter);
 ^~~
>> drivers/gpu/drm/drm_crtc_helper.c:105:2: error: implicit declaration of 
>> function 'drm_for_each_connector_iter' 
>> [-Werror=implicit-function-declaration]
 drm_for_each_connector_iter(connector, &conn_iter) {
 ^~~
>> drivers/gpu/drm/drm_crtc_helper.c:105:53: error: expected ';' before '{' 
>> token
 drm_for_each_connector_iter(connector, &conn_iter) {
^
   drivers/gpu/drm/drm_crtc_helper.c:91:33: warning: unused variable 
'conn_iter' [-Wunused-variable]
 struct drm_connector_list_iter conn_iter;
^
   drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_crtc_helper_disable':
   drivers/gpu/drm/drm_crtc_helper.c:446:34: error: storage size of 'conn_iter' 
isn't known
  struct drm_connector_list_iter conn_iter;
 ^
   drivers/gpu/drm/drm_crtc_helper.c:452:54: error: expected ';' before '{' 
token
  drm_for_each_connector_iter(connector, &conn_iter) {
 ^
   drivers/gpu/drm/drm_crtc_helper.c:446:34: warning: unused variable 
'conn_iter' [-Wunused-variable]
  struct drm_connector_list_iter conn_iter;
 ^
   drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_crtc_helper_set_config':
   drivers/gpu/drm/drm_crtc_helper.c:521:33: error: storage size of 'conn_iter' 
isn't known
 struct drm_connector_list_iter conn_iter;
^
>> drivers/gpu/drm/drm_crtc_helper.c:588:3: error: expected ';' before 
>> 'save_connector_encoders'
  save_connector_encoders[count++] = connector->encoder;
  ^~~
>> drivers/gpu/drm/drm_crtc_helper.c:589:2: error: implicit declaration of 
>> function 'drm_connector_list_iter_put' 
>> [-Werror=implicit-function-declaration]
 drm_connector_list_iter_put(&conn_iter);
 ^~~
   drivers/gpu/drm/drm_crtc_helper.c:633:53: error: expected ';' before '{' 
token
 drm_for_each_connector_iter(connector, &conn_iter) {
^
   drivers/gpu/drm/drm_crtc_helper.c:675:53: error: expected ';' before '{' 
token
 drm_for_each_connector_iter(connector, &conn_iter) {
^
>> drivers/gpu/drm/drm_crtc_helper.c:767:3: error: expected ';' before 
>> 'connector'
  connector->encoder = save_connector_encoders[count++];
  ^
   drivers/gpu/drm/drm_crtc_helper.c:521:33: warning: unused variable 
'conn_iter' [-Wunused-variable]
 struct drm_connector_list_iter conn_iter;
^
   drivers/gpu/drm/drm_crtc_helper.c:517:49: warning: unused variable 
'new_encoder' [-Wunused-variable]
 struct drm_encoder **save_connector_encoders, *new_encoder, *encoder;
^~~
   drivers/gpu/drm/drm_crtc_helper.c:516:41: warning: unused variable 
'new_crtc' [-Wunused-variable]
 struct drm_crtc **save_encoder_crtcs, *new_crtc;
^~~~
   drivers/gpu/drm/drm_crtc_helper.c: In function 
'drm_helper_choose_encoder_dpms':
   drivers/gpu/drm/drm_crtc_helper.c:795:33: error: storage size of 'conn_iter'

[RFC v2 00/11] vb2: Handle user cache hints, allow drivers to choose cache coherency

2016-12-16 Thread Laurent Pinchart
Hello,

This is a rebased version of the vb2 cache hints support patch series posted
by Sakari more than a year ago. The patches have been modified as needed by
the upstream changes and received the occasional small odd fix but are 
otherwise not modified. Please see the individual commit messages for more
information.

The videobuf2 memory managers use the DMA mapping API to handle cache
synchronization on systems that require them transparently for drivers. As
cache operations are expensive, system performances can be impacted. Cache
synchronization can't be skipped altogether if we want to retain correct
behaviour, but optimizations are possible in cases related to buffer sharing
between multiple devices without CPU access to the memory.

The first optimization covers cases where the memory never needs to be
accessed by the CPU (neither in kernelspace nor in userspace). In those cases,
as no CPU memory mappings exist, cache synchronization can be skipped. The
situation could be detected in the kernel as we have enough information to
determine whether CPU mappings for kernelspace or userspace exist (in the
first case because drivers should request them explicitly, in the second case
because the mmap() handler hasn't been invoked). This optimization is not
implemented currently but should at least be prototyped as it could improve
performances automatically in a large number of cases.

The second class of optimizations cover cases where the memory sometimes needs
to be accessed by the CPU. In those cases memory mapping must be created and
cache handled, but cache synchronization could be skipped for buffer that are
not touched by the CPU.

By default the following cache synchronization operations need to be performed
related to the buffer management ioctls. For simplicity means of QBUF below
apply to buf VIDIOC_QBUF and VIDIOC_PREPARE_BUF.

| QBUF  | DQBUF

CAPTURE | Invalidate| Invalidate (*)
OUTPUT  | Clean | -

(*) for systems using speculative pre-fetching only

The following cases can be optimized.

1. CAPTURE, the CPU has not written to the buffer before QBUF

   Cache invalidation can be skipped at QBUF time, but becomes required at
   DQBUF time on all systems, regardless of whether they use speculative
   prefetching.

2. CAPTURE, the CPU will not read from the buffer after DQBUF

   Cache invalidation can be skipped at DQBUF time.

3. CAPTURE, combination of (1) and (2)

   Cache invalidation can be skipped at both QBUF and DQBUF time.

4. OUTPUT, the CPU has not written to the buffer before QBUF

   Cache clean can be skipped at QBUF time.


The kernel can't detect thoses situations automatically and thus requires
hints from userspace to decide whether cache synchronization can be skipped.
It should be noted that those hints might not be honoured. In particular, if
userspace hints that it hasn't touched the buffer with the CPU, drivers might
need to perform memory accesses themselves (adding JPEG or MPEG headers to
buffers is a common case where CPU access could be needed in the kernel), in
which case the userspace hints will be ignored.

Getting the hints wrong will result in data corruption. Userspace applications
are allowed to shoot themselves in the foot, but driver are responsible for
deciding whether data corruption can pose a risk to the system in general. For
instance if the device could be made to crash, or behave in a way that would
jeopardize system security, reliability or performances, when fed with invalid
data, cache synchronization shall not be skipped solely due to possibly
incorrect userspace hints.

The V4L2 API defines two flags, V4L2-BUF-FLAG-NO-CACHE-INVALIDATE and
V4L2_BUF_FLAG_NO_CACHE_SYNC, that can be used to provide cache-related hints
to the kernel. However, no kernel has ever implemented support for those flags
that are thus most likely unused.

A single flag is enough to cover all the optimization cases described above,
provided we keep track of the flag being set at QBUF time to force cache
invalidation at DQBUF time for case (1) if the  flag isn't set at DQBUF time.
This patch series thus cleans up the userspace API and merges both flags into
a single one.

One potential issue with case (1) is that cache invalidation at DQBUF time for
CAPTURE buffers isn't fully under the control of videobuf2. We can instruct
the DMA mapping API to skip cache handling, but we can't force it to
invalidate the cache in the sync_for_cpu operation for non speculative
prefetching systems. Luckily, on ARM32 the current implementation always
invalidates the cache in __dma_page_dev_to_cpu() for CAPTURE buffers so we are
safe fot now. However, this is documented by a FIXME comment that might lead
to someone fixing the implementation in the future. I believe we will have to
the problem at the DMA mapping level, the userspace hint API shouldn't be
affected.

This RFC patch set achieves two m

[RFC v2 04/11] v4l: Unify cache management hint buffer flags

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The V4L2_BUF_FLAG_NO_CACHE_INVALIDATE and V4L2_BUF_FLAG_NO_CACHE_CLEAN
buffer flags are currently not used by the kernel. Replace the definitions
by a single V4L2_BUF_FLAG_NO_CACHE_SYNC flag to be used by further
patches.

Different cache architectures should not be visible to the user space
which can make no meaningful use of the differences anyway. In case a
device can make use of non-coherent memory accesses, the necessary cache
operations depend on the CPU architecture and the buffer type, not the
requests of the user. The cache operation itself may be skipped on the
user's request which was the purpose of the two flags.

On ARM the invalidate and clean are separate operations whereas on
x86(-64) the two are a single operation (flush). Whether the hardware uses
the buffer for reading (V4L2_BUF_TYPE_*_OUTPUT*) or writing
(V4L2_BUF_TYPE_*CAPTURE*) already defines the required cache operation
(clean and invalidate, respectively). No user input is required.

Signed-off-by: Sakari Ailus 
Acked-by: Hans Verkuil 
---
 Documentation/media/uapi/v4l/buffer.rst| 24 --
 .../media/uapi/v4l/vidioc-prepare-buf.rst  |  5 ++---
 include/trace/events/v4l2.h|  3 +--
 include/uapi/linux/videodev2.h |  7 +--
 4 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/Documentation/media/uapi/v4l/buffer.rst 
b/Documentation/media/uapi/v4l/buffer.rst
index ac58966ccb9b..601c3e96464a 100644
--- a/Documentation/media/uapi/v4l/buffer.rst
+++ b/Documentation/media/uapi/v4l/buffer.rst
@@ -437,23 +437,17 @@ Buffer Flags
:ref:`VIDIOC_PREPARE_BUF `,
:ref:`VIDIOC_QBUF` or
:ref:`VIDIOC_DQBUF ` ioctl is called.
-* .. _`V4L2-BUF-FLAG-NO-CACHE-INVALIDATE`:
+* .. _`V4L2-BUF-FLAG-NO-CACHE-SYNC`:

-  - ``V4L2_BUF_FLAG_NO_CACHE_INVALIDATE``
+  - ``V4L2_BUF_FLAG_NO_CACHE_SYNC``
   - 0x0800
-  - Caches do not have to be invalidated for this buffer. Typically
-   applications shall use this flag if the data captured in the
-   buffer is not going to be touched by the CPU, instead the buffer
-   will, probably, be passed on to a DMA-capable hardware unit for
-   further processing or output.
-* .. _`V4L2-BUF-FLAG-NO-CACHE-CLEAN`:
-
-  - ``V4L2_BUF_FLAG_NO_CACHE_CLEAN``
-  - 0x1000
-  - Caches do not have to be cleaned for this buffer. Typically
-   applications shall use this flag for output buffers if the data in
-   this buffer has not been created by the CPU but by some
-   DMA-capable unit, in which case caches have not been used.
+  - Do not perform CPU cache synchronisation operations when the buffer is
+   queued or dequeued. The user is responsible for the correct use of
+   this flag. It should be only used when the buffer is not accessed
+   using the CPU, e.g. the buffer is written to by a hardware block and
+   then read by another one, in which case the flag should be set in both
+   :ref:`VIDIOC_QBUF` and :ref:`VIDIOC_DQBUF` ioctls. The flag has no
+   effect on some devices / architectures.
 * .. _`V4L2-BUF-FLAG-LAST`:

   - ``V4L2_BUF_FLAG_LAST``
diff --git a/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst 
b/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst
index bdcfd9fe550d..80aeb7e403f3 100644
--- a/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst
+++ b/Documentation/media/uapi/v4l/vidioc-prepare-buf.rst
@@ -36,9 +36,8 @@ pass ownership of the buffer to the driver before actually 
enqueuing it,
 using the :ref:`VIDIOC_QBUF` ioctl, and to prepare it for future I/O. Such
 preparations may include cache invalidation or cleaning. Performing them
 in advance saves time during the actual I/O. In case such cache
-operations are not required, the application can use one of
-``V4L2_BUF_FLAG_NO_CACHE_INVALIDATE`` and
-``V4L2_BUF_FLAG_NO_CACHE_CLEAN`` flags to skip the respective step.
+operations are not required, the application can use the
+``V4L2_BUF_FLAG_NO_CACHE_SYNC`` flag to skip the cache synchronization step.

 The struct :c:type:`v4l2_buffer` structure is specified in
 :ref:`buffer`.
diff --git a/include/trace/events/v4l2.h b/include/trace/events/v4l2.h
index ee7754c6e4a1..fb9ad7b0 100644
--- a/include/trace/events/v4l2.h
+++ b/include/trace/events/v4l2.h
@@ -80,8 +80,7 @@ SHOW_FIELD
{ V4L2_BUF_FLAG_ERROR,   "ERROR" },   \
{ V4L2_BUF_FLAG_TIMECODE,"TIMECODE" },\
{ V4L2_BUF_FLAG_PREPARED,"PREPARED" },\
-   { V4L2_BUF_FLAG_NO_CACHE_INVALIDATE, "NO_CACHE_INVALIDATE" }, \
-   { V4L2_BUF_FLAG_NO_CACHE_CLEAN,  "NO_CACHE_CLEAN" },  \
+   { V4L2_BUF_FLAG_NO_CACHE_SYNC,   "NO_CACHE_SYNC" },   \
{ V4L2_BUF_FLAG_TIMESTAMP_MASK,  "TIMESTAMP_MASK" },  \
{ V4L2_BUF_FLAG_TI

[RFC v2 08/11] vb2: dma-contig: Don't warn on failure in obtaining scatterlist

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

vb2_dc_get_base_sgt() which obtains the scatterlist already prints
information on why the scatterlist could not be obtained.

Also, remove the useless warning of a failed kmalloc().

Signed-off-by: Sakari Ailus 
Reviewed-by: Laurent Pinchart 
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 2a00d12ffee2..d59f107f0457 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -370,10 +370,8 @@ static struct sg_table *vb2_dc_get_base_sgt(struct 
vb2_dc_buf *buf)
struct sg_table *sgt;

sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
-   if (!sgt) {
-   dev_err(buf->dev, "failed to alloc sg table\n");
+   if (!sgt)
return NULL;
-   }

ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
buf->size, buf->attrs);
@@ -400,7 +398,7 @@ static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, 
unsigned long flags)
if (!buf->dma_sgt)
buf->dma_sgt = vb2_dc_get_base_sgt(buf);

-   if (WARN_ON(!buf->dma_sgt))
+   if (!buf->dma_sgt)
return NULL;

dbuf = dma_buf_export(&exp_info);
-- 
Regards,

Laurent Pinchart



[RFC v2 05/11] v4l2-core: Don't sync cache for a buffer if so requested

2016-12-16 Thread Laurent Pinchart
From: Samu Onkalo 

The user may request to the driver (vb2) to skip the cache maintenance
operations in case the buffer does not need cache synchronisation, e.g. in
cases where the buffer is passed between hardware blocks without it being
touched by the CPU.

Also document that the prepare and finish vb2_mem_ops might not get called
every time the buffer ownership changes between the kernel and the user
space.

Signed-off-by: Samu Onkalo 
Signed-off-by: Sakari Ailus 
---
Changes since v1:

- Add a no_cache_sync argument to vb2 core prepare/qbuf/dqbuf functions
  to get round the inability to access v4l2_buffer flags from vb2 core.
---
 drivers/media/v4l2-core/videobuf2-core.c | 101 +--
 drivers/media/v4l2-core/videobuf2-v4l2.c |  14 -
 include/media/videobuf2-core.h   |  23 ---
 3 files changed, 97 insertions(+), 41 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
b/drivers/media/v4l2-core/videobuf2-core.c
index 15a83f338072..e5371ef213b0 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -189,6 +189,28 @@ static void __vb2_queue_cancel(struct vb2_queue *q);
 static void __enqueue_in_driver(struct vb2_buffer *vb);

 /**
+ * __mem_prepare_planes() - call finish mem op for all planes of the buffer
+ */
+static void __mem_prepare_planes(struct vb2_buffer *vb)
+{
+   unsigned int plane;
+
+   for (plane = 0; plane < vb->num_planes; ++plane)
+   call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
+}
+
+/**
+ * __mem_finish_planes() - call finish mem op for all planes of the buffer
+ */
+static void __mem_finish_planes(struct vb2_buffer *vb)
+{
+   unsigned int plane;
+
+   for (plane = 0; plane < vb->num_planes; ++plane)
+   call_void_memop(vb, finish, vb->planes[plane].mem_priv);
+}
+
+/**
  * __vb2_buf_mem_alloc() - allocate video memory for the given buffer
  */
 static int __vb2_buf_mem_alloc(struct vb2_buffer *vb)
@@ -953,20 +975,29 @@ EXPORT_SYMBOL_GPL(vb2_discard_done);
 /**
  * __prepare_mmap() - prepare an MMAP buffer
  */
-static int __prepare_mmap(struct vb2_buffer *vb, const void *pb)
+static int __prepare_mmap(struct vb2_buffer *vb, const void *pb,
+ bool no_cache_sync)
 {
-   int ret = 0;
+   int ret;

-   if (pb)
+   if (pb) {
ret = call_bufop(vb->vb2_queue, fill_vb2_buffer,
 vb, pb, vb->planes);
-   return ret ? ret : call_vb_qop(vb, buf_prepare, vb);
+   if (ret)
+   return ret;
+   }
+
+   if (!no_cache_sync)
+   __mem_prepare_planes(vb);
+
+   return call_vb_qop(vb, buf_prepare, vb);
 }

 /**
  * __prepare_userptr() - prepare a USERPTR buffer
  */
-static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
+static int __prepare_userptr(struct vb2_buffer *vb, const void *pb,
+bool no_cache_sync)
 {
struct vb2_plane planes[VB2_MAX_PLANES];
struct vb2_queue *q = vb->vb2_queue;
@@ -1056,6 +1087,11 @@ static int __prepare_userptr(struct vb2_buffer *vb, 
const void *pb)
dprintk(1, "buffer initialization failed\n");
goto err;
}
+
+   /* This is new buffer memory --- always synchronise cache. */
+   __mem_prepare_planes(vb);
+   } else if (!no_cache_sync) {
+   __mem_prepare_planes(vb);
}

ret = call_vb_qop(vb, buf_prepare, vb);
@@ -1083,7 +1119,8 @@ static int __prepare_userptr(struct vb2_buffer *vb, const 
void *pb)
 /**
  * __prepare_dmabuf() - prepare a DMABUF buffer
  */
-static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
+static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb,
+   bool no_cache_sync)
 {
struct vb2_plane planes[VB2_MAX_PLANES];
struct vb2_queue *q = vb->vb2_queue;
@@ -1197,6 +1234,11 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, const 
void *pb)
dprintk(1, "buffer initialization failed\n");
goto err;
}
+
+   /* This is new buffer memory --- always synchronise cache. */
+   __mem_prepare_planes(vb);
+   } else if (!no_cache_sync) {
+   __mem_prepare_planes(vb);
}

ret = call_vb_qop(vb, buf_prepare, vb);
@@ -1229,10 +1271,10 @@ static void __enqueue_in_driver(struct vb2_buffer *vb)
call_void_vb_qop(vb, buf_queue, vb);
 }

-static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
+static int __buf_prepare(struct vb2_buffer *vb, const void *pb,
+bool no_cache_sync)
 {
struct vb2_queue *q = vb->vb2_queue;
-   unsigned int plane;
int ret;

if (q->error) {
@@ -1244,13 +1286,13 @@ static int __buf_prepare(struct vb2_buffer *vb, const 
void *pb)


[RFC v2 06/11] vb2: Improve struct vb2_mem_ops documentation; alloc and put are for MMAP

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The alloc() and put() ops are for MMAP buffers only. Document it.

Signed-off-by: Sakari Ailus 
Acked-by: Hans Verkuil 
Reviewed-by: Laurent Pinchart 
---
Changes since v1:

- Fixed typo in documentation
---
 include/media/videobuf2-core.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index bfad0588bb2b..15084b90e2d5 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -46,16 +46,16 @@ struct vb2_threadio_data;

 /**
  * struct vb2_mem_ops - memory handling/memory allocator operations
- * @alloc: allocate video memory and, optionally, allocator private data,
- * return ERR_PTR() on failure or a pointer to allocator private,
- * per-buffer data on success; the returned private structure
- * will then be passed as @buf_priv argument to other ops in this
- * structure. Additional gfp_flags to use when allocating the
- * are also passed to this operation. These flags are from the
- * gfp_flags field of vb2_queue.
- * @put:   inform the allocator that the buffer will no longer be used;
- * usually will result in the allocator freeing the buffer (if
- * no other users of this buffer are present); the @buf_priv
+ * @alloc: allocate video memory for an MMAP buffer and, optionally,
+ * allocator private data, return ERR_PTR() on failure or a pointer
+ * to allocator private, per-buffer data on success; the returned
+ * private structure will then be passed as @buf_priv argument to
+ * other ops in this structure. Additional gfp_flags to use when
+ * allocating the memory are also passed to this operation. These
+ * flags are from the gfp_flags field of vb2_queue.
+ * @put:   inform the allocator that the MMAP buffer will no longer be
+ * used; usually will result in the allocator freeing the buffer
+ * (if no other users of this buffer are present); the @buf_priv
  * argument is the allocator private per-buffer structure
  * previously returned from the alloc callback.
  * @get_dmabuf: acquire userspace memory for a hardware operation; used for
-- 
Regards,

Laurent Pinchart



[RFC v2 03/11] vb2: Move cache synchronisation from buffer done to dqbuf handler

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The cache synchronisation may be a time consuming operation and thus not
best performed in an interrupt which is a typical context for
vb2_buffer_done() calls. This may consume up to tens of ms on some
machines, depending on the buffer size.

Signed-off-by: Sakari Ailus 
---
Changes since v1:

- Don't rename the 'i' loop counter to 'plane'
---
 drivers/media/v4l2-core/videobuf2-core.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
b/drivers/media/v4l2-core/videobuf2-core.c
index 8ba48703b189..15a83f338072 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -889,7 +889,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum 
vb2_buffer_state state)
 {
struct vb2_queue *q = vb->vb2_queue;
unsigned long flags;
-   unsigned int plane;

if (WARN_ON(vb->state != VB2_BUF_STATE_ACTIVE))
return;
@@ -910,10 +909,6 @@ void vb2_buffer_done(struct vb2_buffer *vb, enum 
vb2_buffer_state state)
dprintk(4, "done processing on buffer %d, state: %d\n",
vb->index, state);

-   /* sync buffers */
-   for (plane = 0; plane < vb->num_planes; ++plane)
-   call_void_memop(vb, finish, vb->planes[plane].mem_priv);
-
spin_lock_irqsave(&q->done_lock, flags);
if (state == VB2_BUF_STATE_QUEUED ||
state == VB2_BUF_STATE_REQUEUEING) {
@@ -1571,6 +1566,10 @@ static void __vb2_dqbuf(struct vb2_buffer *vb)

vb->state = VB2_BUF_STATE_DEQUEUED;

+   /* sync buffers */
+   for (i = 0; i < vb->num_planes; ++i)
+   call_void_memop(vb, finish, vb->planes[i].mem_priv);
+
/* unmap DMABUF buffer */
if (q->memory == VB2_MEMORY_DMABUF)
for (i = 0; i < vb->num_planes; ++i) {
-- 
Regards,

Laurent Pinchart



[RFC v2 07/11] vb2: dma-contig: Remove redundant sgt_base field

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The struct vb2_dc_buf contains two struct sg_table fields: sgt_base and
dma_sgt. The former is used by DMA-BUF buffers whereas the latter is used
by USERPTR.

Unify the two, leaving dma_sgt.

MMAP buffers do not need cache flushing since they have been allocated
using dma_alloc_coherent().

Signed-off-by: Sakari Ailus 
---
Changes since v1:

- Test for MMAP or DMABUF type through the vec field instead of the now
  gone vma field.
- Move the vec field to a USERPTR section in struct vb2_dc_buf, where
  the vma field was located.
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..2a00d12ffee2 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -30,12 +30,13 @@ struct vb2_dc_buf {
unsigned long   attrs;
enum dma_data_direction dma_dir;
struct sg_table *dma_sgt;
-   struct frame_vector *vec;

/* MMAP related */
struct vb2_vmarea_handler   handler;
atomic_trefcount;
-   struct sg_table *sgt_base;
+
+   /* USERPTR related */
+   struct frame_vector *vec;

/* DMABUF related */
struct dma_buf_attachment   *db_attach;
@@ -95,7 +96,7 @@ static void vb2_dc_prepare(void *buf_priv)
struct sg_table *sgt = buf->dma_sgt;

/* DMABUF exporter will flush the cache for us */
-   if (!sgt || buf->db_attach)
+   if (!buf->vec)
return;

dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
@@ -108,7 +109,7 @@ static void vb2_dc_finish(void *buf_priv)
struct sg_table *sgt = buf->dma_sgt;

/* DMABUF exporter will flush the cache for us */
-   if (!sgt || buf->db_attach)
+   if (!buf->vec)
return;

dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
@@ -125,9 +126,9 @@ static void vb2_dc_put(void *buf_priv)
if (!atomic_dec_and_test(&buf->refcount))
return;

-   if (buf->sgt_base) {
-   sg_free_table(buf->sgt_base);
-   kfree(buf->sgt_base);
+   if (buf->dma_sgt) {
+   sg_free_table(buf->dma_sgt);
+   kfree(buf->dma_sgt);
}
dma_free_attrs(buf->dev, buf->size, buf->cookie, buf->dma_addr,
   buf->attrs);
@@ -239,13 +240,13 @@ static int vb2_dc_dmabuf_ops_attach(struct dma_buf *dbuf, 
struct device *dev,
/* Copy the buf->base_sgt scatter list to the attachment, as we can't
 * map the same scatter list to multiple attachments at the same time.
 */
-   ret = sg_alloc_table(sgt, buf->sgt_base->orig_nents, GFP_KERNEL);
+   ret = sg_alloc_table(sgt, buf->dma_sgt->orig_nents, GFP_KERNEL);
if (ret) {
kfree(attach);
return -ENOMEM;
}

-   rd = buf->sgt_base->sgl;
+   rd = buf->dma_sgt->sgl;
wr = sgt->sgl;
for (i = 0; i < sgt->orig_nents; ++i) {
sg_set_page(wr, sg_page(rd), rd->length, rd->offset);
@@ -396,10 +397,10 @@ static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, 
unsigned long flags)
exp_info.flags = flags;
exp_info.priv = buf;

-   if (!buf->sgt_base)
-   buf->sgt_base = vb2_dc_get_base_sgt(buf);
+   if (!buf->dma_sgt)
+   buf->dma_sgt = vb2_dc_get_base_sgt(buf);

-   if (WARN_ON(!buf->sgt_base))
+   if (WARN_ON(!buf->dma_sgt))
return NULL;

dbuf = dma_buf_export(&exp_info);
-- 
Regards,

Laurent Pinchart



[RFC v2 09/11] vb2: dma-contig: Move vb2_dc_get_base_sgt() up

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

Just move the function up. It'll be soon needed earlier than previously.

Signed-off-by: Sakari Ailus 
Reviewed-by: Laurent Pinchart 
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 40 +-
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index d59f107f0457..d503647ea522 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -62,6 +62,26 @@ static unsigned long vb2_dc_get_contiguous_size(struct 
sg_table *sgt)
return size;
 }

+static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
+{
+   int ret;
+   struct sg_table *sgt;
+
+   sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
+   if (!sgt)
+   return NULL;
+
+   ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
+   buf->size, buf->attrs);
+   if (ret < 0) {
+   dev_err(buf->dev, "failed to get scatterlist from DMA API\n");
+   kfree(sgt);
+   return NULL;
+   }
+
+   return sgt;
+}
+
 /*/
 /* callbacks for all buffers */
 /*/
@@ -364,26 +384,6 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
.release = vb2_dc_dmabuf_ops_release,
 };

-static struct sg_table *vb2_dc_get_base_sgt(struct vb2_dc_buf *buf)
-{
-   int ret;
-   struct sg_table *sgt;
-
-   sgt = kmalloc(sizeof(*sgt), GFP_KERNEL);
-   if (!sgt)
-   return NULL;
-
-   ret = dma_get_sgtable_attrs(buf->dev, sgt, buf->cookie, buf->dma_addr,
-   buf->size, buf->attrs);
-   if (ret < 0) {
-   dev_err(buf->dev, "failed to get scatterlist from DMA API\n");
-   kfree(sgt);
-   return NULL;
-   }
-
-   return sgt;
-}
-
 static struct dma_buf *vb2_dc_get_dmabuf(void *buf_priv, unsigned long flags)
 {
struct vb2_dc_buf *buf = buf_priv;
-- 
Regards,

Laurent Pinchart



[RFC v2 01/11] vb2: Rename confusingly named internal buffer preparation functions

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

Rename __qbuf_*() functions which are specific to a buffer type as
__prepare_*() which matches with what they do. The naming was there for
historical reasons; the purpose of the functions was changed without
renaming them.

Signed-off-by: Sakari Ailus 
Acked-by: Hans Verkuil 
Reviewed-by: Laurent Pinchart 
---
 drivers/media/v4l2-core/videobuf2-core.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
b/drivers/media/v4l2-core/videobuf2-core.c
index 7c1d390ea438..5379c8718010 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -956,9 +956,9 @@ void vb2_discard_done(struct vb2_queue *q)
 EXPORT_SYMBOL_GPL(vb2_discard_done);

 /**
- * __qbuf_mmap() - handle qbuf of an MMAP buffer
+ * __prepare_mmap() - prepare an MMAP buffer
  */
-static int __qbuf_mmap(struct vb2_buffer *vb, const void *pb)
+static int __prepare_mmap(struct vb2_buffer *vb, const void *pb)
 {
int ret = 0;

@@ -969,9 +969,9 @@ static int __qbuf_mmap(struct vb2_buffer *vb, const void 
*pb)
 }

 /**
- * __qbuf_userptr() - handle qbuf of a USERPTR buffer
+ * __prepare_userptr() - prepare a USERPTR buffer
  */
-static int __qbuf_userptr(struct vb2_buffer *vb, const void *pb)
+static int __prepare_userptr(struct vb2_buffer *vb, const void *pb)
 {
struct vb2_plane planes[VB2_MAX_PLANES];
struct vb2_queue *q = vb->vb2_queue;
@@ -1086,9 +1086,9 @@ static int __qbuf_userptr(struct vb2_buffer *vb, const 
void *pb)
 }

 /**
- * __qbuf_dmabuf() - handle qbuf of a DMABUF buffer
+ * __prepare_dmabuf() - prepare a DMABUF buffer
  */
-static int __qbuf_dmabuf(struct vb2_buffer *vb, const void *pb)
+static int __prepare_dmabuf(struct vb2_buffer *vb, const void *pb)
 {
struct vb2_plane planes[VB2_MAX_PLANES];
struct vb2_queue *q = vb->vb2_queue;
@@ -1253,13 +1253,13 @@ static int __buf_prepare(struct vb2_buffer *vb, const 
void *pb)

switch (q->memory) {
case VB2_MEMORY_MMAP:
-   ret = __qbuf_mmap(vb, pb);
+   ret = __prepare_mmap(vb, pb);
break;
case VB2_MEMORY_USERPTR:
-   ret = __qbuf_userptr(vb, pb);
+   ret = __prepare_userptr(vb, pb);
break;
case VB2_MEMORY_DMABUF:
-   ret = __qbuf_dmabuf(vb, pb);
+   ret = __prepare_dmabuf(vb, pb);
break;
default:
WARN(1, "Invalid queue type\n");
-- 
Regards,

Laurent Pinchart



[RFC v2 11/11] vb2: dma-contig: Add WARN_ON_ONCE() to check for potential bugs

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The scatterlist should always be present when the cache would need to be
flushed. Each buffer type has its own means to provide that. Add
WARN_ON_ONCE() to check the scatterist exists.

Signed-off-by: Sakari Ailus 
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index a0e88ad93f07..9409f458cf89 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -122,6 +122,9 @@ static void vb2_dc_prepare(void *buf_priv)
if (!(buf->attrs & DMA_ATTR_NON_CONSISTENT))
return;

+   if (WARN_ON_ONCE(!sgt))
+   return;
+
dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
   buf->dma_dir);
 }
@@ -138,6 +141,9 @@ static void vb2_dc_finish(void *buf_priv)
if (!(buf->attrs & DMA_ATTR_NON_CONSISTENT))
return;

+   if (WARN_ON_ONCE(!sgt))
+   return;
+
dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
 }

-- 
Regards,

Laurent Pinchart



[RFC v2 10/11] vb2: dma-contig: Let drivers decide DMA attrs of MMAP and USERPTR bufs

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The desirable DMA attributes are not generic for all devices using
Videobuf2 contiguous DMA ops. Let the drivers decide.

This change also results in MMAP buffers always having an sg_table
(dma_sgt field).

Also arrange the header files alphabetically.

As a result, also the DMA-BUF exporter must provide ops for synchronising
the cache. This adds begin_cpu_access and end_cpu_access ops to
vb2_dc_dmabuf_ops.

Signed-off-by: Sakari Ailus 
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 66 ++
 1 file changed, 56 insertions(+), 10 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index d503647ea522..a0e88ad93f07 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -11,11 +11,11 @@
  */

 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
-#include 

 #include 
 #include 
@@ -115,8 +115,11 @@ static void vb2_dc_prepare(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

-   /* DMABUF exporter will flush the cache for us */
-   if (!buf->vec)
+   /*
+* DMABUF exporter will flush the cache for us; only USERPTR
+* and MMAP buffers with non-coherent memory will be flushed.
+*/
+   if (!(buf->attrs & DMA_ATTR_NON_CONSISTENT))
return;

dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
@@ -128,8 +131,11 @@ static void vb2_dc_finish(void *buf_priv)
struct vb2_dc_buf *buf = buf_priv;
struct sg_table *sgt = buf->dma_sgt;

-   /* DMABUF exporter will flush the cache for us */
-   if (!buf->vec)
+   /*
+* DMABUF exporter will flush the cache for us; only USERPTR
+* and MMAP buffers with non-coherent memory will be flushed.
+*/
+   if (!(buf->attrs & DMA_ATTR_NON_CONSISTENT))
return;

dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
@@ -172,13 +178,22 @@ static void *vb2_dc_alloc(struct device *dev, unsigned 
long attrs,
if (attrs)
buf->attrs = attrs;
buf->cookie = dma_alloc_attrs(dev, size, &buf->dma_addr,
-   GFP_KERNEL | gfp_flags, buf->attrs);
+GFP_KERNEL | gfp_flags, buf->attrs);
if (!buf->cookie) {
-   dev_err(dev, "dma_alloc_coherent of size %ld failed\n", size);
+   dev_err(dev, "dma_alloc_attrs of size %ld failed\n", size);
kfree(buf);
return ERR_PTR(-ENOMEM);
}

+   if (buf->attrs & DMA_ATTR_NON_CONSISTENT) {
+   buf->dma_sgt = vb2_dc_get_base_sgt(buf);
+   if (!buf->dma_sgt) {
+   dma_free_attrs(dev, size, buf->cookie, buf->dma_addr,
+  buf->attrs);
+   return ERR_PTR(-ENOMEM);
+   }
+   }
+
if ((buf->attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0)
buf->vaddr = buf->cookie;

@@ -359,6 +374,34 @@ static void *vb2_dc_dmabuf_ops_kmap(struct dma_buf *dbuf, 
unsigned long pgnum)
return buf->vaddr ? buf->vaddr + pgnum * PAGE_SIZE : NULL;
 }

+static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
+ enum dma_data_direction direction)
+{
+   struct vb2_dc_buf *buf = dbuf->priv;
+   struct sg_table *sgt = buf->dma_sgt;
+
+   if (!(buf->attrs & DMA_ATTR_NON_CONSISTENT))
+   return 0;
+
+   dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
+
+   return 0;
+}
+
+static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
+   enum dma_data_direction direction)
+{
+   struct vb2_dc_buf *buf = dbuf->priv;
+   struct sg_table *sgt = buf->dma_sgt;
+
+   if (!(buf->attrs & DMA_ATTR_NON_CONSISTENT))
+   return 0;
+
+   dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
+
+   return 0;
+}
+
 static void *vb2_dc_dmabuf_ops_vmap(struct dma_buf *dbuf)
 {
struct vb2_dc_buf *buf = dbuf->priv;
@@ -379,6 +422,8 @@ static struct dma_buf_ops vb2_dc_dmabuf_ops = {
.unmap_dma_buf = vb2_dc_dmabuf_ops_unmap,
.kmap = vb2_dc_dmabuf_ops_kmap,
.kmap_atomic = vb2_dc_dmabuf_ops_kmap,
+   .begin_cpu_access = vb2_dc_dmabuf_ops_begin_cpu_access,
+   .end_cpu_access = vb2_dc_dmabuf_ops_end_cpu_access,
.vmap = vb2_dc_dmabuf_ops_vmap,
.mmap = vb2_dc_dmabuf_ops_mmap,
.release = vb2_dc_dmabuf_ops_release,
@@ -424,11 +469,12 @@ static void vb2_dc_put_userptr(void *buf_priv)

if (sgt) {
/*
-* No need to sync to CPU, it's already synced to the CPU
-* since the finish() memop will have been called before th

[RFC v2 02/11] vb2: Move buffer cache synchronisation to prepare from queue

2016-12-16 Thread Laurent Pinchart
From: Sakari Ailus 

The buffer cache should be synchronised in buffer preparation, not when
the buffer is queued to the device. Fix this.

Mmap buffers do not need cache synchronisation since they are always
coherent.

Signed-off-by: Sakari Ailus 
---
 drivers/media/v4l2-core/videobuf2-core.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-core.c 
b/drivers/media/v4l2-core/videobuf2-core.c
index 5379c8718010..8ba48703b189 100644
--- a/drivers/media/v4l2-core/videobuf2-core.c
+++ b/drivers/media/v4l2-core/videobuf2-core.c
@@ -1225,23 +1225,19 @@ static int __prepare_dmabuf(struct vb2_buffer *vb, 
const void *pb)
 static void __enqueue_in_driver(struct vb2_buffer *vb)
 {
struct vb2_queue *q = vb->vb2_queue;
-   unsigned int plane;

vb->state = VB2_BUF_STATE_ACTIVE;
atomic_inc(&q->owned_by_drv_count);

trace_vb2_buf_queue(q, vb);

-   /* sync buffers */
-   for (plane = 0; plane < vb->num_planes; ++plane)
-   call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
-
call_void_vb_qop(vb, buf_queue, vb);
 }

 static int __buf_prepare(struct vb2_buffer *vb, const void *pb)
 {
struct vb2_queue *q = vb->vb2_queue;
+   unsigned int plane;
int ret;

if (q->error) {
@@ -1266,11 +1262,19 @@ static int __buf_prepare(struct vb2_buffer *vb, const 
void *pb)
ret = -EINVAL;
}

-   if (ret)
+   if (ret) {
dprintk(1, "buffer preparation failed: %d\n", ret);
-   vb->state = ret ? VB2_BUF_STATE_DEQUEUED : VB2_BUF_STATE_PREPARED;
+   vb->state = VB2_BUF_STATE_DEQUEUED;
+   return ret;
+   }

-   return ret;
+   /* sync buffers */
+   for (plane = 0; plane < vb->num_planes; ++plane)
+   call_void_memop(vb, prepare, vb->planes[plane].mem_priv);
+
+   vb->state = VB2_BUF_STATE_PREPARED;
+
+   return 0;
 }

 int vb2_core_prepare_buf(struct vb2_queue *q, unsigned int index, void *pb)
-- 
Regards,

Laurent Pinchart



[PATCH V2] drm/i915: relax uncritical udelay_range()

2016-12-16 Thread Nicholas Mc Guire
udelay_range(1, 2) is inefficient and as discussions with Jani Nikula
 unnecessary here. This replaces this
tight setting with a relaxed delay of min=20 and max=50 which helps
the hrtimer subsystem optimize timer handling.

Fixes: commit be4fc046bed3 ("drm/i915: add VLV DSI PLL Calculations") 
Link: http://lkml.org/lkml/2016/12/15/147
Signed-off-by: Nicholas Mc Guire 
---

V2: use relaxed uslee_range() rather than udelay
fix documentation of changed timings

Problem found by coccinelle:

Patch was compile tested with: x86_64_defconfig (implies CONFIG_DRM_I915)

Patch is against 4.9.0 (localversion-next is next-20161215)

 drivers/gpu/drm/i915/intel_dsi_pll.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dsi_pll.c 
b/drivers/gpu/drm/i915/intel_dsi_pll.c
index 56eff60..d210bc4 100644
--- a/drivers/gpu/drm/i915/intel_dsi_pll.c
+++ b/drivers/gpu/drm/i915/intel_dsi_pll.c
@@ -156,8 +156,10 @@ static void vlv_enable_dsi_pll(struct intel_encoder 
*encoder,
vlv_cck_write(dev_priv, CCK_REG_DSI_PLL_CONTROL,
  config->dsi_pll.ctrl & ~DSI_PLL_VCO_EN);

-   /* wait at least 0.5 us after ungating before enabling VCO */
-   usleep_range(1, 10);
+   /* wait at least 0.5 us after ungating before enabling VCO,
+* allow hrtimer subsystem optimization by relaxing timing
+*/
+   usleep_range(10, 50);

vlv_cck_write(dev_priv, CCK_REG_DSI_PLL_CONTROL, config->dsi_pll.ctrl);

-- 
2.1.4



[PATCH V2] drm/i915: relax uncritical udelay_range() settings

2016-12-16 Thread Nicholas Mc Guire
udelay_range(2, 3) is inefficient and as discussions with Jani Nikula
 unnecessary here. This replaces this
tight setting with a relaxed delay of min=20 and max=50. which helps
the hrtimer subsystem optimize timer handling.

Link: http://lkml.org/lkml/2016/12/15/127
Fixes: commit 37ab0810c9b7 ("drm/i915/bxt: DSI enable for BXT")
Signed-off-by: Nicholas Mc Guire 
---

V2: use relaxed uslee_range() rather than udelay
fix documentation of changed timings

Problem found by coccinelle:

Patch was compile tested with: x86_64_defconfig (implies CONFIG_DRM_I915)

Patch is against 4.9.0 (localversion-next is next-20161215)

 drivers/gpu/drm/i915/intel_dsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c
index 5b72c50..92b96fa 100644
--- a/drivers/gpu/drm/i915/intel_dsi.c
+++ b/drivers/gpu/drm/i915/intel_dsi.c
@@ -379,7 +379,8 @@ static void bxt_dsi_device_ready(struct intel_encoder 
*encoder)
val &= ~ULPS_STATE_MASK;
val |= (ULPS_STATE_ENTER | DEVICE_READY);
I915_WRITE(MIPI_DEVICE_READY(port), val);
-   usleep_range(2, 3);
+   /* at least 2us - relaxed for hrtimer subsystem optimization */
+   usleep_range(10, 50);

/* 3. Exit ULPS */
val = I915_READ(MIPI_DEVICE_READY(port));
-- 
2.1.4



[PATCH 3/5] drm/amd/amdgpu: add amdgpu_bo_gpu_accessible helper function

2016-12-16 Thread zhoucm1


On 2016年12月16日 01:10, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> Signed-off-by: Nicolai Hähnle 
Reviewed-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 9 +
>   1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 4306b2f..15a723a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -107,20 +107,29 @@ static inline unsigned 
> amdgpu_bo_gpu_page_alignment(struct amdgpu_bo *bo)
>* amdgpu_bo_mmap_offset - return mmap offset of bo
>* @bo: amdgpu object for which we query the offset
>*
>* Returns mmap offset of the object.
>*/
>   static inline u64 amdgpu_bo_mmap_offset(struct amdgpu_bo *bo)
>   {
>   return drm_vma_node_offset_addr(&bo->tbo.vma_node);
>   }
>   
> +/**
> + * amdgpu_bo_gpu_accessible - return whether the bo is currently in memory 
> that
> + * is accessible to the GPU.
> + */
> +static inline bool amdgpu_bo_gpu_accessible(struct amdgpu_bo *bo)
> +{
> + return bo->tbo.mem.mem_type != TTM_PL_SYSTEM;
> +}
> +
>   int amdgpu_bo_create(struct amdgpu_device *adev,
>   unsigned long size, int byte_align,
>   bool kernel, u32 domain, u64 flags,
>   struct sg_table *sg,
>   struct reservation_object *resv,
>   struct amdgpu_bo **bo_ptr);
>   int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
>   unsigned long size, int byte_align,
>   bool kernel, u32 domain, u64 flags,
>   struct sg_table *sg,



[PATCH 2/5] drm/amd/amdgpu: move eviction counting to amdgpu_bo_move_notify

2016-12-16 Thread zhoucm1


On 2016年12月16日 01:10, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> This catches evictions of shadow page tables from the GART. Since shadow
> page tables are always stored in system memory, amdgpu_bo_move is never
> called for them.
>
> This fixes a crash during command submission that occurs when only a shadow
> page table and no other BOs were evicted since the last submission.
>
> Fixes: 1baa439fb2f4e586 ("drm/amdgpu: allocate shadow for pd/pt bo V2")
> Signed-off-by: Nicolai Hähnle 
Acked-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 4 
>   2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index c29db99..d94cdef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -855,20 +855,24 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
>   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
>   struct amdgpu_bo *abo;
>   struct ttm_mem_reg *old_mem = &bo->mem;
>   
>   if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
>   return;
>   
>   abo = container_of(bo, struct amdgpu_bo, tbo);
>   amdgpu_vm_bo_invalidate(adev, abo);
>   
> + /* remember the eviction */
> + if (evict)
> + atomic64_inc(&adev->num_evictions);
> +
>   /* update statistics */
>   if (!new_mem)
>   return;
>   
>   /* move_notify is called before move happens */
>   amdgpu_update_memory_usage(adev, &bo->mem, new_mem);
>   
>   trace_amdgpu_ttm_bo_move(abo, new_mem->mem_type, old_mem->mem_type);
>   }
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 8f18b8e..80924c2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -460,24 +460,20 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo,
>   struct ttm_mem_reg *old_mem = &bo->mem;
>   int r;
>   
>   /* Can't move a pinned BO */
>   abo = container_of(bo, struct amdgpu_bo, tbo);
>   if (WARN_ON_ONCE(abo->pin_count > 0))
>   return -EINVAL;
>   
>   adev = amdgpu_ttm_adev(bo->bdev);
>   
> - /* remember the eviction */
> - if (evict)
> - atomic64_inc(&adev->num_evictions);
> -
>   if (old_mem->mem_type == TTM_PL_SYSTEM && bo->ttm == NULL) {
>   amdgpu_move_null(bo, new_mem);
>   return 0;
>   }
>   if ((old_mem->mem_type == TTM_PL_TT &&
>new_mem->mem_type == TTM_PL_SYSTEM) ||
>   (old_mem->mem_type == TTM_PL_SYSTEM &&
>new_mem->mem_type == TTM_PL_TT)) {
>   /* bind is enough */
>   amdgpu_move_null(bo, new_mem);



[PATCH v2] drm/amd/amdgpu: add check that shadow page tables are GPU-accessible

2016-12-16 Thread zhoucm1


On 2016年12月16日 01:59, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> Skip amdgpu_gem_va_update_vm otherwise. Also clean up the check for the
> non-shadow page tables using the new helper function.
>
> This fixes a crash with the stack trace:
>
> amdgpu_gem_va_update_vm
> -> amdgpu_vm_update_page_directory
>   -> amdgpu_ttm_bind
>-> amdgpu_gtt_mgr_alloc
>
> v2: actually check bo->shadow instead of just checking bo twice
>
> Signed-off-by: Nicolai Hähnle 
Reviewed-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 9 ++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 4e1eb05..9bd1b4e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -464,26 +464,29 @@ int amdgpu_gem_metadata_ioctl(struct drm_device *dev, 
> void *data,
>   
>   unreserve:
>   amdgpu_bo_unreserve(robj);
>   out:
>   drm_gem_object_unreference_unlocked(gobj);
>   return r;
>   }
>   
>   static int amdgpu_gem_va_check(void *param, struct amdgpu_bo *bo)
>   {
> - unsigned domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> -
>   /* if anything is swapped out don't swap it in here,
>  just abort and wait for the next CS */
> + if (!amdgpu_bo_gpu_accessible(bo))
> + return -ERESTARTSYS;
> +
> + if (bo->shadow && !amdgpu_bo_gpu_accessible(bo->shadow))
> + return -ERESTARTSYS;
>   
> - return domain == AMDGPU_GEM_DOMAIN_CPU ? -ERESTARTSYS : 0;
> + return 0;
>   }
>   
>   /**
>* amdgpu_gem_va_update_vm -update the bo_va in its VM
>*
>* @adev: amdgpu_device pointer
>* @bo_va: bo_va to update
>*
>* Update the bo_va directly after setting it's address. Errors are not
>* vital here, so they are not reported back to userspace.



[PATCH 4/5] drm/amd/amdgpu: add check that shadow page directory is GPU-accessible

2016-12-16 Thread zhoucm1


On 2016年12月16日 01:10, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> Skip amdgpu_gem_va_update_vm when shadow the page directory is swapped out.
> Clean up the check for non-shadow BOs as well using the new helper function.
>
> This fixes a crash with the stack trace:
>
> amdgpu_gem_va_update_vm
> -> amdgpu_vm_update_page_directory
>   -> amdgpu_ttm_bind
>-> amdgpu_gtt_mgr_alloc
>
> Signed-off-by: Nicolai Hähnle 
Reviewed-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 ---
>   1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index cd62f6f..4e1eb05 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -489,44 +489,49 @@ static int amdgpu_gem_va_check(void *param, struct 
> amdgpu_bo *bo)
>* vital here, so they are not reported back to userspace.
>*/
>   static void amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
>   struct amdgpu_bo_va *bo_va,
>   uint32_t operation)
>   {
>   struct ttm_validate_buffer tv, *entry;
>   struct amdgpu_bo_list_entry vm_pd;
>   struct ww_acquire_ctx ticket;
>   struct list_head list, duplicates;
> - unsigned domain;
>   int r;
>   
>   INIT_LIST_HEAD(&list);
>   INIT_LIST_HEAD(&duplicates);
>   
>   tv.bo = &bo_va->bo->tbo;
>   tv.shared = true;
>   list_add(&tv.head, &list);
>   
>   amdgpu_vm_get_pd_bo(bo_va->vm, &list, &vm_pd);
>   
>   /* Provide duplicates to avoid -EALREADY */
>   r = ttm_eu_reserve_buffers(&ticket, &list, true, &duplicates);
>   if (r)
>   goto error_print;
>   
>   list_for_each_entry(entry, &list, head) {
> - domain = amdgpu_mem_type_to_domain(entry->bo->mem.mem_type);
> + struct amdgpu_bo *bo =
> + container_of(entry->bo, struct amdgpu_bo, tbo);
> +
>   /* if anything is swapped out don't swap it in here,
>  just abort and wait for the next CS */
> - if (domain == AMDGPU_GEM_DOMAIN_CPU)
> + if (!amdgpu_bo_gpu_accessible(bo))
> + goto error_unreserve;
> +
> + if (bo->shadow && !amdgpu_bo_gpu_accessible(bo->shadow))
>   goto error_unreserve;
>   }
> +
>   r = amdgpu_vm_validate_pt_bos(adev, bo_va->vm, amdgpu_gem_va_check,
> NULL);
>   if (r)
>   goto error_unreserve;
>   
>   r = amdgpu_vm_update_page_directory(adev, bo_va->vm);
>   if (r)
>   goto error_unreserve;
>   
>   r = amdgpu_vm_clear_freed(adev, bo_va->vm);



[PATCH 1/5] drm/ttm: add evict parameter to ttm_bo_driver::move_notify

2016-12-16 Thread zhoucm1


On 2016年12月16日 01:10, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> Ensure that the driver can listen to evictions even when they don't take the
> path through ttm_bo_driver::move.
>
> This is crucial for amdgpu, which relies on an eviction counter to skip
> re-binding page tables when possible.
>
> Signed-off-by: Nicolai Hähnle 
Acked-by: Chunming Zhou 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  3 ++-
>   drivers/gpu/drm/nouveau/nouveau_bo.c   |  3 ++-
>   drivers/gpu/drm/qxl/qxl_ttm.c  |  1 +
>   drivers/gpu/drm/radeon/radeon_object.c |  1 +
>   drivers/gpu/drm/radeon/radeon_object.h |  1 +
>   drivers/gpu/drm/ttm/ttm_bo.c   |  8 
>   drivers/gpu/drm/virtio/virtgpu_ttm.c   |  1 +
>   drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c |  1 +
>   include/drm/ttm/ttm_bo_driver.h| 10 --
>   10 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index bf79b73..c29db99 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -842,20 +842,21 @@ int amdgpu_bo_get_metadata(struct amdgpu_bo *bo, void 
> *buffer,
>   
>   if (metadata_size)
>   *metadata_size = bo->metadata_size;
>   if (flags)
>   *flags = bo->metadata_flags;
>   
>   return 0;
>   }
>   
>   void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
> +bool evict,
>  struct ttm_mem_reg *new_mem)
>   {
>   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
>   struct amdgpu_bo *abo;
>   struct ttm_mem_reg *old_mem = &bo->mem;
>   
>   if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
>   return;
>   
>   abo = container_of(bo, struct amdgpu_bo, tbo);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 5cbf59e..4306b2f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -148,21 +148,22 @@ void amdgpu_bo_fini(struct amdgpu_device *adev);
>   int amdgpu_bo_fbdev_mmap(struct amdgpu_bo *bo,
>   struct vm_area_struct *vma);
>   int amdgpu_bo_set_tiling_flags(struct amdgpu_bo *bo, u64 tiling_flags);
>   void amdgpu_bo_get_tiling_flags(struct amdgpu_bo *bo, u64 *tiling_flags);
>   int amdgpu_bo_set_metadata (struct amdgpu_bo *bo, void *metadata,
>   uint32_t metadata_size, uint64_t flags);
>   int amdgpu_bo_get_metadata(struct amdgpu_bo *bo, void *buffer,
>  size_t buffer_size, uint32_t *metadata_size,
>  uint64_t *flags);
>   void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
> -   struct ttm_mem_reg *new_mem);
> +bool evict,
> +struct ttm_mem_reg *new_mem);
>   int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo);
>   void amdgpu_bo_fence(struct amdgpu_bo *bo, struct dma_fence *fence,
>bool shared);
>   u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo);
>   int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
>  struct amdgpu_ring *ring,
>  struct amdgpu_bo *bo,
>  struct reservation_object *resv,
>  struct dma_fence **fence, bool direct);
>   int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index e0c0007..6fa1521 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1187,21 +1187,22 @@ nouveau_bo_move_flips(struct ttm_buffer_object *bo, 
> bool evict, bool intr,
>   ret = nouveau_bo_move_m2mf(bo, true, intr, no_wait_gpu, new_mem);
>   if (ret)
>   goto out;
>   
>   out:
>   ttm_bo_mem_put(bo, &tmp_mem);
>   return ret;
>   }
>   
>   static void
> -nouveau_bo_move_ntfy(struct ttm_buffer_object *bo, struct ttm_mem_reg 
> *new_mem)
> +nouveau_bo_move_ntfy(struct ttm_buffer_object *bo, bool evict,
> +  struct ttm_mem_reg *new_mem)
>   {
>   struct nouveau_bo *nvbo = nouveau_bo(bo);
>   struct nvkm_vma *vma;
>   
>   /* ttm can now (stupidly) pass the driver bos it didn't create... */
>   if (bo->destroy != nouveau_bo_del_ttm)
>   return;
>   
>   list_for_each_entry(vma, &nvbo->vma_list, head) {
>   if (new_mem && new_mem->mem_type != TTM_PL_SYSTEM &&
> diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
> index 1176133..f3939a9 100644
> --- a/drivers/gpu/drm/qxl/qxl_ttm.c
> +++ b/drivers/gpu/drm/qxl/qxl_ttm.c
> @@ -360,20 +360,21 @@ static in

[PATCH 5/5] drm/amd/amdgpu: add check that shadow page tables are GPU-accessible

2016-12-16 Thread zhoucm1


On 2016年12月16日 01:10, Nicolai Hähnle wrote:
> From: Nicolai Hähnle 
>
> Skip amdgpu_gem_va_update_vm otherwise. Also clean up the check for the
> non-shadow page tables using the new helper function.
>
> This fixes a crash with the stack trace:
>
> amdgpu_gem_va_update_vm
> -> amdgpu_vm_update_page_directory
>   -> amdgpu_ttm_bind
>-> amdgpu_gtt_mgr_alloc
>
> Signed-off-by: Nicolai Hähnle 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 9 ++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 4e1eb05..d91c80b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -464,26 +464,29 @@ int amdgpu_gem_metadata_ioctl(struct drm_device *dev, 
> void *data,
>   
>   unreserve:
>   amdgpu_bo_unreserve(robj);
>   out:
>   drm_gem_object_unreference_unlocked(gobj);
>   return r;
>   }
>   
>   static int amdgpu_gem_va_check(void *param, struct amdgpu_bo *bo)
>   {
> - unsigned domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
> -
>   /* if anything is swapped out don't swap it in here,
>  just abort and wait for the next CS */
> + if (!amdgpu_bo_gpu_accessible(bo))
> + return -ERESTARTSYS;
> +
> + if (bo->shadow && !amdgpu_bo_gpu_accessible(bo))

this should be 'if (bo->shadow && !amdgpu_bo_gpu_accessible(bo->shadow))'.

Regards,
David Zhou

> + return -ERESTARTSYS;
>   
> - return domain == AMDGPU_GEM_DOMAIN_CPU ? -ERESTARTSYS : 0;
> + return 0;
>   }
>   
>   /**
>* amdgpu_gem_va_update_vm -update the bo_va in its VM
>*
>* @adev: amdgpu_device pointer
>* @bo_va: bo_va to update
>*
>* Update the bo_va directly after setting it's address. Errors are not
>* vital here, so they are not reported back to userspace.



[Intel-gfx] [PATCH] drm: Convert all helpers to drm_connector_list_iter

2016-12-16 Thread Daniel Vetter
Hi Kbuild folks

So yeah this doesn't apply because it's just 1 patch resent out of a
big patch series, in-reply-to the patch it replaces. So applying this
alone and telling me (and all the mailing lists) that it doesn't apply
isn't all that useful.

And it shouldn't be too hard to detect this, since the fdo patchwork
instance does catch most of these partial resends successfully and
correctly, and will retest the entire patch series.
-Daniel


On Thu, Dec 15, 2016 at 11:59 PM, kbuild test robot  wrote:
> Hi Daniel,
>
> [auto build test ERROR on drm/drm-next]
> [also build test ERROR on next-20161215]
> [cannot apply to v4.9]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm-Convert-all-helpers-to-drm_connector_list_iter/20161216-061508
> base:   git://people.freedesktop.org/~airlied/linux.git drm-next
> config: i386-randconfig-x003-201650 (attached as .config)
> compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> All error/warnings (new ones prefixed by >>):
>
>drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_helper_encoder_in_use':
>>> drivers/gpu/drm/drm_crtc_helper.c:91:33: error: storage size of 'conn_iter' 
>>> isn't known
>  struct drm_connector_list_iter conn_iter;
> ^
>>> drivers/gpu/drm/drm_crtc_helper.c:104:2: error: implicit declaration of 
>>> function 'drm_connector_list_iter_get' 
>>> [-Werror=implicit-function-declaration]
>  drm_connector_list_iter_get(dev, &conn_iter);
>  ^~~
>>> drivers/gpu/drm/drm_crtc_helper.c:105:2: error: implicit declaration of 
>>> function 'drm_for_each_connector_iter' 
>>> [-Werror=implicit-function-declaration]
>  drm_for_each_connector_iter(connector, &conn_iter) {
>  ^~~
>>> drivers/gpu/drm/drm_crtc_helper.c:105:53: error: expected ';' before '{' 
>>> token
>  drm_for_each_connector_iter(connector, &conn_iter) {
> ^
>drivers/gpu/drm/drm_crtc_helper.c:91:33: warning: unused variable 
> 'conn_iter' [-Wunused-variable]
>  struct drm_connector_list_iter conn_iter;
> ^
>drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_crtc_helper_disable':
>drivers/gpu/drm/drm_crtc_helper.c:446:34: error: storage size of 
> 'conn_iter' isn't known
>   struct drm_connector_list_iter conn_iter;
>  ^
>drivers/gpu/drm/drm_crtc_helper.c:452:54: error: expected ';' before '{' 
> token
>   drm_for_each_connector_iter(connector, &conn_iter) {
>  ^
>drivers/gpu/drm/drm_crtc_helper.c:446:34: warning: unused variable 
> 'conn_iter' [-Wunused-variable]
>   struct drm_connector_list_iter conn_iter;
>  ^
>drivers/gpu/drm/drm_crtc_helper.c: In function 
> 'drm_crtc_helper_set_config':
>drivers/gpu/drm/drm_crtc_helper.c:521:33: error: storage size of 
> 'conn_iter' isn't known
>  struct drm_connector_list_iter conn_iter;
> ^
>>> drivers/gpu/drm/drm_crtc_helper.c:588:3: error: expected ';' before 
>>> 'save_connector_encoders'
>   save_connector_encoders[count++] = connector->encoder;
>   ^~~
>>> drivers/gpu/drm/drm_crtc_helper.c:589:2: error: implicit declaration of 
>>> function 'drm_connector_list_iter_put' 
>>> [-Werror=implicit-function-declaration]
>  drm_connector_list_iter_put(&conn_iter);
>  ^~~
>drivers/gpu/drm/drm_crtc_helper.c:633:53: error: expected ';' before '{' 
> token
>  drm_for_each_connector_iter(connector, &conn_iter) {
> ^
>drivers/gpu/drm/drm_crtc_helper.c:675:53: error: expected ';' before '{' 
> token
>  drm_for_each_connector_iter(connector, &conn_iter) {
> ^
>>> drivers/gpu/drm/drm_crtc_helper.c:767:3: error: expected ';' before 
>>> 'connector'
>   conne

[PATCH] kref: prefer atomic_inc_not_zero to atomic_add_unless

2016-12-16 Thread Daniel Vetter
On Thu, Dec 15, 2016 at 06:01:10AM +0100, Jason A. Donenfeld wrote:
> On most platforms, there exists this ifdef:
> 
>  #define atomic_inc_not_zero(v) atomic_add_unless((v), 1, 0)
> 
> This makes this patch functionally useless. However, on PPC, there is
> actually an explicit definition of atomic_inc_not_zero with its own
> assembly that is slightly more optimized than atomic_add_unless. So,
> this patch changes kref to use atomic_inc_not_zero instead, for PPC and
> any future platforms that might provide an explicit implementation.
> 
> This also puts this usage of kref more in line with a verbatim reading
> of the examples in Paul McKenney's paper [1] in the section titled "2.4
> Atomic Counting With Check and Release Memory Barrier", which uses
> atomic_inc_not_zero.
> 
> [1] http://open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2167.pdf
> 
> Signed-off-by: Jason A. Donenfeld 
> Reviewed-by: Thomas Hellstrom 

Applied to drm-misc, but for 4.11 since 4.10 is bugfixes only already.

Thanks, Daniel

> ---
> This was reviewed favorably 14 months ago but never picked up.
> I'm resubmitting it now in hopes that you can finally queue it
> up for 4.10.
> 
>  include/linux/kref.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/kref.h b/include/linux/kref.h
> index e15828fd71f1..62f0a84ae94e 100644
> --- a/include/linux/kref.h
> +++ b/include/linux/kref.h
> @@ -133,6 +133,6 @@ static inline int kref_put_mutex(struct kref *kref,
>   */
>  static inline int __must_check kref_get_unless_zero(struct kref *kref)
>  {
> - return atomic_add_unless(&kref->refcount, 1, 0);
> + return atomic_inc_not_zero(&kref->refcount);
>  }
>  #endif /* _KREF_H_ */
> -- 
> 2.11.0
> 
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH] kref: prefer atomic_inc_not_zero to atomic_add_unless

2016-12-16 Thread Daniel Vetter
On Thu, Dec 15, 2016 at 11:10:49AM -0800, Greg KH wrote:
> On Thu, Dec 15, 2016 at 07:55:54PM +0100, Jason A. Donenfeld wrote:
> > On most platforms, there exists this ifdef:
> > 
> >  #define atomic_inc_not_zero(v) atomic_add_unless((v), 1, 0)
> > 
> > This makes this patch functionally useless. However, on PPC, there is
> > actually an explicit definition of atomic_inc_not_zero with its own
> > assembly that is slightly more optimized than atomic_add_unless. So,
> > this patch changes kref to use atomic_inc_not_zero instead, for PPC and
> > any future platforms that might provide an explicit implementation.
> > 
> > This also puts this usage of kref more in line with a verbatim reading
> > of the examples in Paul McKenney's paper [1] in the section titled "2.4
> > Atomic Counting With Check and Release Memory Barrier", which uses
> > atomic_inc_not_zero.
> > 
> > [1] http://open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2167.pdf
> > 
> > Signed-off-by: Jason A. Donenfeld 
> > Reviewed-by: Thomas Hellstrom 
> > Reviewed-by: Christoph Hellwig 
> > ---
> > Sorry to submit this again, but people keep reviewing it saying it's fine,
> > but then point to somebody else to actually merge this. At the end of the
> > chain of fingerpointing is usually Greg. "Just have Greg do it." At this
> > point I'm confused, but it's certainly been sufficiently reviewed and
> > accepted. So can one of you just respond saying "I'll take it!"
> 
> Well, the crazies over in drm land were the ones that merged this new
> api, so they should be the ones responsible for it.  But that was way
> back in 2012, odds are they don't remember it given the lunacy that is
> their subsystem...

We do, it's just that I couldn't find Jason's patch when Thomas reviewed
it and asked for a resend and it took Jason a while to do that ...

Maybe we even remember this api way too well, we're constantly adding new
users of it in drm ;-)

> I'll take it after 4.10-rc1 is out, thanks.

Oh, here's another resubmission of this patch. I've already applied this
to my 4.11 queue, will show up in linux-next as soon as -rc1 is out.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[kbuild-all] [Intel-gfx] [PATCH] drm: Convert all helpers to drm_connector_list_iter

2016-12-16 Thread Fengguang Wu
Hi Daniel,

On Fri, Dec 16, 2016 at 08:29:43AM +0100, Daniel Vetter wrote:
>Hi Kbuild folks
>
>So yeah this doesn't apply because it's just 1 patch resent out of a
>big patch series, in-reply-to the patch it replaces. So applying this
>alone and telling me (and all the mailing lists) that it doesn't apply
>isn't all that useful.
>
>And it shouldn't be too hard to detect this, since the fdo patchwork
>instance does catch most of these partial resends successfully and
>correctly, and will retest the entire patch series.

Good point! CC Xiaolong. This scenario seems happen frequent enough in
LKML to worth the efforts to add auto detect logic for.

Thanks,
Fengguang

>On Thu, Dec 15, 2016 at 11:59 PM, kbuild test robot  wrote:
>> Hi Daniel,
>>
>> [auto build test ERROR on drm/drm-next]
>> [also build test ERROR on next-20161215]
>> [cannot apply to v4.9]
>> [if your patch is applied to the wrong git tree, please drop us a note to 
>> help improve the system]
>>
>> url:    
>> https://github.com/0day-ci/linux/commits/Daniel-Vetter/drm-Convert-all-helpers-to-drm_connector_list_iter/20161216-061508
>> base:   git://people.freedesktop.org/~airlied/linux.git drm-next
>> config: i386-randconfig-x003-201650 (attached as .config)
>> compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
>> reproduce:
>> # save the attached .config to linux build tree
>> make ARCH=i386
>>
>> All error/warnings (new ones prefixed by >>):
>>
>>drivers/gpu/drm/drm_crtc_helper.c: In function 
>> 'drm_helper_encoder_in_use':
>>>> drivers/gpu/drm/drm_crtc_helper.c:91:33: error: storage size of 
>>>> 'conn_iter' isn't known
>>  struct drm_connector_list_iter conn_iter;
>> ^
>>>> drivers/gpu/drm/drm_crtc_helper.c:104:2: error: implicit declaration of 
>>>> function 'drm_connector_list_iter_get' 
>>>> [-Werror=implicit-function-declaration]
>>  drm_connector_list_iter_get(dev, &conn_iter);
>>  ^~~
>>>> drivers/gpu/drm/drm_crtc_helper.c:105:2: error: implicit declaration of 
>>>> function 'drm_for_each_connector_iter' 
>>>> [-Werror=implicit-function-declaration]
>>  drm_for_each_connector_iter(connector, &conn_iter) {
>>  ^~~
>>>> drivers/gpu/drm/drm_crtc_helper.c:105:53: error: expected ';' before '{' 
>>>> token
>>  drm_for_each_connector_iter(connector, &conn_iter) {
>> ^
>>drivers/gpu/drm/drm_crtc_helper.c:91:33: warning: unused variable 
>> 'conn_iter' [-Wunused-variable]
>>  struct drm_connector_list_iter conn_iter;
>> ^
>>drivers/gpu/drm/drm_crtc_helper.c: In function 'drm_crtc_helper_disable':
>>drivers/gpu/drm/drm_crtc_helper.c:446:34: error: storage size of 
>> 'conn_iter' isn't known
>>   struct drm_connector_list_iter conn_iter;
>>  ^
>>drivers/gpu/drm/drm_crtc_helper.c:452:54: error: expected ';' before '{' 
>> token
>>   drm_for_each_connector_iter(connector, &conn_iter) {
>>  ^
>>drivers/gpu/drm/drm_crtc_helper.c:446:34: warning: unused variable 
>> 'conn_iter' [-Wunused-variable]
>>   struct drm_connector_list_iter conn_iter;
>>  ^
>>drivers/gpu/drm/drm_crtc_helper.c: In function 
>> 'drm_crtc_helper_set_config':
>>drivers/gpu/drm/drm_crtc_helper.c:521:33: error: storage size of 
>> 'conn_iter' isn't known
>>  struct drm_connector_list_iter conn_iter;
>> ^
>>>> drivers/gpu/drm/drm_crtc_helper.c:588:3: error: expected ';' before 
>>>> 'save_connector_encoders'
>>   save_connector_encoders[count++] = connector->encoder;
>>   ^~~
>>>> drivers/gpu/drm/drm_crtc_helper.c:589:2: error: implicit declaration of 
>>>> function 'drm_connector_list_iter_put' 
>>>> [-Werror=implicit-function-declaration]
>>  drm_connector_list_iter_put(&conn_iter);
>>  ^~~
>>drivers/gpu/drm/drm_crtc_helper.c:633:53: error: expected ';' befo

drm_mm range manager fixes, take 2

2016-12-16 Thread Chris Wilson
The goal of this series is to fix top-down allocations to be allocated
from the top and aligned correctly, introduce bottom-up allocations, and
speed up searches and tighten evictions.

More polish on the test cases to reduce code duplication and to improve
expectation checking. And a little more polish on the patches leading up
to the fixes.
-Chris


[PATCH v2 02/40] drm/i915: Simplify i915_gtt_color_adjust()

2016-12-16 Thread Chris Wilson
If we remember that node_list is a circular list containing the fake
head_node, we can use a simple list_next_entry() and skip the NULL check
for the allocated check against the head_node.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4543d7fa7fc2..4d82f38a000b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2707,10 +2707,8 @@ static void i915_gtt_color_adjust(struct drm_mm_node 
*node,
if (node->color != color)
*start += 4096;

-   node = list_first_entry_or_null(&node->node_list,
-   struct drm_mm_node,
-   node_list);
-   if (node && node->allocated && node->color != color)
+   node = list_next_entry(node, node_list);
+   if (node->allocated && node->color != color)
*end -= 4096;
 }

-- 
2.11.0



[PATCH v2 03/40] drm: Add drm_mm_for_each_node_safe()

2016-12-16 Thread Chris Wilson
A complement to drm_mm_for_each_node(), wraps list_for_each_entry_safe()
for walking the list of nodes safe against removal.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c |  9 -
 include/drm/drm_mm.h | 19 ---
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index ca1e344f318d..6e0735539545 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -138,7 +138,7 @@ static void show_leaks(struct drm_mm *mm)
if (!buf)
return;

-   list_for_each_entry(node, &mm->head_node.node_list, node_list) {
+   list_for_each_entry(node, __drm_mm_nodes(mm), node_list) {
struct stack_trace trace = {
.entries = entries,
.max_entries = STACKDEPTH
@@ -320,8 +320,7 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct 
drm_mm_node *node)
if (hole->start < end)
return -ENOSPC;
} else {
-   hole = list_entry(&mm->head_node.node_list,
- typeof(*hole), node_list);
+   hole = list_entry(__drm_mm_nodes(mm), typeof(*hole), node_list);
}

hole = list_last_entry(&hole->node_list, typeof(*hole), node_list);
@@ -884,7 +883,7 @@ EXPORT_SYMBOL(drm_mm_scan_remove_block);
  */
 bool drm_mm_clean(struct drm_mm * mm)
 {
-   struct list_head *head = &mm->head_node.node_list;
+   struct list_head *head = __drm_mm_nodes(mm);

return (head->next->next == head);
 }
@@ -930,7 +929,7 @@ EXPORT_SYMBOL(drm_mm_init);
  */
 void drm_mm_takedown(struct drm_mm *mm)
 {
-   if (WARN(!list_empty(&mm->head_node.node_list),
+   if (WARN(!list_empty(__drm_mm_nodes(mm)),
 "Memory manager not clean during takedown.\n"))
show_leaks(mm);

diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 0b8371795aeb..0cc1b78c9ec2 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -179,6 +179,8 @@ static inline u64 drm_mm_hole_node_end(struct drm_mm_node 
*hole_node)
return __drm_mm_hole_node_end(hole_node);
 }

+#define __drm_mm_nodes(mm) (&(mm)->head_node.node_list)
+
 /**
  * drm_mm_for_each_node - iterator to walk over all allocated nodes
  * @entry: drm_mm_node structure to assign to in each iteration step
@@ -187,9 +189,20 @@ static inline u64 drm_mm_hole_node_end(struct drm_mm_node 
*hole_node)
  * This iterator walks over all nodes in the range allocator. It is implemented
  * with list_for_each, so not save against removal of elements.
  */
-#define drm_mm_for_each_node(entry, mm) list_for_each_entry(entry, \
-   &(mm)->head_node.node_list, \
-   node_list)
+#define drm_mm_for_each_node(entry, mm) \
+   list_for_each_entry(entry, __drm_mm_nodes(mm), node_list)
+
+/**
+ * drm_mm_for_each_node_safe - iterator to walk over all allocated nodes
+ * @entry: drm_mm_node structure to assign to in each iteration step
+ * @next: drm_mm_node structure to store the next step
+ * @mm: drm_mm allocator to walk
+ *
+ * This iterator walks over all nodes in the range allocator. It is implemented
+ * with list_for_each_safe, so save against removal of elements.
+ */
+#define drm_mm_for_each_node_safe(entry, next, mm) \
+   list_for_each_entry_safe(entry, next, __drm_mm_nodes(mm), node_list)

 #define __drm_mm_for_each_hole(entry, mm, hole_start, hole_end, backwards) \
for (entry = list_entry((backwards) ? (mm)->hole_stack.prev : 
(mm)->hole_stack.next, struct drm_mm_node, hole_stack); \
-- 
2.11.0



[PATCH v2 08/40] drm: Add a simple prime number generator

2016-12-16 Thread Chris Wilson
Prime numbers are interesting for testing components that use multiplies
and divides, such as testing struct drm_mm alignment computations.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/Kconfig |   4 +
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/lib/drm_prime_numbers.c | 175 
 drivers/gpu/drm/lib/drm_prime_numbers.h |  10 ++
 4 files changed, 190 insertions(+)
 create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.c
 create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 2e6ae95459e4..93895898d596 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -53,6 +53,7 @@ config DRM_DEBUG_MM_SELFTEST
depends on DRM
depends on DEBUG_KERNEL
select DRM_LIB_RANDOM
+   select DRM_LIB_PRIMES
default n
help
  This option provides a kernel module that can be used to test
@@ -340,3 +341,6 @@ config DRM_LIB_RANDOM
bool
default n

+config DRM_LIB_PRIMES
+   bool
+   default n
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 0fa16275fdae..bbd390fa8914 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -19,6 +19,7 @@ drm-y   :=drm_auth.o drm_bufs.o drm_cache.o \
drm_dumb_buffers.o drm_mode_config.o

 drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
+obj-$(CONFIG_DRM_LIB_PRIMES) += lib/drm_prime_numbers.o
 obj-$(CONFIG_DRM_DEBUG_MM_SELFTEST) += selftests/test-drm_mm.o

 drm-$(CONFIG_COMPAT) += drm_ioc32.o
diff --git a/drivers/gpu/drm/lib/drm_prime_numbers.c 
b/drivers/gpu/drm/lib/drm_prime_numbers.c
new file mode 100644
index ..839563d9b787
--- /dev/null
+++ b/drivers/gpu/drm/lib/drm_prime_numbers.c
@@ -0,0 +1,175 @@
+#include 
+#include 
+#include 
+
+#include "drm_prime_numbers.h"
+
+static DEFINE_MUTEX(lock);
+
+static struct primes {
+   struct rcu_head rcu;
+   unsigned long last, sz;
+   unsigned long primes[];
+} __rcu *primes;
+
+static bool slow_is_prime_number(unsigned long x)
+{
+   unsigned long y = int_sqrt(x) + 1;
+
+   while (y > 1) {
+   if ((x % y) == 0)
+   break;
+   y--;
+   }
+
+   return y == 1;
+}
+
+static unsigned long slow_next_prime_number(unsigned long x)
+{
+   for (;;) {
+   if (slow_is_prime_number(++x))
+   return x;
+   }
+}
+
+static unsigned long mark_multiples(unsigned long x,
+   unsigned long *p,
+   unsigned long start,
+   unsigned long end)
+{
+   unsigned long m;
+
+   m = 2 * x;
+   if (m < start)
+   m = (start / x + 1) * x;
+
+   while (m < end) {
+   __clear_bit(m, p);
+   m += x;
+   }
+
+   return x;
+}
+
+static struct primes *expand(unsigned long x)
+{
+   unsigned long sz, y, prev;
+   struct primes *p, *new;
+
+   sz = x * x;
+   if (sz < x)
+   return NULL;
+
+   mutex_lock(&lock);
+   p = rcu_dereference_protected(primes, lockdep_is_held(&lock));
+   if (p && x < p->last)
+   goto unlock;
+
+   sz = round_up(sz, BITS_PER_LONG);
+   new = kmalloc(sizeof(*new) + sz / sizeof(long), GFP_KERNEL);
+   if (!new) {
+   p = NULL;
+   goto unlock;
+   }
+
+   /* Where memory permits, track the primes using the
+* Sieve of Eratosthenes.
+*/
+   if (p) {
+   prev = p->sz;
+   memcpy(new->primes, p->primes, prev / BITS_PER_LONG);
+   } else {
+   prev = 0;
+   }
+   memset(new->primes + prev / BITS_PER_LONG,
+  0xff, (sz - prev) / sizeof(long));
+   for (y = 2UL; y < sz; y = find_next_bit(new->primes, sz, y + 1))
+   new->last = mark_multiples(y, new->primes, prev, sz);
+   new->sz = sz;
+
+   rcu_assign_pointer(primes, new);
+   if (p)
+   kfree_rcu(p, rcu);
+   p = new;
+
+unlock:
+   mutex_unlock(&lock);
+   return p;
+}
+
+unsigned long drm_next_prime_number(unsigned long x)
+{
+   struct primes *p;
+
+   if (x < 2)
+   return 2;
+
+   rcu_read_lock();
+   p = rcu_dereference(primes);
+   if (!p || x >= p->last) {
+   rcu_read_unlock();
+
+   p = expand(x);
+   if (!p)
+   return slow_next_prime_number(x);
+
+   rcu_read_lock();
+   }
+
+   x = find_next_bit(p->primes, p->last, x + 1);
+   rcu_read_unlock();
+
+   return x;
+}
+EXPORT_SYMBOL(drm_next_prime_number);
+
+bool drm_is_prime_number(unsigned long x)
+{
+   struct primes *p;
+   bool result;
+
+   switch (x) {
+   case 0:
+   return false;
+   case 1:
+   case 2:
+   

[PATCH v2 04/40] drm: Constify the drm_mm API

2016-12-16 Thread Chris Wilson
Mark up the pointers as constant through the API where appropriate.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c| 24 
 drivers/gpu/drm/i915/i915_gem_gtt.c |  2 +-
 include/drm/drm_mm.h| 27 +--
 3 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 6e0735539545..7573661302a4 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -174,9 +174,9 @@ INTERVAL_TREE_DEFINE(struct drm_mm_node, rb,
 START, LAST, static inline, drm_mm_interval_tree)

 struct drm_mm_node *
-__drm_mm_interval_first(struct drm_mm *mm, u64 start, u64 last)
+__drm_mm_interval_first(const struct drm_mm *mm, u64 start, u64 last)
 {
-   return drm_mm_interval_tree_iter_first(&mm->interval_tree,
+   return drm_mm_interval_tree_iter_first((struct rb_root 
*)&mm->interval_tree,
   start, last);
 }
 EXPORT_SYMBOL(__drm_mm_interval_first);
@@ -881,9 +881,9 @@ EXPORT_SYMBOL(drm_mm_scan_remove_block);
  * True if the allocator is completely free, false if there's still a node
  * allocated in it.
  */
-bool drm_mm_clean(struct drm_mm * mm)
+bool drm_mm_clean(const struct drm_mm *mm)
 {
-   struct list_head *head = __drm_mm_nodes(mm);
+   const struct list_head *head = __drm_mm_nodes(mm);

return (head->next->next == head);
 }
@@ -897,7 +897,7 @@ EXPORT_SYMBOL(drm_mm_clean);
  *
  * Note that @mm must be cleared to 0 before calling this function.
  */
-void drm_mm_init(struct drm_mm * mm, u64 start, u64 size)
+void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)
 {
INIT_LIST_HEAD(&mm->hole_stack);
mm->scanned_blocks = 0;
@@ -936,8 +936,8 @@ void drm_mm_takedown(struct drm_mm *mm)
 }
 EXPORT_SYMBOL(drm_mm_takedown);

-static u64 drm_mm_debug_hole(struct drm_mm_node *entry,
-const char *prefix)
+static u64 drm_mm_debug_hole(const struct drm_mm_node *entry,
+const char *prefix)
 {
u64 hole_start, hole_end, hole_size;

@@ -958,9 +958,9 @@ static u64 drm_mm_debug_hole(struct drm_mm_node *entry,
  * @mm: drm_mm allocator to dump
  * @prefix: prefix to use for dumping to dmesg
  */
-void drm_mm_debug_table(struct drm_mm *mm, const char *prefix)
+void drm_mm_debug_table(const struct drm_mm *mm, const char *prefix)
 {
-   struct drm_mm_node *entry;
+   const struct drm_mm_node *entry;
u64 total_used = 0, total_free = 0, total = 0;

total_free += drm_mm_debug_hole(&mm->head_node, prefix);
@@ -979,7 +979,7 @@ void drm_mm_debug_table(struct drm_mm *mm, const char 
*prefix)
 EXPORT_SYMBOL(drm_mm_debug_table);

 #if defined(CONFIG_DEBUG_FS)
-static u64 drm_mm_dump_hole(struct seq_file *m, struct drm_mm_node *entry)
+static u64 drm_mm_dump_hole(struct seq_file *m, const struct drm_mm_node 
*entry)
 {
u64 hole_start, hole_end, hole_size;

@@ -1000,9 +1000,9 @@ static u64 drm_mm_dump_hole(struct seq_file *m, struct 
drm_mm_node *entry)
  * @m: seq_file to dump to
  * @mm: drm_mm allocator to dump
  */
-int drm_mm_dump_table(struct seq_file *m, struct drm_mm *mm)
+int drm_mm_dump_table(struct seq_file *m, const struct drm_mm *mm)
 {
-   struct drm_mm_node *entry;
+   const struct drm_mm_node *entry;
u64 total_used = 0, total_free = 0, total = 0;

total_free += drm_mm_dump_hole(m, &mm->head_node);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4d82f38a000b..97fd66e4e3d0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2699,7 +2699,7 @@ void i915_gem_gtt_finish_pages(struct drm_i915_gem_object 
*obj,
dma_unmap_sg(kdev, pages->sgl, pages->nents, PCI_DMA_BIDIRECTIONAL);
 }

-static void i915_gtt_color_adjust(struct drm_mm_node *node,
+static void i915_gtt_color_adjust(const struct drm_mm_node *node,
  unsigned long color,
  u64 *start,
  u64 *end)
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 0cc1b78c9ec2..5c7f15875b6a 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -102,7 +102,8 @@ struct drm_mm {
u64 scan_end;
struct drm_mm_node *prev_scanned_node;

-   void (*color_adjust)(struct drm_mm_node *node, unsigned long color,
+   void (*color_adjust)(const struct drm_mm_node *node,
+unsigned long color,
 u64 *start, u64 *end);
 };

@@ -116,7 +117,7 @@ struct drm_mm {
  * Returns:
  * True if the @node is allocated.
  */
-static inline bool drm_mm_node_allocated(struct drm_mm_node *node)
+static inline bool drm_mm_node_allocated(const struct drm_mm_node *node)
 {
return node->allocated;
 }
@@ -131,12 +132,12 @@ static inline bool drm_mm_node_allocated(struct 

[PATCH v2 20/40] drm: kselftest for drm_mm and color eviction

2016-12-16 Thread Chris Wilson
Check that after applying the driver's color adjustment, eviction
scanning find a suitable hole.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 159 +++
 2 files changed, 160 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index ff44f39a1826..0a3a7e32e5f7 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -19,3 +19,4 @@ selftest(evict, igt_evict)
 selftest(evict_range, igt_evict_range)
 selftest(topdown, igt_topdown)
 selftest(color, igt_color)
+selftest(color_evict, igt_color_evict)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 6b3af04eb582..0a051abc8f13 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -1830,6 +1830,165 @@ static int igt_color(void *ignored)
return ret;
 }

+static int evict_color(struct drm_mm *mm,
+  struct evict_node *nodes,
+  unsigned int *order,
+  unsigned int count,
+  unsigned int size,
+  unsigned int alignment,
+  unsigned long color,
+  const struct insert_mode *mode)
+{
+   LIST_HEAD(evict_list);
+   struct evict_node *e;
+   struct drm_mm_node tmp;
+   int err;
+
+   drm_mm_init_scan(mm, size, alignment, color);
+   if (!evict_nodes(mm,
+nodes, order, count,
+&evict_list))
+   return -EINVAL;
+
+   memset(&tmp, 0, sizeof(tmp));
+   err = drm_mm_insert_node_generic(mm, &tmp, size, alignment, color,
+mode->search_flags,
+mode->create_flags);
+   if (err) {
+   pr_err("Failed to insert into eviction hole: size=%d, align=%d, 
color=%lu, err=%d\n",
+  size, alignment, color, err);
+   show_scan(mm);
+   show_holes(mm, 3);
+   return err;
+   }
+
+   if (colors_abutt(&tmp))
+   err = -EINVAL;
+
+   if (!assert_node(&tmp, mm, size, alignment, color)) {
+   pr_err("Inserted did not fit the eviction hole: size=%lld [%d], 
align=%d [rem=%lld], start=%llx\n",
+  tmp.size, size,
+  alignment, misaligned(&tmp, alignment), tmp.start);
+   err = -EINVAL;
+   }
+
+   drm_mm_remove_node(&tmp);
+   if (err)
+   return err;
+
+   list_for_each_entry(e, &evict_list, link) {
+   err = drm_mm_reserve_node(mm, &e->node);
+   if (err) {
+   pr_err("Failed to reinsert node after eviction: 
start=%llx\n",
+  e->node.start);
+   return err;
+   }
+   }
+
+   return 0;
+}
+
+static int igt_color_evict(void *ignored)
+{
+   RND_STATE(prng, random_seed);
+   const unsigned int total_size = min(8192u, max_iterations);
+   const struct insert_mode *mode;
+   unsigned long color = 0;
+   struct drm_mm mm;
+   struct evict_node *nodes;
+   struct drm_mm_node *node, *next;
+   unsigned int *order, n;
+   int ret, err;
+
+   /* Check that the drm_mm_scan also honours color adjustment when
+* choosing its victims to create a hole. Our color_adjust does not
+* allow two nodes to be placed together without an intervening hole
+* enlarging the set of victims that must be evicted.
+*/
+
+   ret = -ENOMEM;
+   nodes = vzalloc(total_size * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   order = drm_random_order(total_size, &prng);
+   if (!order)
+   goto err_nodes;
+
+   ret = -EINVAL;
+   drm_mm_init(&mm, 0, 2*total_size - 1);
+   mm.color_adjust = separate_adjacent_colors;
+   for (n = 0; n < total_size; n++) {
+   err = drm_mm_insert_node_generic(&mm, &nodes[n].node,
+1, 0, color++,
+DRM_MM_SEARCH_DEFAULT,
+DRM_MM_CREATE_DEFAULT);
+   if (err) {
+   pr_err("insert failed, step %d\n", n);
+   ret = err;
+   goto out;
+   }
+   }
+
+   for (mode = evict_modes; mode->name; mode++) {
+   for (n = 1; n <= total_size; n <<= 1) {
+   drm_random_reorder(order, total_size, &prng);
+   err = evict_color(&mm,
+ nodes, order, total_size,
+ n, 1, color++,
+  

[PATCH v2 14/40] drm: kselftest for drm_mm_insert_node_in_range()

2016-12-16 Thread Chris Wilson
Exercise drm_mm_insert_node_in_range(), check that we only allocate from
the specified range.

v2: Use all allocation flags

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 246 +++
 2 files changed, 247 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index dca726baa65d..92b2c1cb10fa 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -11,3 +11,4 @@ selftest(debug, igt_debug)
 selftest(reserve, igt_reserve)
 selftest(insert, igt_insert)
 selftest(replace, igt_replace)
+selftest(insert_range, igt_insert_range)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index dca146478b06..94cd0741c5b6 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -724,6 +724,252 @@ static int igt_replace(void *ignored)
return 0;
 }

+static bool expect_insert_in_range_fail(struct drm_mm *mm,
+   u64 size,
+   u64 range_start,
+   u64 range_end)
+{
+   struct drm_mm_node tmp = {};
+   int err;
+
+   err = drm_mm_insert_node_in_range_generic(mm, &tmp,
+ size, 0, 0,
+ range_start, range_end,
+ DRM_MM_SEARCH_DEFAULT,
+ DRM_MM_CREATE_DEFAULT);
+   if (err != -ENOSPC)  {
+   if (!err) {
+   pr_err("impossible insert succeeded, node %llx + %llu, 
range [%llx, %llx]\n",
+  tmp.start, tmp.size, range_start, range_end);
+   drm_mm_remove_node(&tmp);
+   } else {
+   pr_err("impossible insert failed with wrong error %d 
[expected %d], size %llu, range [%llx, %llx]\n",
+  err, -ENOSPC, size, range_start, range_end);
+   }
+   return false;
+   }
+
+   return true;
+}
+
+static bool assert_contiguous_in_range(struct drm_mm *mm,
+  u64 size,
+  u64 start,
+  u64 end)
+{
+   struct drm_mm_node *node;
+   unsigned int n;
+
+   if (!expect_insert_in_range_fail(mm, size, start, end))
+   return false;
+
+   n = div64_u64(start + size - 1, size);
+   drm_mm_for_each_node(node, mm) {
+   if (node->start < start || node->start + node->size > end) {
+   pr_err("node %d out of range, address [%llx + %llu], 
range [%llx, %llx]\n",
+  n, node->start, node->start + node->size, start, 
end);
+   return false;
+   }
+
+   if (node->start != n * size) {
+   pr_err("node %d out of order, expected start %llx, 
found %llx\n",
+  n, n * size, node->start);
+   return false;
+   }
+
+   if (node->size != size) {
+   pr_err("node %d has wrong size, expected size %llx, 
found %llx\n",
+  n, size, node->size);
+   return false;
+   }
+
+   if (node->hole_follows && drm_mm_hole_node_end(node) < end) {
+   pr_err("node %d is followed by a hole!\n", n);
+   return false;
+   }
+
+   n++;
+   }
+
+   drm_mm_for_each_node_in_range(node, mm, 0, start) {
+   if (node) {
+   pr_err("node before start: node=%llx+%llu, 
start=%llx\n",
+  node->start, node->size, start);
+   return false;
+   }
+   }
+
+   drm_mm_for_each_node_in_range(node, mm, end, U64_MAX) {
+   if (node) {
+   pr_err("node after end: node=%llx+%llu, end=%llx\n",
+  node->start, node->size, end);
+   return false;
+   }
+   }
+
+   return true;
+}
+
+static int __igt_insert_range(unsigned int count, u64 size, u64 start, u64 end)
+{
+   const struct insert_mode *mode;
+   struct drm_mm mm;
+   struct drm_mm_node *nodes, *node, *next;
+   unsigned int n, start_n, end_n;
+   int ret, err;
+
+   DRM_MM_BUG_ON(!count);
+   DRM_MM_BUG_ON(!size);
+   DRM_MM_BUG_ON(end <= start);
+
+   /* Very similar to __igt_insert(), but now instead of populating the
+* full range of the drm_mm, we try to fill a small portion of it.
+*/
+
+   ret = -ENOMEM;
+   n

[PATCH v2 21/40] drm: kselftest for drm_mm and restricted color eviction

2016-12-16 Thread Chris Wilson
Check that after applying the driver's color adjustment, restricted
eviction scanning find a suitable hole.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 119 ++-
 2 files changed, 116 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 0a3a7e32e5f7..6a4575fdc1c0 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -20,3 +20,4 @@ selftest(evict_range, igt_evict_range)
 selftest(topdown, igt_topdown)
 selftest(color, igt_color)
 selftest(color_evict, igt_color_evict)
+selftest(color_evict_range, igt_color_evict_range)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 0a051abc8f13..43575074eca5 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -1831,6 +1831,7 @@ static int igt_color(void *ignored)
 }

 static int evict_color(struct drm_mm *mm,
+  u64 range_start, u64 range_end,
   struct evict_node *nodes,
   unsigned int *order,
   unsigned int count,
@@ -1844,7 +1845,9 @@ static int evict_color(struct drm_mm *mm,
struct drm_mm_node tmp;
int err;

-   drm_mm_init_scan(mm, size, alignment, color);
+   drm_mm_init_scan_with_range(mm,
+   size, alignment, color,
+   range_start, range_end);
if (!evict_nodes(mm,
 nodes, order, count,
 &evict_list))
@@ -1862,6 +1865,12 @@ static int evict_color(struct drm_mm *mm,
return err;
}

+   if (tmp.start < range_start || tmp.start + tmp.size > range_end) {
+   pr_err("Inserted [address=%llu + %llu] did not fit into the 
request range [%llu, %llu]\n",
+  tmp.start, tmp.size, range_start, range_end);
+   err = -EINVAL;
+   }
+
if (colors_abutt(&tmp))
err = -EINVAL;

@@ -1933,7 +1942,7 @@ static int igt_color_evict(void *ignored)
for (mode = evict_modes; mode->name; mode++) {
for (n = 1; n <= total_size; n <<= 1) {
drm_random_reorder(order, total_size, &prng);
-   err = evict_color(&mm,
+   err = evict_color(&mm, 0, U64_MAX,
  nodes, order, total_size,
  n, 1, color++,
  mode);
@@ -1946,7 +1955,7 @@ static int igt_color_evict(void *ignored)

for (n = 1; n < total_size; n <<= 1) {
drm_random_reorder(order, total_size, &prng);
-   err = evict_color(&mm,
+   err = evict_color(&mm, 0, U64_MAX,
  nodes, order, total_size,
  total_size/2, n, color++,
  mode);
@@ -1963,7 +1972,7 @@ static int igt_color_evict(void *ignored)
DRM_MM_BUG_ON(!nsize);

drm_random_reorder(order, total_size, &prng);
-   err = evict_color(&mm,
+   err = evict_color(&mm, 0, U64_MAX,
  nodes, order, total_size,
  nsize, n, color++,
  mode);
@@ -1989,6 +1998,108 @@ static int igt_color_evict(void *ignored)
return ret;
 }

+static int igt_color_evict_range(void *ignored)
+{
+   RND_STATE(prng, random_seed);
+   const unsigned int total_size = 8192;
+   const unsigned int range_size = total_size / 2;
+   const unsigned int range_start = total_size / 4;
+   const unsigned int range_end = range_start + range_size;
+   const struct insert_mode *mode;
+   unsigned long color = 0;
+   struct drm_mm mm;
+   struct evict_node *nodes;
+   struct drm_mm_node *node, *next;
+   unsigned int *order, n;
+   int ret, err;
+
+   /* Like igt_color_evict(), but limited to small portion of the full
+* drm_mm range.
+*/
+
+   ret = -ENOMEM;
+   nodes = vzalloc(total_size * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   order = drm_random_order(total_size, &prng);
+   if (!order)
+   goto err_nodes;
+
+   ret = -EINVAL;
+   drm_mm_init(&mm, 0, 2*total_size - 1);
+   mm.color_adjust = separate_adjacent_colors;
+   for (n = 0; n < total_size; n++) {
+   err = drm_mm_insert_node_generic(&mm, &nodes[n].node,
+1, 0, color++,
+  

[PATCH v2 23/40] drm: Promote drm_mm alignment to u64

2016-12-16 Thread Chris Wilson
In places (e.g. i915.ko), the alignment is exported to userspace as u64
and there now exists hardware for which we can indeed utilize a u64
alignment. As such, we need to keep 64bit integers throughout when
handling alignment.

Testcase: igt/drm_mm/align64
Testcase: igt/gem_exec_alignment
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: Christian König 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/drm_mm.c| 37 +++--
 drivers/gpu/drm/selftests/test-drm_mm.c |  4 ++--
 include/drm/drm_mm.h| 16 +++---
 3 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 2b76167ef39f..2d02ab0925a9 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -93,12 +93,12 @@

 static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
u64 size,
-   unsigned alignment,
+   u64 alignment,
unsigned long color,
enum drm_mm_search_flags flags);
 static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct 
drm_mm *mm,
u64 size,
-   unsigned alignment,
+   u64 alignment,
unsigned long color,
u64 start,
u64 end,
@@ -227,7 +227,7 @@ static void drm_mm_interval_tree_add_node(struct 
drm_mm_node *hole_node,

 static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 struct drm_mm_node *node,
-u64 size, unsigned alignment,
+u64 size, u64 alignment,
 unsigned long color,
 enum drm_mm_allocator_flags flags)
 {
@@ -246,10 +246,9 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
adj_start = adj_end - size;

if (alignment) {
-   u64 tmp = adj_start;
-   unsigned rem;
+   u64 rem;

-   rem = do_div(tmp, alignment);
+   div64_u64_rem(adj_start, alignment, &rem);
if (rem) {
if (flags & DRM_MM_CREATE_TOP)
adj_start -= rem;
@@ -376,7 +375,7 @@ EXPORT_SYMBOL(drm_mm_reserve_node);
  * 0 on success, -ENOSPC if there's no suitable hole.
  */
 int drm_mm_insert_node_generic(struct drm_mm *mm, struct drm_mm_node *node,
-  u64 size, unsigned alignment,
+  u64 size, u64 alignment,
   unsigned long color,
   enum drm_mm_search_flags sflags,
   enum drm_mm_allocator_flags aflags)
@@ -398,7 +397,7 @@ EXPORT_SYMBOL(drm_mm_insert_node_generic);

 static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
   struct drm_mm_node *node,
-  u64 size, unsigned alignment,
+  u64 size, u64 alignment,
   unsigned long color,
   u64 start, u64 end,
   enum drm_mm_allocator_flags flags)
@@ -423,10 +422,9 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,
adj_start = adj_end - size;

if (alignment) {
-   u64 tmp = adj_start;
-   unsigned rem;
+   u64 rem;

-   rem = do_div(tmp, alignment);
+   div64_u64_rem(adj_start, alignment, &rem);
if (rem) {
if (flags & DRM_MM_CREATE_TOP)
adj_start -= rem;
@@ -482,7 +480,7 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,
  * 0 on success, -ENOSPC if there's no suitable hole.
  */
 int drm_mm_insert_node_in_range_generic(struct drm_mm *mm, struct drm_mm_node 
*node,
-   u64 size, unsigned alignment,
+   u64 size, u64 alignment,
unsigned long color,
u64 start, u64 end,
enum drm_mm_search_flags sflags,
@@ -548,16 +546,15 @@ void drm_mm_remove_node(struct drm_mm_node *node)
 }
 EXPORT_SYMBOL(drm_mm_remove_node);

-static int check_free_hole(u64 start, u64 end, u64 size, unsigned alignment)
+static int check_free_hole(u64 start, u64 end, u64 size, u64 align

[PATCH v2 32/40] drm: Compute tight evictions for drm_mm_scan

2016-12-16 Thread Chris Wilson
Compute the minimal required hole during scan and only evict those nodes
that overlap. This enables us to reduce the number of nodes we need to
evict to the bare minimum.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c| 60 +++--
 drivers/gpu/drm/etnaviv/etnaviv_mmu.c   |  2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c   |  3 +-
 drivers/gpu/drm/selftests/test-drm_mm.c | 10 +++---
 include/drm/drm_mm.h| 22 ++--
 5 files changed, 71 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 956782e7b092..ff1e62c066e8 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -718,10 +718,10 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  * @color: opaque tag value to use for the allocation
  * @start: start of the allowed range for the allocation
  * @end: end of the allowed range for the allocation
+ * @flags: flags to specify how the allocation will be performed afterwards
  *
  * This simply sets up the scanning routines with the parameters for the 
desired
- * hole. Note that there's no need to specify allocation flags, since they only
- * change the place a node is allocated from within a suitable hole.
+ * hole.
  *
  * Warning:
  * As long as the scan list is non-empty, no other operations than
@@ -733,7 +733,8 @@ void drm_mm_scan_init_with_range(struct drm_mm_scan *scan,
 u64 alignment,
 unsigned long color,
 u64 start,
-u64 end)
+u64 end,
+unsigned int flags)
 {
DRM_MM_BUG_ON(start >= end);
DRM_MM_BUG_ON(size == 0 || size > end - start);
@@ -744,6 +745,7 @@ void drm_mm_scan_init_with_range(struct drm_mm_scan *scan,
scan->color = color;
scan->alignment = alignment;
scan->size = size;
+   scan->flags = flags;

DRM_MM_BUG_ON(end <= start);
scan->range_start = start;
@@ -778,7 +780,7 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
DRM_MM_BUG_ON(node->mm != mm);
DRM_MM_BUG_ON(!node->allocated);
DRM_MM_BUG_ON(node->scanned_block);
-   node->scanned_block = 1;
+   node->scanned_block = true;
mm->scan_active++;

hole = list_prev_entry(node, node_list);
@@ -800,15 +802,53 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,

adj_start = max(col_start, scan->range_start);
adj_end = min(col_end, scan->range_end);
+   if (adj_end <= adj_start || adj_end - adj_start < scan->size)
+   return false;
+
+   if (scan->flags == DRM_MM_CREATE_TOP)
+   adj_start = adj_end - scan->size;
+
+   if (scan->alignment) {
+   u64 rem;
+
+   div64_u64_rem(adj_start, scan->alignment, &rem);
+   if (rem) {
+   adj_start -= rem;
+   if (scan->flags != DRM_MM_CREATE_TOP)
+   adj_start += scan->alignment;
+   if (adj_start < max(col_start, scan->range_start) ||
+   min(col_end, scan->range_end) - adj_start < 
scan->size)
+   return false;
+
+   if (adj_end <= adj_start ||
+   adj_end - adj_start < scan->size)
+   return false;
+   }
+   }

-   if (check_free_hole(adj_start, adj_end,
-   scan->size, scan->alignment)) {
+   if (mm->color_adjust) {
+   /* If allocations need adjusting due to neighbouring colours,
+* we do not have enough information to decide if we need
+* to evict nodes on either side of [adj_start, adj_end].
+* What almost works is
+* hit_start = adj_start + (hole_start - col_start);
+* hit_end = adj_start + scan->size + (hole_end - col_end);
+* but because the decision is only made on the final hole,
+* we may underestimate the required adjustments for an
+* interior allocation.
+*/
scan->hit_start = hole_start;
scan->hit_end = hole_end;
-   return true;
+   } else {
+   scan->hit_start = adj_start;
+   scan->hit_end = adj_start + scan->size;
}

-   return false;
+   DRM_MM_BUG_ON(scan->hit_start >= scan->hit_end);
+   DRM_MM_BUG_ON(scan->hit_start < hole_start);
+   DRM_MM_BUG_ON(scan->hit_end > hole_end);
+
+   return true;
 }
 EXPORT_SYMBOL(drm_mm_scan_add_block);

@@ -836,7 +876,7 @@ bool drm_mm_scan_remove_block(struct drm_mm_scan *scan,

DRM_MM_BUG_ON(node->mm != scan->mm);
DRM_MM_BUG_ON(!node->scanned_block);
-   node->scanned_block = 

[PATCH v2 07/40] drm: Add a simple generator of random permutations

2016-12-16 Thread Chris Wilson
When testing, we want a random but yet reproducible order in which to
process elements. Here we create an array which is a random (using the
Tausworthe PRNG) permutation of the order in which to execute.

v2: Tidier code by David Herrmann

Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: David Herrmann 
---
 drivers/gpu/drm/Kconfig  |  6 ++
 drivers/gpu/drm/Makefile |  1 +
 drivers/gpu/drm/lib/drm_random.c | 41 
 drivers/gpu/drm/lib/drm_random.h | 21 
 4 files changed, 69 insertions(+)
 create mode 100644 drivers/gpu/drm/lib/drm_random.c
 create mode 100644 drivers/gpu/drm/lib/drm_random.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index d1363d21d3d1..2e6ae95459e4 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -52,6 +52,7 @@ config DRM_DEBUG_MM_SELFTEST
tristate "kselftests for DRM range manager (struct drm_mm)"
depends on DRM
depends on DEBUG_KERNEL
+   select DRM_LIB_RANDOM
default n
help
  This option provides a kernel module that can be used to test
@@ -334,3 +335,8 @@ config DRM_SAVAGE
  chipset. If M is selected the module will be called savage.

 endif # DRM_LEGACY
+
+config DRM_LIB_RANDOM
+   bool
+   default n
+
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index c8aed3688b20..0fa16275fdae 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -18,6 +18,7 @@ drm-y   :=drm_auth.o drm_bufs.o drm_cache.o \
drm_plane.o drm_color_mgmt.o drm_print.o \
drm_dumb_buffers.o drm_mode_config.o

+drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
 obj-$(CONFIG_DRM_DEBUG_MM_SELFTEST) += selftests/test-drm_mm.o

 drm-$(CONFIG_COMPAT) += drm_ioc32.o
diff --git a/drivers/gpu/drm/lib/drm_random.c b/drivers/gpu/drm/lib/drm_random.c
new file mode 100644
index ..507e72721b6f
--- /dev/null
+++ b/drivers/gpu/drm/lib/drm_random.c
@@ -0,0 +1,41 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "drm_random.h"
+
+static inline u32 prandom_u32_max_state(u32 ep_ro, struct rnd_state *state)
+{
+   return upper_32_bits((u64)prandom_u32_state(state) * ep_ro);
+}
+
+void drm_random_reorder(unsigned int *order, unsigned int count,
+   struct rnd_state *state)
+{
+   unsigned int i, j;
+
+   for (i = 0; i < count; ++i) {
+   BUILD_BUG_ON(sizeof(unsigned int) > sizeof(u32));
+   j = prandom_u32_max_state(count, state);
+   swap(order[i], order[j]);
+   }
+}
+EXPORT_SYMBOL(drm_random_reorder);
+
+unsigned int *drm_random_order(unsigned int count, struct rnd_state *state)
+{
+   unsigned int *order, i;
+
+   order = kmalloc_array(count, sizeof(*order), GFP_TEMPORARY);
+   if (!order)
+   return order;
+
+   for (i = 0; i < count; i++)
+   order[i] = i;
+
+   drm_random_reorder(order, count, state);
+   return order;
+}
+EXPORT_SYMBOL(drm_random_order);
diff --git a/drivers/gpu/drm/lib/drm_random.h b/drivers/gpu/drm/lib/drm_random.h
new file mode 100644
index ..931c8f4a261d
--- /dev/null
+++ b/drivers/gpu/drm/lib/drm_random.h
@@ -0,0 +1,21 @@
+#ifndef __DRM_RANDOM_H__
+#define __DRM_RANDOM_H
+
+#include 
+
+#define RND_STATE_INITIALIZER(seed__) ({   \
+   struct rnd_state state__;   \
+   prandom_seed_state(&state__, (seed__)); \
+   state__;\
+})
+
+#define RND_STATE(name__, seed__) \
+   struct rnd_state name__ = RND_STATE_INITIALIZER(seed__)
+
+unsigned int *drm_random_order(unsigned int count,
+  struct rnd_state *state);
+void drm_random_reorder(unsigned int *order,
+   unsigned int count,
+   struct rnd_state *state);
+
+#endif /* __DRM_RANDOM_H__ */
-- 
2.11.0



[PATCH v2 33/40] drm: Optimise power-of-two alignments in drm_mm_scan_add_block()

2016-12-16 Thread Chris Wilson
For power-of-two alignments, we can avoid the 64bit divide and do a
simple bitwise add instead.

v2: s/alignment_mask/remainder_mask/

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c | 9 -
 include/drm/drm_mm.h | 1 +
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index ff1e62c066e8..ffa439b2bd2c 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -742,8 +742,12 @@ void drm_mm_scan_init_with_range(struct drm_mm_scan *scan,

scan->mm = mm;

+   if (alignment <= 1)
+   alignment = 0;
+
scan->color = color;
scan->alignment = alignment;
+   scan->remainder_mask = is_power_of_2(alignment) ? alignment - 1 : 0;
scan->size = size;
scan->flags = flags;

@@ -811,7 +815,10 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
if (scan->alignment) {
u64 rem;

-   div64_u64_rem(adj_start, scan->alignment, &rem);
+   if (likely(scan->remainder_mask))
+   rem = adj_start & scan->remainder_mask;
+   else
+   div64_u64_rem(adj_start, scan->alignment, &rem);
if (rem) {
adj_start -= rem;
if (scan->flags != DRM_MM_CREATE_TOP)
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 639d00eb8f28..6ee87e1455bf 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -110,6 +110,7 @@ struct drm_mm_scan {

u64 size;
u64 alignment;
+   u64 remainder_mask;

u64 range_start;
u64 range_end;
-- 
2.11.0



[PATCH v2 09/40] drm: kselftest for drm_mm_init()

2016-12-16 Thread Chris Wilson
Simple first test to just exercise initialisation of struct drm_mm.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 114 +++
 2 files changed, 115 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 1610e0a63a5b..844dd29db540 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -6,3 +6,4 @@
  * Tests are executed in order by igt/drm_mm
  */
 selftest(sanitycheck, igt_sanitycheck) /* keep first (selfcheck for igt) */
+selftest(init, igt_init)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 4c061baccf28..ccef8e249d37 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -24,6 +24,120 @@ static int igt_sanitycheck(void *ignored)
return 0;
 }

+static bool assert_no_holes(const struct drm_mm *mm)
+{
+   struct drm_mm_node *hole;
+   u64 hole_start, hole_end;
+   unsigned long count;
+
+   count = 0;
+   drm_mm_for_each_hole(hole, mm, hole_start, hole_end)
+   count++;
+   if (count) {
+   pr_err("Expected to find no holes (after reserve), found %lu 
instead\n", count);
+   return false;
+   }
+
+   drm_mm_for_each_node(hole, mm) {
+   if (hole->hole_follows) {
+   pr_err("Hole follows node, expected none!\n");
+   return false;
+   }
+   }
+
+   return true;
+}
+
+static bool assert_one_hole(const struct drm_mm *mm, u64 start, u64 end)
+{
+   struct drm_mm_node *hole;
+   u64 hole_start, hole_end;
+   unsigned long count;
+   bool ok = true;
+
+   if (end <= start)
+   return true;
+
+   count = 0;
+   drm_mm_for_each_hole(hole, mm, hole_start, hole_end) {
+   if (start != hole_start || end != hole_end) {
+   if (ok)
+   pr_err("empty mm has incorrect hole, found 
(%llx, %llx), expect (%llx, %llx)\n",
+  hole_start, hole_end,
+  start, end);
+   ok = false;
+   }
+   count++;
+   }
+   if (count != 1) {
+   pr_err("Expected to find one hole, found %lu instead\n", count);
+   ok = false;
+   }
+
+   return ok;
+}
+
+static int igt_init(void *ignored)
+{
+   const unsigned int size = 4096;
+   struct drm_mm mm;
+   struct drm_mm_node tmp;
+   int ret = -EINVAL;
+
+   /* Start with some simple checks on initialising the struct drm_mm */
+   memset(&mm, 0, sizeof(mm));
+   if (drm_mm_initialized(&mm)) {
+   pr_err("zeroed mm claims to be initialized\n");
+   return ret;
+   }
+
+   memset(&mm, 0xff, sizeof(mm));
+   drm_mm_init(&mm, 0, size);
+   if (!drm_mm_initialized(&mm)) {
+   pr_err("mm claims not to be initialized\n");
+   goto out;
+   }
+
+   if (!drm_mm_clean(&mm)) {
+   pr_err("mm not empty on creation\n");
+   goto out;
+   }
+
+   /* After creation, it should all be one massive hole */
+   if (!assert_one_hole(&mm, 0, size)) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   memset(&tmp, 0, sizeof(tmp));
+   tmp.start = 0;
+   tmp.size = size;
+   ret = drm_mm_reserve_node(&mm, &tmp);
+   if (ret) {
+   pr_err("failed to reserve whole drm_mm\n");
+   goto out;
+   }
+
+   /* After filling the range entirely, there should be no holes */
+   if (!assert_no_holes(&mm)) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   /* And then after emptying it again, the massive hole should be back */
+   drm_mm_remove_node(&tmp);
+   if (!assert_one_hole(&mm, 0, size)) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+out:
+   if (ret)
+   drm_mm_debug_table(&mm, __func__);
+   drm_mm_takedown(&mm);
+   return ret;
+}
+
 #include "drm_selftest.c"

 static int __init test_drm_mm_init(void)
-- 
2.11.0



[PATCH v2 15/40] drm: kselftest for drm_mm and alignment

2016-12-16 Thread Chris Wilson
Check that we can request alignment to any power-of-two or prime using a
plain drm_mm_node_insert(), and also handle a reasonable selection of
primes.

v2: Exercise all allocation flags

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   3 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 110 +++
 2 files changed, 113 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 92b2c1cb10fa..a7a3763f8b20 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -12,3 +12,6 @@ selftest(reserve, igt_reserve)
 selftest(insert, igt_insert)
 selftest(replace, igt_replace)
 selftest(insert_range, igt_insert_range)
+selftest(align, igt_align)
+selftest(align32, igt_align32)
+selftest(align64, igt_align64)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 94cd0741c5b6..dc3aee222158 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -970,6 +970,116 @@ static int igt_insert_range(void *ignored)
return 0;
 }

+static int igt_align(void *ignored)
+{
+   const struct insert_mode *mode;
+   const unsigned int max_count = min(8192u, max_prime);
+   struct drm_mm mm;
+   struct drm_mm_node *nodes, *node, *next;
+   unsigned int prime;
+   int ret = -EINVAL, err;
+
+   /* For each of the possible insertion modes, we pick a few
+* arbitrary alignments and check that the inserted node
+* meets our requirements.
+*/
+
+   nodes = vzalloc(max_count * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   drm_mm_init(&mm, 1, U64_MAX - 2);
+
+   for (mode = insert_modes; mode->name; mode++) {
+   unsigned int i = 0;
+
+   drm_for_each_prime(prime, max_count) {
+   u64 size = drm_next_prime_number(prime);
+
+   err = drm_mm_insert_node_generic(&mm,
+&nodes[i],
+size, prime, i,
+mode->search_flags,
+mode->create_flags);
+   if (err || !assert_node(&nodes[i], &mm, size, prime, 
i)) {
+   pr_err("%s insert failed with alignment=%d",
+  mode->name, prime);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+
+   i++;
+   }
+
+   drm_mm_for_each_node_safe(node, next, &mm)
+   drm_mm_remove_node(node);
+   DRM_MM_BUG_ON(!drm_mm_clean(&mm));
+   }
+
+   ret = 0;
+out:
+   drm_mm_for_each_node_safe(node, next, &mm)
+   drm_mm_remove_node(node);
+   drm_mm_takedown(&mm);
+   vfree(nodes);
+err:
+   return ret;
+}
+
+static int igt_align_pot(int max)
+{
+   struct drm_mm mm;
+   struct drm_mm_node *node, *next;
+   int bit;
+   int ret = -EINVAL;
+
+   /* Check that we can align to the full u64 address space */
+
+   drm_mm_init(&mm, 1, U64_MAX - 1);
+
+   for (bit = max - 1; bit; bit--) {
+   u64 align, size;
+   int err;
+
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
+   if (!node) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   align = BIT_ULL(bit);
+   size = BIT_ULL(bit-1) + 1;
+   err = drm_mm_insert_node_generic(&mm, node, size, align, bit,
+DRM_MM_SEARCH_DEFAULT,
+DRM_MM_CREATE_DEFAULT);
+   if (err || !assert_node(node, &mm, size, align, bit)) {
+   pr_err("insert failed with alignment=%llx [%d]",
+  align, bit);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+   }
+
+   ret = 0;
+out:
+   drm_mm_for_each_node_safe(node, next, &mm) {
+   drm_mm_remove_node(node);
+   kfree(node);
+   }
+   drm_mm_takedown(&mm);
+   return ret;
+}
+
+static int igt_align32(void *ignored)
+{
+   return igt_align_pot(32);
+}
+
+static int igt_align64(void *ignored)
+{
+   return igt_align_pot(64);
+}
+
 #include "drm_selftest.c"

 static int __init test_drm_mm_init(void)
-- 
2.11.0



[PATCH v2 25/40] drm: Detect overflow in drm_mm_reserve_node()

2016-12-16 Thread Chris Wilson
Protect ourselves from a caller passing in node.start + node.size that
will overflow and trick us into reserving that node.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index ead164093ac7..6450690f5578 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -308,10 +308,9 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct 
drm_mm_node *node)
u64 hole_start, hole_end;
u64 adj_start, adj_end;

-   if (WARN_ON(node->size == 0))
-   return -EINVAL;
-
end = node->start + node->size;
+   if (unlikely(end <= node->start))
+   return -ENOSPC;

/* Find the relevant hole to add our node to */
hole = drm_mm_interval_tree_iter_first(&mm->interval_tree,
-- 
2.11.0



[PATCH v2 12/40] drm: kselftest for drm_mm_insert_node()

2016-12-16 Thread Chris Wilson
Exercise drm_mm_insert_node(), check that we can't overfill a range and
that the lists are correct after reserving/removing.

v2: Extract helpers for the repeated tests
v3: Iterate over all allocation flags

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 220 +++
 2 files changed, 221 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 693d85677e7f..727c6d7255e0 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -9,3 +9,4 @@ selftest(sanitycheck, igt_sanitycheck) /* keep first (selfcheck 
for igt) */
 selftest(init, igt_init)
 selftest(debug, igt_debug)
 selftest(reserve, igt_reserve)
+selftest(insert, igt_insert)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 2e40f94cb9d3..9621820a1cda 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -131,6 +131,48 @@ static bool assert_continuous(const struct drm_mm *mm, u64 
size)
return true;
 }

+static u64 misaligned(struct drm_mm_node *node, u64 alignment)
+{
+   u64 rem;
+
+   if (!alignment)
+   return 0;
+
+   div64_u64_rem(node->start, alignment, &rem);
+   return rem;
+}
+
+static bool assert_node(struct drm_mm_node *node, struct drm_mm *mm,
+   u64 size, u64 alignment, unsigned long color)
+{
+   bool ok = true;
+
+   if (!drm_mm_node_allocated(node) || node->mm != mm) {
+   pr_err("node not allocated\n");
+   ok = false;
+   }
+
+   if (node->size != size) {
+   pr_err("node has wrong size, found %llu, expected %llu\n",
+  node->size, size);
+   ok = false;
+   }
+
+   if (misaligned(node, alignment)) {
+   pr_err("node is misalinged, start %llx rem %llu, expected 
alignment %llu\n",
+  node->start, misaligned(node, alignment), alignment);
+   ok = false;
+   }
+
+   if (node->color != color) {
+   pr_err("node has wrong color, found %lu, expected %lu\n",
+  node->color, color);
+   ok = false;
+   }
+
+   return ok;
+}
+
 static int igt_init(void *ignored)
 {
const unsigned int size = 4096;
@@ -452,6 +494,184 @@ static int igt_reserve(void *ignored)
return 0;
 }

+static bool expect_insert_fail(struct drm_mm *mm, u64 size)
+{
+   struct drm_mm_node tmp = {};
+   int err;
+
+   err = drm_mm_insert_node(mm, &tmp, size, 0, DRM_MM_SEARCH_DEFAULT);
+   if (err != -ENOSPC)  {
+   if (!err) {
+   pr_err("impossible insert succeeded, node %llu + 
%llu\n",
+  tmp.start, tmp.size);
+   drm_mm_remove_node(&tmp);
+   } else {
+   pr_err("impossible insert failed with wrong error %d 
[expected %d], size %llu\n",
+  err, -ENOSPC, size);
+   }
+   return false;
+   }
+
+   return true;
+}
+
+static const struct insert_mode {
+   const char *name;
+   unsigned int search_flags;
+   unsigned int create_flags;
+} insert_modes[] = {
+   { "default", DRM_MM_SEARCH_DEFAULT, DRM_MM_CREATE_DEFAULT },
+   { "top-down", DRM_MM_SEARCH_BELOW, DRM_MM_CREATE_TOP },
+   { "best", DRM_MM_SEARCH_BEST, DRM_MM_CREATE_DEFAULT },
+   {}
+};
+
+static int __igt_insert(unsigned int count, u64 size)
+{
+   RND_STATE(prng, random_seed);
+   const struct insert_mode *mode;
+   struct drm_mm mm;
+   struct drm_mm_node *nodes, *node, *next;
+   unsigned int *order, n, m, o = 0;
+   int ret, err;
+
+   /* Fill a range with lots of nodes, check it doesn't fail too early */
+
+   DRM_MM_BUG_ON(!count);
+   DRM_MM_BUG_ON(!size);
+
+   ret = -ENOMEM;
+   nodes = vzalloc(count * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   order = drm_random_order(count, &prng);
+   if (!order)
+   goto err_nodes;
+
+   ret = -EINVAL;
+   drm_mm_init(&mm, 0, count * size);
+
+   for (mode = insert_modes; mode->name; mode++) {
+   for (n = 0; n < count; n++) {
+   node = &nodes[n];
+   err = drm_mm_insert_node_generic(&mm, node, size, 0, n,
+mode->search_flags,
+mode->create_flags);
+   if (err || !assert_node(node, &mm, size, 0, n)) {
+   pr_err("%s insert failed, size %llu step %d\n",
+  mode->name, size, n);
+   ret = err ?: -EIN

[PATCH v2 24/40] drm: Fix kerneldoc for drm_mm_scan_remove_block()

2016-12-16 Thread Chris Wilson
The nodes must be removed in the *reverse* order. This is correct in the
overview, but backwards in the function description. Whilst here add
Intel's copyright statement and tweak some formatting.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 34 ++
 include/drm/drm_mm.h | 19 +--
 2 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 2d02ab0925a9..ead164093ac7 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -1,6 +1,7 @@
 /**
  *
  * Copyright 2006 Tungsten Graphics, Inc., Bismarck, ND., USA.
+ * Copyright 2016 Intel Corporation
  * All Rights Reserved.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
@@ -31,9 +32,9 @@
  * class implementation for more advanced memory managers.
  *
  * Note that the algorithm used is quite simple and there might be substantial
- * performance gains if a smarter free list is implemented. Currently it is 
just an
- * unordered stack of free regions. This could easily be improved if an RB-tree
- * is used instead. At least if we expect heavy fragmentation.
+ * performance gains if a smarter free list is implemented. Currently it is
+ * just an unordered stack of free regions. This could easily be improved if
+ * an RB-tree is used instead. At least if we expect heavy fragmentation.
  *
  * Aligned allocations can also see improvement.
  *
@@ -67,7 +68,7 @@
  * where an object needs to be created which exactly matches the firmware's
  * scanout target. As long as the range is still free it can be inserted 
anytime
  * after the allocator is initialized, which helps with avoiding looped
- * depencies in the driver load sequence.
+ * dependencies in the driver load sequence.
  *
  * drm_mm maintains a stack of most recently freed holes, which of all
  * simplistic datastructures seems to be a fairly decent approach to clustering
@@ -78,14 +79,14 @@
  *
  * drm_mm supports a few features: Alignment and range restrictions can be
  * supplied. Further more every &drm_mm_node has a color value (which is just 
an
- * opaqua unsigned long) which in conjunction with a driver callback can be 
used
+ * opaque unsigned long) which in conjunction with a driver callback can be 
used
  * to implement sophisticated placement restrictions. The i915 DRM driver uses
  * this to implement guard pages between incompatible caching domains in the
  * graphics TT.
  *
- * Two behaviors are supported for searching and allocating: bottom-up and 
top-down.
- * The default is bottom-up. Top-down allocation can be used if the memory area
- * has different restrictions, or just to reduce fragmentation.
+ * Two behaviors are supported for searching and allocating: bottom-up and
+ * top-down. The default is bottom-up. Top-down allocation can be used if the
+ * memory area has different restrictions, or just to reduce fragmentation.
  *
  * Finally iteration helpers to walk all nodes and all holes are provided as 
are
  * some basic allocator dumpers for debugging.
@@ -510,7 +511,7 @@ EXPORT_SYMBOL(drm_mm_insert_node_in_range_generic);
  *
  * This just removes a node from its drm_mm allocator. The node does not need 
to
  * be cleared again before it can be re-inserted into this or any other drm_mm
- * allocator. It is a bug to call this function on a un-allocated node.
+ * allocator. It is a bug to call this function on a unallocated node.
  */
 void drm_mm_remove_node(struct drm_mm_node *node)
 {
@@ -689,16 +690,16 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  * efficient when we simply start to select all objects from the tail of an LRU
  * until there's a suitable hole: Especially for big objects or nodes that
  * otherwise have special allocation constraints there's a good chance we evict
- * lots of (smaller) objects unecessarily.
+ * lots of (smaller) objects unnecessarily.
  *
  * The DRM range allocator supports this use-case through the scanning
  * interfaces. First a scan operation needs to be initialized with
- * drm_mm_init_scan() or drm_mm_init_scan_with_range(). The the driver adds
+ * drm_mm_init_scan() or drm_mm_init_scan_with_range(). The driver adds
  * objects to the roaster (probably by walking an LRU list, but this can be
  * freely implemented) until a suitable hole is found or there's no further
- * evitable object.
+ * evictable object.
  *
- * The the driver must walk through all objects again in exactly the reverse
+ * The driver must walk through all objects again in exactly the reverse
  * order to restore the allocator state. Note that while the allocator is used
  * in the scan mode no other operation is allowed.
  *
@@ -838,9 +839,10 @@ EXPORT_SYMBOL(drm_mm_scan_add_block);
  * drm_mm_scan_remove_block - remove a node from the scan list
  * @node: drm_mm_node to remove
  *
- * Nodes _must_ be removed in the exact same order

[PATCH v2 35/40] drm: Apply tight eviction scanning to color_adjust

2016-12-16 Thread Chris Wilson
Using mm->color_adjust makes the eviction scanner much tricker since we
don't know the actual neighbours of the target hole until after it is
created (after scanning is complete). To work out whether we need to
evict the neighbours because they impact upon the hole, we have to then
check the hole afterwards - requiring an extra step in the user of the
eviction scanner when they apply color_adjust.

v2: Massage kerneldoc.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c| 76 ++---
 drivers/gpu/drm/i915/i915_gem_evict.c   |  7 +++
 drivers/gpu/drm/selftests/test-drm_mm.c | 20 -
 include/drm/drm_mm.h|  1 +
 4 files changed, 77 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 91c89ae09b26..22db356e3ebc 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -692,19 +692,21 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  * The DRM range allocator supports this use-case through the scanning
  * interfaces. First a scan operation needs to be initialized with
  * drm_mm_scan_init() or drm_mm_scan_init_with_range(). The driver adds
- * objects to the roaster (probably by walking an LRU list, but this can be
- * freely implemented) until a suitable hole is found or there's no further
- * evictable object.
+ * objects to the roster (probably by walking an LRU list, but this can be
+ * freely implemented) (using drm_mm_scan_add_block()) until a suitable hole
+ * is found or there are no further evictable objects.
  *
  * The driver must walk through all objects again in exactly the reverse
  * order to restore the allocator state. Note that while the allocator is used
  * in the scan mode no other operation is allowed.
  *
- * Finally the driver evicts all objects selected in the scan. Adding and
- * removing an object is O(1), and since freeing a node is also O(1) the 
overall
- * complexity is O(scanned_objects). So like the free stack which needs to be
- * walked before a scan operation even begins this is linear in the number of
- * objects. It doesn't seem to hurt badly.
+ * Finally the driver evicts all objects selected (drm_mm_scan_remove_block()
+ * reported true) in the scan, and any overlapping nodes after color adjustment
+ * (drm_mm_scan_evict_color()). Adding and removing an object is O(1), and
+ * since freeing a node is also O(1) the overall complexity is
+ * O(scanned_objects). So like the free stack which needs to be walked before a
+ * scan operation even begins this is linear in the number of objects. It
+ * doesn't seem to hurt too badly.
  */

 /**
@@ -829,23 +831,8 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
}
}

-   if (mm->color_adjust) {
-   /* If allocations need adjusting due to neighbouring colours,
-* we do not have enough information to decide if we need
-* to evict nodes on either side of [adj_start, adj_end].
-* What almost works is
-* hit_start = adj_start + (hole_start - col_start);
-* hit_end = adj_start + scan->size + (hole_end - col_end);
-* but because the decision is only made on the final hole,
-* we may underestimate the required adjustments for an
-* interior allocation.
-*/
-   scan->hit_start = hole_start;
-   scan->hit_end = hole_end;
-   } else {
-   scan->hit_start = adj_start;
-   scan->hit_end = adj_start + scan->size;
-   }
+   scan->hit_start = adj_start;
+   scan->hit_end = adj_start + scan->size;

DRM_MM_BUG_ON(scan->hit_start >= scan->hit_end);
DRM_MM_BUG_ON(scan->hit_start < hole_start);
@@ -903,6 +890,45 @@ bool drm_mm_scan_remove_block(struct drm_mm_scan *scan,
 EXPORT_SYMBOL(drm_mm_scan_remove_block);

 /**
+ * drm_mm_scan_color_evict - evict overlapping nodes on either side of hole
+ * @scan: drm_mm scan with target hole
+ *
+ * After completing an eviction scan and removing the selected nodes, we may
+ * need to remove a few more nodes from either side of the target hole if
+ * mm.color_adjust is being used.
+ *
+ * Returns:
+ * A node to evict, or NULL if there are no overlapping nodes.
+ */
+struct drm_mm_node *drm_mm_scan_color_evict(struct drm_mm_scan *scan)
+{
+   struct drm_mm *mm = scan->mm;
+   struct drm_mm_node *hole;
+   u64 hole_start, hole_end;
+
+   DRM_MM_BUG_ON(list_empty(&mm->hole_stack));
+
+   if (!mm->color_adjust)
+   return NULL;
+
+   hole = list_first_entry(&mm->hole_stack, typeof(*hole), hole_stack);
+   hole_start = __drm_mm_hole_node_start(hole);
+   hole_end = __drm_mm_hole_node_end(hole);
+
+   DRM_MM_BUG_ON(hole_start > scan->hit_start);
+   DRM_MM_BUG_ON(hole_end < scan->hit_end);
+
+   mm->color_adjust(hole, scan->color, &hole_start, &hole_end);
+   if (ho

[PATCH v2 40/40] drm: kselftest for drm_mm and bottom-up allocation

2016-12-16 Thread Chris Wilson
Check that if we request bottom-up allocation from drm_mm_insert_node()
we receive the next available hole from the bottom.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 102 +++
 2 files changed, 103 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 6a4575fdc1c0..37bbdac52896 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -17,6 +17,7 @@ selftest(align32, igt_align32)
 selftest(align64, igt_align64)
 selftest(evict, igt_evict)
 selftest(evict_range, igt_evict_range)
+selftest(bottomup, igt_bottomup)
 selftest(topdown, igt_topdown)
 selftest(color, igt_color)
 selftest(color_evict, igt_color_evict)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 97468b8b33a0..081b5a1c565f 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -1653,6 +1653,108 @@ static int igt_topdown(void *ignored)
return ret;
 }

+static int igt_bottomup(void *ignored)
+{
+   RND_STATE(prng, random_seed);
+   const unsigned int count = 8192;
+   unsigned int size;
+   unsigned long *bitmap;
+   struct drm_mm mm;
+   struct drm_mm_node *nodes, *node, *next;
+   unsigned int *order, n, m, o = 0;
+   int ret, err;
+
+   /* Like igt_topdown, but instead of searching for the last hole,
+* we search for the first.
+*/
+
+   ret = -ENOMEM;
+   nodes = vzalloc(count * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   bitmap = kzalloc(count / BITS_PER_LONG * sizeof(unsigned long),
+GFP_TEMPORARY);
+   if (!bitmap)
+   goto err_nodes;
+
+   order = drm_random_order(count, &prng);
+   if (!order)
+   goto err_bitmap;
+
+   ret = -EINVAL;
+   for (size = 1; size <= 64; size <<= 1) {
+   drm_mm_init(&mm, 0, size*count);
+   for (n = 0; n < count; n++) {
+   err = drm_mm_insert_node_generic(&mm, &nodes[n],
+size, 0, n,
+DRM_MM_INSERT_LOW);
+   if (err || !assert_node(&nodes[n], &mm, size, 0, n)) {
+   pr_err("bottomup insert failed, size %u step 
%d\n", size, n);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+
+   if (!assert_one_hole(&mm, size*(n + 1), size*count))
+   goto out;
+   }
+
+   if (!assert_continuous(&mm, size))
+   goto out;
+
+   drm_random_reorder(order, count, &prng);
+   drm_for_each_prime(n, min(count, max_prime)) {
+   for (m = 0; m < n; m++) {
+   node = &nodes[order[(o + m) % count]];
+   drm_mm_remove_node(node);
+   __set_bit(node_index(node), bitmap);
+   }
+
+   for (m = 0; m < n; m++) {
+   unsigned int first;
+
+   node = &nodes[order[(o + m) % count]];
+   err = drm_mm_insert_node_generic(&mm, node, 
size, 0, 0,
+
DRM_MM_INSERT_LOW);
+   if (err || !assert_node(node, &mm, size, 0, 0)) 
{
+   pr_err("insert failed, step %d/%d\n", 
m, n);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+
+   first = find_first_bit(bitmap, count);
+   if (node_index(node) != first) {
+   pr_err("node %d/%d not inserted into 
bottom hole, expected %d, found %d\n",
+  m, n, first, node_index(node));
+   goto out;
+   }
+   __clear_bit(first, bitmap);
+   }
+
+   DRM_MM_BUG_ON(find_first_bit(bitmap, count) != count);
+
+   o += n;
+   }
+
+   drm_mm_for_each_node_safe(node, next, &mm)
+   drm_mm_remove_node(node);
+   DRM_MM_BUG_ON(!drm_mm_clean(&mm));
+   }
+
+   ret = 0;
+out:
+   drm_mm_for_each_node_safe(node, next, &mm)
+   drm_mm_remove_node(node);
+   drm_mm_takedown(&mm);
+   kfree(order);
+err_bitma

[PATCH v2 05/40] drm: Compile time enabling for asserts in drm_mm

2016-12-16 Thread Chris Wilson
Use CONFIG_DRM_DEBUG_MM to conditionally enable the internal and
validation checking using BUG_ON. Ideally these paths should all be
exercised by CI selftests (with the asserts enabled).

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c | 45 +++--
 include/drm/drm_mm.h |  8 +++-
 2 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 7573661302a4..2b76167ef39f 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -237,7 +237,7 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
u64 adj_start = hole_start;
u64 adj_end = hole_end;

-   BUG_ON(node->allocated);
+   DRM_MM_BUG_ON(node->allocated);

if (mm->color_adjust)
mm->color_adjust(hole_node, color, &adj_start, &adj_end);
@@ -258,8 +258,8 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
}
}

-   BUG_ON(adj_start < hole_start);
-   BUG_ON(adj_end > hole_end);
+   DRM_MM_BUG_ON(adj_start < hole_start);
+   DRM_MM_BUG_ON(adj_end > hole_end);

if (adj_start == hole_start) {
hole_node->hole_follows = 0;
@@ -276,7 +276,7 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,

drm_mm_interval_tree_add_node(hole_node, node);

-   BUG_ON(node->start + node->size > adj_end);
+   DRM_MM_BUG_ON(node->start + node->size > adj_end);

node->hole_follows = 0;
if (__drm_mm_hole_node_start(node) < hole_end) {
@@ -409,7 +409,7 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,
u64 adj_start = hole_start;
u64 adj_end = hole_end;

-   BUG_ON(!hole_node->hole_follows || node->allocated);
+   DRM_MM_BUG_ON(!hole_node->hole_follows || node->allocated);

if (adj_start < start)
adj_start = start;
@@ -450,10 +450,10 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,

drm_mm_interval_tree_add_node(hole_node, node);

-   BUG_ON(node->start < start);
-   BUG_ON(node->start < adj_start);
-   BUG_ON(node->start + node->size > adj_end);
-   BUG_ON(node->start + node->size > end);
+   DRM_MM_BUG_ON(node->start < start);
+   DRM_MM_BUG_ON(node->start < adj_start);
+   DRM_MM_BUG_ON(node->start + node->size > adj_end);
+   DRM_MM_BUG_ON(node->start + node->size > end);

node->hole_follows = 0;
if (__drm_mm_hole_node_start(node) < hole_end) {
@@ -519,22 +519,21 @@ void drm_mm_remove_node(struct drm_mm_node *node)
struct drm_mm *mm = node->mm;
struct drm_mm_node *prev_node;

-   if (WARN_ON(!node->allocated))
-   return;
-
-   BUG_ON(node->scanned_block || node->scanned_prev_free
-  || node->scanned_next_free);
+   DRM_MM_BUG_ON(!node->allocated);
+   DRM_MM_BUG_ON(node->scanned_block ||
+ node->scanned_prev_free ||
+ node->scanned_next_free);

prev_node =
list_entry(node->node_list.prev, struct drm_mm_node, node_list);

if (node->hole_follows) {
-   BUG_ON(__drm_mm_hole_node_start(node) ==
-  __drm_mm_hole_node_end(node));
+   DRM_MM_BUG_ON(__drm_mm_hole_node_start(node) ==
+ __drm_mm_hole_node_end(node));
list_del(&node->hole_stack);
} else
-   BUG_ON(__drm_mm_hole_node_start(node) !=
-  __drm_mm_hole_node_end(node));
+   DRM_MM_BUG_ON(__drm_mm_hole_node_start(node) !=
+ __drm_mm_hole_node_end(node));


if (!prev_node->hole_follows) {
@@ -578,7 +577,7 @@ static struct drm_mm_node *drm_mm_search_free_generic(const 
struct drm_mm *mm,
u64 adj_end;
u64 best_size;

-   BUG_ON(mm->scanned_blocks);
+   DRM_MM_BUG_ON(mm->scanned_blocks);

best = NULL;
best_size = ~0UL;
@@ -622,7 +621,7 @@ static struct drm_mm_node 
*drm_mm_search_free_in_range_generic(const struct drm_
u64 adj_end;
u64 best_size;

-   BUG_ON(mm->scanned_blocks);
+   DRM_MM_BUG_ON(mm->scanned_blocks);

best = NULL;
best_size = ~0UL;
@@ -668,6 +667,8 @@ static struct drm_mm_node 
*drm_mm_search_free_in_range_generic(const struct drm_
  */
 void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new)
 {
+   DRM_MM_BUG_ON(!old->allocated);
+
list_replace(&old->node_list, &new->node_list);
list_replace(&old->hole_stack, &new->hole_stack);
rb_replace_node(&old->rb, &new->rb, &old->mm->interval_tree);
@@ -798,7 +799,7 @@ bool drm_mm_scan_add_block(struct drm_mm_node *node)

mm->scanned_blocks++;

-   BUG_ON(node->scanned_block);
+   DRM_MM_BUG_ON(node->scanned_block);
node->

[PATCH v2 27/40] drm: Add asserts to catch overflow in drm_mm_init() and drm_mm_init_scan()

2016-12-16 Thread Chris Wilson
A simple assert to ensure that we don't overflow start + size when
initialising the drm_mm, or its scanner.

In future, we may want to switch to tracking the value of ranges (rather
than size) so that we can cover the full u64, for example like resource
tracking.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 14a5ef505f1b..57267845b7d4 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -729,6 +729,8 @@ void drm_mm_init_scan(struct drm_mm *mm,
  u64 alignment,
  unsigned long color)
 {
+   DRM_MM_BUG_ON(size == 0);
+
mm->scan_color = color;
mm->scan_alignment = alignment;
mm->scan_size = size;
@@ -764,6 +766,9 @@ void drm_mm_init_scan_with_range(struct drm_mm *mm,
 u64 start,
 u64 end)
 {
+   DRM_MM_BUG_ON(start >= end);
+   DRM_MM_BUG_ON(size == 0 || size > end - start);
+
mm->scan_color = color;
mm->scan_alignment = alignment;
mm->scan_size = size;
@@ -882,6 +887,8 @@ EXPORT_SYMBOL(drm_mm_scan_remove_block);
  */
 void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)
 {
+   DRM_MM_BUG_ON(start + size <= start);
+
INIT_LIST_HEAD(&mm->hole_stack);
mm->scanned_blocks = 0;

-- 
2.11.0



[PATCH v2 17/40] drm: kselftest for drm_mm and range restricted eviction

2016-12-16 Thread Chris Wilson
Check that we add arbitrary blocks to a restrited eviction scanner in
order to find the first minimal hole that matches our request.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 113 ++-
 2 files changed, 110 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index a31b4458c7eb..965aca65c160 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -16,3 +16,4 @@ selftest(align, igt_align)
 selftest(align32, igt_align32)
 selftest(align64, igt_align64)
 selftest(evict, igt_evict)
+selftest(evict_range, igt_evict_range)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 4881752d6424..6c4679304358 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -1249,6 +1249,7 @@ static bool evict_everything(struct drm_mm *mm,
 }

 static int evict_something(struct drm_mm *mm,
+  u64 range_start, u64 range_end,
   struct evict_node *nodes,
   unsigned int *order,
   unsigned int count,
@@ -1261,7 +1262,9 @@ static int evict_something(struct drm_mm *mm,
struct drm_mm_node tmp;
int err;

-   drm_mm_init_scan(mm, size, alignment, 0);
+   drm_mm_init_scan_with_range(mm,
+   size, alignment, 0,
+   range_start, range_end);
if (!evict_nodes(mm,
 nodes, order, count,
 &evict_list))
@@ -1279,6 +1282,12 @@ static int evict_something(struct drm_mm *mm,
return err;
}

+   if (tmp.start < range_start || tmp.start + tmp.size > range_end) {
+   pr_err("Inserted [address=%llu + %llu] did not fit into the 
request range [%llu, %llu]\n",
+  tmp.start, tmp.size, range_start, range_end);
+   err = -EINVAL;
+   }
+
if (!assert_node(&tmp, mm, size, alignment, 0) || tmp.hole_follows) {
pr_err("Inserted did not fill the eviction hole: size=%lld 
[%d], align=%d [rem=%lld], start=%llx, hole-follows?=%d\n",
   tmp.size, size,
@@ -1360,7 +1369,7 @@ static int igt_evict(void *ignored)
for (mode = evict_modes; mode->name; mode++) {
for (n = 1; n <= size; n <<= 1) {
drm_random_reorder(order, size, &prng);
-   err = evict_something(&mm,
+   err = evict_something(&mm, 0, U64_MAX,
  nodes, order, size,
  n, 1,
  mode);
@@ -1374,7 +1383,7 @@ static int igt_evict(void *ignored)

for (n = 1; n < size; n <<= 1) {
drm_random_reorder(order, size, &prng);
-   err = evict_something(&mm,
+   err = evict_something(&mm, 0, U64_MAX,
  nodes, order, size,
  size/2, n,
  mode);
@@ -1392,7 +1401,7 @@ static int igt_evict(void *ignored)
DRM_MM_BUG_ON(!nsize);

drm_random_reorder(order, size, &prng);
-   err = evict_something(&mm,
+   err = evict_something(&mm, 0, U64_MAX,
  nodes, order, size,
  nsize, n,
  mode);
@@ -1417,6 +1426,102 @@ static int igt_evict(void *ignored)
return ret;
 }

+static int igt_evict_range(void *ignored)
+{
+   RND_STATE(prng, random_seed);
+   const unsigned int size = 8192;
+   const unsigned int range_size = size / 2;
+   const unsigned int range_start = size / 4;
+   const unsigned int range_end = range_start + range_size;
+   const struct insert_mode *mode;
+   struct drm_mm mm;
+   struct evict_node *nodes;
+   struct drm_mm_node *node, *next;
+   unsigned int *order, n;
+   int ret, err;
+
+   /* Like igt_evict() but now we are limiting the search to a
+* small portion of the full drm_mm.
+*/
+
+   ret = -ENOMEM;
+   nodes = vzalloc(size * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   order = drm_random_order(size, &prng);
+   if (!order)
+   goto err_nodes;
+
+   ret = -EINVAL;
+   drm_mm_init(&mm, 0, size);
+   for (n = 0; n < size; n++) {
+   err = drm_mm_insert_node(&mm, &nodes[n].node, 1, 0,
+

[PATCH v2 13/40] drm: kselftest for drm_mm_replace_node()

2016-12-16 Thread Chris Wilson
Reuse drm_mm_insert_node() with a temporary node to exercise
drm_mm_replace_node(). We use the previous test in order to exercise the
various lists following replacement.

v2: Check that we copy across the important (user) details of the node.
The internal details (such as lists and hole tracking) we hope to detect
errors by exercise.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |  1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 64 +---
 2 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 727c6d7255e0..dca726baa65d 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -10,3 +10,4 @@ selftest(init, igt_init)
 selftest(debug, igt_debug)
 selftest(reserve, igt_reserve)
 selftest(insert, igt_insert)
+selftest(replace, igt_replace)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 9621820a1cda..dca146478b06 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -526,7 +526,7 @@ static const struct insert_mode {
{}
 };

-static int __igt_insert(unsigned int count, u64 size)
+static int __igt_insert(unsigned int count, u64 size, bool replace)
 {
RND_STATE(prng, random_seed);
const struct insert_mode *mode;
@@ -541,7 +541,7 @@ static int __igt_insert(unsigned int count, u64 size)
DRM_MM_BUG_ON(!size);

ret = -ENOMEM;
-   nodes = vzalloc(count * sizeof(*nodes));
+   nodes = vmalloc(count * sizeof(*nodes));
if (!nodes)
goto err;

@@ -554,7 +554,10 @@ static int __igt_insert(unsigned int count, u64 size)

for (mode = insert_modes; mode->name; mode++) {
for (n = 0; n < count; n++) {
-   node = &nodes[n];
+   struct drm_mm_node tmp;
+
+   node = replace ? &tmp : &nodes[n];
+   memset(node, 0, sizeof(*node));
err = drm_mm_insert_node_generic(&mm, node, size, 0, n,
 mode->search_flags,
 mode->create_flags);
@@ -564,6 +567,28 @@ static int __igt_insert(unsigned int count, u64 size)
ret = err ?: -EINVAL;
goto out;
}
+
+   if (replace) {
+   drm_mm_replace_node(&tmp, &nodes[n]);
+   if (drm_mm_node_allocated(&tmp)) {
+   pr_err("replaced old-node still 
allocated! step %d\n",
+  n);
+   goto out;
+   }
+
+   if (!assert_node(&nodes[n], &mm, size, 0, n)) {
+   pr_err("replaced node did not inherit 
parameters, size %llu step %d\n",
+  size, n);
+   goto out;
+   }
+
+   if (tmp.start != nodes[n].start) {
+   pr_err("replaced node mismatch location 
expected [%llx + %llx], found [%llx + %llx]\n",
+  tmp.start, size,
+  nodes[n].start, nodes[n].size);
+   goto out;
+   }
+   }
}

/* After random insertion the nodes should be in order */
@@ -656,17 +681,44 @@ static int igt_insert(void *ignored)
drm_for_each_prime(n, 54) {
u64 size = BIT_ULL(n);

-   ret = __igt_insert(count, size - 1);
+   ret = __igt_insert(count, size - 1, false);
if (ret)
return ret;

-   ret = __igt_insert(count, size);
+   ret = __igt_insert(count, size, false);
+   if (ret)
+   return ret;
+
+   ret = __igt_insert(count, size + 1, false);
+   }
+
+   return 0;
+}
+
+static int igt_replace(void *ignored)
+{
+   const unsigned int count = min_t(unsigned int, BIT(10), max_iterations);
+   unsigned int n;
+   int ret;
+
+   /* Reuse igt_insert to exercise replacement by inserting a dummy node,
+* then replacing it with the intended node. We want to check that
+* the tree is intact and all the information we need is carried
+* across to the target node.
+*/
+
+   drm_for_each_prime(n, 54) {
+   u64 size = BIT_ULL(n);
+
+   ret = _

[PATCH v2 39/40] drm: Improve drm_mm search (and fix topdown allocation) with rbtrees

2016-12-16 Thread Chris Wilson
The drm_mm range manager claimed to support top-down insertion, but it
was neither searching for the top-most hole that could fit the
allocation request nor fitting the request to the hole correctly.

In order to search the range efficiently, we create a secondary index
for the holes using either their size or their address. This index
allows us to find the smallest hole or the hole at the bottom or top of
the range efficiently, whilst keeping the hole stack to rapidly service
evictions.

v2: Search for holes both high and low. Rename flags to mode.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c  |  16 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c |  20 +-
 drivers/gpu/drm/armada/armada_gem.c  |   4 +-
 drivers/gpu/drm/drm_mm.c | 507 +++
 drivers/gpu/drm/drm_vma_manager.c|   3 +-
 drivers/gpu/drm/etnaviv/etnaviv_mmu.c|   8 +-
 drivers/gpu/drm/i915/i915_gem.c  |  10 +-
 drivers/gpu/drm/i915/i915_gem_evict.c|   9 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |   5 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c  |  39 +--
 drivers/gpu/drm/i915/i915_gem_stolen.c   |   6 +-
 drivers/gpu/drm/msm/msm_gem.c|   3 +-
 drivers/gpu/drm/msm/msm_gem_vma.c|   3 +-
 drivers/gpu/drm/selftests/test-drm_mm.c  |  97 ++---
 drivers/gpu/drm/sis/sis_mm.c |   6 +-
 drivers/gpu/drm/tegra/gem.c  |   4 +-
 drivers/gpu/drm/ttm/ttm_bo_manager.c |  18 +-
 drivers/gpu/drm/vc4/vc4_crtc.c   |   2 +-
 drivers/gpu/drm/vc4/vc4_hvs.c|   3 +-
 drivers/gpu/drm/vc4/vc4_plane.c  |   6 +-
 drivers/gpu/drm/via/via_mm.c |   4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c   |  10 +-
 include/drm/drm_mm.h | 135 +++
 23 files changed, 442 insertions(+), 476 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 00f46b0e076d..d841fcb2e709 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -97,8 +97,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 {
struct amdgpu_gtt_mgr *mgr = man->priv;
struct drm_mm_node *node = mem->mm_node;
-   enum drm_mm_search_flags sflags = DRM_MM_SEARCH_BEST;
-   enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT;
+   enum drm_mm_insert_mode mode;
unsigned long fpfn, lpfn;
int r;

@@ -115,15 +114,14 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
else
lpfn = man->size;

-   if (place && place->flags & TTM_PL_FLAG_TOPDOWN) {
-   sflags = DRM_MM_SEARCH_BELOW;
-   aflags = DRM_MM_CREATE_TOP;
-   }
+   mode = DRM_MM_INSERT_BEST;
+   if (place && place->mode & TTM_PL_FLAG_TOPDOWN)
+   mode = DRM_MM_INSERT_HIGH;

spin_lock(&mgr->lock);
-   r = drm_mm_insert_node_in_range_generic(&mgr->mm, node, mem->num_pages,
-   mem->page_alignment, 0,
-   fpfn, lpfn, sflags, aflags);
+   r = drm_mm_insert_node_in_range(&mgr->mm, node,
+   mem->num_pages, mem->page_alignment, 0,
+   fpfn, lpfn, mode);
spin_unlock(&mgr->lock);

if (!r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index d710226a0fff..5f106ad815ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -97,8 +97,7 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager 
*man,
struct amdgpu_vram_mgr *mgr = man->priv;
struct drm_mm *mm = &mgr->mm;
struct drm_mm_node *nodes;
-   enum drm_mm_search_flags sflags = DRM_MM_SEARCH_DEFAULT;
-   enum drm_mm_allocator_flags aflags = DRM_MM_CREATE_DEFAULT;
+   enum drm_mm_insert_mode mode;
unsigned long lpfn, num_nodes, pages_per_node, pages_left;
unsigned i;
int r;
@@ -121,10 +120,9 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager 
*man,
if (!nodes)
return -ENOMEM;

-   if (place->flags & TTM_PL_FLAG_TOPDOWN) {
-   sflags = DRM_MM_SEARCH_BELOW;
-   aflags = DRM_MM_CREATE_TOP;
-   }
+   mode = DRM_MM_INSERT_BEST;
+   if (place->flags & TTM_PL_FLAG_TOPDOWN)
+   mode = DRM_MM_INSERT_HIGH;

pages_left = mem->num_pages;

@@ -135,13 +133,11 @@ static int amdgpu_vram_mgr_new(struct 
ttm_mem_type_manager *man,

if (pages == pages_per_node)
alignment = pages_per_node;
-   else
-   sflags |= DRM_MM_SEARCH_BEST;

-   r = drm_mm_insert_node_in

[PATCH v2 36/40] drm: Wrap drm_mm_node.hole_follows

2016-12-16 Thread Chris Wilson
Insulate users from changed to the internal hole tracking within
struct drm_mm_node by using an accessor for hole_follows.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c| 12 ++--
 drivers/gpu/drm/i915/i915_vma.c |  4 ++--
 drivers/gpu/drm/selftests/test-drm_mm.c | 18 ++
 include/drm/drm_mm.h| 22 +++---
 4 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 22db356e3ebc..da9f98690e97 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -323,7 +323,7 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct 
drm_mm_node *node)
}

hole = list_last_entry(&hole->node_list, typeof(*hole), node_list);
-   if (!hole->hole_follows)
+   if (!drm_mm_hole_follows(hole))
return -ENOSPC;

adj_start = hole_start = __drm_mm_hole_node_start(hole);
@@ -408,7 +408,7 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,
u64 adj_start = hole_start;
u64 adj_end = hole_end;

-   DRM_MM_BUG_ON(!hole_node->hole_follows || node->allocated);
+   DRM_MM_BUG_ON(!drm_mm_hole_follows(hole_node) || node->allocated);

if (adj_start < start)
adj_start = start;
@@ -523,16 +523,16 @@ void drm_mm_remove_node(struct drm_mm_node *node)
prev_node =
list_entry(node->node_list.prev, struct drm_mm_node, node_list);

-   if (node->hole_follows) {
+   if (drm_mm_hole_follows(node)) {
DRM_MM_BUG_ON(__drm_mm_hole_node_start(node) ==
  __drm_mm_hole_node_end(node));
list_del(&node->hole_stack);
-   } else
+   } else {
DRM_MM_BUG_ON(__drm_mm_hole_node_start(node) !=
  __drm_mm_hole_node_end(node));
+   }

-
-   if (!prev_node->hole_follows) {
+   if (!drm_mm_hole_follows(prev_node)) {
prev_node->hole_follows = 1;
list_add(&prev_node->hole_stack, &mm->hole_stack);
} else
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f709c9b76358..34374c4133b5 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -322,11 +322,11 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, 
unsigned long cache_level)
GEM_BUG_ON(list_empty(&node->node_list));

other = list_prev_entry(node, node_list);
-   if (color_differs(other, cache_level) && !other->hole_follows)
+   if (color_differs(other, cache_level) && !drm_mm_hole_follows(other))
return false;

other = list_next_entry(node, node_list);
-   if (color_differs(other, cache_level) && !node->hole_follows)
+   if (color_differs(other, cache_level) && !drm_mm_hole_follows(node))
return false;

return true;
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 9899e8364350..a9ed018aac12 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -42,7 +42,7 @@ static bool assert_no_holes(const struct drm_mm *mm)
}

drm_mm_for_each_node(hole, mm) {
-   if (hole->hole_follows) {
+   if (drm_mm_hole_follows(hole)) {
pr_err("Hole follows node, expected none!\n");
return false;
}
@@ -104,7 +104,7 @@ static bool assert_continuous(const struct drm_mm *mm, u64 
size)
return false;
}

-   if (node->hole_follows) {
+   if (drm_mm_hole_follows(node)) {
pr_err("node[%ld] is followed by a hole!\n", n);
return false;
}
@@ -787,7 +787,8 @@ static bool assert_contiguous_in_range(struct drm_mm *mm,
return false;
}

-   if (node->hole_follows && drm_mm_hole_node_end(node) < end) {
+   if (drm_mm_hole_follows(node) &&
+   drm_mm_hole_node_end(node) < end) {
pr_err("node %d is followed by a hole!\n", n);
return false;
}
@@ -1307,11 +1308,12 @@ static int evict_something(struct drm_mm *mm,
err = -EINVAL;
}

-   if (!assert_node(&tmp, mm, size, alignment, 0) || tmp.hole_follows) {
+   if (!assert_node(&tmp, mm, size, alignment, 0) ||
+   drm_mm_hole_follows(&tmp)) {
pr_err("Inserted did not fill the eviction hole: size=%lld 
[%d], align=%d [rem=%lld], start=%llx, hole-follows?=%d\n",
   tmp.size, size,
   alignment, misaligned(&tmp, alignment),
-  tmp.start, tmp.hole_follows);
+  tmp.start, drm_mm_hole_follows(&tmp));
   

[PATCH v2 01/40] drm/i915: Use the MRU stack search after evicting

2016-12-16 Thread Chris Wilson
When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen  #v1
---
 drivers/gpu/drm/i915/gvt/aperture_gm.c | 33 +---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 72 --
 drivers/gpu/drm/i915/i915_gem_gtt.h|  5 +++
 drivers/gpu/drm/i915/i915_vma.c| 40 ++-
 4 files changed, 70 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c 
b/drivers/gpu/drm/i915/gvt/aperture_gm.c
index 7d33b607bc89..1bb7a5b80d47 100644
--- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
+++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
@@ -48,47 +48,34 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm)
 {
struct intel_gvt *gvt = vgpu->gvt;
struct drm_i915_private *dev_priv = gvt->dev_priv;
-   u32 alloc_flag, search_flag;
+   unsigned int flags;
u64 start, end, size;
struct drm_mm_node *node;
-   int retried = 0;
int ret;

if (high_gm) {
-   search_flag = DRM_MM_SEARCH_BELOW;
-   alloc_flag = DRM_MM_CREATE_TOP;
node = &vgpu->gm.high_gm_node;
size = vgpu_hidden_sz(vgpu);
start = gvt_hidden_gmadr_base(gvt);
end = gvt_hidden_gmadr_end(gvt);
+   flags = PIN_HIGH;
} else {
-   search_flag = DRM_MM_SEARCH_DEFAULT;
-   alloc_flag = DRM_MM_CREATE_DEFAULT;
node = &vgpu->gm.low_gm_node;
size = vgpu_aperture_sz(vgpu);
start = gvt_aperture_gmadr_base(gvt);
end = gvt_aperture_gmadr_end(gvt);
+   flags = PIN_MAPPABLE;
}

mutex_lock(&dev_priv->drm.struct_mutex);
-search_again:
-   ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm,
- node, size, 4096,
- I915_COLOR_UNEVICTABLE,
- start, end, search_flag,
- alloc_flag);
-   if (ret) {
-   ret = i915_gem_evict_something(&dev_priv->ggtt.base,
-  size, 4096,
-  I915_COLOR_UNEVICTABLE,
-  start, end, 0);
-   if (ret == 0 && ++retried < 3)
-   goto search_again;
-
-   gvt_err("fail to alloc %s gm space from host, retried %d\n",
-   high_gm ? "high" : "low", retried);
-   }
+   ret = i915_gem_gtt_insert(&dev_priv->ggtt.base, node,
+ size, 4096, I915_COLOR_UNEVICTABLE,
+ start, end, flags);
mutex_unlock(&dev_priv->drm.struct_mutex);
+   if (ret)
+   gvt_err("fail to alloc %s gm space from host\n",
+   high_gm ? "high" : "low");
+
return ret;
 }

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ef00d36680c9..4543d7fa7fc2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2056,7 +2056,6 @@ static int gen6_ppgtt_allocate_page_directories(struct 
i915_hw_ppgtt *ppgtt)
struct i915_address_space *vm = &ppgtt->base;
struct drm_i915_private *dev_priv = ppgtt->base.i915;
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   bool retried = false;
int ret;

/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
@@ -2069,29 +2068,14 @@ static int gen6_ppgtt_allocate_page_directories(struct 
i915_hw_ppgtt *ppgtt)
if (ret)
return ret;

-alloc:
-   ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm, &ppgtt->node,
- GEN6_PD_SIZE, GEN6_PD_ALIGN,
- I915_COLOR_UNEVICTABLE,
- 0, ggtt->base.total,
- DRM_MM_TOPDOWN);
-   if (ret == -ENOSPC && !retried) {
-   ret = i915_gem_evict_something(&ggtt->base,
-  GEN6_PD_SIZE, GEN6_PD_ALIGN,
-  I915_COLOR_UNEVICTABLE,
-  0, ggtt->base.total,
-  0);
-   if (ret)
-   goto err_out;
-
-   retried = true;
-

[PATCH v2 18/40] drm: kselftest for drm_mm and top-down allocation

2016-12-16 Thread Chris Wilson
Check that if we request top-down allocation from drm_mm_insert_node()
we receive the next available hole from the top.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 123 +++
 2 files changed, 124 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 965aca65c160..cd508e3d6538 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -17,3 +17,4 @@ selftest(align32, igt_align32)
 selftest(align64, igt_align64)
 selftest(evict, igt_evict)
 selftest(evict_range, igt_evict_range)
+selftest(topdown, igt_topdown)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 6c4679304358..fc76d787cfd4 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -1522,6 +1522,129 @@ static int igt_evict_range(void *ignored)
return ret;
 }

+static unsigned int node_index(const struct drm_mm_node *node)
+{
+   return div64_u64(node->start, node->size);
+}
+
+static int igt_topdown(void *ignored)
+{
+   RND_STATE(prng, random_seed);
+   const unsigned int count = 8192;
+   unsigned int size;
+   unsigned long *bitmap = NULL;
+   struct drm_mm mm;
+   struct drm_mm_node *nodes, *node, *next;
+   unsigned int *order, n, m, o = 0;
+   int ret, err;
+
+   /* When allocating top-down, we expect to be returned a node
+* from a suitable hole at the top of the drm_mm. We check that
+* the returned node does match the highest available slot.
+*/
+
+   ret = -ENOMEM;
+   nodes = vzalloc(count * sizeof(*nodes));
+   if (!nodes)
+   goto err;
+
+   bitmap = kzalloc(count / BITS_PER_LONG * sizeof(unsigned long),
+GFP_TEMPORARY);
+   if (!bitmap)
+   goto err_nodes;
+
+   order = drm_random_order(count, &prng);
+   if (!order)
+   goto err_bitmap;
+
+   ret = -EINVAL;
+   for (size = 1; size <= 64; size <<= 1) {
+   drm_mm_init(&mm, 0, size*count);
+   for (n = 0; n < count; n++) {
+   err = drm_mm_insert_node_generic(&mm,
+&nodes[n], size, 0, n,
+DRM_MM_SEARCH_BELOW,
+DRM_MM_CREATE_TOP);
+   if (err || !assert_node(&nodes[n], &mm, size, 0, n)) {
+   pr_err("insert failed, size %u step %d\n", 
size, n);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+
+   if (nodes[n].hole_follows) {
+   pr_err("hole after topdown insert %d, 
start=%llx\n, size=%u",
+  n, nodes[n].start, size);
+   goto out;
+   }
+
+   if (!assert_one_hole(&mm, 0, size*(count - n - 1)))
+   goto out;
+   }
+
+   if (!assert_continuous(&mm, size))
+   goto out;
+
+   drm_random_reorder(order, count, &prng);
+   drm_for_each_prime(n, min(count, max_prime)) {
+   for (m = 0; m < n; m++) {
+   node = &nodes[order[(o + m) % count]];
+   drm_mm_remove_node(node);
+   __set_bit(node_index(node), bitmap);
+   }
+
+   for (m = 0; m < n; m++) {
+   unsigned int last;
+
+   node = &nodes[order[(o + m) % count]];
+   err = drm_mm_insert_node_generic(&mm, node, 
size, 0, 0,
+
DRM_MM_SEARCH_BELOW,
+
DRM_MM_CREATE_TOP);
+   if (err || !assert_node(node, &mm, size, 0, 0)) 
{
+   pr_err("insert failed, step %d/%d\n", 
m, n);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+
+   if (node->hole_follows) {
+   pr_err("hole after topdown insert 
%d/%d, start=%llx\n",
+  m, n, node->start);
+   goto out;
+   }
+
+   last = find_last_bit(bitmap, count);
+   if (node_index

[PATCH v2 30/40] drm: Unconditionally do the range check in drm_mm_scan_add_block()

2016-12-16 Thread Chris Wilson
Doing the check is trivial (low cost in comparison to overall eviction)
and helps simplify the code.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c  | 53 +++
 drivers/gpu/drm/i915/i915_gem_evict.c | 10 ++-
 include/drm/drm_mm.h  | 33 ++
 3 files changed, 34 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 6bb61f2212f8..04b1d36c4ebc 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -710,46 +710,6 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  */

 /**
- * drm_mm_scan_init - initialize lru scanning
- * @scan: scan state
- * @mm: drm_mm to scan
- * @size: size of the allocation
- * @alignment: alignment of the allocation
- * @color: opaque tag value to use for the allocation
- *
- * This simply sets up the scanning routines with the parameters for the 
desired
- * hole. Note that there's no need to specify allocation flags, since they only
- * change the place a node is allocated from within a suitable hole.
- *
- * Warning:
- * As long as the scan list is non-empty, no other operations than
- * adding/removing nodes to/from the scan list are allowed.
- */
-void drm_mm_scan_init(struct drm_mm_scan *scan,
- struct drm_mm *mm,
- u64 size,
- u64 alignment,
- unsigned long color)
-{
-   DRM_MM_BUG_ON(size == 0);
-   DRM_MM_BUG_ON(mm->scan_active);
-
-   scan->mm = mm;
-
-   scan->color = color;
-   scan->alignment = alignment;
-   scan->size = size;
-
-   scan->check_range = 0;
-
-   scan->hit_start = U64_MAX;
-   scan->hit_end = 0;
-
-   scan->prev_scanned_node = NULL;
-}
-EXPORT_SYMBOL(drm_mm_scan_init);
-
-/**
  * drm_mm_scan_init_with_range - initialize range-restricted lru scanning
  * @scan: scan state
  * @mm: drm_mm to scan
@@ -788,7 +748,6 @@ void drm_mm_scan_init_with_range(struct drm_mm_scan *scan,
DRM_MM_BUG_ON(end <= start);
scan->range_start = start;
scan->range_end = end;
-   scan->check_range = 1;

scan->hit_start = U64_MAX;
scan->hit_end = 0;
@@ -830,15 +789,11 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
node->node_list.next = &scan->prev_scanned_node->node_list;
scan->prev_scanned_node = node;

-   adj_start = hole_start = drm_mm_hole_node_start(hole);
-   adj_end = hole_end = drm_mm_hole_node_end(hole);
+   hole_start = drm_mm_hole_node_start(hole);
+   hole_end = drm_mm_hole_node_end(hole);

-   if (scan->check_range) {
-   if (adj_start < scan->range_start)
-   adj_start = scan->range_start;
-   if (adj_end > scan->range_end)
-   adj_end = scan->range_end;
-   }
+   adj_start = max(hole_start, scan->range_start);
+   adj_end = min(hole_end, scan->range_end);

if (mm->color_adjust)
mm->color_adjust(hole, scan->color, &adj_start, &adj_end);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 6db0d73c0aa7..77ded288534b 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -126,13 +126,9 @@ i915_gem_evict_something(struct i915_address_space *vm,
 * On each list, the oldest objects lie at the HEAD with the freshest
 * object on the TAIL.
 */
-   if (start != 0 || end != vm->total) {
-   drm_mm_scan_init_with_range(&scan, &vm->mm, min_size,
-   alignment, cache_level,
-   start, end);
-   } else
-   drm_mm_scan_init(&scan, &vm->mm, min_size,
-alignment, cache_level);
+   drm_mm_scan_init_with_range(&scan, &vm->mm,
+   min_size, alignment, cache_level,
+   start, end);

/* Retire before we search the active list. Although we have
 * reasonable accuracy in our retirement lists, we may have
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 4db57bef98e0..8539832fac0f 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -120,7 +120,6 @@ struct drm_mm_scan {
struct drm_mm_node *prev_scanned_node;

unsigned long color;
-   bool check_range : 1;
 };

 /**
@@ -375,11 +374,6 @@ __drm_mm_interval_first(const struct drm_mm *mm, u64 
start, u64 last);
 node__ && node__->start < (end__); \
 node__ = list_next_entry(node__, node_list))

-void drm_mm_scan_init(struct drm_mm_scan *scan,
- struct drm_mm *mm,
- u64 size,
- u64 alignment,
- unsigned long color);
 void drm_mm_scan_init_with_range(struct drm_mm_scan 

[PATCH v2 22/40] drm/i915: Build DRM range manager selftests for CI

2016-12-16 Thread Chris Wilson
Build the struct drm_mm selftests so that we can trivially run them
within our CI.

"Enable debug, become developer." - Joonas Lahtinen

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/Kconfig.debug | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 597648c7a645..598551dbf62c 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -24,6 +24,7 @@ config DRM_I915_DEBUG
 select X86_MSR # used by igt/pm_rpm
 select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
 select DRM_DEBUG_MM if DRM=y
+   select DRM_DEBUG_MM_SELFTEST
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
 default n
 help
-- 
2.11.0



[PATCH v2 31/40] drm: Fix application of color vs range restriction when scanning drm_mm

2016-12-16 Thread Chris Wilson
The range restriction should be applied after the color adjustment, or
else we may inadvertently apply the color adjustment to the restricted
hole (and not against its neighbours).

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 04b1d36c4ebc..956782e7b092 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -772,6 +772,7 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
struct drm_mm *mm = scan->mm;
struct drm_mm_node *hole;
u64 hole_start, hole_end;
+   u64 col_start, col_end;
u64 adj_start, adj_end;

DRM_MM_BUG_ON(node->mm != mm);
@@ -789,14 +790,16 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
node->node_list.next = &scan->prev_scanned_node->node_list;
scan->prev_scanned_node = node;

-   hole_start = drm_mm_hole_node_start(hole);
-   hole_end = drm_mm_hole_node_end(hole);
-
-   adj_start = max(hole_start, scan->range_start);
-   adj_end = min(hole_end, scan->range_end);
+   hole_start = __drm_mm_hole_node_start(hole);
+   hole_end = __drm_mm_hole_node_end(hole);

+   col_start = hole_start;
+   col_end = hole_end;
if (mm->color_adjust)
-   mm->color_adjust(hole, scan->color, &adj_start, &adj_end);
+   mm->color_adjust(hole, scan->color, &col_start, &col_end);
+
+   adj_start = max(col_start, scan->range_start);
+   adj_end = min(col_end, scan->range_end);

if (check_free_hole(adj_start, adj_end,
scan->size, scan->alignment)) {
-- 
2.11.0



[PATCH v2 10/40] drm: kselftest for drm_mm_debug()

2016-12-16 Thread Chris Wilson
Simple test to just exercise calling the debug dumper on the drm_mm.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |  1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 35 
 2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 844dd29db540..0265f09e92fa 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -7,3 +7,4 @@
  */
 selftest(sanitycheck, igt_sanitycheck) /* keep first (selfcheck for igt) */
 selftest(init, igt_init)
+selftest(debug, igt_debug)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index ccef8e249d37..262e44f8f9fb 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -138,6 +138,41 @@ static int igt_init(void *ignored)
return ret;
 }

+static int igt_debug(void *ignored)
+{
+   struct drm_mm mm;
+   struct drm_mm_node nodes[2];
+   int ret;
+
+   /* Create a small drm_mm with a couple of nodes and a few holes, and
+* check that the debug iterator doesn't explode over a trivial drm_mm.
+*/
+
+   drm_mm_init(&mm, 0, 4096);
+
+   memset(nodes, 0, sizeof(nodes));
+   nodes[0].start = 512;
+   nodes[0].size = 1024;
+   ret = drm_mm_reserve_node(&mm, &nodes[0]);
+   if (ret) {
+   pr_err("failed to reserve node[0] {start=%lld, size=%lld)\n",
+  nodes[0].start, nodes[0].size);
+   return ret;
+   }
+
+   nodes[1].size = 1024;
+   nodes[1].start = 4096 - 512 - nodes[1].size;
+   ret = drm_mm_reserve_node(&mm, &nodes[1]);
+   if (ret) {
+   pr_err("failed to reserve node[1] {start=%lld, size=%lld)\n",
+  nodes[1].start, nodes[1].size);
+   return ret;
+   }
+
+   drm_mm_debug_table(&mm, __func__);
+   return 0;
+}
+
 #include "drm_selftest.c"

 static int __init test_drm_mm_init(void)
-- 
2.11.0



[PATCH v2 19/40] drm: kselftest for drm_mm and color adjustment

2016-12-16 Thread Chris Wilson
Check that after applying the driver's color adjustment, fitting of the
node and its alignment are still correct.

v2: s/no_color_touching/separate_adjacent_colors/

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 185 +++
 2 files changed, 186 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index cd508e3d6538..ff44f39a1826 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -18,3 +18,4 @@ selftest(align64, igt_align64)
 selftest(evict, igt_evict)
 selftest(evict_range, igt_evict_range)
 selftest(topdown, igt_topdown)
+selftest(color, igt_color)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index fc76d787cfd4..6b3af04eb582 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -1645,6 +1645,191 @@ static int igt_topdown(void *ignored)
return ret;
 }

+static void separate_adjacent_colors(const struct drm_mm_node *node,
+unsigned long color,
+u64 *start,
+u64 *end)
+{
+   if (node->allocated && node->color != color)
+   ++*start;
+
+   node = list_next_entry(node, node_list);
+   if (node->allocated && node->color != color)
+   --*end;
+}
+
+static bool colors_abutt(const struct drm_mm_node *node)
+{
+   if (!node->hole_follows &&
+   list_next_entry(node, node_list)->allocated) {
+   pr_err("colors abutt; %ld [%llx + %llx] is next to %ld [%llx + 
%llx]!\n",
+  node->color, node->start, node->size,
+  list_next_entry(node, node_list)->color,
+  list_next_entry(node, node_list)->start,
+  list_next_entry(node, node_list)->size);
+   return true;
+   }
+
+   return false;
+}
+
+static int igt_color(void *ignored)
+{
+   const unsigned int count = min(4096u, max_iterations);
+   const struct insert_mode *mode;
+   struct drm_mm mm;
+   struct drm_mm_node *node, *nn;
+   unsigned int n;
+   int ret = -EINVAL, err;
+
+   /* Color adjustment complicates everything. First we just check
+* that when we insert a node we apply any color_adjustment callback.
+* The callback we use should ensure that there is a gap between
+* any two nodes, and so after each insertion we check that those
+* holes are inserted and that they are preserved.
+*/
+
+   drm_mm_init(&mm, 0, U64_MAX);
+
+   for (n = 1; n <= count; n++) {
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
+   if (!node) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   err = drm_mm_insert_node_generic(&mm, node, n, 0, n,
+DRM_MM_SEARCH_DEFAULT,
+DRM_MM_CREATE_DEFAULT);
+   if (err || !assert_node(node, &mm, n, 0, n)) {
+   pr_err("insert failed, step %d\n", n);
+   kfree(node);
+   ret = err ?: -EINVAL;
+   goto out;
+   }
+   }
+
+   drm_mm_for_each_node_safe(node, nn, &mm) {
+   if (node->color != node->size) {
+   pr_err("invalid color stored: expected %lld, found 
%ld\n",
+  node->size, node->color);
+
+   goto out;
+   }
+
+   drm_mm_remove_node(node);
+   kfree(node);
+   }
+
+   /* Now, let's start experimenting with applying a color callback */
+   mm.color_adjust = separate_adjacent_colors;
+   for (mode = insert_modes; mode->name; mode++) {
+   u64 last;
+
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
+   if (!node) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   node->size = 1 + 2*count;
+   node->color = node->size;
+
+   err = drm_mm_reserve_node(&mm, node);
+   if (err) {
+   pr_err("initial reserve failed!\n");
+   goto out;
+   }
+
+   last = node->start + node->size;
+
+   for (n = 1; n <= count; n++) {
+   int rem;
+
+   node = kzalloc(sizeof(*node), GFP_KERNEL);
+   if (!node) {
+   ret = -ENOMEM;
+   goto out;
+   }
+
+   node->start = last;
+ 

[PATCH v2 37/40] drm: Apply range restriction after color adjustment when allocation

2016-12-16 Thread Chris Wilson
mm->color_adjust() compares the hole with its neighbouring nodes. They
only abutt before we restrict the hole, so we have to apply color_adjust
before we apply the range restriction.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index da9f98690e97..062a2a82efd6 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -410,14 +410,12 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,

DRM_MM_BUG_ON(!drm_mm_hole_follows(hole_node) || node->allocated);

-   if (adj_start < start)
-   adj_start = start;
-   if (adj_end > end)
-   adj_end = end;
-
if (mm->color_adjust)
mm->color_adjust(hole_node, color, &adj_start, &adj_end);

+   adj_start = max(adj_start, start);
+   adj_end = min(adj_end, end);
+
if (flags & DRM_MM_CREATE_TOP)
adj_start = adj_end - size;

@@ -625,17 +623,15 @@ static struct drm_mm_node 
*drm_mm_search_free_in_range_generic(const struct drm_
   flags & DRM_MM_SEARCH_BELOW) {
u64 hole_size = adj_end - adj_start;

-   if (adj_start < start)
-   adj_start = start;
-   if (adj_end > end)
-   adj_end = end;
-
if (mm->color_adjust) {
mm->color_adjust(entry, color, &adj_start, &adj_end);
if (adj_end <= adj_start)
continue;
}

+   adj_start = max(adj_start, start);
+   adj_end = min(adj_end, end);
+
if (!check_free_hole(adj_start, adj_end, size, alignment))
continue;

-- 
2.11.0



[PATCH v2 34/40] drm: Simplify drm_mm scan-list manipulation

2016-12-16 Thread Chris Wilson
Since we mandate a strict reverse-order of drm_mm_scan_remove_block()
after drm_mm_scan_add_block() we can further simplify the list
manipulations when generating the temporary scan-hole.

v2: Highlight the games being played with the lists to track the scan
holes without allocation.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 35 ++-
 include/drm/drm_mm.h |  7 +--
 2 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index ffa439b2bd2c..91c89ae09b26 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -518,9 +518,7 @@ void drm_mm_remove_node(struct drm_mm_node *node)
struct drm_mm_node *prev_node;

DRM_MM_BUG_ON(!node->allocated);
-   DRM_MM_BUG_ON(node->scanned_block ||
- node->scanned_prev_free ||
- node->scanned_next_free);
+   DRM_MM_BUG_ON(node->scanned_block);

prev_node =
list_entry(node->node_list.prev, struct drm_mm_node, node_list);
@@ -757,8 +755,6 @@ void drm_mm_scan_init_with_range(struct drm_mm_scan *scan,

scan->hit_start = U64_MAX;
scan->hit_end = 0;
-
-   scan->prev_scanned_node = NULL;
 }
 EXPORT_SYMBOL(drm_mm_scan_init_with_range);

@@ -787,14 +783,14 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
node->scanned_block = true;
mm->scan_active++;

+   /* Remove this block from the node_list so that we enlarge the hole
+* (distance between the end of our previous node and the start of
+* or next), without poisoning the link so that we can restore it
+* later in drm_mm_scan_remove_block().
+*/
hole = list_prev_entry(node, node_list);
-
-   node->scanned_preceeds_hole = hole->hole_follows;
-   hole->hole_follows = 1;
-   list_del(&node->node_list);
-   node->node_list.prev = &hole->node_list;
-   node->node_list.next = &scan->prev_scanned_node->node_list;
-   scan->prev_scanned_node = node;
+   DRM_MM_BUG_ON(list_next_entry(hole, node_list) != node);
+   __list_del_entry(&node->node_list);

hole_start = __drm_mm_hole_node_start(hole);
hole_end = __drm_mm_hole_node_end(hole);
@@ -888,9 +884,17 @@ bool drm_mm_scan_remove_block(struct drm_mm_scan *scan,
DRM_MM_BUG_ON(!node->mm->scan_active);
node->mm->scan_active--;

+   /* During drm_mm_scan_add_block() we decoupled this node leaving
+* its pointers intact. Now that the caller is walking back along
+* the eviction list we can restore this block into its rightful
+* place on the full node_list. To confirm that the caller is walking
+* backwards correctly we check that prev_node->next == node->next,
+* i.e. both believe the same node should be on the other side of the
+* hole.
+*/
prev_node = list_prev_entry(node, node_list);
-
-   prev_node->hole_follows = node->scanned_preceeds_hole;
+   DRM_MM_BUG_ON(list_next_entry(prev_node, node_list) !=
+ list_next_entry(node, node_list));
list_add(&node->node_list, &prev_node->node_list);

return (node->start + node->size > scan->hit_start &&
@@ -917,9 +921,6 @@ void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)
INIT_LIST_HEAD(&mm->head_node.node_list);
mm->head_node.allocated = 0;
mm->head_node.hole_follows = 1;
-   mm->head_node.scanned_block = 0;
-   mm->head_node.scanned_prev_free = 0;
-   mm->head_node.scanned_next_free = 0;
mm->head_node.mm = mm;
mm->head_node.start = start + size;
mm->head_node.size = start - mm->head_node.start;
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 6ee87e1455bf..a1532eb0ffbc 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -74,11 +74,8 @@ struct drm_mm_node {
struct list_head hole_stack;
struct rb_node rb;
unsigned hole_follows : 1;
-   unsigned scanned_block : 1;
-   unsigned scanned_prev_free : 1;
-   unsigned scanned_next_free : 1;
-   unsigned scanned_preceeds_hole : 1;
unsigned allocated : 1;
+   bool scanned_block : 1;
unsigned long color;
u64 start;
u64 size;
@@ -118,8 +115,6 @@ struct drm_mm_scan {
u64 hit_start;
u64 hit_end;

-   struct drm_mm_node *prev_scanned_node;
-
unsigned long color;
unsigned int flags;
 };
-- 
2.11.0



[PATCH v2 26/40] drm: Simplify drm_mm_clean()

2016-12-16 Thread Chris Wilson
Since commit ea7b1dd44867 ("drm: mm: track free areas implicitly"),
to test whether there are any nodes allocated within the range manager,
we merely have to ask whether the node_list is empty.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c | 19 +--
 include/drm/drm_mm.h | 14 +-
 2 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 6450690f5578..14a5ef505f1b 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -873,22 +873,6 @@ bool drm_mm_scan_remove_block(struct drm_mm_node *node)
 EXPORT_SYMBOL(drm_mm_scan_remove_block);

 /**
- * drm_mm_clean - checks whether an allocator is clean
- * @mm: drm_mm allocator to check
- *
- * Returns:
- * True if the allocator is completely free, false if there's still a node
- * allocated in it.
- */
-bool drm_mm_clean(const struct drm_mm *mm)
-{
-   const struct list_head *head = __drm_mm_nodes(mm);
-
-   return (head->next->next == head);
-}
-EXPORT_SYMBOL(drm_mm_clean);
-
-/**
  * drm_mm_init - initialize a drm-mm allocator
  * @mm: the drm_mm structure to initialize
  * @start: start of the range managed by @mm
@@ -928,10 +912,9 @@ EXPORT_SYMBOL(drm_mm_init);
  */
 void drm_mm_takedown(struct drm_mm *mm)
 {
-   if (WARN(!list_empty(__drm_mm_nodes(mm)),
+   if (WARN(!drm_mm_clean(mm),
 "Memory manager not clean during takedown.\n"))
show_leaks(mm);
-
 }
 EXPORT_SYMBOL(drm_mm_takedown);

diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 47ca6502ef29..2f28a806d015 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -330,7 +330,19 @@ void drm_mm_remove_node(struct drm_mm_node *node);
 void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new);
 void drm_mm_init(struct drm_mm *mm, u64 start, u64 size);
 void drm_mm_takedown(struct drm_mm *mm);
-bool drm_mm_clean(const struct drm_mm *mm);
+
+/**
+ * drm_mm_clean - checks whether an allocator is clean
+ * @mm: drm_mm allocator to check
+ *
+ * Returns:
+ * True if the allocator is completely free, false if there's still a node
+ * allocated in it.
+ */
+static inline bool drm_mm_clean(const struct drm_mm *mm)
+{
+   return list_empty(__drm_mm_nodes(mm));
+}

 struct drm_mm_node *
 __drm_mm_interval_first(const struct drm_mm *mm, u64 start, u64 last);
-- 
2.11.0



[PATCH v2 16/40] drm: kselftest for drm_mm and eviction

2016-12-16 Thread Chris Wilson
Check that we add arbitrary blocks to the eviction scanner in order to
find the first minimal hole that matches our request.

v2: Refactor out some common eviction code for later

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 337 +++
 2 files changed, 338 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index a7a3763f8b20..a31b4458c7eb 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -15,3 +15,4 @@ selftest(insert_range, igt_insert_range)
 selftest(align, igt_align)
 selftest(align32, igt_align32)
 selftest(align64, igt_align64)
+selftest(evict, igt_evict)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index dc3aee222158..4881752d6424 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -524,6 +524,10 @@ static const struct insert_mode {
{ "top-down", DRM_MM_SEARCH_BELOW, DRM_MM_CREATE_TOP },
{ "best", DRM_MM_SEARCH_BEST, DRM_MM_CREATE_DEFAULT },
{}
+}, evict_modes[] = {
+   { "default", DRM_MM_SEARCH_DEFAULT, DRM_MM_CREATE_DEFAULT },
+   { "top-down", DRM_MM_SEARCH_BELOW, DRM_MM_CREATE_TOP },
+   {}
 };

 static int __igt_insert(unsigned int count, u64 size, bool replace)
@@ -1080,6 +1084,339 @@ static int igt_align64(void *ignored)
return igt_align_pot(64);
 }

+static void show_scan(const struct drm_mm *scan)
+{
+   pr_info("scan: hit [%llx, %llx], size=%lld, align=%d, color=%ld\n",
+   scan->scan_hit_start, scan->scan_hit_end,
+   scan->scan_size, scan->scan_alignment, scan->scan_color);
+}
+
+static void show_holes(const struct drm_mm *mm, int count)
+{
+   u64 hole_start, hole_end;
+   struct drm_mm_node *hole;
+
+   drm_mm_for_each_hole(hole, mm, hole_start, hole_end) {
+   struct drm_mm_node *next = list_next_entry(hole, node_list);
+   const char *node1 = NULL, *node2 = NULL;
+
+   if (hole->allocated)
+   node1 = kasprintf(GFP_KERNEL,
+ "[%llx + %lld, color=%ld], ",
+ hole->start, hole->size, hole->color);
+
+   if (next->allocated)
+   node2 = kasprintf(GFP_KERNEL,
+ ", [%llx + %lld, color=%ld]",
+ next->start, next->size, next->color);
+
+   pr_info("%sHole [%llx - %llx, size %lld]%s\n",
+   node1,
+   hole_start, hole_end, hole_end - hole_start,
+   node2);
+
+   kfree(node2);
+   kfree(node1);
+
+   if (!--count)
+   break;
+   }
+}
+
+struct evict_node {
+   struct drm_mm_node node;
+   struct list_head link;
+};
+
+static bool evict_nodes(struct drm_mm *mm,
+   struct evict_node *nodes,
+   unsigned int *order,
+   unsigned int count,
+   struct list_head *evict_list)
+{
+   struct evict_node *e, *en;
+   unsigned int i;
+
+   for (i = 0; i < count; i++) {
+   e = &nodes[order ? order[i] : i];
+   list_add(&e->link, evict_list);
+   if (drm_mm_scan_add_block(&e->node))
+   break;
+   }
+   list_for_each_entry_safe(e, en, evict_list, link) {
+   if (!drm_mm_scan_remove_block(&e->node))
+   list_del(&e->link);
+   }
+   if (list_empty(evict_list)) {
+   pr_err("Failed to find eviction: size=%lld [avail=%d], align=%d 
(color=%lu)\n",
+  mm->scan_size, count,
+  mm->scan_alignment,
+  mm->scan_color);
+   return false;
+   }
+
+   list_for_each_entry(e, evict_list, link)
+   drm_mm_remove_node(&e->node);
+
+   return true;
+}
+
+static bool evict_nothing(struct drm_mm *mm,
+ unsigned int total_size,
+ struct evict_node *nodes)
+{
+   LIST_HEAD(evict_list);
+   struct evict_node *e;
+   struct drm_mm_node *node;
+   unsigned int n;
+
+   drm_mm_init_scan(mm, 1, 0, 0);
+   for (n = 0; n < total_size; n++) {
+   e = &nodes[n];
+   list_add(&e->link, &evict_list);
+   drm_mm_scan_add_block(&e->node);
+   }
+   list_for_each_entry(e, &evict_list, link)
+   drm_mm_scan_remove_block(&e->node);
+
+   for (n = 0; n < total_size; n++) {
+   e = &nodes[n];
+
+   if (!drm_mm_node_allocated(&e->node)) {
+

[PATCH v2 06/40] drm: Add some kselftests for the DRM range manager (struct drm_mm)

2016-12-16 Thread Chris Wilson
First we introduce a smattering of infrastructure for writing selftests.
The idea is that we have a test module that exercises a particular
portion of the exported API, and that module provides a set of tests
that can either be run as an ensemble via kselftest or individually via
an igt harness (in this case igt/drm_mm). To accommodate selecting
individual tests, we export a boolean parameter to control selection of
each test - that is hidden inside a bunch of reusable boilerplate macros
to keep writing the tests simple.

v2: Choose a random random_seed unless one is specified by the user.
v3: More parameters to control max_iterations and max_prime of the
tests.

Testcase: igt/drm_mm
Signed-off-by: Chris Wilson 
Acked-by: Christian König 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/Kconfig   |  13 +++
 drivers/gpu/drm/Makefile  |   2 +
 drivers/gpu/drm/selftests/drm_mm_selftests.h  |   8 ++
 drivers/gpu/drm/selftests/drm_selftest.c  | 109 ++
 drivers/gpu/drm/selftests/drm_selftest.h  |  41 ++
 drivers/gpu/drm/selftests/test-drm_mm.c   |  55 +
 tools/testing/selftests/drivers/gpu/drm_mm.sh |  15 
 7 files changed, 243 insertions(+)
 create mode 100644 drivers/gpu/drm/selftests/drm_mm_selftests.h
 create mode 100644 drivers/gpu/drm/selftests/drm_selftest.c
 create mode 100644 drivers/gpu/drm/selftests/drm_selftest.h
 create mode 100644 drivers/gpu/drm/selftests/test-drm_mm.c
 create mode 100755 tools/testing/selftests/drivers/gpu/drm_mm.sh

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index ebfe8404c25f..d1363d21d3d1 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -48,6 +48,19 @@ config DRM_DEBUG_MM

  If in doubt, say "N".

+config DRM_DEBUG_MM_SELFTEST
+   tristate "kselftests for DRM range manager (struct drm_mm)"
+   depends on DRM
+   depends on DEBUG_KERNEL
+   default n
+   help
+ This option provides a kernel module that can be used to test
+ the DRM range manager (drm_mm) and its API. This option is not
+ useful for distributions or general kernels, but only for kernel
+ developers working on DRM and associated drivers.
+
+ If in doubt, say "N".
+
 config DRM_KMS_HELPER
tristate
depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index b9ae4280de9d..c8aed3688b20 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -18,6 +18,8 @@ drm-y   :=drm_auth.o drm_bufs.o drm_cache.o \
drm_plane.o drm_color_mgmt.o drm_print.o \
drm_dumb_buffers.o drm_mode_config.o

+obj-$(CONFIG_DRM_DEBUG_MM_SELFTEST) += selftests/test-drm_mm.o
+
 drm-$(CONFIG_COMPAT) += drm_ioc32.o
 drm-$(CONFIG_DRM_GEM_CMA_HELPER) += drm_gem_cma_helper.o
 drm-$(CONFIG_PCI) += ati_pcigart.o
diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
new file mode 100644
index ..1610e0a63a5b
--- /dev/null
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -0,0 +1,8 @@
+/* List each unit test as selftest(name, function)
+ *
+ * The name is used as both an enum and expanded as igt__name to create
+ * a module parameter. It must be unique and legal for a C identifier.
+ *
+ * Tests are executed in order by igt/drm_mm
+ */
+selftest(sanitycheck, igt_sanitycheck) /* keep first (selfcheck for igt) */
diff --git a/drivers/gpu/drm/selftests/drm_selftest.c 
b/drivers/gpu/drm/selftests/drm_selftest.c
new file mode 100644
index ..844d4625931e
--- /dev/null
+++ b/drivers/gpu/drm/selftests/drm_selftest.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#define selftest(name, func) __idx_##name,
+enum {
+#include TESTS
+};
+#undef selftest
+

[PATCH v2 11/40] drm: kselftest for drm_mm_reserve_node()

2016-12-16 Thread Chris Wilson
Exercise drm_mm_reserve_node(), check that we can't reserve an already
occupied range and that the lists are correct after reserving/removing.

v2: Check for invalid node reservation.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/selftests/drm_mm_selftests.h |   1 +
 drivers/gpu/drm/selftests/test-drm_mm.c  | 279 +++
 2 files changed, 280 insertions(+)

diff --git a/drivers/gpu/drm/selftests/drm_mm_selftests.h 
b/drivers/gpu/drm/selftests/drm_mm_selftests.h
index 0265f09e92fa..693d85677e7f 100644
--- a/drivers/gpu/drm/selftests/drm_mm_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_mm_selftests.h
@@ -8,3 +8,4 @@
 selftest(sanitycheck, igt_sanitycheck) /* keep first (selfcheck for igt) */
 selftest(init, igt_init)
 selftest(debug, igt_debug)
+selftest(reserve, igt_reserve)
diff --git a/drivers/gpu/drm/selftests/test-drm_mm.c 
b/drivers/gpu/drm/selftests/test-drm_mm.c
index 262e44f8f9fb..2e40f94cb9d3 100644
--- a/drivers/gpu/drm/selftests/test-drm_mm.c
+++ b/drivers/gpu/drm/selftests/test-drm_mm.c
@@ -11,6 +11,9 @@

 #include 

+#include "../lib/drm_random.h"
+#include "../lib/drm_prime_numbers.h"
+
 #define TESTS "drm_mm_selftests.h"
 #include "drm_selftest.h"

@@ -77,6 +80,57 @@ static bool assert_one_hole(const struct drm_mm *mm, u64 
start, u64 end)
return ok;
 }

+static bool assert_continuous(const struct drm_mm *mm, u64 size)
+{
+   struct drm_mm_node *node, *check, *found;
+   unsigned long n;
+   u64 addr;
+
+   if (!assert_no_holes(mm))
+   return false;
+
+   n = 0;
+   addr = 0;
+   drm_mm_for_each_node(node, mm) {
+   if (node->start != addr) {
+   pr_err("node[%ld] list out of order, expected %llx 
found %llx\n",
+  n, addr, node->start);
+   return false;
+   }
+
+   if (node->size != size) {
+   pr_err("node[%ld].size incorrect, expected %llx, found 
%llx\n",
+  n, size, node->size);
+   return false;
+   }
+
+   if (node->hole_follows) {
+   pr_err("node[%ld] is followed by a hole!\n", n);
+   return false;
+   }
+
+   found = NULL;
+   drm_mm_for_each_node_in_range(check, mm, addr, addr + size) {
+   if (node != check) {
+   pr_err("lookup return wrong node, expected 
start %llx, found %llx\n",
+  node->start, check->start);
+   return false;
+   }
+   found = check;
+   }
+   if (!found) {
+   pr_err("lookup failed for node %llx + %llx\n",
+  addr, size);
+   return false;
+   }
+
+   addr += size;
+   n++;
+   }
+
+   return true;
+}
+
 static int igt_init(void *ignored)
 {
const unsigned int size = 4096;
@@ -173,6 +227,231 @@ static int igt_debug(void *ignored)
return 0;
 }

+static struct drm_mm_node *set_node(struct drm_mm_node *node,
+   u64 start, u64 size)
+{
+   node->start = start;
+   node->size = size;
+   return node;
+}
+
+static bool expect_reserve_fail(struct drm_mm *mm, struct drm_mm_node *node)
+{
+   int err;
+
+   err = drm_mm_reserve_node(mm, node);
+   if (err != -ENOSPC) {
+   if (!err) {
+   pr_err("impossible reserve succeeded, node %llu + 
%llu\n",
+  node->start, node->size);
+   drm_mm_remove_node(node);
+   } else {
+   pr_err("impossible reserve failed with wrong error %d 
[expected %d], node %llu + %llu\n",
+  err, -ENOSPC, node->start, node->size);
+   }
+   return false;
+   }
+
+   return true;
+}
+
+static bool check_reserve_boundaries(struct drm_mm *mm,
+unsigned int count,
+u64 size)
+{
+   const struct boundary {
+   u64 start, size;
+   const char *name;
+   } boundaries[] = {
+#define B(st, sz) { (st), (sz), "{ " #st ", " #sz "}" }
+   B(0, 0),
+   B(-size, 0),
+   B(size, 0),
+   B(size * count, 0),
+   B(-size, size),
+   B(-size, -size),
+   B(-size, 2*size),
+   B(0, -size),
+   B(size, -size),
+   B(count*size, size),
+   B(count*size, -size),
+   B(count*size, count*size),
+   B(count*size, -count*size),
+   B(count*size, -(count+1)*size),
+   B((count+1)*size, size),
+

[PATCH v2 29/40] drm: Rename prev_node to hole in drm_mm_scan_add_block()

2016-12-16 Thread Chris Wilson
Acknowledging that we were building up the hole was more useful to me
when reading the code, than knowing the relationship between this node
and the previous node.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 0f1396b0d39a..6bb61f2212f8 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -811,7 +811,7 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
   struct drm_mm_node *node)
 {
struct drm_mm *mm = scan->mm;
-   struct drm_mm_node *prev_node;
+   struct drm_mm_node *hole;
u64 hole_start, hole_end;
u64 adj_start, adj_end;

@@ -821,17 +821,17 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
node->scanned_block = 1;
mm->scan_active++;

-   prev_node = list_prev_entry(node, node_list);
+   hole = list_prev_entry(node, node_list);

-   node->scanned_preceeds_hole = prev_node->hole_follows;
-   prev_node->hole_follows = 1;
+   node->scanned_preceeds_hole = hole->hole_follows;
+   hole->hole_follows = 1;
list_del(&node->node_list);
-   node->node_list.prev = &prev_node->node_list;
+   node->node_list.prev = &hole->node_list;
node->node_list.next = &scan->prev_scanned_node->node_list;
scan->prev_scanned_node = node;

-   adj_start = hole_start = drm_mm_hole_node_start(prev_node);
-   adj_end = hole_end = drm_mm_hole_node_end(prev_node);
+   adj_start = hole_start = drm_mm_hole_node_start(hole);
+   adj_end = hole_end = drm_mm_hole_node_end(hole);

if (scan->check_range) {
if (adj_start < scan->range_start)
@@ -841,7 +841,7 @@ bool drm_mm_scan_add_block(struct drm_mm_scan *scan,
}

if (mm->color_adjust)
-   mm->color_adjust(prev_node, scan->color, &adj_start, &adj_end);
+   mm->color_adjust(hole, scan->color, &adj_start, &adj_end);

if (check_free_hole(adj_start, adj_end,
scan->size, scan->alignment)) {
-- 
2.11.0



[PATCH v2 38/40] drm: Use drm_mm_insert_node_in_range_generic() for everyone

2016-12-16 Thread Chris Wilson
Remove a superfluous helper as drm_mm_insert_node is equivalent to
insert_node_in_range with a range of (0, U64_MAX).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/drm_mm.c | 166 ---
 include/drm/drm_mm.h |  90 +++--
 2 files changed, 67 insertions(+), 189 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 062a2a82efd6..c6fce14178cc 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -92,11 +92,6 @@
  * some basic allocator dumpers for debugging.
  */

-static struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
-   u64 size,
-   u64 alignment,
-   unsigned long color,
-   enum drm_mm_search_flags flags);
 static struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct 
drm_mm *mm,
u64 size,
u64 alignment,
@@ -230,6 +225,7 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
 struct drm_mm_node *node,
 u64 size, u64 alignment,
 unsigned long color,
+u64 range_start, u64 range_end,
 enum drm_mm_allocator_flags flags)
 {
struct drm_mm *mm = hole_node->mm;
@@ -238,11 +234,14 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
u64 adj_start = hole_start;
u64 adj_end = hole_end;

-   DRM_MM_BUG_ON(node->allocated);
+   DRM_MM_BUG_ON(!drm_mm_hole_follows(hole_node) || node->allocated);

if (mm->color_adjust)
mm->color_adjust(hole_node, color, &adj_start, &adj_end);

+   adj_start = max(adj_start, range_start);
+   adj_end = min(adj_end, range_end);
+
if (flags & DRM_MM_CREATE_TOP)
adj_start = adj_end - size;

@@ -258,9 +257,6 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
}
}

-   DRM_MM_BUG_ON(adj_start < hole_start);
-   DRM_MM_BUG_ON(adj_end > hole_end);
-
if (adj_start == hole_start) {
hole_node->hole_follows = 0;
list_del(&hole_node->hole_stack);
@@ -276,7 +272,10 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,

drm_mm_interval_tree_add_node(hole_node, node);

+   DRM_MM_BUG_ON(node->start < range_start);
+   DRM_MM_BUG_ON(node->start < adj_start);
DRM_MM_BUG_ON(node->start + node->size > adj_end);
+   DRM_MM_BUG_ON(node->start + node->size > range_end);

node->hole_follows = 0;
if (__drm_mm_hole_node_start(node) < hole_end) {
@@ -360,107 +359,6 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct 
drm_mm_node *node)
 EXPORT_SYMBOL(drm_mm_reserve_node);

 /**
- * drm_mm_insert_node_generic - search for space and insert @node
- * @mm: drm_mm to allocate from
- * @node: preallocate node to insert
- * @size: size of the allocation
- * @alignment: alignment of the allocation
- * @color: opaque tag value to use for this node
- * @sflags: flags to fine-tune the allocation search
- * @aflags: flags to fine-tune the allocation behavior
- *
- * The preallocated node must be cleared to 0.
- *
- * Returns:
- * 0 on success, -ENOSPC if there's no suitable hole.
- */
-int drm_mm_insert_node_generic(struct drm_mm *mm, struct drm_mm_node *node,
-  u64 size, u64 alignment,
-  unsigned long color,
-  enum drm_mm_search_flags sflags,
-  enum drm_mm_allocator_flags aflags)
-{
-   struct drm_mm_node *hole_node;
-
-   if (WARN_ON(size == 0))
-   return -EINVAL;
-
-   hole_node = drm_mm_search_free_generic(mm, size, alignment,
-  color, sflags);
-   if (!hole_node)
-   return -ENOSPC;
-
-   drm_mm_insert_helper(hole_node, node, size, alignment, color, aflags);
-   return 0;
-}
-EXPORT_SYMBOL(drm_mm_insert_node_generic);
-
-static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
-  struct drm_mm_node *node,
-  u64 size, u64 alignment,
-  unsigned long color,
-  u64 start, u64 end,
-  enum drm_mm_allocator_flags flags)
-{
-   struct drm_mm *mm = hole_node->mm;
-   u64 hole_start = drm_mm_hole_node_start(hole_node);
-   u64 hole_end = drm_mm_hole_node_end(hole_node);
-   u64 adj_start = hole_start;
-   u64 adj_end = hole_end;
-
-   DRM_MM_BUG_ON(!drm_mm_hole_follows(ho

[PATCH v2 28/40] drm: Extract struct drm_mm_scan from struct drm_mm

2016-12-16 Thread Chris Wilson
The scan state occupies a large proportion of the struct drm_mm and is
rarely used and only contains temporary state. That makes it suitable to
moving to its struct and onto the stack of the callers.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/drm_mm.c| 124 ++--
 drivers/gpu/drm/etnaviv/etnaviv_mmu.c   |   7 +-
 drivers/gpu/drm/i915/i915_gem_evict.c   |  19 +++--
 drivers/gpu/drm/selftests/test-drm_mm.c |  45 ++--
 include/drm/drm_mm.h|  43 +++
 5 files changed, 138 insertions(+), 100 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 57267845b7d4..0f1396b0d39a 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -574,7 +574,7 @@ static struct drm_mm_node *drm_mm_search_free_generic(const 
struct drm_mm *mm,
u64 adj_end;
u64 best_size;

-   DRM_MM_BUG_ON(mm->scanned_blocks);
+   DRM_MM_BUG_ON(mm->scan_active);

best = NULL;
best_size = ~0UL;
@@ -618,7 +618,7 @@ static struct drm_mm_node 
*drm_mm_search_free_in_range_generic(const struct drm_
u64 adj_end;
u64 best_size;

-   DRM_MM_BUG_ON(mm->scanned_blocks);
+   DRM_MM_BUG_ON(mm->scan_active);

best = NULL;
best_size = ~0UL;
@@ -693,7 +693,7 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  *
  * The DRM range allocator supports this use-case through the scanning
  * interfaces. First a scan operation needs to be initialized with
- * drm_mm_init_scan() or drm_mm_init_scan_with_range(). The driver adds
+ * drm_mm_scan_init() or drm_mm_scan_init_with_range(). The driver adds
  * objects to the roaster (probably by walking an LRU list, but this can be
  * freely implemented) until a suitable hole is found or there's no further
  * evictable object.
@@ -710,7 +710,8 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  */

 /**
- * drm_mm_init_scan - initialize lru scanning
+ * drm_mm_scan_init - initialize lru scanning
+ * @scan: scan state
  * @mm: drm_mm to scan
  * @size: size of the allocation
  * @alignment: alignment of the allocation
@@ -724,26 +725,33 @@ EXPORT_SYMBOL(drm_mm_replace_node);
  * As long as the scan list is non-empty, no other operations than
  * adding/removing nodes to/from the scan list are allowed.
  */
-void drm_mm_init_scan(struct drm_mm *mm,
+void drm_mm_scan_init(struct drm_mm_scan *scan,
+ struct drm_mm *mm,
  u64 size,
  u64 alignment,
  unsigned long color)
 {
DRM_MM_BUG_ON(size == 0);
+   DRM_MM_BUG_ON(mm->scan_active);

-   mm->scan_color = color;
-   mm->scan_alignment = alignment;
-   mm->scan_size = size;
-   mm->scanned_blocks = 0;
-   mm->scan_hit_start = 0;
-   mm->scan_hit_end = 0;
-   mm->scan_check_range = 0;
-   mm->prev_scanned_node = NULL;
+   scan->mm = mm;
+
+   scan->color = color;
+   scan->alignment = alignment;
+   scan->size = size;
+
+   scan->check_range = 0;
+
+   scan->hit_start = U64_MAX;
+   scan->hit_end = 0;
+
+   scan->prev_scanned_node = NULL;
 }
-EXPORT_SYMBOL(drm_mm_init_scan);
+EXPORT_SYMBOL(drm_mm_scan_init);

 /**
- * drm_mm_init_scan - initialize range-restricted lru scanning
+ * drm_mm_scan_init_with_range - initialize range-restricted lru scanning
+ * @scan: scan state
  * @mm: drm_mm to scan
  * @size: size of the allocation
  * @alignment: alignment of the allocation
@@ -759,7 +767,8 @@ EXPORT_SYMBOL(drm_mm_init_scan);
  * As long as the scan list is non-empty, no other operations than
  * adding/removing nodes to/from the scan list are allowed.
  */
-void drm_mm_init_scan_with_range(struct drm_mm *mm,
+void drm_mm_scan_init_with_range(struct drm_mm_scan *scan,
+struct drm_mm *mm,
 u64 size,
 u64 alignment,
 unsigned long color,
@@ -768,19 +777,25 @@ void drm_mm_init_scan_with_range(struct drm_mm *mm,
 {
DRM_MM_BUG_ON(start >= end);
DRM_MM_BUG_ON(size == 0 || size > end - start);
+   DRM_MM_BUG_ON(mm->scan_active);
+
+   scan->mm = mm;
+
+   scan->color = color;
+   scan->alignment = alignment;
+   scan->size = size;
+
+   DRM_MM_BUG_ON(end <= start);
+   scan->range_start = start;
+   scan->range_end = end;
+   scan->check_range = 1;

-   mm->scan_color = color;
-   mm->scan_alignment = alignment;
-   mm->scan_size = size;
-   mm->scanned_blocks = 0;
-   mm->scan_hit_start = 0;
-   mm->scan_hit_end = 0;
-   mm->scan_start = start;
-   mm->scan_end = end;
-   mm->scan_check_range = 1;
-   mm->prev_scanned_node = NULL;
+   scan->hit_start = U64_MAX;
+   scan->hit_end = 0;
+
+   scan->prev_scanned_node = NULL;
 }
-EXPORT_SYMBOL(drm_mm_init_scan_with_range);
+EXPORT_SYMBOL(drm_mm_scan_

[Intel-gfx] [bug report] drm/i915: Small compaction of the engine init code

2016-12-16 Thread Tvrtko Ursulin

On 15/12/2016 20:54, Chris Wilson wrote:
> On Thu, Dec 15, 2016 at 11:44:13PM +0300, Dan Carpenter wrote:
>> Hello Tvrtko Ursulin,
>>
>> The patch a19d6ff29a82: "drm/i915: Small compaction of the engine
>> init code" from Jun 23, 2016, leads to the following static checker
>> warning:
>>
>>  drivers/gpu/drm/i915/intel_lrc.c:1973 logical_render_ring_init()
>>  warn: passing freed memory 'engine'
>>
>> drivers/gpu/drm/i915/intel_lrc.c
>>   1970
>>   1971  ret = logical_ring_init(engine);
>>   1972  if (ret) {
>>   1973  lrc_destroy_wa_ctx_obj(engine);
>>
>> The problem is that logical_ring_init() frees "engine" on the error
>> path so this is a use after free.
>
> And calls lrc_destroy_wa_ctx_obj() in the process. So we can
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 067394b0a769..1c1bad8ae7b0 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1940,12 +1940,7 @@ int logical_render_ring_init(struct intel_engine_cs 
> *engine)
>   ret);
> }
>
> -   ret = logical_ring_init(engine);
> -   if (ret) {
> -   lrc_destroy_wa_ctx_obj(engine);
> -   }
> -
> -   return ret;
> +   return logical_ring_init(engine);
>  }
>
>  int logical_xcs_ring_init(struct intel_engine_cs *engine)
>

I've marked this as TODO for later.

Interesting that it was detected only now. Even more so, the referenced 
commit just moved the code around.

I think that the real issue was introduced around "drm/i915/gen8: 
Re-order init pipe_control in lrc mode".

Regards,

Tvrtko



[PATCH v4 3/5] drm: bridge: Link encoder and bridge in core code

2016-12-16 Thread Archit Taneja
Hi,

On 12/14/2016 03:29 PM, Laurent Pinchart wrote:
> Instead of linking encoders and bridges in every driver (and getting it
> wrong half of the time, as many drivers forget to set the drm_bridge
> encoder pointer), do so in core code. The drm_bridge_attach() function
> needs the encoder and optional previous bridge to perform that task,
> update all the callers.
>
> Signed-off-by: Laurent Pinchart 
> Acked-by: Stefan Agner  # For DCU
> Acked-by: Boris Brezillon  # For 
> atmel-hlcdc
> Acked-by: Vincent Abriou  # For STI

This one needs acks for arcgpu, tilcdc, mediatek and imx. The changes in
those drivers look good to me, though. Will push it in a day or so unless anyone
has any comments on it.

Thanks,
Archit

> ---
>  drivers/gpu/drm/arc/arcpgu_hdmi.c  |  5 +--
>  drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c   |  4 +-
>  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c |  4 +-
>  drivers/gpu/drm/bridge/dw-hdmi.c   |  3 +-
>  drivers/gpu/drm/drm_bridge.c   | 46 
> --
>  drivers/gpu/drm/drm_simple_kms_helper.c|  4 +-
>  drivers/gpu/drm/exynos/exynos_dp.c |  5 +--
>  drivers/gpu/drm/exynos/exynos_drm_dsi.c|  6 +--
>  drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c  |  5 +--
>  drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c   |  5 +--
>  drivers/gpu/drm/imx/imx-ldb.c  |  6 +--
>  drivers/gpu/drm/imx/parallel-display.c |  4 +-
>  drivers/gpu/drm/mediatek/mtk_dpi.c |  8 ++--
>  drivers/gpu/drm/mediatek/mtk_dsi.c | 24 ++-
>  drivers/gpu/drm/mediatek/mtk_hdmi.c| 11 +++---
>  drivers/gpu/drm/msm/dsi/dsi_manager.c  | 17 +---
>  drivers/gpu/drm/msm/edp/edp_bridge.c   |  2 +-
>  drivers/gpu/drm/msm/hdmi/hdmi_bridge.c |  2 +-
>  drivers/gpu/drm/rcar-du/rcar_du_hdmienc.c  |  5 +--
>  drivers/gpu/drm/sti/sti_dvo.c  |  3 +-
>  drivers/gpu/drm/sti/sti_hda.c  |  3 +-
>  drivers/gpu/drm/sti/sti_hdmi.c |  3 +-
>  drivers/gpu/drm/sun4i/sun4i_rgb.c  | 13 +++---
>  drivers/gpu/drm/tilcdc/tilcdc_external.c   |  4 +-
>  include/drm/drm_bridge.h   |  3 +-
>  25 files changed, 85 insertions(+), 110 deletions(-)
>
> diff --git a/drivers/gpu/drm/arc/arcpgu_hdmi.c 
> b/drivers/gpu/drm/arc/arcpgu_hdmi.c
> index b69c66b4897e..0ce7f398bcff 100644
> --- a/drivers/gpu/drm/arc/arcpgu_hdmi.c
> +++ b/drivers/gpu/drm/arc/arcpgu_hdmi.c
> @@ -47,10 +47,7 @@ int arcpgu_drm_hdmi_init(struct drm_device *drm, struct 
> device_node *np)
>   return ret;
>
>   /* Link drm_bridge to encoder */
> - bridge->encoder = encoder;
> - encoder->bridge = bridge;
> -
> - ret = drm_bridge_attach(drm, bridge);
> + ret = drm_bridge_attach(encoder, bridge, NULL);
>   if (ret)
>   drm_encoder_cleanup(encoder);
>
> diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c 
> b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c
> index 6119b5085501..e7799b6ee829 100644
> --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c
> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c
> @@ -230,9 +230,7 @@ static int atmel_hlcdc_attach_endpoint(struct drm_device 
> *dev,
>   of_node_put(np);
>
>   if (bridge) {
> - output->encoder.bridge = bridge;
> - bridge->encoder = &output->encoder;
> - ret = drm_bridge_attach(dev, bridge);
> + ret = drm_bridge_attach(&output->encoder, bridge, NULL);
>   if (!ret)
>   return 0;
>   }
> diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
> b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> index eb9bf8786c24..b7494c8d43fe 100644
> --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> @@ -1227,12 +1227,10 @@ static int analogix_dp_create_bridge(struct 
> drm_device *drm_dev,
>
>   dp->bridge = bridge;
>
> - dp->encoder->bridge = bridge;
>   bridge->driver_private = dp;
> - bridge->encoder = dp->encoder;
>   bridge->funcs = &analogix_dp_bridge_funcs;
>
> - ret = drm_bridge_attach(drm_dev, bridge);
> + ret = drm_bridge_attach(dp->encoder, bridge, NULL);
>   if (ret) {
>   DRM_ERROR("failed to attach drm bridge\n");
>   return -EINVAL;
> diff --git a/drivers/gpu/drm/bridge/dw-hdmi.c 
> b/drivers/gpu/drm/bridge/dw-hdmi.c
> index 235ce7d1583d..f5009ae39b89 100644
> --- a/drivers/gpu/drm/bridge/dw-hdmi.c
> +++ b/drivers/gpu/drm/bridge/dw-hdmi.c
> @@ -1841,13 +1841,12 @@ static int dw_hdmi_register(struct drm_device *drm, 
> struct dw_hdmi *hdmi)
>   hdmi->bridge = bridge;
>   bridge->driver_private = hdmi;
>   bridge->funcs = &dw_hdmi_bridge_funcs;
> - ret = drm_bridge_attach(drm, bridge);
>

[Intel-gfx] [bug report] drm/i915: Small compaction of the engine init code

2016-12-16 Thread Tvrtko Ursulin

On 16/12/2016 08:02, Tvrtko Ursulin wrote:
>
> On 15/12/2016 20:54, Chris Wilson wrote:
>> On Thu, Dec 15, 2016 at 11:44:13PM +0300, Dan Carpenter wrote:
>>> Hello Tvrtko Ursulin,
>>>
>>> The patch a19d6ff29a82: "drm/i915: Small compaction of the engine
>>> init code" from Jun 23, 2016, leads to the following static checker
>>> warning:
>>>
>>> drivers/gpu/drm/i915/intel_lrc.c:1973 logical_render_ring_init()
>>> warn: passing freed memory 'engine'
>>>
>>> drivers/gpu/drm/i915/intel_lrc.c
>>>   1970
>>>   1971  ret = logical_ring_init(engine);
>>>   1972  if (ret) {
>>>   1973  lrc_destroy_wa_ctx_obj(engine);
>>>
>>> The problem is that logical_ring_init() frees "engine" on the error
>>> path so this is a use after free.
>>
>> And calls lrc_destroy_wa_ctx_obj() in the process. So we can
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 067394b0a769..1c1bad8ae7b0 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1940,12 +1940,7 @@ int logical_render_ring_init(struct
>> intel_engine_cs *engine)
>>   ret);
>> }
>>
>> -   ret = logical_ring_init(engine);
>> -   if (ret) {
>> -   lrc_destroy_wa_ctx_obj(engine);
>> -   }
>> -
>> -   return ret;
>> +   return logical_ring_init(engine);
>>  }
>>
>>  int logical_xcs_ring_init(struct intel_engine_cs *engine)
>>
>
> I've marked this as TODO for later.
>
> Interesting that it was detected only now. Even more so, the referenced
> commit just moved the code around.
>
> I think that the real issue was introduced around "drm/i915/gen8:
> Re-order init pipe_control in lrc mode".

Actually it was probably the dynamic engine allocation patch.

Regards,

Tvrtko


[PATCH v2 01/40] drm/i915: Use the MRU stack search after evicting

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:46 +, Chris Wilson wrote:
> When we evict from the GTT to make room for an object, the hole we
> create is put onto the MRU stack inside the drm_mm range manager. On the
> next search pass, we can speed up a PIN_HIGH allocation by referencing
> that stack for the new hole.
> 
> v2: Pull together the 3 identical implements (ahem, a couple were
> outdated) into a common routine for allocating a node and evicting as
> necessary.
> 
> Signed-off-by: Chris Wilson 
> Reviewed-by: Joonas Lahtinen  #v1

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v2 03/40] drm: Add drm_mm_for_each_node_safe()

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:46 +, Chris Wilson wrote:
> A complement to drm_mm_for_each_node(), wraps list_for_each_entry_safe()
> for walking the list of nodes safe against removal.
> 

Most of the diff is about __drm_mm_nodes(mm), which could be split into
own patch and keep the R-b's.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v4 3/5] drm: bridge: Link encoder and bridge in core code

2016-12-16 Thread Jyri Sarha
On 12/14/16 11:59, Laurent Pinchart wrote:
> Instead of linking encoders and bridges in every driver (and getting it
> wrong half of the time, as many drivers forget to set the drm_bridge
> encoder pointer), do so in core code. The drm_bridge_attach() function
> needs the encoder and optional previous bridge to perform that task,
> update all the callers.
> 
> Signed-off-by: Laurent Pinchart 
> Acked-by: Stefan Agner  # For DCU
> Acked-by: Boris Brezillon  # For 
> atmel-hlcdc
> Acked-by: Vincent Abriou  # For STI

Acked-by: Jyri Sarha  # For tilcdc

> ---
>  drivers/gpu/drm/arc/arcpgu_hdmi.c  |  5 +--
>  drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_output.c   |  4 +-
>  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c |  4 +-
>  drivers/gpu/drm/bridge/dw-hdmi.c   |  3 +-
>  drivers/gpu/drm/drm_bridge.c   | 46 
> --
>  drivers/gpu/drm/drm_simple_kms_helper.c|  4 +-
>  drivers/gpu/drm/exynos/exynos_dp.c |  5 +--
>  drivers/gpu/drm/exynos/exynos_drm_dsi.c|  6 +--
>  drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c  |  5 +--
>  drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c   |  5 +--
>  drivers/gpu/drm/imx/imx-ldb.c  |  6 +--
>  drivers/gpu/drm/imx/parallel-display.c |  4 +-
>  drivers/gpu/drm/mediatek/mtk_dpi.c |  8 ++--
>  drivers/gpu/drm/mediatek/mtk_dsi.c | 24 ++-
>  drivers/gpu/drm/mediatek/mtk_hdmi.c| 11 +++---
>  drivers/gpu/drm/msm/dsi/dsi_manager.c  | 17 +---
>  drivers/gpu/drm/msm/edp/edp_bridge.c   |  2 +-
>  drivers/gpu/drm/msm/hdmi/hdmi_bridge.c |  2 +-
>  drivers/gpu/drm/rcar-du/rcar_du_hdmienc.c  |  5 +--
>  drivers/gpu/drm/sti/sti_dvo.c  |  3 +-
>  drivers/gpu/drm/sti/sti_hda.c  |  3 +-
>  drivers/gpu/drm/sti/sti_hdmi.c |  3 +-
>  drivers/gpu/drm/sun4i/sun4i_rgb.c  | 13 +++---
>  drivers/gpu/drm/tilcdc/tilcdc_external.c   |  4 +-
...
> diff --git a/drivers/gpu/drm/tilcdc/tilcdc_external.c 
> b/drivers/gpu/drm/tilcdc/tilcdc_external.c
> index c67d7cd7d57e..b0dd5e8634ae 100644
> --- a/drivers/gpu/drm/tilcdc/tilcdc_external.c
> +++ b/drivers/gpu/drm/tilcdc/tilcdc_external.c
> @@ -167,10 +167,8 @@ int tilcdc_attach_bridge(struct drm_device *ddev, struct 
> drm_bridge *bridge)
>   int ret;
>  
>   priv->external_encoder->possible_crtcs = BIT(0);
> - priv->external_encoder->bridge = bridge;
> - bridge->encoder = priv->external_encoder;
>  
> - ret = drm_bridge_attach(ddev, bridge);
> + ret = drm_bridge_attach(priv->external_encoder, bridge, NULL);
>   if (ret) {
>   dev_err(ddev->dev, "drm_bridge_attach() failed %d\n", ret);
>   return ret;
> diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
> index 530a1d6e8cde..94e5ee96b3b5 100644
> --- a/include/drm/drm_bridge.h
> +++ b/include/drm/drm_bridge.h
> @@ -201,7 +201,8 @@ struct drm_bridge {
>  int drm_bridge_add(struct drm_bridge *bridge);
>  void drm_bridge_remove(struct drm_bridge *bridge);
>  struct drm_bridge *of_drm_find_bridge(struct device_node *np);
> -int drm_bridge_attach(struct drm_device *dev, struct drm_bridge *bridge);
> +int drm_bridge_attach(struct drm_encoder *encoder, struct drm_bridge *bridge,
> +   struct drm_bridge *previous);
>  void drm_bridge_detach(struct drm_bridge *bridge);
>  
>  bool drm_bridge_mode_fixup(struct drm_bridge *bridge,
> 



[PATCH V2] drm/i915: relax uncritical udelay_range()

2016-12-16 Thread Jani Nikula
On Fri, 16 Dec 2016, Nicholas Mc Guire  wrote:
> udelay_range(1, 2) is inefficient and as discussions with Jani Nikula
>  unnecessary here. This replaces this
> tight setting with a relaxed delay of min=20 and max=50 which helps
> the hrtimer subsystem optimize timer handling.
>
> Fixes: commit be4fc046bed3 ("drm/i915: add VLV DSI PLL Calculations") 
> Link: http://lkml.org/lkml/2016/12/15/147
> Signed-off-by: Nicholas Mc Guire 

Pushed to drm-intel-next-queued, thanks for the patch.

BR,
Jani.

> ---
>
> V2: use relaxed uslee_range() rather than udelay
> fix documentation of changed timings
>
> Problem found by coccinelle:
>
> Patch was compile tested with: x86_64_defconfig (implies CONFIG_DRM_I915)
>
> Patch is against 4.9.0 (localversion-next is next-20161215)
>
>  drivers/gpu/drm/i915/intel_dsi_pll.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_dsi_pll.c 
> b/drivers/gpu/drm/i915/intel_dsi_pll.c
> index 56eff60..d210bc4 100644
> --- a/drivers/gpu/drm/i915/intel_dsi_pll.c
> +++ b/drivers/gpu/drm/i915/intel_dsi_pll.c
> @@ -156,8 +156,10 @@ static void vlv_enable_dsi_pll(struct intel_encoder 
> *encoder,
>   vlv_cck_write(dev_priv, CCK_REG_DSI_PLL_CONTROL,
> config->dsi_pll.ctrl & ~DSI_PLL_VCO_EN);
>  
> - /* wait at least 0.5 us after ungating before enabling VCO */
> - usleep_range(1, 10);
> + /* wait at least 0.5 us after ungating before enabling VCO,
> +  * allow hrtimer subsystem optimization by relaxing timing
> +  */
> + usleep_range(10, 50);
>  
>   vlv_cck_write(dev_priv, CCK_REG_DSI_PLL_CONTROL, config->dsi_pll.ctrl);

-- 
Jani Nikula, Intel Open Source Technology Center


[Bug 98869] Electronic Super Joy graphic artefacts (regression)

2016-12-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=98869

cosiekvfj at o2.pl changed:

   What|Removed |Added

 Attachment #128445|0   |1
is obsolete||

--- Comment #10 from cosiekvfj at o2.pl ---
Created attachment 128497
  --> https://bugs.freedesktop.org/attachment.cgi?id=128497&action=edit
apitrace

apitrace: loaded into /usr/bin/apitrace32
apitrace: unloaded from /usr/bin/apitrace32
apitrace: loaded into
/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy
Set current directory to
/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy
Found path:
/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy
Mono path[0] =
'/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy_Data/Managed'
Mono path[1] =
'/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy_Data/Mono'
Mono config path =
'/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy_Data/Mono/etc'
apitrace: tracing to
/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy.trace
apitrace: redirecting dlopen("libGL.so.1", 0x102)
apitrace: attempting to read configuration file:
/home/kacper/.config/apitrace/gltrace.conf
apitrace: warning: glVertexPointer: call will be faked due to pointer to user
memory
(https://github.com/apitrace/apitrace/blob/master/docs/BUGS.markdown#tracing)
apitrace: warning: glTexCoordPointer: call will be faked due to pointer to user
memory
(https://github.com/apitrace/apitrace/blob/master/docs/BUGS.markdown#tracing)
apitrace: warning: glColorPointer: call will be faked due to pointer to user
memory
(https://github.com/apitrace/apitrace/blob/master/docs/BUGS.markdown#tracing)
apitrace: unloaded from
/home/kacper/.local/share/Steam/steamapps/common/ElectronicSuperJoy/ElectronicSuperJoy

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/756ee6ae/attachment.html>


[PATCH v2 08/40] drm: Add a simple prime number generator

2016-12-16 Thread Lukas Wunner
On Fri, Dec 16, 2016 at 07:46:46AM +, Chris Wilson wrote:
> Prime numbers are interesting for testing components that use multiplies
> and divides, such as testing struct drm_mm alignment computations.
> 
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/Kconfig |   4 +
>  drivers/gpu/drm/Makefile|   1 +
>  drivers/gpu/drm/lib/drm_prime_numbers.c | 175 
> 
>  drivers/gpu/drm/lib/drm_prime_numbers.h |  10 ++
>  4 files changed, 190 insertions(+)
>  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.c
>  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.h

Hm, why not put this in lib/ ?  Don't see anything DRM-specific here
at first glance and this might be useful to others.  Or others might
come up with improvements and they'll be more likely to discover it
outside of DRM.

Same for the random permutations patch.

Thanks,

Lukas

> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 2e6ae95459e4..93895898d596 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -53,6 +53,7 @@ config DRM_DEBUG_MM_SELFTEST
>   depends on DRM
>   depends on DEBUG_KERNEL
>   select DRM_LIB_RANDOM
> + select DRM_LIB_PRIMES
>   default n
>   help
> This option provides a kernel module that can be used to test
> @@ -340,3 +341,6 @@ config DRM_LIB_RANDOM
>   bool
>   default n
>  
> +config DRM_LIB_PRIMES
> + bool
> + default n
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 0fa16275fdae..bbd390fa8914 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -19,6 +19,7 @@ drm-y   :=  drm_auth.o drm_bufs.o drm_cache.o \
>   drm_dumb_buffers.o drm_mode_config.o
>  
>  drm-$(CONFIG_DRM_LIB_RANDOM) += lib/drm_random.o
> +obj-$(CONFIG_DRM_LIB_PRIMES) += lib/drm_prime_numbers.o
>  obj-$(CONFIG_DRM_DEBUG_MM_SELFTEST) += selftests/test-drm_mm.o
>  
>  drm-$(CONFIG_COMPAT) += drm_ioc32.o
> diff --git a/drivers/gpu/drm/lib/drm_prime_numbers.c 
> b/drivers/gpu/drm/lib/drm_prime_numbers.c
> new file mode 100644
> index ..839563d9b787
> --- /dev/null
> +++ b/drivers/gpu/drm/lib/drm_prime_numbers.c
> @@ -0,0 +1,175 @@
> +#include 
> +#include 
> +#include 
> +
> +#include "drm_prime_numbers.h"
> +
> +static DEFINE_MUTEX(lock);
> +
> +static struct primes {
> + struct rcu_head rcu;
> + unsigned long last, sz;
> + unsigned long primes[];
> +} __rcu *primes;
> +
> +static bool slow_is_prime_number(unsigned long x)
> +{
> + unsigned long y = int_sqrt(x) + 1;
> +
> + while (y > 1) {
> + if ((x % y) == 0)
> + break;
> + y--;
> + }
> +
> + return y == 1;
> +}
> +
> +static unsigned long slow_next_prime_number(unsigned long x)
> +{
> + for (;;) {
> + if (slow_is_prime_number(++x))
> + return x;
> + }
> +}
> +
> +static unsigned long mark_multiples(unsigned long x,
> + unsigned long *p,
> + unsigned long start,
> + unsigned long end)
> +{
> + unsigned long m;
> +
> + m = 2 * x;
> + if (m < start)
> + m = (start / x + 1) * x;
> +
> + while (m < end) {
> + __clear_bit(m, p);
> + m += x;
> + }
> +
> + return x;
> +}
> +
> +static struct primes *expand(unsigned long x)
> +{
> + unsigned long sz, y, prev;
> + struct primes *p, *new;
> +
> + sz = x * x;
> + if (sz < x)
> + return NULL;
> +
> + mutex_lock(&lock);
> + p = rcu_dereference_protected(primes, lockdep_is_held(&lock));
> + if (p && x < p->last)
> + goto unlock;
> +
> + sz = round_up(sz, BITS_PER_LONG);
> + new = kmalloc(sizeof(*new) + sz / sizeof(long), GFP_KERNEL);
> + if (!new) {
> + p = NULL;
> + goto unlock;
> + }
> +
> + /* Where memory permits, track the primes using the
> +  * Sieve of Eratosthenes.
> +  */
> + if (p) {
> + prev = p->sz;
> + memcpy(new->primes, p->primes, prev / BITS_PER_LONG);
> + } else {
> + prev = 0;
> + }
> + memset(new->primes + prev / BITS_PER_LONG,
> +0xff, (sz - prev) / sizeof(long));
> + for (y = 2UL; y < sz; y = find_next_bit(new->primes, sz, y + 1))
> + new->last = mark_multiples(y, new->primes, prev, sz);
> + new->sz = sz;
> +
> + rcu_assign_pointer(primes, new);
> + if (p)
> + kfree_rcu(p, rcu);
> + p = new;
> +
> +unlock:
> + mutex_unlock(&lock);
> + return p;
> +}
> +
> +unsigned long drm_next_prime_number(unsigned long x)
> +{
> + struct primes *p;
> +
> + if (x < 2)
> + return 2;
> +
> + rcu_read_lock();
> + p = rcu_dereference(primes);
> + if (!p || x >= p->last) {
> + rcu_read_unlo

[PATCH v2 07/40] drm: Add a simple generator of random permutations

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:46 +, Chris Wilson wrote:
> When testing, we want a random but yet reproducible order in which to
> process elements. Here we create an array which is a random (using the
> Tausworthe PRNG) permutation of the order in which to execute.
> 
> v2: Tidier code by David Herrmann
> 
> Signed-off-by: Chris Wilson 
> Cc: Joonas Lahtinen 
> Cc: David Herrmann 



> @@ -0,0 +1,41 @@
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "drm_random.h"
> +
> +static inline u32 prandom_u32_max_state(u32 ep_ro, struct rnd_state *state)
> +{
> + return upper_32_bits((u64)prandom_u32_state(state) * ep_ro);
> +}
> +

To be submitted upstream. If you prefix the function here, there won't
be a conflict when the upstream part gets merged.

> +++ b/drivers/gpu/drm/lib/drm_random.h
> @@ -0,0 +1,21 @@
> +#ifndef __DRM_RANDOM_H__
> +#define __DRM_RANDOM_H
> +
> +#include 
> +
> +#define RND_STATE_INITIALIZER(seed__) ({ \
> + struct rnd_state state__;   \
> + prandom_seed_state(&state__, (seed__)); \
> + state__;\
> +})
> +
> +#define RND_STATE(name__, seed__) \
> + struct rnd_state name__ = RND_STATE_INITIALIZER(seed__)

For upstream submission too. Same comment as above.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v2 09/40] drm: kselftest for drm_mm_init()

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:46 +, Chris Wilson wrote:
> Simple first test to just exercise initialisation of struct drm_mm.
> 
> Signed-off-by: Chris Wilson 

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v2 08/40] drm: Add a simple prime number generator

2016-12-16 Thread Chris Wilson
On Fri, Dec 16, 2016 at 10:31:17AM +0100, Lukas Wunner wrote:
> On Fri, Dec 16, 2016 at 07:46:46AM +, Chris Wilson wrote:
> > Prime numbers are interesting for testing components that use multiplies
> > and divides, such as testing struct drm_mm alignment computations.
> > 
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/Kconfig |   4 +
> >  drivers/gpu/drm/Makefile|   1 +
> >  drivers/gpu/drm/lib/drm_prime_numbers.c | 175 
> > 
> >  drivers/gpu/drm/lib/drm_prime_numbers.h |  10 ++
> >  4 files changed, 190 insertions(+)
> >  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.c
> >  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.h
> 
> Hm, why not put this in lib/ ?  Don't see anything DRM-specific here
> at first glance and this might be useful to others.  Or others might
> come up with improvements and they'll be more likely to discover it
> outside of DRM.

Because that is a 3+ month cycle before I can then apply the testcases,
and without the testscases do you want the bugfixes?

If I put in in drm/lib then lift it, I can use it immediately and drop
the local copy once merged.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[Bug 98869] Electronic Super Joy graphic artefacts (regression)

2016-12-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=98869

--- Comment #11 from cosiekvfj at o2.pl ---
"skl shows correct rendering" What does that mean?

I also tested app with LIBGL_ALWAYS_SOFTWARE=1 and then rendering is correct.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/f1096310/attachment.html>


[PATCH v2 10/40] drm: kselftest for drm_mm_debug()

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:46 +, Chris Wilson wrote:
> Simple test to just exercise calling the debug dumper on the drm_mm.
> 
> Signed-off-by: Chris Wilson 

This is rather meta already. Not entirely sure how good of a selftest
this is when we do not validate the generated output, or do you at the
runner side?

Code itself is (but I'm unsure of the usefulness as is);

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[Bug 98869] Electronic Super Joy graphic artefacts (regression)

2016-12-16 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=98869

--- Comment #12 from cosiekvfj at o2.pl ---
Comment on attachment 128497
  --> https://bugs.freedesktop.org/attachment.cgi?id=128497
apitrace

Longer apitrace

-- 
You are receiving this mail because:
You are the assignee for the bug.
-- next part --
An HTML attachment was scrubbed...
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/4475bcd8/attachment.html>


Radeon X200M device suspend problem

2016-12-16 Thread Michel Dänzer
On 16/12/16 01:29 AM, Dmitriy Kryuk wrote:
> I have a laptop with a Radeon X200M card in it. I use Radeon DRM driver
> for graphics, and it makes the system hang with display off when trying
> to suspend (either to disk or to RAM). Using /sys/power/pm_test
> interface revealed that it freezes when suspending devices.
> 
> I have tried both Debian repository kernel
> (https://packages.debian.org/stable/linux-image-3.16.0-4-686-pae) and a
> custom-built vanilla 3.18.45 kernel with this driver both built-in and
> included as a module. The problem reproduces the same way. It stops to
> reproduce if I delete the module radeon.ko or otherwise prevent it from
> loading. The problem didn't appear before I started using DRM.
> 
> dmitriy at laptop:~$ dmesg | grep -i radeon
> [   18.520307] [drm] radeon kernel modesetting enabled.
> [   18.521610] radeon :01:05.0: VRAM: 128M 0x7800 -
> 0x7FFF (128M used)
> [   18.521667] radeon :01:05.0: GTT: 512M 0x8000 -
> 0x9FFF
> [   18.526251] [drm] radeon: 128M of VRAM memory ready
> [   18.526303] [drm] radeon: 512M of GTT memory ready.
> [   18.558510] [drm] radeon: 1 quad pipes, 1 z pipes initialized.
> [   18.558685] radeon :01:05.0: WB enabled
> [   18.558739] radeon :01:05.0: fence driver on ring 0 use gpu addr
> 0x8000 and cpu addr 0xf48a6000
> [   18.558896] radeon :01:05.0: radeon: MSI limited to 32-bit
> [   18.558983] [drm] radeon: irq initialized.
> [   18.696262] [drm] radeon: ring at 0x80001000
> [   18.697892] [drm] Radeon Display Connectors
> [   18.736547] radeon :01:05.0: fb0: radeondrmfb frame buffer device
> [   18.736601] radeon :01:05.0: registered panic notifier
> [   18.736660] [drm] Initialized radeon 2.40.0 20080528 for :01:05.0
> on minor 0
> dmitriy at laptop:~$ lspci | grep -i radeon
> 01:05.0 VGA compatible controller: Advanced Micro Devices, Inc.
> [AMD/ATI] RC410M [Mobility Radeon Xpress 200M]
> 
> dmitriy at laptop:~$ lsmod
> Module  Size  Used by
> fbcon  42796  71
> bitblit12545  1 fbcon
> softcursor 12333  1 bitblit
> tileblit   12517  1 fbcon
> ath9k_htc  50765  0
> radeon   1438969  2
> ath9k_common   21530  1 ath9k_htc
> ath9k_hw  369801  2 ath9k_common,ath9k_htc
> ath21707  3 ath9k_common,ath9k_htc,ath9k_hw
> cfbfillrect12474  1 radeon
> cfbimgblt  12335  1 radeon
> cfbcopyarea12334  1 radeon
> i2c_algo_bit   12640  1 radeon
> drm_kms_helper 71528  1 radeon
> ttm67750  1 radeon
> drm   207864  5 ttm,drm_kms_helper,radeon
> 
> What additional information can I collect and how? What other kernel and
> driver versions can I try to see if the problem is already solved?

Please try a current kernel version; ideally 4.9, but definitely much
newer than 3.18.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer


[PATCH v2 10/40] drm: kselftest for drm_mm_debug()

2016-12-16 Thread Chris Wilson
On Fri, Dec 16, 2016 at 11:44:39AM +0200, Joonas Lahtinen wrote:
> On pe, 2016-12-16 at 07:46 +, Chris Wilson wrote:
> > Simple test to just exercise calling the debug dumper on the drm_mm.
> > 
> > Signed-off-by: Chris Wilson 
> 
> This is rather meta already. Not entirely sure how good of a selftest
> this is when we do not validate the generated output, or do you at the
> runner side?

No, it is just to ensure we have coverage of that function. All it does is
make sure it doesn't explode, under very tame circumstances.

I thought of doing a few mocks to capture the output and decided that
was asking a little too much for a debug function. Still don't have
coverage for the debugfs dumper...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH v2 08/40] drm: Add a simple prime number generator

2016-12-16 Thread Lukas Wunner
On Fri, Dec 16, 2016 at 09:43:54AM +, Chris Wilson wrote:
> On Fri, Dec 16, 2016 at 10:31:17AM +0100, Lukas Wunner wrote:
> > On Fri, Dec 16, 2016 at 07:46:46AM +, Chris Wilson wrote:
> > > Prime numbers are interesting for testing components that use multiplies
> > > and divides, such as testing struct drm_mm alignment computations.
> > > 
> > > Signed-off-by: Chris Wilson 
> > > ---
> > >  drivers/gpu/drm/Kconfig |   4 +
> > >  drivers/gpu/drm/Makefile|   1 +
> > >  drivers/gpu/drm/lib/drm_prime_numbers.c | 175 
> > > 
> > >  drivers/gpu/drm/lib/drm_prime_numbers.h |  10 ++
> > >  4 files changed, 190 insertions(+)
> > >  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.c
> > >  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.h
> > 
> > Hm, why not put this in lib/ ?  Don't see anything DRM-specific here
> > at first glance and this might be useful to others.  Or others might
> > come up with improvements and they'll be more likely to discover it
> > outside of DRM.
> 
> Because that is a 3+ month cycle before I can then apply the testcases,
> and without the testscases do you want the bugfixes?

Do patches for lib/ have to go through a different tree?
Don't think so, I've seen e.g. changes to lib/ucs2_string.c
go through the EFI tree.  It seems to me lib/ is sort of
free for all.


> If I put in in drm/lib then lift it, I can use it immediately and drop
> the local copy once merged.

That is also workable of course.  Anyway, it was just a suggestion.

Thanks,

Lukas


[PATCH v2 24/40] drm: Fix kerneldoc for drm_mm_scan_remove_block()

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:47 +, Chris Wilson wrote:
> The nodes must be removed in the *reverse* order. This is correct in the
> overview, but backwards in the function description. Whilst here add
> Intel's copyright statement and tweak some formatting.
> 
> Signed-off-by: Chris Wilson 

It's like a Christmas gift for Daniel.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v2 25/40] drm: Detect overflow in drm_mm_reserve_node()

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:47 +, Chris Wilson wrote:
> Protect ourselves from a caller passing in node.start + node.size that
> will overflow and trick us into reserving that node.
> 
> Signed-off-by: Chris Wilson 

I was about to suggest an additional check (but didn't). A combined
check is much more clever.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v2 08/40] drm: Add a simple prime number generator

2016-12-16 Thread Chris Wilson
On Fri, Dec 16, 2016 at 11:08:10AM +0100, Lukas Wunner wrote:
> On Fri, Dec 16, 2016 at 09:43:54AM +, Chris Wilson wrote:
> > On Fri, Dec 16, 2016 at 10:31:17AM +0100, Lukas Wunner wrote:
> > > On Fri, Dec 16, 2016 at 07:46:46AM +, Chris Wilson wrote:
> > > > Prime numbers are interesting for testing components that use multiplies
> > > > and divides, such as testing struct drm_mm alignment computations.
> > > > 
> > > > Signed-off-by: Chris Wilson 
> > > > ---
> > > >  drivers/gpu/drm/Kconfig |   4 +
> > > >  drivers/gpu/drm/Makefile|   1 +
> > > >  drivers/gpu/drm/lib/drm_prime_numbers.c | 175 
> > > > 
> > > >  drivers/gpu/drm/lib/drm_prime_numbers.h |  10 ++
> > > >  4 files changed, 190 insertions(+)
> > > >  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.c
> > > >  create mode 100644 drivers/gpu/drm/lib/drm_prime_numbers.h
> > > 
> > > Hm, why not put this in lib/ ?  Don't see anything DRM-specific here
> > > at first glance and this might be useful to others.  Or others might
> > > come up with improvements and they'll be more likely to discover it
> > > outside of DRM.
> > 
> > Because that is a 3+ month cycle before I can then apply the testcases,
> > and without the testscases do you want the bugfixes?
> 
> Do patches for lib/ have to go through a different tree?
> Don't think so, I've seen e.g. changes to lib/ucs2_string.c
> go through the EFI tree.  It seems to me lib/ is sort of
> free for all.

Hmm, I was expecting to shepherd them through say Andrew Morton.
lib/random32.c is maintained by David Miller, so definitely would like
to present a simple set of pre-reviewed patches.

But it looks like we could create lib/prime_numbers.c without too much
consternation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH 0/2] drm: link status property and DP link training failure handling

2016-12-16 Thread Jani Nikula
The two remaining patches from [1], rebased.

BR,
Jani.


[1] http://mid.mail-archive.com/1480984058-552-1-git-send-email-manasi.d.navare 
at intel.com


Manasi Navare (2):
  drm: Add a new connector atomic property for link status
  drm/i915: Implement Link Rate fallback on Link training failure

 drivers/gpu/drm/drm_atomic.c  | 16 +
 drivers/gpu/drm/drm_atomic_helper.c   | 15 
 drivers/gpu/drm/drm_connector.c   | 52 +++
 drivers/gpu/drm/i915/intel_dp.c   | 27 ++
 drivers/gpu/drm/i915/intel_dp_link_training.c | 22 ++--
 drivers/gpu/drm/i915/intel_drv.h  |  3 ++
 include/drm/drm_connector.h   | 19 ++
 include/drm/drm_mode_config.h |  5 +++
 include/uapi/drm/drm_mode.h   |  4 +++
 9 files changed, 161 insertions(+), 2 deletions(-)

-- 
2.1.4



[PATCH 1/2] drm: Add a new connector atomic property for link status

2016-12-16 Thread Jani Nikula
From: Manasi Navare 

At the time userspace does setcrtc, we've already promised the mode
would work. The promise is based on the theoretical capabilities of
the link, but it's possible we can't reach this in practice. The DP
spec describes how the link should be reduced, but we can't reduce
the link below the requirements of the mode. Black screen follows.

One idea would be to have setcrtc return a failure. However, it
already should not fail as the atomic checks have passed. It would
also conflict with the idea of making setcrtc asynchronous in the
future, returning before the actual mode setting and link training.

Another idea is to train the link "upfront" at hotplug time, before
pruning the mode list, so that we can do the pruning based on
practical not theoretical capabilities. However, the changes for link
training are pretty drastic, all for the sake of error handling and
DP compliance, when the most common happy day scenario is the current
approach of link training at mode setting time, using the optimal
parameters for the mode. It is also not certain all hardware could do
this without the pipe on; not even all our hardware can do this. Some
of this can be solved, but not trivially.

Both of the above ideas also fail to address link degradation *during*
operation.

The solution is to add a new "link-status" connector property in order
to address link training failure in a way that:
a) changes the current happy day scenario as little as possible, to
avoid regressions, b) can be implemented the same way by all drm
drivers, c) is still opt-in for the drivers and userspace, and opting
out doesn't regress the user experience, d) doesn't prevent drivers
from implementing better or alternate approaches, possibly without
userspace involvement. And, of course, handles all the issues presented.
In the usual happy day scenario, this is always "good". If something
fails during or after a mode set, the kernel driver can set the link
status to "bad" and issue a hotplug uevent for userspace to have it
re-check the valid modes through GET_CONNECTOR IOCTL, and try modeset
again. If the theoretical capabilities of the link can't be reached,
the mode list is trimmed based on that.

v7 by Jani:
* Rebase, simplify set property while at it, checkpatch fix
v6:
* Fix a typo in kernel doc (Sean Paul)
v5:
* Clarify doc for silent rejection of atomic properties by driver (Daniel 
Vetter)
v4:
* Add comments in kernel-doc format (Daniel Vetter)
* Update the kernel-doc for link-status (Sean Paul)
v3:
* Fixed a build error (Jani Saarinen)
v2:
* Removed connector->link_status (Daniel Vetter)
* Set connector->state->link_status in 
drm_mode_connector_set_link_status_property
(Daniel Vetter)
* Set the connector_changed flag to true if connector->state->link_status 
changed.
* Reset link_status to GOOD in update_output_state (Daniel Vetter)
* Never allow userspace to set link status from Good To Bad (Daniel Vetter)

Reviewed-by: Sean Paul 
Reviewed-by: Daniel Vetter 
Reviewed-by: Jani Nikula 
Acked-by: Tony Cheng 
Acked-by: Harry Wentland 
Cc: Jani Nikula 
Cc: Daniel Vetter 
Cc: Ville Syrjala 
Cc: Chris Wilson 
Cc: Sean Paul 
Signed-off-by: Manasi Navare 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/drm_atomic.c| 16 
 drivers/gpu/drm/drm_atomic_helper.c | 15 +++
 drivers/gpu/drm/drm_connector.c | 52 +
 include/drm/drm_connector.h | 19 ++
 include/drm/drm_mode_config.h   |  5 
 include/uapi/drm/drm_mode.h |  4 +++
 6 files changed, 111 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index ff38592134f5..91fd8a9a7526 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -,6 +,20 @@ int drm_atomic_connector_set_property(struct 
drm_connector *connector,
state->tv.saturation = val;
} else if (property == config->tv_hue_property) {
state->tv.hue = val;
+   } else if (property == config->link_status_property) {
+   /* Never downgrade from GOOD to BAD on userspace's request here,
+* only hw issues can do that.
+*
+* For an atomic property the userspace doesn't need to be able
+* to understand all the properties, but needs to be able to
+* restore the state it wants on VT switch. So if the userspace
+* tries to change the link_status from GOOD to BAD, driver
+* silently rejects it and returns a 0. This prevents userspace
+* from accidently breaking  the display when it restores the
+* state.
+*/
+   if (state->link_status != DRM_LINK_STATUS_GOOD)
+   state->link_status = val;
} else if (connector->funcs->atomic_set_property) {
return connector->funcs->atomic_set_property(connector,
   

[PATCH 2/2] drm/i915: Implement Link Rate fallback on Link training failure

2016-12-16 Thread Jani Nikula
From: Manasi Navare 

If link training at a link rate optimal for a particular
mode fails during modeset's atomic commit phase, then we
let the modeset complete and then retry. We save the link rate
value at which link training failed, update the link status property
to "BAD" and use a lower link rate to prune the modes. It will redo
the modeset on the current mode at lower link rate or if the current
mode gets pruned due to lower link constraints then, it will send a
hotplug uevent for userspace to handle it.

This is also required to pass DP CTS tests 4.3.1.3, 4.3.1.4,
4.3.1.6.

v9:
* Use the trimmed max values of link rate/lane count based on
link train fallback (Daniel Vetter)
v8:
* Set link_status to BAD first and then call mode_valid (Jani Nikula)
v7:
Remove the redundant variable in previous patch itself
v6:
* Obtain link rate index from fallback_link_rate using
the helper intel_dp_link_rate_index (Jani Nikula)
* Include fallback within intel_dp_start_link_train (Jani Nikula)
v5:
* Move set link status to drm core (Daniel Vetter, Jani Nikula)
v4:
* Add fallback support for non DDI platforms too
* Set connector->link status inside set_link_status function
(Jani Nikula)
v3:
* Set link status property to BAd unconditionally (Jani Nikula)
* Dont use two separate variables link_train_failed and link_status
to indicate same thing (Jani Nikula)
v2:
* Squashed a few patches (Jani Nikula)

Acked-by: Tony Cheng 
Acked-by: Harry Wentland 
Cc: Jani Nikula 
Cc: Daniel Vetter 
Cc: Ville Syrjala 
Signed-off-by: Manasi Navare 
Reviewed-by: Jani Nikula 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/intel_dp.c   | 27 +++
 drivers/gpu/drm/i915/intel_dp_link_training.c | 22 --
 drivers/gpu/drm/i915/intel_drv.h  |  3 +++
 3 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 45ebc9632633..97d1e03d22b8 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -5671,6 +5671,29 @@ static bool intel_edp_init_connector(struct intel_dp 
*intel_dp,
return false;
 }

+static void intel_dp_modeset_retry_work_fn(struct work_struct *work)
+{
+   struct intel_connector *intel_connector;
+   struct drm_connector *connector;
+
+   intel_connector = container_of(work, typeof(*intel_connector),
+  modeset_retry_work);
+   connector = &intel_connector->base;
+   DRM_DEBUG_KMS("[CONNECTOR:%d:%s]\n", connector->base.id,
+ connector->name);
+
+   /* Grab the locks before changing connector property*/
+   mutex_lock(&connector->dev->mode_config.mutex);
+   /* Set connector link status to BAD and send a Uevent to notify
+* userspace to do a modeset.
+*/
+   drm_mode_connector_set_link_status_property(connector,
+   DRM_MODE_LINK_STATUS_BAD);
+   mutex_unlock(&connector->dev->mode_config.mutex);
+   /* Send Hotplug uevent so userspace can reprobe */
+   drm_kms_helper_hotplug_event(connector->dev);
+}
+
 bool
 intel_dp_init_connector(struct intel_digital_port *intel_dig_port,
struct intel_connector *intel_connector)
@@ -5683,6 +5706,10 @@ intel_dp_init_connector(struct intel_digital_port 
*intel_dig_port,
enum port port = intel_dig_port->port;
int type;

+   /* Initialize the work for modeset in case of link train failure */
+   INIT_WORK(&intel_connector->modeset_retry_work,
+ intel_dp_modeset_retry_work_fn);
+
if (WARN(intel_dig_port->max_lanes < 1,
 "Not enough lanes (%d) for DP on port %c\n",
 intel_dig_port->max_lanes, port_name(port)))
diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c 
b/drivers/gpu/drm/i915/intel_dp_link_training.c
index 0048b520baf7..955b239e7c2d 100644
--- a/drivers/gpu/drm/i915/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
@@ -313,6 +313,24 @@ void intel_dp_stop_link_train(struct intel_dp *intel_dp)
 void
 intel_dp_start_link_train(struct intel_dp *intel_dp)
 {
-   intel_dp_link_training_clock_recovery(intel_dp);
-   intel_dp_link_training_channel_equalization(intel_dp);
+   struct intel_connector *intel_connector = intel_dp->attached_connector;
+
+   if (!intel_dp_link_training_clock_recovery(intel_dp))
+   goto failure_handling;
+   if (!intel_dp_link_training_channel_equalization(intel_dp))
+   goto failure_handling;
+
+   DRM_DEBUG_KMS("Link Training Passed at Link Rate = %d, Lane count = %d",
+ intel_dp->link_rate, intel_dp->lane_count);
+   return;
+
+ failure_handling:
+   DRM_DEBUG_KMS("Link Training failed at link rate = %d, lane count = %d",
+ intel_dp->link_rate, intel_dp->lane_count);
+   if (!intel_dp_get_lin

[RFC v2 00/11] vb2: Handle user cache hints, allow drivers to choose cache coherency

2016-12-16 Thread Hans Verkuil
On 16/12/16 02:24, Laurent Pinchart wrote:
> Hello,
>
> This is a rebased version of the vb2 cache hints support patch series posted
> by Sakari more than a year ago. The patches have been modified as needed by
> the upstream changes and received the occasional small odd fix but are
> otherwise not modified. Please see the individual commit messages for more
> information.
>
> The videobuf2 memory managers use the DMA mapping API to handle cache
> synchronization on systems that require them transparently for drivers. As
> cache operations are expensive, system performances can be impacted. Cache
> synchronization can't be skipped altogether if we want to retain correct
> behaviour, but optimizations are possible in cases related to buffer sharing
> between multiple devices without CPU access to the memory.
>
> The first optimization covers cases where the memory never needs to be
> accessed by the CPU (neither in kernelspace nor in userspace). In those cases,
> as no CPU memory mappings exist, cache synchronization can be skipped. The
> situation could be detected in the kernel as we have enough information to
> determine whether CPU mappings for kernelspace or userspace exist (in the
> first case because drivers should request them explicitly, in the second case
> because the mmap() handler hasn't been invoked). This optimization is not
> implemented currently but should at least be prototyped as it could improve
> performances automatically in a large number of cases.
>
> The second class of optimizations cover cases where the memory sometimes needs
> to be accessed by the CPU. In those cases memory mapping must be created and
> cache handled, but cache synchronization could be skipped for buffer that are
> not touched by the CPU.
>
> By default the following cache synchronization operations need to be performed
> related to the buffer management ioctls. For simplicity means of QBUF below
> apply to buf VIDIOC_QBUF and VIDIOC_PREPARE_BUF.
>
>   | QBUF  | DQBUF
>   
>   CAPTURE | Invalidate| Invalidate (*)
>   OUTPUT  | Clean | -
>
> (*) for systems using speculative pre-fetching only
>
> The following cases can be optimized.
>
> 1. CAPTURE, the CPU has not written to the buffer before QBUF
>
>Cache invalidation can be skipped at QBUF time, but becomes required at
>DQBUF time on all systems, regardless of whether they use speculative
>prefetching.
>
> 2. CAPTURE, the CPU will not read from the buffer after DQBUF
>
>Cache invalidation can be skipped at DQBUF time.
>
> 3. CAPTURE, combination of (1) and (2)
>
>Cache invalidation can be skipped at both QBUF and DQBUF time.
>
> 4. OUTPUT, the CPU has not written to the buffer before QBUF
>
>Cache clean can be skipped at QBUF time.
>
>
> The kernel can't detect thoses situations automatically and thus requires
> hints from userspace to decide whether cache synchronization can be skipped.
> It should be noted that those hints might not be honoured. In particular, if
> userspace hints that it hasn't touched the buffer with the CPU, drivers might
> need to perform memory accesses themselves (adding JPEG or MPEG headers to
> buffers is a common case where CPU access could be needed in the kernel), in
> which case the userspace hints will be ignored.
>
> Getting the hints wrong will result in data corruption. Userspace applications
> are allowed to shoot themselves in the foot, but driver are responsible for
> deciding whether data corruption can pose a risk to the system in general. For
> instance if the device could be made to crash, or behave in a way that would
> jeopardize system security, reliability or performances, when fed with invalid
> data, cache synchronization shall not be skipped solely due to possibly
> incorrect userspace hints.
>
> The V4L2 API defines two flags, V4L2-BUF-FLAG-NO-CACHE-INVALIDATE and
> V4L2_BUF_FLAG_NO_CACHE_SYNC, that can be used to provide cache-related hints
> to the kernel. However, no kernel has ever implemented support for those flags
> that are thus most likely unused.
>
> A single flag is enough to cover all the optimization cases described above,
> provided we keep track of the flag being set at QBUF time to force cache
> invalidation at DQBUF time for case (1) if the  flag isn't set at DQBUF time.
> This patch series thus cleans up the userspace API and merges both flags into
> a single one.
>
> One potential issue with case (1) is that cache invalidation at DQBUF time for
> CAPTURE buffers isn't fully under the control of videobuf2. We can instruct
> the DMA mapping API to skip cache handling, but we can't force it to
> invalidate the cache in the sync_for_cpu operation for non speculative
> prefetching systems. Luckily, on ARM32 the current implementation always
> invalidates the cache in __dma_page_dev_to_cpu() for CAPTURE buffers so we are
> safe fot now. However, this is documented by a FIXME comment that mig

[PATCH v4 19/22] drm: omapdrm: Simplify IRQ wait implementation

2016-12-16 Thread Tomi Valkeinen
_irq_wait_init(struct drm_device *dev,
>   uint32_t irqmask, int count)
>  {
> + struct omap_drm_private *priv = dev->dev_private;
>   struct omap_irq_wait *wait = kzalloc(sizeof(*wait), GFP_KERNEL);

A separate improvement, but I think we could just drop the
kzalloc/kfree, and use a local omap_irq_wait variable, passed to these
funcs.

> - wait->irq.irq = wait_irq;
> - wait->irq.irqmask = irqmask;
> + unsigned long flags;
> +
> + wait->irqmask = irqmask;
>   wait->count = count;
> - omap_irq_register(dev, &wait->irq);
> +
> + spin_lock_irqsave(&list_lock, flags);
> + list_add(&wait->node, &priv->wait_list);
> + omap_irq_update(dev);
> + spin_unlock_irqrestore(&list_lock, flags);
> +
>   return wait;
>  }
>  
> @@ -101,11 +75,16 @@ int omap_irq_wait(struct drm_device *dev, struct 
> omap_irq_wait *wait,
>   unsigned long timeout)
>  {
>   int ret = wait_event_timeout(wait_event, (wait->count <= 0), timeout);
> - omap_irq_unregister(dev, &wait->irq);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&list_lock, flags);
> + list_del(&wait->node);
> + omap_irq_update(dev);
> + spin_unlock_irqrestore(&list_lock, flags);
> +
>   kfree(wait);
> - if (ret == 0)
> - return -1;
> - return 0;
> +
> + return ret == 0 ? -1 : 0;
>  }
>  
>  /**
> @@ -213,7 +192,7 @@ static irqreturn_t omap_irq_handler(int irq, void *arg)
>  {
>   struct drm_device *dev = (struct drm_device *) arg;
>   struct omap_drm_private *priv = dev->dev_private;
> - struct omap_drm_irq *handler, *n;
> + struct omap_irq_wait *wait, *n;
>   unsigned long flags;
>   unsigned int id;
>   u32 irqstatus;
> @@ -241,12 +220,9 @@ static irqreturn_t omap_irq_handler(int irq, void *arg)
>   omap_irq_fifo_underflow(priv, irqstatus);
>  
>   spin_lock_irqsave(&list_lock, flags);
> - list_for_each_entry_safe(handler, n, &priv->irq_list, node) {
> - if (handler->irqmask & irqstatus) {
> - spin_unlock_irqrestore(&list_lock, flags);
> - handler->irq(handler);
> - spin_lock_irqsave(&list_lock, flags);
> - }
> + list_for_each_entry_safe(wait, n, &priv->wait_list, node) {
> + if (wait->irqmask & irqstatus)
> + omap_irq_wait_irq(wait);
>   }
>   spin_unlock_irqrestore(&list_lock, flags);
>  
> @@ -275,7 +251,7 @@ int omap_drm_irq_install(struct drm_device *dev)
>   unsigned int i;
>   int ret;
>  
> - INIT_LIST_HEAD(&priv->irq_list);
> + INIT_LIST_HEAD(&priv->wait_list);
>  
>   priv->irq_mask = DISPC_IRQ_OCP_ERR;
>  
> 

With the function rename change:

Reviewed-by: Tomi Valkeinen 

 Tomi

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/b8d02d52/attachment.sig>


[alsa-lib][PATCH] ASoC: hdmi-codec: use unsigned type to structure members with bit-field instead of signed type

2016-12-16 Thread Mark Brown
On Fri, Dec 16, 2016 at 06:26:54PM +0900, Takashi Sakamoto wrote:

> This is a fix for Linux 4.10-rc1.

This...

> CC: stable at vger.kernel.org
> Fixes: 09184118a8ab ("ASoC: hdmi-codec: Add hdmi-codec for external 
> HDMI-encoders")

...and this don't add up (though it is something that should go to
stable, good catch).  Please also try to keep your subject lines much
shorter.
-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/8509a187/attachment.sig>


[PATCH v4 20/22] drm: omapdrm: Remove global variables

2016-12-16 Thread Tomi Valkeinen
On 14/12/16 02:27, Laurent Pinchart wrote:
> Move the list of pending IRQ wait instances to the omap_drm_private
> structure and the wait queue head to the IRQ wait structure.
> 
> Signed-off-by: Laurent Pinchart 
> ---
>  drivers/gpu/drm/omapdrm/omap_drv.h |  3 ++-
>  drivers/gpu/drm/omapdrm/omap_irq.c | 42 
> --
>  2 files changed, 24 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/omapdrm/omap_drv.h 
> b/drivers/gpu/drm/omapdrm/omap_drv.h
> index 8ef7e8963bd9..b20377efd01b 100644
> --- a/drivers/gpu/drm/omapdrm/omap_drv.h
> +++ b/drivers/gpu/drm/omapdrm/omap_drv.h
> @@ -88,7 +88,8 @@ struct omap_drm_private {
>   struct drm_property *zorder_prop;
>  
>   /* irq handling: */
> - struct list_head wait_list; /* list of omap_irq_wait */
> + spinlock_t wait_lock;   /* protects the wait_list */
> + struct list_head wait_list; /* list of omap_irq_wait */
>   uint32_t irq_mask;  /* enabled irqs in addition to 
> wait_list */
>  
>   /* atomic commit */
> diff --git a/drivers/gpu/drm/omapdrm/omap_irq.c 
> b/drivers/gpu/drm/omapdrm/omap_irq.c
> index f9510c13e1a2..7555b62f6c53 100644
> --- a/drivers/gpu/drm/omapdrm/omap_irq.c
> +++ b/drivers/gpu/drm/omapdrm/omap_irq.c
> @@ -19,22 +19,21 @@
>  
>  #include "omap_drv.h"
>  
> -static DEFINE_SPINLOCK(list_lock);
> -
>  struct omap_irq_wait {
>   struct list_head node;
> + wait_queue_head_t wq;
>   uint32_t irqmask;
>   int count;
>  };

The wait_queue_head_t + count combination looks suspiciously like
completion. But as there's no wait_for_completion_count(), I guess the
current implementation is better.

Reviewed-by: Tomi Valkeinen 

 Tomi

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/16a89e89/attachment-0001.sig>


Applied "ASoC: hdmi-codec: use unsigned type to structure members with bit-field" to the asoc tree

2016-12-16 Thread Mark Brown
The patch

   ASoC: hdmi-codec: use unsigned type to structure members with bit-field

has been applied to the asoc tree at

   git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 9e4d59ada4d602e78eee9fb5f898ce61fdddb446 Mon Sep 17 00:00:00 2001
From: Takashi Sakamoto 
Date: Fri, 16 Dec 2016 18:26:54 +0900
Subject: [PATCH] ASoC: hdmi-codec: use unsigned type to structure members with
 bit-field

This is a fix for Linux 4.10-rc1.

In C language specification, a bit-field is interpreted as a signed or
unsigned integer type consisting of the specified number of bits.

In GCC manual, the range of a signed bit field of N bits is from
-(2^N) / 2 to ((2^N) / 2) - 1
https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html#Bit-Fields

Therefore, when defined as 1 bit-field with signed type, variables can
represents -1 and 0.

The snd-soc-hdmi-codec module includes a structure which has signed type
members with bit-fields. Codes of this module assign 0 and 1 to the
members. This seems to result in implementation-dependent behaviours.

As of v4.10-rc1 merge window, outside of sound subsystem, this structure
is referred by below GPU modules.
 - tda998x
 - sti-drm
 - mediatek-drm-hdmi
 - msm

As long as I review their codes relevant to the structure, the structure
members are used just for condition statements and printk formats.
My proposal of change is a bit intrusive to the printk formats but this
may be acceptable.

Totally, it's reasonable to use unsigned type for the structure members.
This bug is detected by Sparse, static code analyzer with below warnings.

./include/sound/hdmi-codec.h:39:26: error: dubious one-bit signed bitfield
./include/sound/hdmi-codec.h:40:28: error: dubious one-bit signed bitfield
./include/sound/hdmi-codec.h:41:29: error: dubious one-bit signed bitfield
./include/sound/hdmi-codec.h:42:31: error: dubious one-bit signed bitfield

Fixes: 09184118a8ab ("ASoC: hdmi-codec: Add hdmi-codec for external 
HDMI-encoders")
Signed-off-by: Takashi Sakamoto 
Acked-by: Arnaud Pouliquen 
Signed-off-by: Mark Brown 
CC: stable at vger.kernel.org
---
 include/sound/hdmi-codec.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/sound/hdmi-codec.h b/include/sound/hdmi-codec.h
index 530c57bdefa0..915c4357945c 100644
--- a/include/sound/hdmi-codec.h
+++ b/include/sound/hdmi-codec.h
@@ -36,10 +36,10 @@ struct hdmi_codec_daifmt {
HDMI_AC97,
HDMI_SPDIF,
} fmt;
-   int bit_clk_inv:1;
-   int frame_clk_inv:1;
-   int bit_clk_master:1;
-   int frame_clk_master:1;
+   unsigned int bit_clk_inv:1;
+   unsigned int frame_clk_inv:1;
+   unsigned int bit_clk_master:1;
+   unsigned int frame_clk_master:1;
 };

 /*
-- 
2.11.0



[PATCH v4 22/22] drm: omapdrm: Perform initialization/cleanup at probe/remove time

2016-12-16 Thread Tomi Valkeinen
  {
>   file->driver_priv = NULL;
> @@ -806,8 +712,6 @@ static const struct file_operations omapdriver_fops = {
>  static struct drm_driver omap_drm_driver = {
>   .driver_features = DRIVER_MODESET | DRIVER_GEM  | DRIVER_PRIME |
>   DRIVER_ATOMIC,
> - .load = dev_load,
> - .unload = dev_unload,
>   .open = dev_open,
>   .lastclose = dev_lastclose,
>   .get_vblank_counter = drm_vblank_no_hw_counter,
> @@ -837,30 +741,128 @@ static struct drm_driver omap_drm_driver = {
>   .patchlevel = DRIVER_PATCHLEVEL,
>  };
>  
> -static int pdev_probe(struct platform_device *device)
> +static int pdev_probe(struct platform_device *pdev)
>  {
> - int r;
> + struct omap_drm_platform_data *pdata = pdev->dev.platform_data;
> + struct omap_drm_private *priv;
> + struct drm_device *ddev;
> + unsigned int i;
> + int ret;
> +
> + DBG("%s", pdev->name);
>  
>   if (omapdss_is_initialized() == false)
>   return -EPROBE_DEFER;
>  
>   omap_crtc_pre_init();
>  
> - r = omap_connect_dssdevs();
> - if (r) {
> + ret = omap_connect_dssdevs();
> + if (ret) {
>   omap_crtc_pre_uninit();
> - return r;
> + goto err_crtc_uninit;

This calls omap_crtc_pre_uninit here and in the err handler.

> + }
> +
> + /* Allocate and initialize the driver private structure. */
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (!priv) {
> + ret = -ENOMEM;
> + goto err_disconnect_dssdevs;
> + }
> +
> + priv->omaprev = pdata->omaprev;
> + priv->wq = alloc_ordered_workqueue("omapdrm", 0);
> +
> + init_waitqueue_head(&priv->commit.wait);
> + spin_lock_init(&priv->commit.lock);
> + spin_lock_init(&priv->list_lock);
> + INIT_LIST_HEAD(&priv->obj_list);
> +
> + /* Allocate and initialize the DRM device. */
> + ddev = drm_dev_alloc(&omap_drm_driver, &pdev->dev);
> + if (IS_ERR(ddev)) {
> + ret = PTR_ERR(ddev);
> + goto err_free_priv;
> + }
> +
> + ddev->dev_private = priv;
> + platform_set_drvdata(pdev, ddev);
> +
> + omap_gem_init(ddev);
> +
> + ret = omap_modeset_init(ddev);
> + if (ret) {
> + dev_err(&pdev->dev, "omap_modeset_init failed: ret=%d\n", ret);
> + goto err_free_drm_dev;
> + }
> +
> + /* Initialize vblank handling, start with all CRTCs disabled. */
> + ret = drm_vblank_init(ddev, priv->num_crtcs);
> + if (ret) {
> + dev_err(&pdev->dev, "could not init vblank\n");
> + goto err_cleanup_modeset;
>   }
>  
> - DBG("%s", device->name);
> - return drm_platform_init(&omap_drm_driver, device);
> + for (i = 0; i < priv->num_crtcs; i++)
> + drm_crtc_vblank_off(priv->crtcs[i]);
> +
> + priv->fbdev = omap_fbdev_init(ddev);
> +
> + drm_kms_helper_poll_init(ddev);
> +
> + /*
> +  * Register the DRM device with the core and the connectors with
> +  * sysfs.
> +  */
> + ret = drm_dev_register(ddev, 0);
> + if (ret)
> + goto err_cleanup_helpers;
> +
> + return 0;
> +
> +err_cleanup_helpers:
> + drm_kms_helper_poll_fini(ddev);
> + if (priv->fbdev)
> + omap_fbdev_free(ddev);
> + drm_vblank_cleanup(ddev);
> +err_cleanup_modeset:
> + drm_mode_config_cleanup(ddev);
> + omap_drm_irq_uninstall(ddev);
> +err_free_drm_dev:
> + omap_gem_deinit(ddev);
> + drm_dev_unref(ddev);
> +err_free_priv:
> + destroy_workqueue(priv->wq);
> + kfree(priv);
> +err_disconnect_dssdevs:
> + omap_disconnect_dssdevs();
> +err_crtc_uninit:
> + omap_crtc_pre_uninit();
> + return ret;
>  }
>  
> -static int pdev_remove(struct platform_device *device)
> +static int pdev_remove(struct platform_device *pdev)
>  {
> + struct drm_device *ddev = platform_get_drvdata(pdev);
> + struct omap_drm_private *priv = ddev->dev_private;
> +
>   DBG("");
>  
> - drm_put_dev(platform_get_drvdata(device));
> + drm_dev_unregister(ddev);
> +
> + drm_kms_helper_poll_fini(ddev);
> +
> + if (priv->fbdev)
> + omap_fbdev_free(ddev);
> +
> + drm_mode_config_cleanup(ddev);
> +
> + omap_drm_irq_uninstall(ddev);
> + omap_gem_deinit(ddev);
> +
> + drm_dev_unref(ddev);
> +
> + destroy_workqueue(priv->wq);
> + kfree(priv);
>  
>   omap_disconnect_dssdevs();
>   omap_crtc_pre_uninit();
> 

The old code calls drm_vblank_cleanup(), and the probe's error handling
calls that, but not remove. Is that correct?

 Tomi

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20161216/7cdff755/attachment.sig>


[PATCH v2 37/40] drm: Apply range restriction after color adjustment when allocation

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:47 +, Chris Wilson wrote:
> mm->color_adjust() compares the hole with its neighbouring nodes. They
> only abutt before we restrict the hole, so we have to apply color_adjust
> before we apply the range restriction.
> 
> Signed-off-by: Chris Wilson 

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


[PATCH v2 38/40] drm: Use drm_mm_insert_node_in_range_generic() for everyone

2016-12-16 Thread Joonas Lahtinen
On pe, 2016-12-16 at 07:47 +, Chris Wilson wrote:
> Remove a superfluous helper as drm_mm_insert_node is equivalent to
> insert_node_in_range with a range of (0, U64_MAX).
> 
> Signed-off-by: Chris Wilson 

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation


  1   2   3   >