Re: [PATCH] [Draft]: media: videobuf2-dma-heap: add a vendor defined memory runtine

2022-08-07 Thread Hsia-Jun Li




On 8/5/22 18:09, Tomasz Figa wrote:

CAUTION: Email originated externally, do not click links or open attachments 
unless you recognize the sender and know the content is safe.


On Tue, Aug 2, 2022 at 9:21 PM ayaka  wrote:


Sorry, the previous one contains html data.


On Aug 2, 2022, at 3:33 PM, Tomasz Figa  wrote:

On Mon, Aug 1, 2022 at 8:43 PM ayaka  wrote:

Sent from my iPad

On Aug 1, 2022, at 5:46 PM, Tomasz Figa  wrote:

CAUTION: Email originated externally, do not click links or open attachments 
unless you recognize the sender and know the content is safe.

On Mon, Aug 1, 2022 at 3:44 PM Hsia-Jun Li  wrote:

On 8/1/22 14:19, Tomasz Figa wrote:

Hello Tomasz

?Hi Randy,
On Mon, Aug 1, 2022 at 5:21 AM  wrote:

From: Randy Li 
This module is still at a early stage, I wrote this for showing what
APIs we need here.
Let me explain why we need such a module here.
If you won't allocate buffers from a V4L2 M2M device, this module
may not be very useful. I am sure the most of users won't know a
device would require them allocate buffers from a DMA-Heap then
import those buffers into a V4L2's queue.
Then the question goes back to why DMA-Heap. From the Android's
description, we know it is about the copyright's DRM.
When we allocate a buffer in a DMA-Heap, it may register that buffer
in the trusted execution environment so the firmware which is running
or could only be acccesed from there could use that buffer later.
The answer above leads to another thing which is not done in this
version, the DMA mapping. Although in some platforms, a DMA-Heap
responses a IOMMU device as well. For the genernal purpose, we would
be better assuming the device mapping should be done for each device
itself. The problem here we only know alloc_devs in those DMAbuf
methods, which are DMA-heaps in my design, the device from the queue
is not enough, a plane may requests another IOMMU device or table
for mapping.
Signed-off-by: Randy Li 
---
drivers/media/common/videobuf2/Kconfig|   6 +
drivers/media/common/videobuf2/Makefile   |   1 +
.../common/videobuf2/videobuf2-dma-heap.c | 350 ++
include/media/videobuf2-dma-heap.h|  30 ++
4 files changed, 387 insertions(+)
create mode 100644 drivers/media/common/videobuf2/videobuf2-dma-heap.c
create mode 100644 include/media/videobuf2-dma-heap.h

First of all, thanks for the series.
Possibly a stupid question, but why not just allocate the DMA-bufs
directly from the DMA-buf heap device in the userspace and just import
the buffers to the V4L2 device using V4L2_MEMORY_DMABUF?

Sometimes the allocation policy could be very complex, let's suppose a
multiple planes pixel format enabling with frame buffer compression.
Its luma, chroma data could be allocated from a pool which is delegated
for large buffers while its metadata would come from a pool which many
users could take some few slices from it(likes system pool).
Then when we have a new users knowing nothing about this platform, if we
just configure the alloc_devs in each queues well. The user won't need
to know those complex rules.
The real situation could be more complex, Samsung MFC's left and right
banks could be regarded as two pools, many devices would benefit from
this either from the allocation times or the security buffers policy.
In our design, when we need to do some security decoding(DRM video),
codecs2 would allocate buffers from the pool delegated for that. While
the non-DRM video, users could not care about this.

I'm a little bit surprised about this, because on Android all the
graphics buffers are allocated from the system IAllocator and imported
to the specific devices.

In the non-tunnel mode, yes it is. While the tunnel mode is completely vendor 
defined. Neither HWC nor codec2 cares about where the buffers coming from, you 
could do what ever you want.
Besides there are DRM video in GNU Linux platform, I heard the webkit has made 
huge effort here and Playready is one could work in non-Android Linux.

Would it make sense to instead extend the UAPI to expose enough
information about the allocation requirements to the userspace, so it
can allocate correctly?

Yes, it could. But as I said it would need the users to do more works.

My reasoning here is that it's not a driver's decision to allocate
from a DMA-buf heap (and which one) or not. It's the userspace which
knows that, based on the specific use case that it wants to fulfill.

Although I would like to let the users decide that, users just can’t do that 
which would violate the security rules in some platforms.
For example,  video codec and display device could only access a region of 
memory, any other device or trusted apps can’t access it. Users have to 
allocate the buffer from the pool the vendor decided.
So why not we offer a quick way that users don’t need to try and error.


In principle, I'm not against integrating DMA-buf heap with vb2,
however I see some problems I mentioned before:

1) How would the driver k

Re: New subsystem for acceleration devices

2022-08-07 Thread Oded Gabbay
On Sun, Aug 7, 2022 at 9:43 AM Oded Gabbay  wrote:
>
> On Fri, Aug 5, 2022 at 3:22 AM Jason Gunthorpe  wrote:
> >
> > On Thu, Aug 04, 2022 at 08:48:28PM +0300, Oded Gabbay wrote:
> >
> > > > The flip is true of DRM - DRM is pretty general. I bet I could
> > > > implement an RDMA device under DRM - but that doesn't mean it should
> > > > be done.
> > > >
> > > > My biggest concern is that this subsystem not turn into a back door
> > > > for a bunch of abuse of kernel APIs going forward. Though things
> > > > are
> > >
> > > How do you suggest to make sure it won't happen ?
> >
> > Well, if you launch the subsystem then it is your job to make sure it
> > doesn't happen - or endure the complaints :)
> Understood, I'll make sure there is no room for complaints.
> >
> > Accelerators have this nasty tendancy to become co-designed with their
> > SOCs in nasty intricate ways and then there is a desire to punch
> > through all the inconvenient layers to make it go faster.
> >
> > > > better now, we still see this in DRM where expediency or performance
> > > > justifies hacky shortcuts instead of good in-kernel architecture. At
> > > > least DRM has reliable and experienced review these days.
> > > Definitely. DRM has some parts that are really well written. For
> > > example, the whole char device handling with sysfs/debugfs and managed
> > > resources code.
> >
> > Arguably this should all be common code in the driver core/etc - what
> > value is a subsystem adding beyond that besides using it properly?
>
> I mainly see two things here:
>
> 1. If there is a subsystem which is responsible for creating and
> exposing the device character files, then there should be some code
> that connects between each device driver to that subsystem.
> i.e. There should be functions that each driver should call from its
> probe and release callback functions.
>
> Those functions should take care of the following:
> - Create metadata for the device, the device's minor(s) and the
> driver's ioctls table and driver's callback for file operations (both
> are common for all the driver's devices). Save all that metadata with
> proper locking.
> - Create the device char files themselves and supply file operations
> that will be called per each open/close/mmap/etc.
> - Keep track of all these objects' lifetime in regard to the device
> driver's lifetime, with proper handling for release.
> - Add common handling and entries of sysfs/debugfs for these devices
> with the ability for each device driver to add their own unique
> entries.
>
> 2. I think that you underestimate (due to your experience) the "using
> it properly" part... It is not so easy to do this properly for
> inexperienced kernel people. If we provide all the code I mentioned
> above, the device driver writer doesn't need to be aware of all these
> kernel APIs.
>
Two more points I thought of as examples for added value to be done in
common code:
1. Common code for handling dma-buf. Very similar to what was added a
year ago to rdma. This can be accompanied by a common ioctl to export
and import a dma-buf.
2. Common code to handle drivers that want to allow a single user at a
time to run open the device char file.

Oded


> >
> > It would be nice to at least identify something that could obviously
> > be common, like some kind of enumeration and metadata kind of stuff
> > (think like ethtool, devlink, rdma tool, nvemctl etc)
> Definitely. I think we can have at least one ioctl that will be common
> to all drivers from the start.
> A kind of information retrieval ioctl. There are many information
> points that I'm sure are common to most accelerators. We have an
> extensive information ioctl in the habanalabs driver and most of the
> information there is not habana specific imo.
> >
> > > I think that it is clear from my previous email what I intended to
> > > provide. A clean, simple framework for devices to register with, get
> > > services for the most basic stuff such as device char handling,
> > > sysfs/debugfs.
> >
> > This should all be trivially done by any driver using the core codes,
> > if you see gaps I'd ask why not improve the core code?
> >
> > > Later on, add more simple stuff such as memory manager
> > > and uapi for memory handling. I guess someone can say all that exists
> > > in drm, but like I said it exists in other subsystems as well.
> >
> > This makes sense to me, all accelerators need a way to set a memory
> > map, but on the other hand we are doing some very nifty stuff in this
> > area with iommufd and it might be very interesting to just have the
> > accelerator driver link to that API instead of building yet another
> > copy of pin_user_pages() code.. Especially with PASID likely becoming
> > part of any accelerator toolkit.
> Here I disagree with you. First of all, there are many relatively
> simple accelerators, especially in edge, where PASID is really not
> relevant.
> Second, even for the more sophisticated PCIe/CXL-based ones, PASID is
> not ma

Re: [PATCH v2 3/3] efi: earlycon: Add support for generic framebuffers and move to console subsystem

2022-08-07 Thread kernel test robot
Hi Markuss,

I love your patch! Yet something to improve:

[auto build test ERROR on tty/tty-testing]
[also build test ERROR on efi/next staging/staging-testing usb/usb-testing 
linus/master v5.19 next-20220805]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Markuss-Broks/Add-generic-framebuffer-support-to-EFI-earlycon-driver/20220807-003646
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git 
tty-testing
config: sh-allmodconfig 
(https://download.01.org/0day-ci/archive/20220807/202208072145.ywogkrw1-...@intel.com/config)
compiler: sh4-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/97dfc2aa69b065de769a191352afe2099c52fedb
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Markuss-Broks/Add-generic-framebuffer-support-to-EFI-earlycon-driver/20220807-003646
git checkout 97dfc2aa69b065de769a191352afe2099c52fedb
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=sh SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/video/console/earlycon.c:7:10: fatal error: asm/early_ioremap.h: No 
>> such file or directory
   7 | #include 
 |  ^
   compilation terminated.


vim +7 drivers/video/console/earlycon.c

   > 7  #include 
 8  #include 
 9  #include 
10  #include 
11  #include 
12  #include 
13  #include 
14  #include 
15  #include 
16  #include 
17  #include 
18  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp


[PATCH] i2c: qcom-geni: Fix GPI DMA buffer sync-back

2022-08-07 Thread Robin Reckmann
Fix i2c transfers using GPI DMA mode for all message types that do not set
the I2C_M_DMA_SAFE flag (e.g. SMBus "read byte").

In this case a bounce buffer is returned by i2c_get_dma_safe_msg_buf(),
and it has to synced back to the message after the transfer is done.

Add missing assignment of dma buffer in geni_i2c_gpi().

Set xferred in i2c_put_dma_safe_msg_buf() to true in case of no error to
ensure the sync-back of this dma buffer to the message.

Signed-off-by: Robin Reckmann 
---
 drivers/i2c/busses/i2c-qcom-geni.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-qcom-geni.c 
b/drivers/i2c/busses/i2c-qcom-geni.c
index 6ac402ea58fb..d3541e94794e 100644
--- a/drivers/i2c/busses/i2c-qcom-geni.c
+++ b/drivers/i2c/busses/i2c-qcom-geni.c
@@ -484,12 +484,12 @@ static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, 
struct i2c_msg *msg,
 {
if (tx_buf) {
dma_unmap_single(gi2c->se.dev->parent, tx_addr, msg->len, 
DMA_TO_DEVICE);
-   i2c_put_dma_safe_msg_buf(tx_buf, msg, false);
+   i2c_put_dma_safe_msg_buf(tx_buf, msg, !gi2c->err);
}
 
if (rx_buf) {
dma_unmap_single(gi2c->se.dev->parent, rx_addr, msg->len, 
DMA_FROM_DEVICE);
-   i2c_put_dma_safe_msg_buf(rx_buf, msg, false);
+   i2c_put_dma_safe_msg_buf(rx_buf, msg, !gi2c->err);
}
 }
 
@@ -553,6 +553,7 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct 
i2c_msg *msg,
desc->callback_param = gi2c;
 
dmaengine_submit(desc);
+   *buf = dma_buf;
*dma_addr_p = addr;
 
return 0;
-- 
2.25.1



Re: [PATCH v2 1/3] drm/rockchip: define gamma registers for RK3399

2022-08-07 Thread Milan P . Stanić
Hi,

Tested it on gru-kevin with mainline kernel 5.19 and it works

On Tue, 2021-10-19 at 22:58, Hugh Cole-Baker wrote:
> The VOP on RK3399 has a different approach from previous versions for
> setting a gamma lookup table, using an update_gamma_lut register. As
> this differs from RK3288, give RK3399 its own set of "common" register
> definitions.
> 
> Signed-off-by: Hugh Cole-Baker 
Tested-by: "Milan P. Stanić" 
> ---
> 
> Changes from v1: no changes in this patch
> 
>  drivers/gpu/drm/rockchip/rockchip_drm_vop.h |  2 ++
>  drivers/gpu/drm/rockchip/rockchip_vop_reg.c | 24 +++--
>  drivers/gpu/drm/rockchip/rockchip_vop_reg.h |  1 +
>  3 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.h 
> b/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
> index 857d97cdc67c..14179e89bd21 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
> @@ -99,6 +99,8 @@ struct vop_common {
>   struct vop_reg dither_down_en;
>   struct vop_reg dither_up;
>   struct vop_reg dsp_lut_en;
> + struct vop_reg update_gamma_lut;
> + struct vop_reg lut_buffer_index;
>   struct vop_reg gate_en;
>   struct vop_reg mmu_en;
>   struct vop_reg out_mode;
> diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c 
> b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
> index ca7cc82125cb..bfb7e130f09b 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
> +++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
> @@ -865,6 +865,24 @@ static const struct vop_output rk3399_output = {
>   .mipi_dual_channel_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 3),
>  };
>  
> +static const struct vop_common rk3399_common = {
> + .standby = VOP_REG_SYNC(RK3399_SYS_CTRL, 0x1, 22),
> + .gate_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 23),
> + .mmu_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 20),
> + .dither_down_sel = VOP_REG(RK3399_DSP_CTRL1, 0x1, 4),
> + .dither_down_mode = VOP_REG(RK3399_DSP_CTRL1, 0x1, 3),
> + .dither_down_en = VOP_REG(RK3399_DSP_CTRL1, 0x1, 2),
> + .pre_dither_down = VOP_REG(RK3399_DSP_CTRL1, 0x1, 1),
> + .dither_up = VOP_REG(RK3399_DSP_CTRL1, 0x1, 6),
> + .dsp_lut_en = VOP_REG(RK3399_DSP_CTRL1, 0x1, 0),
> + .update_gamma_lut = VOP_REG(RK3399_DSP_CTRL1, 0x1, 7),
> + .lut_buffer_index = VOP_REG(RK3399_DBG_POST_REG1, 0x1, 1),
> + .data_blank = VOP_REG(RK3399_DSP_CTRL0, 0x1, 19),
> + .dsp_blank = VOP_REG(RK3399_DSP_CTRL0, 0x3, 18),
> + .out_mode = VOP_REG(RK3399_DSP_CTRL0, 0xf, 0),
> + .cfg_done = VOP_REG_SYNC(RK3399_REG_CFG_DONE, 0x1, 0),
> +};
> +
>  static const struct vop_yuv2yuv_phy rk3399_yuv2yuv_win01_data = {
>   .y2r_coefficients = {
>   VOP_REG(RK3399_WIN0_YUV2YUV_Y2R + 0, 0x, 0),
> @@ -944,7 +962,7 @@ static const struct vop_data rk3399_vop_big = {
>   .version = VOP_VERSION(3, 5),
>   .feature = VOP_FEATURE_OUTPUT_RGB10,
>   .intr = &rk3366_vop_intr,
> - .common = &rk3288_common,
> + .common = &rk3399_common,
>   .modeset = &rk3288_modeset,
>   .output = &rk3399_output,
>   .afbc = &rk3399_vop_afbc,
> @@ -952,6 +970,7 @@ static const struct vop_data rk3399_vop_big = {
>   .win = rk3399_vop_win_data,
>   .win_size = ARRAY_SIZE(rk3399_vop_win_data),
>   .win_yuv2yuv = rk3399_vop_big_win_yuv2yuv_data,
> + .lut_size = 1024,
>  };
>  
>  static const struct vop_win_data rk3399_vop_lit_win_data[] = {
> @@ -970,13 +989,14 @@ static const struct vop_win_yuv2yuv_data 
> rk3399_vop_lit_win_yuv2yuv_data[] = {
>  static const struct vop_data rk3399_vop_lit = {
>   .version = VOP_VERSION(3, 6),
>   .intr = &rk3366_vop_intr,
> - .common = &rk3288_common,
> + .common = &rk3399_common,
>   .modeset = &rk3288_modeset,
>   .output = &rk3399_output,
>   .misc = &rk3368_misc,
>   .win = rk3399_vop_lit_win_data,
>   .win_size = ARRAY_SIZE(rk3399_vop_lit_win_data),
>   .win_yuv2yuv = rk3399_vop_lit_win_yuv2yuv_data,
> + .lut_size = 256,
>  };
>  
>  static const struct vop_win_data rk3228_vop_win_data[] = {
> diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.h 
> b/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
> index 0b3cd65ba5c1..406e981c75bd 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
> +++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
> @@ -628,6 +628,7 @@
>  #define RK3399_YUV2YUV_WIN   0x02c0
>  #define RK3399_YUV2YUV_POST  0x02c4
>  #define RK3399_AUTO_GATING_EN0x02cc
> +#define RK3399_DBG_POST_REG1 0x036c
>  #define RK3399_WIN0_CSC_COE  0x03a0
>  #define RK3399_WIN1_CSC_COE  0x03c0
>  #define RK3399_WIN2_CSC_COE  0x03e0


Re: [PATCH] i2c: qcom-geni: Fix GPI DMA buffer sync-back

2022-08-07 Thread Caleb Connolly



On 07/08/2022 15:04, Robin Reckmann wrote:
> Fix i2c transfers using GPI DMA mode for all message types that do not set
> the I2C_M_DMA_SAFE flag (e.g. SMBus "read byte").
>
> In this case a bounce buffer is returned by i2c_get_dma_safe_msg_buf(),
> and it has to synced back to the message after the transfer is done.
>
> Add missing assignment of dma buffer in geni_i2c_gpi().
>
> Set xferred in i2c_put_dma_safe_msg_buf() to true in case of no error to
> ensure the sync-back of this dma buffer to the message.
Thanks for sending this, it fixes GPI DMA on the PocoPhone F1 and Pixel 3!
>
> Signed-off-by: Robin Reckmann Reviewed-by: Caleb 
> Connolly 
> ---
>   drivers/i2c/busses/i2c-qcom-geni.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c 
> b/drivers/i2c/busses/i2c-qcom-geni.c
> index 6ac402ea58fb..d3541e94794e 100644
> --- a/drivers/i2c/busses/i2c-qcom-geni.c
> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
> @@ -484,12 +484,12 @@ static void geni_i2c_gpi_unmap(struct geni_i2c_dev 
> *gi2c, struct i2c_msg *msg,
>   {
>   if (tx_buf) {
>   dma_unmap_single(gi2c->se.dev->parent, tx_addr, msg->len, 
> DMA_TO_DEVICE);
> - i2c_put_dma_safe_msg_buf(tx_buf, msg, false);
> + i2c_put_dma_safe_msg_buf(tx_buf, msg, !gi2c->err);
>   }
>
>   if (rx_buf) {
>   dma_unmap_single(gi2c->se.dev->parent, rx_addr, msg->len, 
> DMA_FROM_DEVICE);
> - i2c_put_dma_safe_msg_buf(rx_buf, msg, false);
> + i2c_put_dma_safe_msg_buf(rx_buf, msg, !gi2c->err);
>   }
>   }
>
> @@ -553,6 +553,7 @@ static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct 
> i2c_msg *msg,
>   desc->callback_param = gi2c;
>
>   dmaengine_submit(desc);
> + *buf = dma_buf;
>   *dma_addr_p = addr;
>
>   return 0;
> --
> 2.25.1
>

--
Kind Regards,
Caleb



[PATCH v2 1/2] drm/msm: Move hangcheck timer restart

2022-08-07 Thread Rob Clark
From: Rob Clark 

Don't directly restart the hangcheck timer from the timer handler, but
instead start it after the recover_worker replays remaining jobs.

If the kthread is blocked for other reasons, there is no point to
immediately restart the timer.  Fixes a random symptom of the problem
fixed in the next patch.

v2: Keep the hangcheck timer restart in the timer handler in the case
where we aren't scheduling recover_worker

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gpu.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index fba85f894314..6762001d9945 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -328,6 +328,7 @@ find_submit(struct msm_ringbuffer *ring, uint32_t fence)
 }
 
 static void retire_submits(struct msm_gpu *gpu);
+static void hangcheck_timer_reset(struct msm_gpu *gpu);
 
 static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, char 
**cmd)
 {
@@ -420,6 +421,8 @@ static void recover_worker(struct kthread_work *work)
}
 
if (msm_gpu_active(gpu)) {
+   bool restart_hangcheck = false;
+
/* retire completed submits, plus the one that hung: */
retire_submits(gpu);
 
@@ -436,10 +439,15 @@ static void recover_worker(struct kthread_work *work)
unsigned long flags;
 
spin_lock_irqsave(&ring->submit_lock, flags);
-   list_for_each_entry(submit, &ring->submits, node)
+   list_for_each_entry(submit, &ring->submits, node) {
gpu->funcs->submit(gpu, submit);
+   restart_hangcheck = true;
+   }
spin_unlock_irqrestore(&ring->submit_lock, flags);
}
+
+   if (restart_hangcheck)
+   hangcheck_timer_reset(gpu);
}
 
mutex_unlock(&gpu->lock);
@@ -498,6 +506,7 @@ static void hangcheck_handler(struct timer_list *t)
struct drm_device *dev = gpu->dev;
struct msm_ringbuffer *ring = gpu->funcs->active_ring(gpu);
uint32_t fence = ring->memptrs->fence;
+   bool restart_hangcheck = true;
 
if (fence != ring->hangcheck_fence) {
/* some progress has been made.. ya! */
@@ -513,10 +522,16 @@ static void hangcheck_handler(struct timer_list *t)
gpu->name, ring->fctx->last_fence);
 
kthread_queue_work(gpu->worker, &gpu->recover_work);
+
+   /* If we do recovery, we want to defer restarting the hangcheck
+* timer until recovery completes and the remaining non-guilty
+* jobs are re-played.
+*/
+   restart_hangcheck = false;
}
 
/* if still more pending work, reset the hangcheck timer: */
-   if (fence_after(ring->fctx->last_fence, ring->hangcheck_fence))
+   if (restart_hangcheck && fence_after(ring->fctx->last_fence, 
ring->hangcheck_fence))
hangcheck_timer_reset(gpu);
 
/* workaround for missing irq: */
-- 
2.36.1



[PATCH v2 2/2] drm/msm/rd: Fix FIFO-full deadlock

2022-08-07 Thread Rob Clark
From: Rob Clark 

If the previous thing cat'ing $debugfs/rd left the FIFO full, then
subsequent open could deadlock in rd_write() (because open is blocked,
not giving a chance for read() to consume any data in the FIFO).  Also
it is generally a good idea to clear out old data from the FIFO.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_rd.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_rd.c b/drivers/gpu/drm/msm/msm_rd.c
index a92ffde53f0b..db2f847c8535 100644
--- a/drivers/gpu/drm/msm/msm_rd.c
+++ b/drivers/gpu/drm/msm/msm_rd.c
@@ -196,6 +196,9 @@ static int rd_open(struct inode *inode, struct file *file)
file->private_data = rd;
rd->open = true;
 
+   /* Reset fifo to clear any previously unread data: */
+   rd->fifo.head = rd->fifo.tail = 0;
+
/* the parsing tools need to know gpu-id to know which
 * register database to load.
 *
-- 
2.36.1



Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 29.07.22 um 19:07 schrieb Rob Clark:

From: Rob Clark 

This is a fairly narrowly focused interface, providing a way for a VMM
in userspace to tell the guest kernel what pgprot settings to use when
mapping a buffer to guest userspace.

For buffers that get mapped into guest userspace, virglrenderer returns
a dma-buf fd to the VMM (crosvm or qemu).


Wow, wait a second. Who is giving whom the DMA-buf fd here?

My last status was that this design was illegal and couldn't be 
implemented because it requires internal knowledge only the exporting 
driver can have.


@Daniel has anything changed on that is or my status still valid?

Regards,
Christian.


   In addition to mapping the
pages into the guest VM, it needs to report to drm/virtio in the guest
the cache settings to use for guest userspace.  In particular, on some
architectures, creating aliased mappings with different cache attributes
is frowned upon, so it is important that the guest mappings have the
same cache attributes as any potential host mappings.

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/dma-buf.c| 26 ++
  include/linux/dma-buf.h  |  7 +++
  include/uapi/linux/dma-buf.h | 28 
  3 files changed, 61 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..d02d6c2a3b49 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
return 0;
  }
  
+static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)

+{
+   struct dma_buf_info arg;
+
+   if (copy_from_user(&arg, uarg, sizeof(arg)))
+   return -EFAULT;
+
+   switch (arg.param) {
+   case DMA_BUF_INFO_VM_PROT:
+   if (!dmabuf->ops->mmap_info)
+   return -ENOSYS;
+   arg.value = dmabuf->ops->mmap_info(dmabuf);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (copy_to_user(uarg, &arg, sizeof(arg)))
+   return -EFAULT;
+
+   return 0;
+}
+
  static long dma_buf_ioctl(struct file *file,
  unsigned int cmd, unsigned long arg)
  {
@@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);
  
+	case DMA_BUF_IOCTL_INFO:

+   return dma_buf_info(dmabuf, (const void __user *)arg);
+
default:
return -ENOTTY;
}
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 71731796c8c3..6f4de64a5937 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -283,6 +283,13 @@ struct dma_buf_ops {
 */
int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);
  
+	/**

+* @mmap_info:
+*
+* Return mmapping info for the buffer.  See DMA_BUF_INFO_VM_PROT.
+*/
+   int (*mmap_info)(struct dma_buf *);
+
int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
  };
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index b1523cb8ab30..a41adac0f46a 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -85,6 +85,32 @@ struct dma_buf_sync {
  
  #define DMA_BUF_NAME_LEN	32
  
+

+/**
+ * struct dma_buf_info - Query info about the buffer.
+ */
+struct dma_buf_info {
+
+#define DMA_BUF_INFO_VM_PROT  1
+#  define DMA_BUF_VM_PROT_WC  0
+#  define DMA_BUF_VM_PROT_CACHED  1
+
+   /**
+* @param: Which param to query
+*
+* DMA_BUF_INFO_BM_PROT:
+* Query the access permissions of userspace mmap's of this buffer.
+* Returns one of DMA_BUF_VM_PROT_x
+*/
+   __u32 param;
+   __u32 pad;
+
+   /**
+* @value: Return value of the query.
+*/
+   __u64 value;
+};
+
  #define DMA_BUF_BASE  'b'
  #define DMA_BUF_IOCTL_SYNC_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
  
@@ -95,4 +121,6 @@ struct dma_buf_sync {

  #define DMA_BUF_SET_NAME_A_IOW(DMA_BUF_BASE, 1, __u32)
  #define DMA_BUF_SET_NAME_B_IOW(DMA_BUF_BASE, 1, __u64)
  
+#define DMA_BUF_IOCTL_INFO	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_info)

+
  #endif




Re: [PATCH] dma-buf: Use dma_fence_unwrap_for_each when importing fences

2022-08-07 Thread Christian König




Am 02.08.22 um 23:01 schrieb Jason Ekstrand:

Ever since 68129f431faa ("dma-buf: warn about containers in dma_resv object"),
dma_resv_add_shared_fence will warn if you attempt to add a container fence.
While most drivers were fine, fences can also be added to a dma_resv via the
recently added DMA_BUF_IOCTL_IMPORT_SYNC_FILE.  Use dma_fence_unwrap_for_each
to add each fence one at a time.

Fixes: 594740497e99 ("dma-buf: Add an API for importing sync files (v10)")
Signed-off-by: Jason Ekstrand 
Reported-by: Sarah Walker 
Cc: Christian König 
---
  drivers/dma-buf/dma-buf.c | 23 +--
  1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 630133284e2b..8d5d45112f52 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -15,6 +15,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -391,8 +392,10 @@ static long dma_buf_import_sync_file(struct dma_buf 
*dmabuf,
 const void __user *user_data)
  {
struct dma_buf_import_sync_file arg;
-   struct dma_fence *fence;
+   struct dma_fence *fence, *f;
enum dma_resv_usage usage;
+   struct dma_fence_unwrap iter;
+   unsigned int num_fences;
int ret = 0;
  
  	if (copy_from_user(&arg, user_data, sizeof(arg)))

@@ -411,13 +414,21 @@ static long dma_buf_import_sync_file(struct dma_buf 
*dmabuf,
usage = (arg.flags & DMA_BUF_SYNC_WRITE) ? DMA_RESV_USAGE_WRITE :
   DMA_RESV_USAGE_READ;
  
-	dma_resv_lock(dmabuf->resv, NULL);

+   num_fences = 0;
+   dma_fence_unwrap_for_each(f, &iter, fence)
+   ++num_fences;
  
-	ret = dma_resv_reserve_fences(dmabuf->resv, 1);

-   if (!ret)
-   dma_resv_add_fence(dmabuf->resv, fence, usage);
+   if (num_fences > 0) {
+   dma_resv_lock(dmabuf->resv, NULL);
  
-	dma_resv_unlock(dmabuf->resv);

+   ret = dma_resv_reserve_fences(dmabuf->resv, num_fences);


That looks like it is misplaced.

You *must* only lock the reservation object once and then add all fences 
in one go.


Thinking now about it we probably had a bug around that before as well. 
Going to double check tomorrow.


Regards,
Christian.


+   if (!ret) {
+   dma_fence_unwrap_for_each(f, &iter, fence)
+   dma_resv_add_fence(dmabuf->resv, f, usage);
+   }
+
+   dma_resv_unlock(dmabuf->resv);
+   }
  
  	dma_fence_put(fence);
  




Re: [PATCH v2] drm/gem: Fix GEM handle release errors

2022-08-07 Thread Christian König

Am 03.08.22 um 10:32 schrieb Jeffy Chen:

Currently we are assuming a one to one mapping between dmabuf and handle
when releasing GEM handles.

But that is not always true, since we would create extra handles for the
GEM obj in cases like gem_open() and getfb{,2}().

A similar issue was reported at:
https://lore.kernel.org/all/20211105083308.392156-1-jay...@rock-chips.com/

Another problem is that the drm_gem_remove_prime_handles() now only
remove handle to the exported dmabuf (gem_obj->dma_buf), so the imported
ones would leak:
WARNING: CPU: 2 PID: 236 at drivers/gpu/drm/drm_prime.c:228 
drm_prime_destroy_file_private+0x18/0x24

Let's fix these by using handle to find the exact map to remove.


Well we are clearly something missing here. As far as I can see the 
current code is correct.


Creating multiple GEM handles for the same DMA-buf is possible, but illegal.

In other words when a GEM handle is exported as DMA-buf and imported 
again you should intentionally always get the same handle.


So this is pretty much a clear NAK to this patch since it shouldn't be 
necessary or something is seriously broken somewhere else.


Regards,
Christian.



Signed-off-by: Jeffy Chen 
---

Changes in v2:
Fix a typo of rbtree.

  drivers/gpu/drm/drm_gem.c  | 17 +
  drivers/gpu/drm/drm_internal.h |  4 ++--
  drivers/gpu/drm/drm_prime.c| 20 
  3 files changed, 15 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index eb0c2d041f13..ed39da383570 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -168,21 +168,6 @@ void drm_gem_private_object_init(struct drm_device *dev,
  }
  EXPORT_SYMBOL(drm_gem_private_object_init);
  
-static void

-drm_gem_remove_prime_handles(struct drm_gem_object *obj, struct drm_file *filp)
-{
-   /*
-* Note: obj->dma_buf can't disappear as long as we still hold a
-* handle reference in obj->handle_count.
-*/
-   mutex_lock(&filp->prime.lock);
-   if (obj->dma_buf) {
-   drm_prime_remove_buf_handle_locked(&filp->prime,
-  obj->dma_buf);
-   }
-   mutex_unlock(&filp->prime.lock);
-}
-
  /**
   * drm_gem_object_handle_free - release resources bound to userspace handles
   * @obj: GEM object to clean up.
@@ -253,7 +238,7 @@ drm_gem_object_release_handle(int id, void *ptr, void *data)
if (obj->funcs->close)
obj->funcs->close(obj, file_priv);
  
-	drm_gem_remove_prime_handles(obj, file_priv);

+   drm_prime_remove_buf_handle(&file_priv->prime, id);
drm_vma_node_revoke(&obj->vma_node, file_priv);
  
  	drm_gem_object_handle_put_unlocked(obj);

diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 1fbbc19f1ac0..7bb98e6a446d 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -74,8 +74,8 @@ int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void 
*data,
  
  void drm_prime_init_file_private(struct drm_prime_file_private *prime_fpriv);

  void drm_prime_destroy_file_private(struct drm_prime_file_private 
*prime_fpriv);
-void drm_prime_remove_buf_handle_locked(struct drm_prime_file_private 
*prime_fpriv,
-   struct dma_buf *dma_buf);
+void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
+uint32_t handle);
  
  /* drm_drv.c */

  struct drm_minor *drm_minor_acquire(unsigned int minor_id);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index e3f09f18110c..bd5366b16381 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -190,29 +190,33 @@ static int drm_prime_lookup_buf_handle(struct 
drm_prime_file_private *prime_fpri
return -ENOENT;
  }
  
-void drm_prime_remove_buf_handle_locked(struct drm_prime_file_private *prime_fpriv,

-   struct dma_buf *dma_buf)
+void drm_prime_remove_buf_handle(struct drm_prime_file_private *prime_fpriv,
+uint32_t handle)
  {
struct rb_node *rb;
  
-	rb = prime_fpriv->dmabufs.rb_node;

+   mutex_lock(&prime_fpriv->lock);
+
+   rb = prime_fpriv->handles.rb_node;
while (rb) {
struct drm_prime_member *member;
  
-		member = rb_entry(rb, struct drm_prime_member, dmabuf_rb);

-   if (member->dma_buf == dma_buf) {
+   member = rb_entry(rb, struct drm_prime_member, handle_rb);
+   if (member->handle == handle) {
rb_erase(&member->handle_rb, &prime_fpriv->handles);
rb_erase(&member->dmabuf_rb, &prime_fpriv->dmabufs);
  
-			dma_buf_put(dma_buf);

+   dma_buf_put(member->dma_buf);
kfree(member);
-   return;
-   } else if (member->dma_buf < dma_buf) {
+ 

Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Rob Clark
On Sun, Aug 7, 2022 at 9:09 AM Christian König
 wrote:
>
> Am 29.07.22 um 19:07 schrieb Rob Clark:
> > From: Rob Clark 
> >
> > This is a fairly narrowly focused interface, providing a way for a VMM
> > in userspace to tell the guest kernel what pgprot settings to use when
> > mapping a buffer to guest userspace.
> >
> > For buffers that get mapped into guest userspace, virglrenderer returns
> > a dma-buf fd to the VMM (crosvm or qemu).
>
> Wow, wait a second. Who is giving whom the DMA-buf fd here?

Not sure I understand the question.. the dma-buf fd could come from
EGL_MESA_image_dma_buf_export, gbm, or similar.

> My last status was that this design was illegal and couldn't be
> implemented because it requires internal knowledge only the exporting
> driver can have.

This ioctl provides that information from the exporting driver so that
a VMM doesn't have to make assumptions ;-)

Currently crosvm assumes if (drivername == "i915") then it is a cached
mapping, otherwise it is wc.  I'm trying to find a way to fix this.
Suggestions welcome, but because of how mapping to a guest VM works, a
VMM is a somewhat special case where this information is needed in
userspace.

BR,
-R

> @Daniel has anything changed on that is or my status still valid?
>
> Regards,
> Christian.
>
> >In addition to mapping the
> > pages into the guest VM, it needs to report to drm/virtio in the guest
> > the cache settings to use for guest userspace.  In particular, on some
> > architectures, creating aliased mappings with different cache attributes
> > is frowned upon, so it is important that the guest mappings have the
> > same cache attributes as any potential host mappings.
> >
> > Signed-off-by: Rob Clark 
> > ---
> >   drivers/dma-buf/dma-buf.c| 26 ++
> >   include/linux/dma-buf.h  |  7 +++
> >   include/uapi/linux/dma-buf.h | 28 
> >   3 files changed, 61 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index 32f55640890c..d02d6c2a3b49 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, 
> > const char __user *buf)
> >   return 0;
> >   }
> >
> > +static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)
> > +{
> > + struct dma_buf_info arg;
> > +
> > + if (copy_from_user(&arg, uarg, sizeof(arg)))
> > + return -EFAULT;
> > +
> > + switch (arg.param) {
> > + case DMA_BUF_INFO_VM_PROT:
> > + if (!dmabuf->ops->mmap_info)
> > + return -ENOSYS;
> > + arg.value = dmabuf->ops->mmap_info(dmabuf);
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + if (copy_to_user(uarg, &arg, sizeof(arg)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> >   static long dma_buf_ioctl(struct file *file,
> > unsigned int cmd, unsigned long arg)
> >   {
> > @@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
> >   case DMA_BUF_SET_NAME_B:
> >   return dma_buf_set_name(dmabuf, (const char __user *)arg);
> >
> > + case DMA_BUF_IOCTL_INFO:
> > + return dma_buf_info(dmabuf, (const void __user *)arg);
> > +
> >   default:
> >   return -ENOTTY;
> >   }
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index 71731796c8c3..6f4de64a5937 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -283,6 +283,13 @@ struct dma_buf_ops {
> >*/
> >   int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);
> >
> > + /**
> > +  * @mmap_info:
> > +  *
> > +  * Return mmapping info for the buffer.  See DMA_BUF_INFO_VM_PROT.
> > +  */
> > + int (*mmap_info)(struct dma_buf *);
> > +
> >   int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
> >   void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
> >   };
> > diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
> > index b1523cb8ab30..a41adac0f46a 100644
> > --- a/include/uapi/linux/dma-buf.h
> > +++ b/include/uapi/linux/dma-buf.h
> > @@ -85,6 +85,32 @@ struct dma_buf_sync {
> >
> >   #define DMA_BUF_NAME_LEN32
> >
> > +
> > +/**
> > + * struct dma_buf_info - Query info about the buffer.
> > + */
> > +struct dma_buf_info {
> > +
> > +#define DMA_BUF_INFO_VM_PROT  1
> > +#  define DMA_BUF_VM_PROT_WC  0
> > +#  define DMA_BUF_VM_PROT_CACHED  1
> > +
> > + /**
> > +  * @param: Which param to query
> > +  *
> > +  * DMA_BUF_INFO_BM_PROT:
> > +  * Query the access permissions of userspace mmap's of this 
> > buffer.
> > +  * Returns one of DMA_BUF_VM_PROT_x
> > +  */
> > + __u32 param;
> > + __u32 pad;
> > +
> > + /**
> > +  * @value: Return value of the query.
> > +  

Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 07.08.22 um 19:02 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 9:09 AM Christian König
 wrote:

Am 29.07.22 um 19:07 schrieb Rob Clark:

From: Rob Clark 

This is a fairly narrowly focused interface, providing a way for a VMM
in userspace to tell the guest kernel what pgprot settings to use when
mapping a buffer to guest userspace.

For buffers that get mapped into guest userspace, virglrenderer returns
a dma-buf fd to the VMM (crosvm or qemu).

Wow, wait a second. Who is giving whom the DMA-buf fd here?

Not sure I understand the question.. the dma-buf fd could come from
EGL_MESA_image_dma_buf_export, gbm, or similar.


My last status was that this design was illegal and couldn't be
implemented because it requires internal knowledge only the exporting
driver can have.

This ioctl provides that information from the exporting driver so that
a VMM doesn't have to make assumptions ;-)


And exactly that was NAKed the last time it came up. Only the exporting 
driver is allowed to mmap() the DMA-buf into the guest.


This way you also don't need to transport any caching information anywhere.


Currently crosvm assumes if (drivername == "i915") then it is a cached
mapping, otherwise it is wc.  I'm trying to find a way to fix this.
Suggestions welcome, but because of how mapping to a guest VM works, a
VMM is a somewhat special case where this information is needed in
userspace.


Ok that leaves me completely puzzled. How does that work in the first place?

In other words how does the mapping into the guest page tables happen?

Regards,
Christian.



BR,
-R


@Daniel has anything changed on that is or my status still valid?

Regards,
Christian.


In addition to mapping the
pages into the guest VM, it needs to report to drm/virtio in the guest
the cache settings to use for guest userspace.  In particular, on some
architectures, creating aliased mappings with different cache attributes
is frowned upon, so it is important that the guest mappings have the
same cache attributes as any potential host mappings.

Signed-off-by: Rob Clark 
---
   drivers/dma-buf/dma-buf.c| 26 ++
   include/linux/dma-buf.h  |  7 +++
   include/uapi/linux/dma-buf.h | 28 
   3 files changed, 61 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..d02d6c2a3b49 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
   return 0;
   }

+static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)
+{
+ struct dma_buf_info arg;
+
+ if (copy_from_user(&arg, uarg, sizeof(arg)))
+ return -EFAULT;
+
+ switch (arg.param) {
+ case DMA_BUF_INFO_VM_PROT:
+ if (!dmabuf->ops->mmap_info)
+ return -ENOSYS;
+ arg.value = dmabuf->ops->mmap_info(dmabuf);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (copy_to_user(uarg, &arg, sizeof(arg)))
+ return -EFAULT;
+
+ return 0;
+}
+
   static long dma_buf_ioctl(struct file *file,
 unsigned int cmd, unsigned long arg)
   {
@@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
   case DMA_BUF_SET_NAME_B:
   return dma_buf_set_name(dmabuf, (const char __user *)arg);

+ case DMA_BUF_IOCTL_INFO:
+ return dma_buf_info(dmabuf, (const void __user *)arg);
+
   default:
   return -ENOTTY;
   }
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 71731796c8c3..6f4de64a5937 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -283,6 +283,13 @@ struct dma_buf_ops {
*/
   int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);

+ /**
+  * @mmap_info:
+  *
+  * Return mmapping info for the buffer.  See DMA_BUF_INFO_VM_PROT.
+  */
+ int (*mmap_info)(struct dma_buf *);
+
   int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
   void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
   };
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index b1523cb8ab30..a41adac0f46a 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -85,6 +85,32 @@ struct dma_buf_sync {

   #define DMA_BUF_NAME_LEN32

+
+/**
+ * struct dma_buf_info - Query info about the buffer.
+ */
+struct dma_buf_info {
+
+#define DMA_BUF_INFO_VM_PROT  1
+#  define DMA_BUF_VM_PROT_WC  0
+#  define DMA_BUF_VM_PROT_CACHED  1
+
+ /**
+  * @param: Which param to query
+  *
+  * DMA_BUF_INFO_BM_PROT:
+  * Query the access permissions of userspace mmap's of this buffer.
+  * Returns one of DMA_BUF_VM_PROT_x
+  */
+ __u32 param;
+ __u32 pad;
+
+ /**
+  * @value: Return value of the query.
+  */
+ __u64 value;
+};
+
 

[PATCH] drm/msm: Add fault-injection support

2022-08-07 Thread Rob Clark
From: Rob Clark 

Intended as a way to trigger error paths in mesa.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_debugfs.c |  8 
 drivers/gpu/drm/msm/msm_drv.c | 15 +++
 drivers/gpu/drm/msm/msm_drv.h |  7 +++
 3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_debugfs.c 
b/drivers/gpu/drm/msm/msm_debugfs.c
index ea2a20699cb4..a515ddcec007 100644
--- a/drivers/gpu/drm/msm/msm_debugfs.c
+++ b/drivers/gpu/drm/msm/msm_debugfs.c
@@ -7,6 +7,7 @@
 #ifdef CONFIG_DEBUG_FS
 
 #include 
+#include 
 
 #include 
 #include 
@@ -325,6 +326,13 @@ void msm_debugfs_init(struct drm_minor *minor)
 
if (priv->kms && priv->kms->funcs->debugfs_init)
priv->kms->funcs->debugfs_init(priv->kms, minor);
+
+#ifdef CONFIG_FAULT_INJECTION
+   fault_create_debugfs_attr("fail_gem_alloc", minor->debugfs_root,
+ &fail_gem_alloc);
+   fault_create_debugfs_attr("fail_gem_iova", minor->debugfs_root,
+ &fail_gem_iova);
+#endif
 }
 #endif
 
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 4979aa8187ec..6b1b483ddd59 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -6,6 +6,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -78,6 +79,11 @@ static bool modeset = true;
 MODULE_PARM_DESC(modeset, "Use kernel modesetting [KMS] (1=on (default), 
0=disable)");
 module_param(modeset, bool, 0600);
 
+#ifdef CONFIG_FAULT_INJECTION
+DECLARE_FAULT_ATTR(fail_gem_alloc);
+DECLARE_FAULT_ATTR(fail_gem_iova);
+#endif
+
 static irqreturn_t msm_irq(int irq, void *arg)
 {
struct drm_device *dev = arg;
@@ -701,6 +707,9 @@ static int msm_ioctl_gem_new(struct drm_device *dev, void 
*data,
flags |= MSM_BO_WC;
}
 
+   if (should_fail(&fail_gem_alloc, args->size))
+   return -ENOMEM;
+
return msm_gem_new_handle(dev, file, args->size,
args->flags, &args->handle, NULL);
 }
@@ -762,6 +771,9 @@ static int msm_ioctl_gem_info_iova(struct drm_device *dev,
if (!priv->gpu)
return -EINVAL;
 
+   if (should_fail(&fail_gem_iova, obj->size))
+   return -ENOMEM;
+
/*
 * Don't pin the memory here - just get an address so that userspace can
 * be productive
@@ -783,6 +795,9 @@ static int msm_ioctl_gem_info_set_iova(struct drm_device 
*dev,
if (priv->gpu->aspace == ctx->aspace)
return -EOPNOTSUPP;
 
+   if (should_fail(&fail_gem_iova, obj->size))
+   return -ENOMEM;
+
return msm_gem_set_iova(obj, ctx->aspace, iova);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index b4ace34ec889..e830f9609f2d 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -34,6 +34,13 @@
 #include 
 #include 
 
+#ifdef CONFIG_FAULT_INJECTION
+extern struct fault_attr fail_gem_alloc;
+extern struct fault_attr fail_gem_iova;
+#else
+#  define should_fail(attr, size) 0
+#endif
+
 struct msm_kms;
 struct msm_gpu;
 struct msm_mmu;
-- 
2.36.1



Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Rob Clark
On Sun, Aug 7, 2022 at 10:14 AM Christian König
 wrote:
>
> Am 07.08.22 um 19:02 schrieb Rob Clark:
> > On Sun, Aug 7, 2022 at 9:09 AM Christian König
> >  wrote:
> >> Am 29.07.22 um 19:07 schrieb Rob Clark:
> >>> From: Rob Clark 
> >>>
> >>> This is a fairly narrowly focused interface, providing a way for a VMM
> >>> in userspace to tell the guest kernel what pgprot settings to use when
> >>> mapping a buffer to guest userspace.
> >>>
> >>> For buffers that get mapped into guest userspace, virglrenderer returns
> >>> a dma-buf fd to the VMM (crosvm or qemu).
> >> Wow, wait a second. Who is giving whom the DMA-buf fd here?
> > Not sure I understand the question.. the dma-buf fd could come from
> > EGL_MESA_image_dma_buf_export, gbm, or similar.
> >
> >> My last status was that this design was illegal and couldn't be
> >> implemented because it requires internal knowledge only the exporting
> >> driver can have.
> > This ioctl provides that information from the exporting driver so that
> > a VMM doesn't have to make assumptions ;-)
>
> And exactly that was NAKed the last time it came up. Only the exporting
> driver is allowed to mmap() the DMA-buf into the guest.

except the exporting driver is in the host ;-)

> This way you also don't need to transport any caching information anywhere.
>
> > Currently crosvm assumes if (drivername == "i915") then it is a cached
> > mapping, otherwise it is wc.  I'm trying to find a way to fix this.
> > Suggestions welcome, but because of how mapping to a guest VM works, a
> > VMM is a somewhat special case where this information is needed in
> > userspace.
>
> Ok that leaves me completely puzzled. How does that work in the first place?
>
> In other words how does the mapping into the guest page tables happen?

There are multiple levels to this, but in short mapping to guest
userspace happens via drm/virtio (aka "virtio_gpu" or "virtgpu"), the
cache attributes are set via "map_info" attribute returned from the
host VMM (host userspace, hence the need for this ioctl).

In the host, the host kernel driver mmaps to host userspace (VMM).
Here the exporting driver is performing the mmap with correct cache
attributes.  The VMM uses KVM to map these pages into the guest so
they appear as physical pages to the guest kernel.  The guest kernel
(virtgpu) in turn maps them to guest userspace.

BR,
-R

>
> Regards,
> Christian.
>
> >
> > BR,
> > -R
> >
> >> @Daniel has anything changed on that is or my status still valid?
> >>
> >> Regards,
> >> Christian.
> >>
> >>> In addition to mapping the
> >>> pages into the guest VM, it needs to report to drm/virtio in the guest
> >>> the cache settings to use for guest userspace.  In particular, on some
> >>> architectures, creating aliased mappings with different cache attributes
> >>> is frowned upon, so it is important that the guest mappings have the
> >>> same cache attributes as any potential host mappings.
> >>>
> >>> Signed-off-by: Rob Clark 
> >>> ---
> >>>drivers/dma-buf/dma-buf.c| 26 ++
> >>>include/linux/dma-buf.h  |  7 +++
> >>>include/uapi/linux/dma-buf.h | 28 
> >>>3 files changed, 61 insertions(+)
> >>>
> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> >>> index 32f55640890c..d02d6c2a3b49 100644
> >>> --- a/drivers/dma-buf/dma-buf.c
> >>> +++ b/drivers/dma-buf/dma-buf.c
> >>> @@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, 
> >>> const char __user *buf)
> >>>return 0;
> >>>}
> >>>
> >>> +static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)
> >>> +{
> >>> + struct dma_buf_info arg;
> >>> +
> >>> + if (copy_from_user(&arg, uarg, sizeof(arg)))
> >>> + return -EFAULT;
> >>> +
> >>> + switch (arg.param) {
> >>> + case DMA_BUF_INFO_VM_PROT:
> >>> + if (!dmabuf->ops->mmap_info)
> >>> + return -ENOSYS;
> >>> + arg.value = dmabuf->ops->mmap_info(dmabuf);
> >>> + break;
> >>> + default:
> >>> + return -EINVAL;
> >>> + }
> >>> +
> >>> + if (copy_to_user(uarg, &arg, sizeof(arg)))
> >>> + return -EFAULT;
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>>static long dma_buf_ioctl(struct file *file,
> >>>  unsigned int cmd, unsigned long arg)
> >>>{
> >>> @@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
> >>>case DMA_BUF_SET_NAME_B:
> >>>return dma_buf_set_name(dmabuf, (const char __user *)arg);
> >>>
> >>> + case DMA_BUF_IOCTL_INFO:
> >>> + return dma_buf_info(dmabuf, (const void __user *)arg);
> >>> +
> >>>default:
> >>>return -ENOTTY;
> >>>}
> >>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> >>> index 71731796c8c3..6f4de64a5937 100644
> >>> --- a/include/linux/dma-buf.h
> >>> +++ b/include/linux/dma-buf.h
> >>> @@ -283

RE: mainline build failure for x86_64 allmodconfig with clang

2022-08-07 Thread David Laight
From: Arnd Bergmann
> Sent: 05 August 2022 20:32
...
> One thing to try would be to condense a function call like
> 
> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
> 
...
>  /* more arguments */);
> 
> into calling conventions that take a pointer to 'mode_lib->vba' and another
> one to 'v', so these are no longer passed on the stack individually.

Or, if it is only called once (I can't find the source)
force it to be inlined.

Or just shoot the software engineer who thinks 100 arguments
is sane. :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 07.08.22 um 19:35 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 10:14 AM Christian König
 wrote:

Am 07.08.22 um 19:02 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 9:09 AM Christian König
 wrote:

Am 29.07.22 um 19:07 schrieb Rob Clark:

From: Rob Clark 

This is a fairly narrowly focused interface, providing a way for a VMM
in userspace to tell the guest kernel what pgprot settings to use when
mapping a buffer to guest userspace.

For buffers that get mapped into guest userspace, virglrenderer returns
a dma-buf fd to the VMM (crosvm or qemu).

Wow, wait a second. Who is giving whom the DMA-buf fd here?

Not sure I understand the question.. the dma-buf fd could come from
EGL_MESA_image_dma_buf_export, gbm, or similar.


My last status was that this design was illegal and couldn't be
implemented because it requires internal knowledge only the exporting
driver can have.

This ioctl provides that information from the exporting driver so that
a VMM doesn't have to make assumptions ;-)

And exactly that was NAKed the last time it came up. Only the exporting
driver is allowed to mmap() the DMA-buf into the guest.

except the exporting driver is in the host ;-)


This way you also don't need to transport any caching information anywhere.


Currently crosvm assumes if (drivername == "i915") then it is a cached
mapping, otherwise it is wc.  I'm trying to find a way to fix this.
Suggestions welcome, but because of how mapping to a guest VM works, a
VMM is a somewhat special case where this information is needed in
userspace.

Ok that leaves me completely puzzled. How does that work in the first place?

In other words how does the mapping into the guest page tables happen?

There are multiple levels to this, but in short mapping to guest
userspace happens via drm/virtio (aka "virtio_gpu" or "virtgpu"), the
cache attributes are set via "map_info" attribute returned from the
host VMM (host userspace, hence the need for this ioctl).

In the host, the host kernel driver mmaps to host userspace (VMM).
Here the exporting driver is performing the mmap with correct cache
attributes.



The VMM uses KVM to map these pages into the guest so


And exactly that was declared completely illegal the last time it came 
up on the mailing list.


Daniel implemented a whole bunch of patches into the DMA-buf layer to 
make it impossible for KVM to do this.


I have absolutely no idea why that is now a topic again and why anybody 
is still using this approach.


@Daniel can you clarify.

Thanks,
Christian.


they appear as physical pages to the guest kernel.  The guest kernel
(virtgpu) in turn maps them to guest userspace.

BR,
-R


Regards,
Christian.


BR,
-R


@Daniel has anything changed on that is or my status still valid?

Regards,
Christian.


 In addition to mapping the
pages into the guest VM, it needs to report to drm/virtio in the guest
the cache settings to use for guest userspace.  In particular, on some
architectures, creating aliased mappings with different cache attributes
is frowned upon, so it is important that the guest mappings have the
same cache attributes as any potential host mappings.

Signed-off-by: Rob Clark 
---
drivers/dma-buf/dma-buf.c| 26 ++
include/linux/dma-buf.h  |  7 +++
include/uapi/linux/dma-buf.h | 28 
3 files changed, 61 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..d02d6c2a3b49 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
return 0;
}

+static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)
+{
+ struct dma_buf_info arg;
+
+ if (copy_from_user(&arg, uarg, sizeof(arg)))
+ return -EFAULT;
+
+ switch (arg.param) {
+ case DMA_BUF_INFO_VM_PROT:
+ if (!dmabuf->ops->mmap_info)
+ return -ENOSYS;
+ arg.value = dmabuf->ops->mmap_info(dmabuf);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (copy_to_user(uarg, &arg, sizeof(arg)))
+ return -EFAULT;
+
+ return 0;
+}
+
static long dma_buf_ioctl(struct file *file,
  unsigned int cmd, unsigned long arg)
{
@@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);

+ case DMA_BUF_IOCTL_INFO:
+ return dma_buf_info(dmabuf, (const void __user *)arg);
+
default:
return -ENOTTY;
}
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 71731796c8c3..6f4de64a5937 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -283,6 +283,13 @@ struct dma_buf_ops {
 */
int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);

+ /**

Re: mainline build failure for x86_64 allmodconfig with clang

2022-08-07 Thread Linus Torvalds
On Sun, Aug 7, 2022 at 10:36 AM David Laight  wrote:
>
> Or just shoot the software engineer who thinks 100 arguments
> is sane. :-)

I suspect the issue is that it's not primarily a software engineer who
wrote that code.

Hardware people writing code are about as scary as software engineers
with a soldering iron.

   Linus


Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Rob Clark
On Sun, Aug 7, 2022 at 10:38 AM Christian König
 wrote:
>
> Am 07.08.22 um 19:35 schrieb Rob Clark:
> > On Sun, Aug 7, 2022 at 10:14 AM Christian König
> >  wrote:
> >> Am 07.08.22 um 19:02 schrieb Rob Clark:
> >>> On Sun, Aug 7, 2022 at 9:09 AM Christian König
> >>>  wrote:
>  Am 29.07.22 um 19:07 schrieb Rob Clark:
> > From: Rob Clark 
> >
> > This is a fairly narrowly focused interface, providing a way for a VMM
> > in userspace to tell the guest kernel what pgprot settings to use when
> > mapping a buffer to guest userspace.
> >
> > For buffers that get mapped into guest userspace, virglrenderer returns
> > a dma-buf fd to the VMM (crosvm or qemu).
>  Wow, wait a second. Who is giving whom the DMA-buf fd here?
> >>> Not sure I understand the question.. the dma-buf fd could come from
> >>> EGL_MESA_image_dma_buf_export, gbm, or similar.
> >>>
>  My last status was that this design was illegal and couldn't be
>  implemented because it requires internal knowledge only the exporting
>  driver can have.
> >>> This ioctl provides that information from the exporting driver so that
> >>> a VMM doesn't have to make assumptions ;-)
> >> And exactly that was NAKed the last time it came up. Only the exporting
> >> driver is allowed to mmap() the DMA-buf into the guest.
> > except the exporting driver is in the host ;-)
> >
> >> This way you also don't need to transport any caching information anywhere.
> >>
> >>> Currently crosvm assumes if (drivername == "i915") then it is a cached
> >>> mapping, otherwise it is wc.  I'm trying to find a way to fix this.
> >>> Suggestions welcome, but because of how mapping to a guest VM works, a
> >>> VMM is a somewhat special case where this information is needed in
> >>> userspace.
> >> Ok that leaves me completely puzzled. How does that work in the first 
> >> place?
> >>
> >> In other words how does the mapping into the guest page tables happen?
> > There are multiple levels to this, but in short mapping to guest
> > userspace happens via drm/virtio (aka "virtio_gpu" or "virtgpu"), the
> > cache attributes are set via "map_info" attribute returned from the
> > host VMM (host userspace, hence the need for this ioctl).
> >
> > In the host, the host kernel driver mmaps to host userspace (VMM).
> > Here the exporting driver is performing the mmap with correct cache
> > attributes.
>
> > The VMM uses KVM to map these pages into the guest so
>
> And exactly that was declared completely illegal the last time it came
> up on the mailing list.
>
> Daniel implemented a whole bunch of patches into the DMA-buf layer to
> make it impossible for KVM to do this.

This issue isn't really with KVM, it is not making any CPU mappings
itself.  KVM is just making the pages available to the guest.  Like I
said the CPU mapping to the guest userspace is setup by virtgpu.  But
based on information that the host VMM provides.  This patch simply
provides a way for the host VMM to provide the correct information.

> I have absolutely no idea why that is now a topic again and why anybody
> is still using this approach.

Because this is how VMMs work.  And it is how the virtgpu device
spec[1] is designed.

[1] https://github.com/oasis-tcs/virtio-spec/blob/master/virtio-gpu.tex#L767

> @Daniel can you clarify.

Like I've said, I'd be happy to hear alternative suggestions.  But the
root problem is that it is not possible for the host kernel to
directly map into guest userspace.  So I don't really see an
alternative.  Even if it were possible for host kernel to directly map
to guest userspace, that ship has already sailed as far as virtio
device specification.

BR,
-R

> Thanks,
> Christian.
>
> > they appear as physical pages to the guest kernel.  The guest kernel
> > (virtgpu) in turn maps them to guest userspace.
> >
> > BR,
> > -R
> >
> >> Regards,
> >> Christian.
> >>
> >>> BR,
> >>> -R
> >>>
>  @Daniel has anything changed on that is or my status still valid?
> 
>  Regards,
>  Christian.
> 
> >  In addition to mapping the
> > pages into the guest VM, it needs to report to drm/virtio in the guest
> > the cache settings to use for guest userspace.  In particular, on some
> > architectures, creating aliased mappings with different cache attributes
> > is frowned upon, so it is important that the guest mappings have the
> > same cache attributes as any potential host mappings.
> >
> > Signed-off-by: Rob Clark 
> > ---
> > drivers/dma-buf/dma-buf.c| 26 ++
> > include/linux/dma-buf.h  |  7 +++
> > include/uapi/linux/dma-buf.h | 28 
> > 3 files changed, 61 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> > index 32f55640890c..d02d6c2a3b49 100644
> > --- a/drivers/dma-buf/dma-buf.c
> > +++ b/drivers/dma-buf/dma-buf.c
> > @@ -326,6 +326,29 @

Re: [GIT PULL] fbdev updates & fixes for v5.20-rc1

2022-08-07 Thread pr-tracker-bot
The pull request you sent on Sat, 6 Aug 2022 22:06:49 +0200:

> http://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev.git 
> tags/for-5.20/fbdev-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Christian König

Am 07.08.22 um 19:56 schrieb Rob Clark:

On Sun, Aug 7, 2022 at 10:38 AM Christian König
 wrote:

[SNIP]
And exactly that was declared completely illegal the last time it came
up on the mailing list.

Daniel implemented a whole bunch of patches into the DMA-buf layer to
make it impossible for KVM to do this.

This issue isn't really with KVM, it is not making any CPU mappings
itself.  KVM is just making the pages available to the guest.


Well I can only repeat myself: This is strictly illegal.

Please try this approach with CONFIG_DMABUF_DEBUG set. I'm pretty sure 
you will immediately run into a crash.


See this here as well 
https://elixir.bootlin.com/linux/v5.19/source/drivers/dma-buf/dma-buf.c#L653


Daniel intentionally added code to mangle the page pointers to make it 
impossible for KVM to do this.


If the virtio/virtgpu UAPI was build around the idea that this is 
possible then it is most likely fundamental broken.


Regards,
Christian.


Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Rob Clark
On Sun, Aug 7, 2022 at 11:05 AM Christian König
 wrote:
>
> Am 07.08.22 um 19:56 schrieb Rob Clark:
> > On Sun, Aug 7, 2022 at 10:38 AM Christian König
> >  wrote:
> >> [SNIP]
> >> And exactly that was declared completely illegal the last time it came
> >> up on the mailing list.
> >>
> >> Daniel implemented a whole bunch of patches into the DMA-buf layer to
> >> make it impossible for KVM to do this.
> > This issue isn't really with KVM, it is not making any CPU mappings
> > itself.  KVM is just making the pages available to the guest.
>
> Well I can only repeat myself: This is strictly illegal.
>
> Please try this approach with CONFIG_DMABUF_DEBUG set. I'm pretty sure
> you will immediately run into a crash.
>
> See this here as well
> https://elixir.bootlin.com/linux/v5.19/source/drivers/dma-buf/dma-buf.c#L653
>
> Daniel intentionally added code to mangle the page pointers to make it
> impossible for KVM to do this.

I don't believe KVM is using the sg table, so this isn't going to stop
anything ;-)

> If the virtio/virtgpu UAPI was build around the idea that this is
> possible then it is most likely fundamental broken.

How else can you envision mmap'ing to guest userspace working?  The
guest kernel is the one that controls the guest userspace pagetables,
not the host kernel.  I guess your complaint is about VMs in general,
but unfortunately I don't think you'll convince the rest of the
industry to abandon VMs ;-)

But more seriously, let's take a step back here.. what scenarios are
you seeing this being problematic for?  Then we can see how to come up
with solutions.  The current situation of host userspace VMM just
guessing isn't great.  And sticking our heads in the sand and
pretending VMs don't exist isn't great.  So what can we do?  I can
instead add a msm ioctl to return this info and solve the problem even
more narrowly for a single platform.  But then the problem still
remains on other platforms.

Slightly implicit in this is that mapping dma-bufs to the guest won't
work for anything that requires DMA_BUF_IOCTL_SYNC for coherency.. we
could add a possible return value for DMA_BUF_INFO_VM_PROT indicating
that the buffer does not support mapping to guest or CPU access
without DMA_BUF_IOCTL_SYNC.  Then at least the VMM can fail gracefully
instead of subtly.

BR,
-R


Re: [Freedreno] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

2022-08-07 Thread Akhil P Oommen

On 7/29/2022 10:37 PM, Rob Clark wrote:

From: Rob Clark 

This is a fairly narrowly focused interface, providing a way for a VMM
in userspace to tell the guest kernel what pgprot settings to use when
mapping a buffer to guest userspace.

For buffers that get mapped into guest userspace, virglrenderer returns
a dma-buf fd to the VMM (crosvm or qemu).  In addition to mapping the
pages into the guest VM, it needs to report to drm/virtio in the guest
the cache settings to use for guest userspace.  In particular, on some
architectures, creating aliased mappings with different cache attributes
is frowned upon, so it is important that the guest mappings have the
same cache attributes as any potential host mappings.

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/dma-buf.c| 26 ++
  include/linux/dma-buf.h  |  7 +++
  include/uapi/linux/dma-buf.h | 28 
  3 files changed, 61 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 32f55640890c..d02d6c2a3b49 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -326,6 +326,29 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const 
char __user *buf)
return 0;
  }
  
+static long dma_buf_info(struct dma_buf *dmabuf, const void __user *uarg)

+{
+   struct dma_buf_info arg;
+
+   if (copy_from_user(&arg, uarg, sizeof(arg)))
+   return -EFAULT;
+
+   switch (arg.param) {
+   case DMA_BUF_INFO_VM_PROT:
+   if (!dmabuf->ops->mmap_info)
+   return -ENOSYS;
+   arg.value = dmabuf->ops->mmap_info(dmabuf);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (copy_to_user(uarg, &arg, sizeof(arg)))
+   return -EFAULT;
+
+   return 0;
+}
+
  static long dma_buf_ioctl(struct file *file,
  unsigned int cmd, unsigned long arg)
  {
@@ -369,6 +392,9 @@ static long dma_buf_ioctl(struct file *file,
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);
  
+	case DMA_BUF_IOCTL_INFO:

+   return dma_buf_info(dmabuf, (const void __user *)arg);
+
default:
return -ENOTTY;
}
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 71731796c8c3..6f4de64a5937 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -283,6 +283,13 @@ struct dma_buf_ops {
 */
int (*mmap)(struct dma_buf *, struct vm_area_struct *vma);
  
+	/**

+* @mmap_info:
+*
+* Return mmapping info for the buffer.  See DMA_BUF_INFO_VM_PROT.
+*/
+   int (*mmap_info)(struct dma_buf *);
+
int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map);
void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map *map);
  };
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index b1523cb8ab30..a41adac0f46a 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -85,6 +85,32 @@ struct dma_buf_sync {
  
  #define DMA_BUF_NAME_LEN	32
  
+

+/**
+ * struct dma_buf_info - Query info about the buffer.
+ */
+struct dma_buf_info {
+
+#define DMA_BUF_INFO_VM_PROT  1
+#  define DMA_BUF_VM_PROT_WC  0
+#  define DMA_BUF_VM_PROT_CACHED  1
+
+   /**
+* @param: Which param to query
+*
+* DMA_BUF_INFO_BM_PROT:

Is there a typo here? BM -> VM ?

-Akhil.

+* Query the access permissions of userspace mmap's of this buffer.
+* Returns one of DMA_BUF_VM_PROT_x
+*/
+   __u32 param;
+   __u32 pad;
+
+   /**
+* @value: Return value of the query.
+*/
+   __u64 value;
+};
+
  #define DMA_BUF_BASE  'b'
  #define DMA_BUF_IOCTL_SYNC_IOW(DMA_BUF_BASE, 0, struct dma_buf_sync)
  
@@ -95,4 +121,6 @@ struct dma_buf_sync {

  #define DMA_BUF_SET_NAME_A_IOW(DMA_BUF_BASE, 1, __u32)
  #define DMA_BUF_SET_NAME_B_IOW(DMA_BUF_BASE, 1, __u64)
  
+#define DMA_BUF_IOCTL_INFO	_IOWR(DMA_BUF_BASE, 2, struct dma_buf_info)

+
  #endif




Re: [PATCH 1/2] drm/i915/ttm: remove calc_ctrl_surf_instr_size

2022-08-07 Thread Ramalingam C
On 2022-08-05 at 14:22:39 +0100, Matthew Auld wrote:
> We only ever need to emit one ccs block copy command.
Since max size we handle at a time is CHUNK_SZ, we will need only one
cmd.

Reviewed-by: Ramalingam C
> 
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_migrate.c | 35 +++--
>  1 file changed, 3 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 9a0814422ba4..1bbed7aa436a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -511,44 +511,16 @@ static inline u32 *i915_flush_dw(u32 *cmd, u32 flags)
>   return cmd;
>  }
>  
> -static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915, int size)
> -{
> - u32 num_cmds, num_blks, total_size;
> -
> - if (!GET_CCS_BYTES(i915, size))
> - return 0;
> -
> - /*
> -  * XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte
> -  * blocks. one XY_CTRL_SURF_COPY_BLT command can
> -  * transfer upto 1024 blocks.
> -  */
> - num_blks = DIV_ROUND_UP(GET_CCS_BYTES(i915, size),
> - NUM_CCS_BYTES_PER_BLOCK);
> - num_cmds = DIV_ROUND_UP(num_blks, NUM_CCS_BLKS_PER_XFER);
> - total_size = XY_CTRL_SURF_INSTR_SIZE * num_cmds;
> -
> - /*
> -  * Adding a flush before and after XY_CTRL_SURF_COPY_BLT
> -  */
> - total_size += 2 * MI_FLUSH_DW_SIZE;
> -
> - return total_size;
> -}
> -
>  static int emit_copy_ccs(struct i915_request *rq,
>u32 dst_offset, u8 dst_access,
>u32 src_offset, u8 src_access, int size)
>  {
>   struct drm_i915_private *i915 = rq->engine->i915;
>   int mocs = rq->engine->gt->mocs.uc_index << 1;
> - u32 num_ccs_blks, ccs_ring_size;
> + u32 num_ccs_blks;
>   u32 *cs;
>  
> - ccs_ring_size = calc_ctrl_surf_instr_size(i915, size);
> - WARN_ON(!ccs_ring_size);
> -
> - cs = intel_ring_begin(rq, round_up(ccs_ring_size, 2));
> + cs = intel_ring_begin(rq, 12);
>   if (IS_ERR(cs))
>   return PTR_ERR(cs);
>  
> @@ -583,8 +555,7 @@ static int emit_copy_ccs(struct i915_request *rq,
>   FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, mocs);
>  
>   cs = i915_flush_dw(cs, MI_FLUSH_DW_LLC | MI_FLUSH_DW_CCS);
> - if (ccs_ring_size & 1)
> - *cs++ = MI_NOOP;
> + *cs++ = MI_NOOP;
>  
>   intel_ring_advance(rq, cs);
>  
> -- 
> 2.37.1
> 


Re: [PATCH 2/2] drm/i915/ttm: fix CCS handling

2022-08-07 Thread Ramalingam C
On 2022-08-05 at 14:22:40 +0100, Matthew Auld wrote:
> Crucible + recent Mesa seems to sometimes hit:
> 
> GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER)
> 
> And it looks like we can also trigger this with gem_lmem_swapping, if we
> modify the test to use slightly larger object sizes.
> 
> Looking closer it looks like we have the following issues in
> migrate_copy():
> 
>   - We are using plain integer in various places, which we can easily overflow
> with a large object.
> 
>   - We pass the entire object size (when the src is lmem) into
> emit_pte() and then try to copy it, which doesn't work, since we
> only have a few fixed sized windows in which to map the pages and
> perform the copy. With an object > 8M we therefore aren't properly
> copying the pages. And then with an object > 64M we trigger the
> GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER).
> 
> So it looks like our copy handling for any object > 8M (which is our
> CHUNK_SZ) is currently broken on DG2.
> 
> Fixes: da0595ae91da ("drm/i915/migrate: Evict and restore the flatccs capable 
> lmem obj")
> Testcase: igt@gem_lmem_swapping@basic-big
> Testcase: igt@gem_lmem_swapping@verify-ccs-big
> Signed-off-by: Matthew Auld 
> Cc: Thomas Hellström 
> Cc: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_migrate.c | 44 -
>  1 file changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 1bbed7aa436a..aaaf1906026c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -609,9 +609,9 @@ static int emit_copy(struct i915_request *rq,
>   return 0;
>  }
>  
> -static int scatter_list_length(struct scatterlist *sg)
> +static u64 scatter_list_length(struct scatterlist *sg)
>  {
> - int len = 0;
> + u64 len = 0;
>  
>   while (sg && sg_dma_len(sg)) {
>   len += sg_dma_len(sg);
> @@ -621,28 +621,26 @@ static int scatter_list_length(struct scatterlist *sg)
>   return len;
>  }
>  
> -static void
> +static int
>  calculate_chunk_sz(struct drm_i915_private *i915, bool src_is_lmem,
> -int *src_sz, u32 bytes_to_cpy, u32 ccs_bytes_to_cpy)
> +u64 bytes_to_cpy, u64 ccs_bytes_to_cpy)
>  {
> - if (ccs_bytes_to_cpy) {
> - if (!src_is_lmem)
> - /*
> -  * When CHUNK_SZ is passed all the pages upto CHUNK_SZ
> -  * will be taken for the blt. in Flat-ccs supported
> -  * platform Smem obj will have more pages than required
> -  * for main meory hence limit it to the required size
> -  * for main memory
> -  */
> - *src_sz = min_t(int, bytes_to_cpy, CHUNK_SZ);
> - } else { /* ccs handling is not required */
> - *src_sz = CHUNK_SZ;
> - }
> + if (ccs_bytes_to_cpy && !src_is_lmem)
Yes this is needed for ccs copy of an obj of >8M from lmem to smem.

Reviewed-by: Ramalingam C
> + /*
> +  * When CHUNK_SZ is passed all the pages upto CHUNK_SZ
> +  * will be taken for the blt. in Flat-ccs supported
> +  * platform Smem obj will have more pages than required
> +  * for main meory hence limit it to the required size
> +  * for main memory
> +  */
> + return min_t(u64, bytes_to_cpy, CHUNK_SZ);
> + else
> + return CHUNK_SZ;
>  }
>  
> -static void get_ccs_sg_sgt(struct sgt_dma *it, u32 bytes_to_cpy)
> +static void get_ccs_sg_sgt(struct sgt_dma *it, u64 bytes_to_cpy)
>  {
> - u32 len;
> + u64 len;
>  
>   do {
>   GEM_BUG_ON(!it->sg || !sg_dma_len(it->sg));
> @@ -673,12 +671,12 @@ intel_context_migrate_copy(struct intel_context *ce,
>  {
>   struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs;
>   struct drm_i915_private *i915 = ce->engine->i915;
> - u32 ccs_bytes_to_cpy = 0, bytes_to_cpy;
> + u64 ccs_bytes_to_cpy = 0, bytes_to_cpy;
>   enum i915_cache_level ccs_cache_level;
>   u32 src_offset, dst_offset;
>   u8 src_access, dst_access;
>   struct i915_request *rq;
> - int src_sz, dst_sz;
> + u64 src_sz, dst_sz;
>   bool ccs_is_src, overwrite_ccs;
>   int err;
>  
> @@ -761,8 +759,8 @@ intel_context_migrate_copy(struct intel_context *ce,
>   if (err)
>   goto out_rq;
>  
> - calculate_chunk_sz(i915, src_is_lmem, &src_sz,
> -bytes_to_cpy, ccs_bytes_to_cpy);
> + src_sz = calculate_chunk_sz(i915, src_is_lmem,
> + bytes_to_cpy, ccs_bytes_to_cpy);
>  
>   len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem,
>  src_offset, src_sz);
> -- 
> 2.37.1
> 


Re: [PATCH v2 1/3] drivers: serial: earlycon: Correct argument name

2022-08-07 Thread kernel test robot
Hi Markuss,

I love your patch! Yet something to improve:

[auto build test ERROR on tty/tty-testing]
[also build test ERROR on efi/next staging/staging-testing usb/usb-testing 
linus/master v5.19 next-20220805]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Markuss-Broks/Add-generic-framebuffer-support-to-EFI-earlycon-driver/20220807-003646
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git 
tty-testing
config: x86_64-randconfig-a012 
(https://download.01.org/0day-ci/archive/20220807/20220807.own8uzfx-...@intel.com/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 
5f1c7e2cc5a3c07cbc2412e851a7283c1841f520)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/dedd7c138e9492439eeda05fa75a18bf19883a08
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Markuss-Broks/Add-generic-framebuffer-support-to-EFI-earlycon-driver/20220807-003646
git checkout dedd7c138e9492439eeda05fa75a18bf19883a08
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/tty/serial/earlycon.c:246:12: error: conflicting types for 
>> 'of_setup_earlycon'
   int __init of_setup_earlycon(const struct earlycon_id *match,
  ^
   include/linux/serial_core.h:708:12: note: previous declaration is here
   extern int of_setup_earlycon(const struct earlycon_id *match,
  ^
   1 error generated.


vim +/of_setup_earlycon +246 drivers/tty/serial/earlycon.c

8477614d9f7c5c Peter Hurley   2016-01-16  245  
c90fe9c0394b06 Peter Hurley   2016-01-16 @246  int __init 
of_setup_earlycon(const struct earlycon_id *match,
dedd7c138e9492 Markuss Broks  2022-08-06  247
int offset,
4d118c9a866590 Peter Hurley   2016-01-16  248
const char *options)
b0b6abd34c1b50 Rob Herring2014-03-27  249  {
b0b6abd34c1b50 Rob Herring2014-03-27  250   int err;
b0b6abd34c1b50 Rob Herring2014-03-27  251   struct uart_port *port 
= &early_console_dev.port;
088da2a17619cf Peter Hurley   2016-01-16  252   const __be32 *val;
088da2a17619cf Peter Hurley   2016-01-16  253   bool big_endian;
c90fe9c0394b06 Peter Hurley   2016-01-16  254   u64 addr;
b0b6abd34c1b50 Rob Herring2014-03-27  255  
65e20e8cbbccaf Michael Walle  2022-06-28  256   if (early_con.flags & 
CON_ENABLED)
65e20e8cbbccaf Michael Walle  2022-06-28  257   return 
-EALREADY;
65e20e8cbbccaf Michael Walle  2022-06-28  258  
e1dd3bef6d03c9 Geert Uytterhoeven 2015-11-27  259   
spin_lock_init(&port->lock);
b0b6abd34c1b50 Rob Herring2014-03-27  260   port->iotype = UPIO_MEM;
dedd7c138e9492 Markuss Broks  2022-08-06  261   addr = 
of_flat_dt_translate_address(offset);
c90fe9c0394b06 Peter Hurley   2016-01-16  262   if (addr == 
OF_BAD_ADDR) {
c90fe9c0394b06 Peter Hurley   2016-01-16  263   pr_warn("[%s] 
bad address\n", match->name);
c90fe9c0394b06 Peter Hurley   2016-01-16  264   return -ENXIO;
c90fe9c0394b06 Peter Hurley   2016-01-16  265   }
b0b6abd34c1b50 Rob Herring2014-03-27  266   port->mapbase = addr;
b0b6abd34c1b50 Rob Herring2014-03-27  267  
dedd7c138e9492 Markuss Broks  2022-08-06  268   val = 
of_get_flat_dt_prop(offset, "reg-offset", NULL);
088da2a17619cf Peter Hurley   2016-01-16  269   if (val)
088da2a17619cf Peter Hurley   2016-01-16  270   port->mapbase 
+= be32_to_cpu(*val);
1f66dd36bb1843 Greentime Hu   2018-02-13  271   port->membase = 
earlycon_map(port->mapbase, SZ_4K);
1f66dd36bb1843 Greentime Hu   2018-02-13  272  
dedd7c138e9492 Markuss Broks  2022-08-06  273   val = 
of_get_flat_dt_prop(offset, "reg-shift", NULL);
088da2a17619cf Peter Hurley   2016-01-16  274   if (val)
088da2a17619cf Peter Hurley   2016-01-16  275   port->regshift 
= be32_to_cpu(*val);
dedd7c138e9492 Markuss Broks  2022-08-06  276   big_endian = 
of_get_flat_dt_prop(offset, "big-endian", NULL) != NULL ||
088da2a17619cf Peter Hurley   

Re: [PATCH 1/2] dt-bindings: display: bridge: icn6211: Add support for RGB/BGR swap

2022-08-07 Thread Marek Vasut

On 8/4/22 00:41, Rob Herring wrote:

On Tue, Aug 2, 2022 at 5:33 AM Marek Vasut  wrote:


On 8/1/22 18:32, Rob Herring wrote:

On Mon, Aug 01, 2022 at 03:19:00PM +0200, Marek Vasut wrote:

The ICN6211 is capable of swapping the output DPI RGB/BGR color channels,
document a DT property to select this swap in DT. This can be useful on
hardware where such swap happens.


We should ensure this series[1] works for you instead.


[...]


Rob

[1] https://lore.kernel.org/r/20220628181838.2031-3-max.oss...@gmail.com


I'm still not convinced that we should encode this pixel format value
directly into the DT instead of trying to describe the DPI bus color
channel width/order/shift in the DT instead. I think I mentioned that
before in one of the previous versions of that series.


I worry that it gets pretty verbose, but worth having the discussion.
In any case, let's have that discussion and not add yet another one
off property.


Done, I replied


Re: [PATCH v1 0/7] New DRM driver for Intel VPU

2022-08-07 Thread Dave Airlie
On Thu, 28 Jul 2022 at 23:17, Jacek Lawrynowicz
 wrote:
>
> Hi,
>
> This patchset contains a new Linux* Kernel Driver for Intel® VPUs.
>
> VPU stands for Versatile Processing Unit and it is an AI inference accelerator
> integrated with Intel non-server CPUs starting from 14th generation.
> VPU enables efficient execution of Deep Learning applications
> like object detection, classification etc.
>
> Driver is part of gpu/drm subsystem because VPU is similar in operation to
> an integrated GPU. Reusing drm driver init, ioctl handling, gem and prime
> helpers and drm_mm allows to minimize code duplication in the kernel.
>
> The whole driver is licensed under GPL-2.0-only except for two headers 
> imported
> from the firmware that are MIT licensed.
>
> User mode driver stack consists of Level Zero API driver and OpenVINO plugin.
> Both should be open-sourced by the end of Q3.
> The firmware for the VPU will be distributed as a closed source binary.


Thanks for the submission, this looks pretty good and well layed out,

just a few higher level things, I think I'd like this name intel-vpu
or ivpu or something, VPU is a pretty generic namespace usage.

I think adding some sort of TODO file with what is missing and what
future things need to happen would be useful to know when merging this
might be a good idea.

I'm kinda thinking with a rename we could merge this sooner into a
staging-lite model.

I think I'd like Christian/Maarten to maybe review the fencing/uapi,
to make sure nothing too much is wrong there. The submit/waitbo model
is getting a bit old, and using syncobjs might be useful to make it
more modern. Is this device meant to be used by multiple users at
once? Maybe we'd want scheduler integration for it as well (which I
think I saw mentioned somewhere in passing).

Dave.


Re: imx8mm lcdif->dsi->adv7535 no video, no errors

2022-08-07 Thread Adam Ford
On Fri, Aug 5, 2022 at 4:05 PM Adam Ford  wrote:
>
> On Fri, Aug 5, 2022 at 7:56 AM Adam Ford  wrote:
> >
> > On Fri, Aug 5, 2022 at 5:55 AM Adam Ford  wrote:
> > >
> > > On Fri, Aug 5, 2022 at 3:44 AM Biju Das  
> > > wrote:
> > > >
> > > > Hi Adam and all,
> > > >
> > > > > Subject: Re: imx8mm lcdif->dsi->adv7535 no video, no errors
> > > > >
> > > > > On Thu, Aug 4, 2022 at 9:52 AM Dave Stevenson
> > > > >  wrote:
> > > > > >
> > > > > > On Thu, 4 Aug 2022 at 13:51, Marco Felsch 
> > > > > wrote:
> > > > > > >
> > > > > > > Hi Dave,
> > > > > > >
> > > > > > > On 22-08-04, Dave Stevenson wrote:
> > > > > > > > Hi Marco
> > > > > > > >
> > > > > > > > On Thu, 4 Aug 2022 at 10:38, Marco Felsch
> > > > >  wrote:
> > > > > > > > >
> > > > > > > > > Hi Dave, Adam,
> > > > > > > > >
> > > > > > > > > On 22-08-03, Dave Stevenson wrote:
> > > > > > > > > > Hi Adam
> > > > > > > > > >
> > > > > > > > > > On Wed, 3 Aug 2022 at 12:03, Adam Ford 
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > ...
> > > > > > > > >
> > > > > > > > > > > > Did managed to get access to the ADV7535 programming
> > > > > > > > > > > > guide? This is the black box here. Let me check if I can
> > > > > > > > > > > > provide you a link with our repo so you can test our
> > > > > current DSIM state if you want.
> > > > > > > > > > >
> > > > > > > > > > > I do have access to the programming guide, but it's under
> > > > > > > > > > > NDA, but I'll try to answer questions if I can.
> > > > > > > > > >
> > > > > > > > > > Not meaning to butt in, but I have datasheets for ADV7533 
> > > > > > > > > > and
> > > > > > > > > > 7535 from previously looking at these chips.
> > > > > > > > >
> > > > > > > > > Thanks for stepping into :)
> > > > > > > > >
> > > > > > > > > > Mine fairly plainly states:
> > > > > > > > > > "The DSI receiver input supports DSI video mode operation
> > > > > > > > > > only, and specifically, only supports nonburst mode with 
> > > > > > > > > > sync
> > > > > pulses".
> > > > > > > > >
> > > > > > > > > I've read this also, and we are working in nonburst mode with
> > > > > > > > > sync pulses. I have no access to an MIPI-DSI analyzer 
> > > > > > > > > therefore
> > > > > > > > > I can't verify it.
> > > > > > > > >
> > > > > > > > > > Non-burst mode meaning that the DSI pixel rate MUST be the
> > > > > > > > > > same as the HDMI pixel rate.
> > > > > > > > >
> > > > > > > > > On DSI side you don't have a pixel-clock instead there is bit-
> > > > > clock.
> > > > > > > >
> > > > > > > > You have an effective pixel clock, with a fixed conversion for 
> > > > > > > > the
> > > > > > > > configuration.
> > > > > > > >
> > > > > > > > DSI bit-clock * number of lanes / bits_per_pixel = pixel rate.
> > > > > > > > 891Mbit/s * 4 lanes / 24bpp = 148.5 Mpixels/s
> > > > > > >
> > > > > > > Okay, I just checked the bandwidth which must equal.
> > > > > > >
> > > > > > > > As noted elsewhere, the DSI is DDR, so the clock lane itself is
> > > > > > > > only running at 891 / 2 = 445.5MHz.
> > > > > > > >
> > > > > > > > > > Section 6.1.1 "DSI Input Modes" of
> > > > > > > > > > adv7533_hardware_user_s_guide is even more explicit about 
> > > > > > > > > > the
> > > > > > > > > > requirement of DSI timing matching
> > > > > > > > >
> > > > > > > > > Is it possible to share the key points of the requirements?
> > > > > > > >
> > > > > > > > "Specifically the ADV7533 supports the Non-Burst Mode with 
> > > > > > > > syncs.
> > > > > > > > This mode requires real time data generation as a pulse packet
> > > > > > > > received becomes a pulse generated. Therefore this mode 
> > > > > > > > requires a
> > > > > > > > continuous stream of data with correct video timing to avoid any
> > > > > > > > visual artifacts."
> > > > > > > >
> > > > > > > > LP mode is supported on data lanes. Clock lane must remain in HS
> > > > > mode.
> > > > > > > >
> > > > > > > > "... the goal is to accurately convey DPI-type timing over DSI.
> > > > > > > > This includes matching DPI pixel-transmission rates, and widths 
> > > > > > > > of
> > > > > > > > timing events."
> > > > > > >
> > > > > > > Thanks for sharing.
> > > > > > >
> > > > > > > > > > The NXP kernel switching down to an hs_clk of 445.5MHz would
> > > > > > > > > > therefore be correct for 720p operation.
> > > > > > > > >
> > > > > > > > > It should be absolute no difference if you work on 891MHz 
> > > > > > > > > with 2
> > > > > > > > > lanes or on 445.5 MHz with 4 lanes. What must be ensured is 
> > > > > > > > > that
> > > > > > > > > you need the minimum required bandwidth which is roughly:
> > > > > > > > > 1280*720*24*60 = 1.327 GBps.
> > > > > > > >
> > > > > > > > Has someone changed the number of lanes in use? I'd missed that 
> > > > > > > > if
> > > > > > > > so, but I'll agree that 891MHz over 2 lanes should work for
> > > > > 720p60.
> > > > > > >
> > > > > > > The ADV driver is changing it autom. but this logic is somehow odd
> > > > > > > and there was already a approach

Re: [PATCH v2] drm/gem: Fix GEM handle release errors

2022-08-07 Thread Chen Jeffy

Hi Christian,

Thanks for your reply, and sorry i didn't make it clear.

On 8/8 星期一 0:52, Christian König wrote:

Am 03.08.22 um 10:32 schrieb Jeffy Chen:

Currently we are assuming a one to one mapping between dmabuf and handle
when releasing GEM handles.

But that is not always true, since we would create extra handles for the
GEM obj in cases like gem_open() and getfb{,2}().

A similar issue was reported at:
https://lore.kernel.org/all/20211105083308.392156-1-jay...@rock-chips.com/

Another problem is that the drm_gem_remove_prime_handles() now only
remove handle to the exported dmabuf (gem_obj->dma_buf), so the imported
ones would leak:
WARNING: CPU: 2 PID: 236 at drivers/gpu/drm/drm_prime.c:228 
drm_prime_destroy_file_private+0x18/0x24


Let's fix these by using handle to find the exact map to remove.


Well we are clearly something missing here. As far as I can see the 
current code is correct.


Creating multiple GEM handles for the same DMA-buf is possible, but 
illegal. >
In other words when a GEM handle is exported as DMA-buf and imported 
again you should intentionally always get the same handle.


These issue are not about having handles for importing an exported 
dma-buf case, but for having multiple handles to a GEM object(which 
means having multiple handles to a dma-buf).


I know the drm-prime is trying to make dma-buf and handle maps one to 
one, but the drm-gem is allowing to create extra handles for a GEM 
object, for example:

drm_gem_open_ioctl -> drm_gem_handle_create_tail
drm_mode_getfb2_ioctl -> drm_gem_handle_create
drm_mode_getfb -> fb->funcs->create_handle

So we are allowing GEM object to have multiple handles, and GEM object 
could have at most one dma-buf, doesn't that means that dma-buf could 
map to multiple handles?


Or should we rewrite the GEM framework to limit GEM object with uniq handle?


The other issue is that we are leaking dma-buf <-> handle map for the 
imported dma-buf, since the drm_gem_remove_prime_handles doesn't take 
care of obj->import_attach->dmabuf.


But of cause this can be fixed in other way:
+++ b/drivers/gpu/drm/drm_gem.c
@@ -180,6 +180,9 @@ drm_gem_remove_prime_handles(struct drm_gem_object 
*obj, struct drm_file *filp)

drm_prime_remove_buf_handle_locked(&filp->prime,
   obj->dma_buf);
}
+   if (obj->import_attach)
+   drm_prime_remove_buf_handle_locked(&filp->prime,
+ 
obj->import_attach->dmabuf);

mutex_unlock(&filp->prime.lock);
 }


So this is pretty much a clear NAK to this patch since it shouldn't be 
necessary or something is seriously broken somewhere else.


Regards,
Christian.



Signed-off-by: Jeffy Chen 
---

Changes in v2:
Fix a typo of rbtree.

  drivers/gpu/drm/drm_gem.c  | 17 +
  drivers/gpu/drm/drm_internal.h |  4 ++--
  drivers/gpu/drm/drm_prime.c    | 20 
  3 files changed, 15 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index eb0c2d041f13..ed39da383570 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -168,21 +168,6 @@ void drm_gem_private_object_init(struct 
drm_device *dev,

  }
  EXPORT_SYMBOL(drm_gem_private_object_init);
-static void
-drm_gem_remove_prime_handles(struct drm_gem_object *obj, struct 
drm_file *filp)

-{
-    /*
- * Note: obj->dma_buf can't disappear as long as we still hold a
- * handle reference in obj->handle_count.
- */
-    mutex_lock(&filp->prime.lock);
-    if (obj->dma_buf) {
-    drm_prime_remove_buf_handle_locked(&filp->prime,
-   obj->dma_buf);
-    }
-    mutex_unlock(&filp->prime.lock);
-}
-
  /**
   * drm_gem_object_handle_free - release resources bound to userspace 
handles

   * @obj: GEM object to clean up.
@@ -253,7 +238,7 @@ drm_gem_object_release_handle(int id, void *ptr, 
void *data)

  if (obj->funcs->close)
  obj->funcs->close(obj, file_priv);
-    drm_gem_remove_prime_handles(obj, file_priv);
+    drm_prime_remove_buf_handle(&file_priv->prime, id);
  drm_vma_node_revoke(&obj->vma_node, file_priv);
  drm_gem_object_handle_put_unlocked(obj);
diff --git a/drivers/gpu/drm/drm_internal.h 
b/drivers/gpu/drm/drm_internal.h

index 1fbbc19f1ac0..7bb98e6a446d 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -74,8 +74,8 @@ int drm_prime_fd_to_handle_ioctl(struct drm_device 
*dev, void *data,
  void drm_prime_init_file_private(struct drm_prime_file_private 
*prime_fpriv);
  void drm_prime_destroy_file_private(struct drm_prime_file_private 
*prime_fpriv);
-void drm_prime_remove_buf_handle_locked(struct drm_prime_file_private 
*prime_fpriv,

-    struct dma_buf *dma_buf);
+void drm_prime_remove_buf_handle(struct drm_prime_file_private 
*prime_fpriv,

+ uint32_t handle);
  /* drm_drv.c */
  struct drm_minor *drm_minor_acquire(unsigned int minor_id);
diff --git a/

Re: [PATCH] i2c: qcom-geni: Fix GPI DMA buffer sync-back

2022-08-07 Thread Wolfram Sang
On Sun, Aug 07, 2022 at 11:04:54PM +0900, Robin Reckmann wrote:
> Fix i2c transfers using GPI DMA mode for all message types that do not set
> the I2C_M_DMA_SAFE flag (e.g. SMBus "read byte").
> 
> In this case a bounce buffer is returned by i2c_get_dma_safe_msg_buf(),
> and it has to synced back to the message after the transfer is done.
> 
> Add missing assignment of dma buffer in geni_i2c_gpi().
> 
> Set xferred in i2c_put_dma_safe_msg_buf() to true in case of no error to
> ensure the sync-back of this dma buffer to the message.
> 
> Signed-off-by: Robin Reckmann 

Thank you! What would be a Fixes tag for this?



signature.asc
Description: PGP signature


Re: New subsystem for acceleration devices

2022-08-07 Thread Greg Kroah-Hartman
On Sun, Aug 07, 2022 at 02:25:33PM +0300, Oded Gabbay wrote:
> 2. Common code to handle drivers that want to allow a single user at a
> time to run open the device char file.

Note, that's an impossible request, and one that the kernel should never
worry about, so don't even try it.  Think about userspace doing an call
to dup() on an open char file descriptor and then passing that off
somewhere else.

thanks,

greg k-h


Re: How to test whether a buffer is in linear format

2022-08-07 Thread Sebastian Krzyszkowiak
On 8/6/22, Hoosier, Matt  wrote:
> Any idea what’s up with some compositors adding code to infer
> DRM_FORMAT_MOD_LINEAR semantics when the buffer’s modifiers are set to 0?
> Wlroots, for example, added this as a “safety net for drm drivers not
> announcing modifiers”.
>
> https://source.puri.sm/Librem5/wlroots/-/merge_requests/63

For the record, that's an old piece of code from a branch that hasn't
been used for a long time already, so don't pay attention to it. See
https://github.com/swaywm/wlroots/pull/2090 for details.

Cheers,
Sebastian


Re: [PATCH] amdgpu: add context creation flags in CS IOCTL

2022-08-07 Thread Somalapuram, Amaranath



On 8/2/2022 7:25 PM, Shashank Sharma wrote:

This patch adds:
- A new input parameter "flags" in the amdgpu_ctx_create2 call.
- Some new flags defining workload type hints.
- Some change in the caller function of amdgpu_ctx_create2, to
   accomodate this new parameter.

The idea is to pass the workload hints while context creation, so
that kernel GPU scheduler can pass this information to GPU FW, which in
turn can adjust the GPU characterstics as per the workload type.

Signed-off-by: Shashank Sharma 
Cc: Alex Deucher 
Cc: Marek Olsak 
Cc: Christian Koenig 
Cc: Amarnath Somalapuram 
---
  amdgpu/amdgpu.h  |  2 ++
  amdgpu/amdgpu_cs.c   |  5 -
  include/drm/amdgpu_drm.h | 10 +-
  3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index b118dd48..1ebb46e6 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -874,6 +874,7 @@ int amdgpu_bo_list_update(amdgpu_bo_list_handle handle,
   *
   * \param   dev  - \c [in] Device handle. See #amdgpu_device_initialize()
   * \param   priority - \c [in] Context creation flags. See 
AMDGPU_CTX_PRIORITY_*
+ * \param   flags- \c [in] Context flags. See AMDGPU_CTX_FLAGS_*
   * \param   context  - \c [out] GPU Context handle
   *
   * \return   0 on success\n
@@ -884,6 +885,7 @@ int amdgpu_bo_list_update(amdgpu_bo_list_handle handle,
  */
  int amdgpu_cs_ctx_create2(amdgpu_device_handle dev,
 uint32_t priority,
+uint32_t flags,
 amdgpu_context_handle *context);
  /**
   * Create GPU execution Context
diff --git a/amdgpu/amdgpu_cs.c b/amdgpu/amdgpu_cs.c
index fad484bf..d4723ea5 100644
--- a/amdgpu/amdgpu_cs.c
+++ b/amdgpu/amdgpu_cs.c
@@ -44,12 +44,14 @@ static int amdgpu_cs_reset_sem(amdgpu_semaphore_handle sem);
   *
   * \param   dev  - \c [in] Device handle. See #amdgpu_device_initialize()
   * \param   priority - \c [in] Context creation flags. See 
AMDGPU_CTX_PRIORITY_*
+ * \param   flags- \c [in] Context flags. See AMDGPU_CTX_FLAGS_*
   * \param   context  - \c [out] GPU Context handle
   *
   * \return  0 on success otherwise POSIX Error code
  */
  drm_public int amdgpu_cs_ctx_create2(amdgpu_device_handle dev,
 uint32_t priority,
+uint32_t flags,
 amdgpu_context_handle *context)
  {
struct amdgpu_context *gpu_context;
@@ -74,6 +76,7 @@ drm_public int amdgpu_cs_ctx_create2(amdgpu_device_handle dev,
memset(&args, 0, sizeof(args));
args.in.op = AMDGPU_CTX_OP_ALLOC_CTX;
args.in.priority = priority;
+   args.in.flags = flags;
  
  	r = drmCommandWriteRead(dev->fd, DRM_AMDGPU_CTX, &args, sizeof(args));

if (r)
@@ -97,7 +100,7 @@ error:
  drm_public int amdgpu_cs_ctx_create(amdgpu_device_handle dev,
amdgpu_context_handle *context)
  {
-   return amdgpu_cs_ctx_create2(dev, AMDGPU_CTX_PRIORITY_NORMAL, context);
+   return amdgpu_cs_ctx_create2(dev, AMDGPU_CTX_PRIORITY_NORMAL, 0, 
context);


How do we set workload hint from application, amdgpu_cs_ctx_create show 
have flag ?


Regards,
S.Amarnath

  }
  
  /**

diff --git a/include/drm/amdgpu_drm.h b/include/drm/amdgpu_drm.h
index 0cbd1540..d9fb1f20 100644
--- a/include/drm/amdgpu_drm.h
+++ b/include/drm/amdgpu_drm.h
@@ -238,10 +238,18 @@ union drm_amdgpu_bo_list {
  #define AMDGPU_CTX_PRIORITY_HIGH512
  #define AMDGPU_CTX_PRIORITY_VERY_HIGH   1023
  
+/* GPU context workload hint bitmask */

+#define AMDGPU_CTX_FLAGS_WORKLOAD_HINT_MASK0xFF
+#define AMDGPU_CTX_FLAGS_WORKLOAD_HINT_NONE0
+#define AMDGPU_CTX_FLAGS_WORKLOAD_HINT_3D  (1 << 1)
+#define AMDGPU_CTX_FLAGS_WORKLOAD_HINT_VIDEO   (1 << 2)
+#define AMDGPU_CTX_FLAGS_WORKLOAD_HINT_VR  (1 << 3)
+#define AMDGPU_CTX_FLAGS_WORKLOAD_HINT_COMPUTE (1 << 4)
+
  struct drm_amdgpu_ctx_in {
/** AMDGPU_CTX_OP_* */
__u32   op;
-   /** For future use, no flags defined so far */
+   /** AMDGPU_CTX_FLAGS_* */
__u32   flags;
__u32   ctx_id;
/** AMDGPU_CTX_PRIORITY_* */