Re: [PATCH] backlight: pwm_bl: Avoid open coded arithmetic in memory allocation

2022-02-07 Thread Uwe Kleine-König
On Sat, Feb 05, 2022 at 08:40:48AM +0100, Christophe JAILLET wrote:
> kmalloc_array()/kcalloc() should be used to avoid potential overflow when
> a multiplication is needed to compute the size of the requested memory.
> 
> So turn a kzalloc()+explicit size computation into an equivalent kcalloc().
> 
> Signed-off-by: Christophe JAILLET 

LGTM

Acked-by: Christophe JAILLET 

Thanks
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature


Re: [PATCH] drm/radeon: Avoid open coded arithmetic in memory allocation

2022-02-07 Thread Christian König

Am 05.02.22 um 18:38 schrieb Christophe JAILLET:

kmalloc_array()/kcalloc() should be used to avoid potential overflow when
a multiplication is needed to compute the size of the requested memory.

So turn a kzalloc()+explicit size computation into an equivalent kcalloc().

Signed-off-by: Christophe JAILLET 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_atombios.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_atombios.c 
b/drivers/gpu/drm/radeon/radeon_atombios.c
index 28c4413f4dc8..7b9cc7a9f42f 100644
--- a/drivers/gpu/drm/radeon/radeon_atombios.c
+++ b/drivers/gpu/drm/radeon/radeon_atombios.c
@@ -897,13 +897,13 @@ bool 
radeon_get_atom_connector_info_from_supported_devices_table(struct
union atom_supported_devices *supported_devices;
int i, j, max_device;
struct bios_connector *bios_connectors;
-   size_t bc_size = sizeof(*bios_connectors) * ATOM_MAX_SUPPORTED_DEVICE;
struct radeon_router router;
  
  	router.ddc_valid = false;

router.cd_valid = false;
  
-	bios_connectors = kzalloc(bc_size, GFP_KERNEL);

+   bios_connectors = kcalloc(ATOM_MAX_SUPPORTED_DEVICE,
+ sizeof(*bios_connectors), GFP_KERNEL);
if (!bios_connectors)
return false;
  




Re: [PATCH] [RFC] drm: mxsfb: Implement LCDIF scanout CRC32 support

2022-02-07 Thread Marek Vasut

On 2/7/22 06:13, Liu Ying wrote:

Hi Marek,


Hi,


On Sun, 2022-02-06 at 19:56 +0100, Marek Vasut wrote:

The LCDIF controller as present in i.MX6SX/i.MX8M Mini/Nano has a CRC_STAT
register, which contains CRC32 of the frame as it was clocked out of the
DPI interface of the LCDIF. This is likely meant as a functional safety
register.

Unfortunatelly, there is zero documentation on how the CRC32 is calculated,
there is no documentation of the polynomial, the init value, nor on which
data is the checksum applied.

By applying brute-force on 8 pixel / 2 line frame, which is the minimum
size LCDIF would work with, it turns out the polynomial is CRC32_POLY_LE
0xedb88320 , init value is 0x , the input data are bitrev32()
of the entire frame and the resulting CRC has to be also bitrev32()ed.


No idea how the HW calculates the CRC value.
I didn't hear anyone internal tried this feature.


It would be nice if the datasheet could be improved.

There are many blank areas which are undocumented, this LCDIF CRC32 
feature, i.MX8M Mini Arteris NOC at 0x3270 , the ARM GPV NIC-301 at 
0x32{0,1,2,3,4,5,6,8}0 and their master/slave port mapping. The NOC 
and NICs were documented at least up to i.MX6QP and then that 
information disappeared from NXP datasheets. I think reconfiguring the 
NOC/NIC QoS would help mitigate this shift issue described below (*).


Do you know if there is some additional NOC/NIC documentation for i.MX8M 
Mini available ?



Doing this calculation in software for each frame is unrealistic due to
the CPU demand, implement at least a sysfs attribute which permits testing
the current frame on demand.


Why not using the existing debugfs CRC support implemented
in drivers/gpu/drm/drm_debugfs_crc.c?


I wasn't aware of that, thanks.


Unfortunatelly, this functionality has another problem. On all of those SoCs,
it is possible to overload interconnect e.g. by concurrent USB and uSDHC
transfers, at which point the LCDIF LFIFO suffers an UNDERFLOW condition,
which results in the image being shifted to the right by exactly LFIFO size
pixels. On i.MX8M Mini, the LFIFO is 76x256 bits = 2432 Byte ~= 810 pixel
at 24bpp. In this case, the LCDIF does not assert UNDERFLOW_IRQ bit, the
frame CRC32 indicated in CRC_STAT register matches the CRC32 of the frame
in DRAM, the RECOVER_ON_UNDERFLOW bit has no effect, so if this mode of
failure occurs, the failure gets undetected and uncorrected.


Hmmm, interesting, no UNDERFLOW_IRQ bit asserted when LCDIF suffers an
UNDERFLOW condition?


Yes


Are you sure LCDIF really underflows?


Mostly sure.

This problem occurs also on i.MX6SX which has no DSIM.

The failure is triggered by many short writes into DRAM to different 
addresses (I was successful at triggering it by using i.MX8M Mini with 
ASIX 88772 USB ethernet adapter, running iperf3 on the device, iperf3 -c 
... -t 0 -R -P 16 on the PC). This effectively makes the CI HDRC behave 
as a DRAM thrashing AXI master, since it triggers a lot of short USB qTD 
READs from DRAM and a lot of short ethernet packet WRITEs to DRAM. And 
that either clogs the DRAM itself, or the NOC or DISPLAY/HSIO NIC-301, 
and prevents LCDIF from getting data long enough for this underflow 
condition to happen, LFIFO to underflow, and this shift to appear. And 
the shift does not disappear automatically itself, it just stays there 
until the LCDIF is reinitialized.


And it apparently also happens on iMXRT, where a suggestion was made to 
tweak the QoS settings of the interconnect (which cannot be tested on 
i.MX8M Mini, because neither of that documentation is available, see 
above (*)):

https://community.nxp.com/t5/i-MX-RT/iMXRT1052-LCD-Screen-shifted/td-p/1069978


If the shifted image is seen on a MIPI DSI display, could that be a
MIPI DSI or DPHY issue, like wrong horizontal parameter(s)?


No, it happens also on i.MX6SX without DSIM, so this is LCDIF problem.


Re: [PATCH v3] drm/bridge: dw-hdmi: use safe format when first in bridge chain

2022-02-07 Thread Neil Armstrong

Hi Sam,

On 06/02/2022 22:33, Sam Ravnborg wrote:

Hi Neail,

On Fri, Feb 04, 2022 at 03:33:37PM +0100, Neil Armstrong wrote:

When the dw-hdmi bridge is in first place of the bridge chain, this
means there is no way to select an input format of the dw-hdmi HW
component.

Since introduction of display-connector, negotiation was broken since
the dw-hdmi negotiation code only worked when the dw-hdmi bridge was
in last position of the bridge chain or behind another bridge also
supporting input & output format negotiation.

Commit 7cd70656d128 ("drm/bridge: display-connector: implement bus fmts 
callbacks")
was introduced to make negotiation work again by making display-connector
act as a pass-through concerning input & output format negotiation.

But in the case where the dw-hdmi is single in the bridge chain, for
example on Renesas SoCs, with the display-connector bridge the dw-hdmi
is no more single, breaking output format.


I have forgotten all the details during my leave from drm, so I
may miss something obvious.
This fix looks like it papers over some general thingy with the
format negotiation.

We do not want to see the below in all display drivers, so
I assume the right fix is to stuff it in somewhere in the framework.


The main issue is there is rule about the encoder in display driver having
a companion bridge to support format negotiation.

To solve this cleanly, the first bridge tied to an encoder should register
with some caps or flags.

For now very few bridge supports negotiation so the rules yet needs to be 
defined.

Since we are getting into a better support of DRM_BRIDGE_ATTACH_NO_CONNECTOR, 
which
clarifies the bridge chain, we should have more cards in our hand in a near 
future.

Anyway, in the meantime there is no fix in the framework for this case.

Neil



Or do I miss something?

Sam




Reported-by: Biju Das 
Bisected-by: Kieran Bingham 
Tested-by: Kieran Bingham 
Fixes: 7cd70656d128 ("drm/bridge: display-connector: implement bus fmts 
callbacks").
Signed-off-by: Neil Armstrong 
Reviewed-by: Robert Foss 
---
Changes since v2:
- Add rob's r-b
- Fix invalid Fixes commit hash

Changes since v1:
- Remove bad fix in dw_hdmi_bridge_atomic_get_input_bus_fmts
- Fix typos in commit message

  drivers/gpu/drm/bridge/synopsys/dw-hdmi.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index 54d8fdad395f..97cdc61b57f6 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -2551,8 +2551,9 @@ static u32 
*dw_hdmi_bridge_atomic_get_output_bus_fmts(struct drm_bridge *bridge,
if (!output_fmts)
return NULL;
  
-	/* If dw-hdmi is the only bridge, avoid negociating with ourselves */

-   if (list_is_singular(&bridge->encoder->bridge_chain)) {
+   /* If dw-hdmi is the first or only bridge, avoid negociating with 
ourselves */
+   if (list_is_singular(&bridge->encoder->bridge_chain) ||
+   list_is_first(&bridge->chain_node, &bridge->encoder->bridge_chain)) 
{
*num_output_fmts = 1;
output_fmts[0] = MEDIA_BUS_FMT_FIXED;
  
--

2.25.1




Re: [PATCH 03/19] iosys-map: Add a few more helpers

2022-02-07 Thread Thomas Zimmermann

Hi

Am 04.02.22 um 20:44 schrieb Lucas De Marchi:
[...]

I only came up with such a macro after doing the rest of the patches and
noticing a pattern that is hard to debug otherwise. I expanded the
explanation in the doc above this macro.

Maybe something like:

#define IOSYS_MAP_INIT_OFFSET(map_, offset_) ({    \
 struct iosys_map copy = *(map_);    \
 iosys_map_incr(©, offset_);    \
 copy;    \
})

Hopefully the compiler elides the additional copy, but I need to check.


I would accept this implementation of the macro. Don't worry about 
possible extra copies.







However, you won't need the offset'ed iosys_map because the 
memcpy_to/from helpers now have the offset parameter.


I can't see how the offset would help. The idea is to use a shallow copy
of the map so another function or even compilation unit can be
designated to read/write part of the struct overlayed in the map... not
even have knowledge of the outer struct.


I totally see your point. I still don't think it's something you should 
do. These functions don't operate on data types, but on raw memory that 
has to be unpacked into memory that has a data type assigned. Types are 
concepts of C, the I/O memory only knows reads and writes of different 
sizes.









+
 /**
  * iosys_map_set_vaddr_iomem - Sets a iosys mapping structure to an 
address in I/O memory

  * @map:    The iosys_map structure
@@ -220,7 +260,7 @@ static inline void iosys_map_clear(struct 
iosys_map *map)

 }
 /**
- * iosys_map_memcpy_to_offset - Memcpy into offset of iosys_map
+ * iosys_map_memcpy_to - Memcpy into iosys_map


That's the fix for the other patch. :)


yep :-/




  * @dst:    The iosys_map structure
  * @dst_offset:    The offset from which to copy
  * @src:    The source buffer
@@ -239,6 +279,26 @@ static inline void iosys_map_memcpy_to(struct 
iosys_map *dst, size_t dst_offset,

 memcpy(dst->vaddr + dst_offset, src, len);
 }
+/**
+ * iosys_map_memcpy_from - Memcpy from iosys_map into system memory
+ * @dst:    Destination in system memory
+ * @src:    The iosys_map structure
+ * @src_offset:    The offset from which to copy
+ * @len:    The number of byte in src
+ *
+ * Copies data from a iosys_map with an offset. The dest buffer is in
+ * system memory. Depending on the mapping location, the helper 
picks the

+ * correct method of accessing the memory.
+ */
+static inline void iosys_map_memcpy_from(void *dst, const struct 
iosys_map *src,

+ size_t src_offset, size_t len)
+{
+    if (src->is_iomem)
+    memcpy_fromio(dst, src->vaddr_iomem + src_offset, len);
+    else
+    memcpy(dst, src->vaddr + src_offset, len);
+}
+
 /**
  * iosys_map_incr - Increments the address stored in a iosys mapping
  * @map:    The iosys_map structure
@@ -255,4 +315,96 @@ static inline void iosys_map_incr(struct 
iosys_map *map, size_t incr)

 map->vaddr += incr;
 }
+/**
+ * iosys_map_memset - Memset iosys_map
+ * @dst:    The iosys_map structure
+ * @offset:    Offset from dst where to start setting value
+ * @value:    The value to set
+ * @len:    The number of bytes to set in dst
+ *
+ * Set value in iosys_map. Depending on the buffer's location, the 
helper

+ * picks the correct method of accessing the memory.
+ */
+static inline void iosys_map_memset(struct iosys_map *dst, size_t 
offset,

+    int value, size_t len)
+{
+    if (dst->is_iomem)
+    memset_io(dst->vaddr_iomem + offset, value, len);
+    else
+    memset(dst->vaddr + offset, value, len);
+}


I've found that memset32() and memset64() can significantly faster. If 
ever needed, we can add variants here as well.



+
+/**
+ * iosys_map_rd - Read a C-type value from the iosys_map
+ *
+ * @map__:    The iosys_map structure
+ * @offset__:    The offset from which to read
+ * @type__:    Type of the value being read
+ *
+ * Read a C type value from iosys_map, handling possible un-aligned 
accesses to

+ * the mapping.
+ *
+ * Returns:
+ * The value read from the mapping.
+ */
+#define iosys_map_rd(map__, offset__, type__) ({    \
+    type__ val;    \
+    iosys_map_memcpy_from(&val, map__, offset__, sizeof(val));    \
+    val;    \
+})
+
+/**
+ * iosys_map_wr - Write a C-type value to the iosys_map
+ *
+ * @map__:    The iosys_map structure
+ * @offset__:    The offset from the mapping to write to
+ * @type__:    Type of the value being written
+ * @val__:    Value to write
+ *
+ * Write a C-type value to the iosys_map, handling possible 
un-aligned accesses

+ * to the mapping.
+ */
+#define iosys_map_wr(map__, offset__, type__, val__) ({    \
+    type__ val = (val__);    \
+    iosys_map_memcpy_to(map__, offset__, &val, sizeof(val));    \
+})
+
+/**
+ * iosys_map_rd_field - Read a member from a struct in the iosys_map
+ *
+ * @map__:    The iosys_map structure
+ * @struct_type__:    The struc

[RFC PATCH 5/6] ARM: dts: imx6sll: add EPDC

2022-02-07 Thread Andreas Kemnade
The commercial variant has a controller for e-Paper displays.

Signed-off-by: Andreas Kemnade 
---
 arch/arm/boot/dts/imx6sll.dtsi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/arm/boot/dts/imx6sll.dtsi b/arch/arm/boot/dts/imx6sll.dtsi
index d4a000c3dde7..042e8a391b2f 100644
--- a/arch/arm/boot/dts/imx6sll.dtsi
+++ b/arch/arm/boot/dts/imx6sll.dtsi
@@ -643,6 +643,15 @@ pxp: pxp@20f {
clock-names = "axi";
};
 
+   epdc: epdc@20f4000 {
+   compatible = "fsl,imx6sll-epdc";
+   reg = <0x020f4000 0x4000>;
+   interrupts = ;
+   clocks = <&clks IMX6SLL_CLK_EPDC_AXI>, <&clks 
IMX6SLL_CLK_EPDC_PIX>;
+   clock-names = "axi", "pix";
+   status = "disabled";
+   };
+
lcdif: lcd-controller@20f8000 {
compatible = "fsl,imx6sll-lcdif", 
"fsl,imx28-lcdif";
reg = <0x020f8000 0x4000>;
-- 
2.30.2



[RFC PATCH 4/6] drm: mxc-epdc: Add update management

2022-02-07 Thread Andreas Kemnade
The EPDC can process some dirty rectangles at a time, pick them up and
forward them to the controller. Only processes not involving PXP are
supported at the moment. Due to that and to work with more waveforms,
there is some masking/shifting done. It was tested with the factory
waveforms of Kobo Clara HD, Tolino Shine 3, and Tolino Shine 2HD.
Also the waveform called epdc_E060SCM.fw from NXP BSP works with the
i.MX6SL devices.

Signed-off-by: Andreas Kemnade 
---
 drivers/gpu/drm/mxc-epdc/Makefile   |2 +-
 drivers/gpu/drm/mxc-epdc/epdc_hw.c  |2 +
 drivers/gpu/drm/mxc-epdc/epdc_update.c  | 1210 +++
 drivers/gpu/drm/mxc-epdc/epdc_update.h  |9 +
 drivers/gpu/drm/mxc-epdc/mxc_epdc.h |   50 +
 drivers/gpu/drm/mxc-epdc/mxc_epdc_drv.c |   33 +-
 6 files changed, 1304 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_update.c
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_update.h

diff --git a/drivers/gpu/drm/mxc-epdc/Makefile 
b/drivers/gpu/drm/mxc-epdc/Makefile
index 0263ef2bf0db..a55e2bfe824a 100644
--- a/drivers/gpu/drm/mxc-epdc/Makefile
+++ b/drivers/gpu/drm/mxc-epdc/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-mxc_epdc_drm-y := mxc_epdc_drv.o epdc_hw.o epdc_waveform.o
+mxc_epdc_drm-y := mxc_epdc_drv.o epdc_hw.o epdc_update.o epdc_waveform.o
 
 obj-$(CONFIG_DRM_MXC_EPDC) += mxc_epdc_drm.o
 
diff --git a/drivers/gpu/drm/mxc-epdc/epdc_hw.c 
b/drivers/gpu/drm/mxc-epdc/epdc_hw.c
index a74cbd237e0d..22a065ac6992 100644
--- a/drivers/gpu/drm/mxc-epdc/epdc_hw.c
+++ b/drivers/gpu/drm/mxc-epdc/epdc_hw.c
@@ -20,6 +20,7 @@
 #include "mxc_epdc.h"
 #include "epdc_regs.h"
 #include "epdc_hw.h"
+#include "epdc_update.h"
 #include "epdc_waveform.h"
 
 void mxc_epdc_powerup(struct mxc_epdc *priv)
@@ -410,6 +411,7 @@ void mxc_epdc_init_sequence(struct mxc_epdc *priv, struct 
drm_display_mode *m)
 
priv->in_init = true;
mxc_epdc_powerup(priv);
+   mxc_epdc_draw_mode0(priv);
/* Force power down event */
priv->powering_down = true;
mxc_epdc_powerdown(priv);
diff --git a/drivers/gpu/drm/mxc-epdc/epdc_update.c 
b/drivers/gpu/drm/mxc-epdc/epdc_update.c
new file mode 100644
index ..0c061982aa0b
--- /dev/null
+++ b/drivers/gpu/drm/mxc-epdc/epdc_update.c
@@ -0,0 +1,1210 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright (C) 2022 Andreas Kemnade
+//
+/*
+ * based on the EPDC framebuffer driver
+ * Copyright (C) 2010-2016 Freescale Semiconductor, Inc.
+ * Copyright 2017 NXP
+ */
+
+#include 
+#include 
+#include 
+#include "mxc_epdc.h"
+#include "epdc_hw.h"
+#include "epdc_regs.h"
+#include "epdc_waveform.h"
+
+#define EPDC_V2_NUM_LUTS   64
+#define EPDC_V2_MAX_NUM_UPDATES 64
+#define INVALID_LUT (-1)
+#define DRY_RUN_NO_LUT   100
+
+#define MERGE_OK   0
+#define MERGE_FAIL  1
+#define MERGE_BLOCK 2
+
+struct update_desc_list {
+   struct list_head list;
+   struct mxcfb_update_data upd_data;/* Update parameters */
+   u32 update_order;   /* Numeric ordering value for update */
+};
+
+/* This structure represents a list node containing both
+ * a memory region allocated as an output buffer for the PxP
+ * update processing task, and the update description (mode, region, etc.)
+ */
+struct update_data_list {
+   struct list_head list;
+   struct update_desc_list *update_desc;
+   int lut_num;/* Assigned before update is processed into 
working buffer */
+   u64 collision_mask; /* Set when update creates collision */
+   /* Mask of the LUTs the update collides with */
+};
+
+/
+ * Start Low-Level EPDC Functions
+ /
+
+static inline void epdc_lut_complete_intr(struct mxc_epdc *priv, u32 lut_num, 
bool enable)
+{
+   if (enable) {
+   if (lut_num < 32)
+   epdc_write(priv, EPDC_IRQ_MASK1_SET, BIT(lut_num));
+   else
+   epdc_write(priv, EPDC_IRQ_MASK2_SET, BIT(lut_num - 32));
+   } else {
+   if (lut_num < 32)
+   epdc_write(priv, EPDC_IRQ_MASK1_CLEAR, BIT(lut_num));
+   else
+   epdc_write(priv, EPDC_IRQ_MASK2_CLEAR, BIT(lut_num - 
32));
+   }
+}
+
+static inline void epdc_working_buf_intr(struct mxc_epdc *priv, bool enable)
+{
+   if (enable)
+   epdc_write(priv, EPDC_IRQ_MASK_SET, EPDC_IRQ_WB_CMPLT_IRQ);
+   else
+   epdc_write(priv, EPDC_IRQ_MASK_CLEAR, EPDC_IRQ_WB_CMPLT_IRQ);
+}
+
+static inline void epdc_clear_working_buf_irq(struct mxc_epdc *priv)
+{
+   epdc_write(priv, EPDC_IRQ_CLEAR,
+  EPDC_IRQ_WB_CMPLT_IRQ | EPDC_IRQ_LUT_COL_IRQ);
+}
+
+static inline void epdc_eof_intr(struct mxc_epdc *priv, bool enable)
+{
+   if (enable)
+   epdc_write(priv, EPDC_IRQ_MASK_SET, EPDC

Re: [syzbot] WARNING in __dma_map_sg_attrs

2022-02-07 Thread syzbot
syzbot has found a reproducer for the following issue on:

HEAD commit:0457e5153e0e Merge tag 'for-linus' of git://git.kernel.org..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11b2637c70
kernel config:  https://syzkaller.appspot.com/x/.config?x=6f043113811433a5
dashboard link: https://syzkaller.appspot.com/bug?extid=10e27961f4da37c443b2
compiler:   gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for 
Debian) 2.35.2
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=11c6554270
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1163f48070

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+10e27961f4da37c44...@syzkaller.appspotmail.com

[ cut here ]
WARNING: CPU: 1 PID: 3595 at kernel/dma/mapping.c:188 
__dma_map_sg_attrs+0x181/0x1f0 kernel/dma/mapping.c:188
Modules linked in:
CPU: 0 PID: 3595 Comm: syz-executor249 Not tainted 
5.17.0-rc2-syzkaller-00316-g0457e5153e0e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:__dma_map_sg_attrs+0x181/0x1f0 kernel/dma/mapping.c:188
Code: 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 71 4c 8b 3d c0 83 b5 
0d e9 db fe ff ff e8 b6 0f 13 00 0f 0b e8 af 0f 13 00 <0f> 0b 45 31 e4 e9 54 ff 
ff ff e8 a0 0f 13 00 49 8d 7f 50 48 b8 00
RSP: 0018:c90002a07d68 EFLAGS: 00010293
RAX:  RBX:  RCX: 
RDX: 88807e25e2c0 RSI: 81649e91 RDI: 88801b848408
RBP: 88801b848000 R08: 0002 R09: 88801d86c74f
R10: 81649d72 R11: 0001 R12: 0002
R13: 88801d86c680 R14: 0001 R15: 
FS:  56e30300() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 20cc CR3: 1d74a000 CR4: 003506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 
 dma_map_sgtable+0x70/0xf0 kernel/dma/mapping.c:264
 get_sg_table.isra.0+0xe0/0x160 drivers/dma-buf/udmabuf.c:72
 begin_cpu_udmabuf+0x130/0x1d0 drivers/dma-buf/udmabuf.c:126
 dma_buf_begin_cpu_access+0xfd/0x1d0 drivers/dma-buf/dma-buf.c:1164
 dma_buf_ioctl+0x259/0x2b0 drivers/dma-buf/dma-buf.c:363
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f62fcf530f9
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:7ffe3edab9b8 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX:  RCX: 7f62fcf530f9
RDX: 2200 RSI: 40086200 RDI: 0006
RBP: 7f62fcf170e0 R08:  R09: 
R10:  R11: 0246 R12: 7f62fcf17170
R13:  R14:  R15: 
 



[RFC PATCH 0/6] drm: EPDC driver for i.MX6

2022-02-07 Thread Andreas Kemnade
Add a driver for the Electrophoretic Display Controller found in the
i.MX6 SoCs.

In combination with a driver for an EPD PMIC (like the TPS65185 or the
SY7636A), it works with the EPDC found in i.MX6SLL based devices and the
EPDC found in i.MX6SL devices.

Support for waveforms might be limited, there was no 4bit waveform found
which works with the 6SLL but it works with the vendor waveforms of the
Kobo Clara HD (6SLL), the Tolino Shine 2/3 (6SL).
On the 6SL devices, also the epdc_E060SCM.fw works but not as brilliant
as the vendor one.

It does not involve the PXP yet. The NXP/Freescale kernel fork uses that
for rotation and mysterious waveform handling. That is not planed to be
upstreamed in the first step.

Also it does not provide any special userspace API to fine-tune updates.
That is also IMHO something for a second step.

Andreas Kemnade (6):
  dt-bindings: display: imx: Add EPDC
  drm: Add skeleton for EPDC driver
  drm: mxc-epdc: Add display and waveform initialisation
  drm: mxc-epdc: Add update management
  ARM: dts: imx6sll: add EPDC
  arm: dts: imx6sl: Add EPDC

 .../bindings/display/imx/fsl,mxc-epdc.yaml|  159 +++
 arch/arm/boot/dts/imx6sl.dtsi |3 +
 arch/arm/boot/dts/imx6sll.dtsi|9 +
 drivers/gpu/drm/Kconfig   |2 +
 drivers/gpu/drm/Makefile  |1 +
 drivers/gpu/drm/mxc-epdc/Kconfig  |   15 +
 drivers/gpu/drm/mxc-epdc/Makefile |5 +
 drivers/gpu/drm/mxc-epdc/epdc_hw.c|  497 +++
 drivers/gpu/drm/mxc-epdc/epdc_hw.h|8 +
 drivers/gpu/drm/mxc-epdc/epdc_regs.h  |  442 ++
 drivers/gpu/drm/mxc-epdc/epdc_update.c| 1210 +
 drivers/gpu/drm/mxc-epdc/epdc_update.h|9 +
 drivers/gpu/drm/mxc-epdc/epdc_waveform.c  |  189 +++
 drivers/gpu/drm/mxc-epdc/epdc_waveform.h  |7 +
 drivers/gpu/drm/mxc-epdc/mxc_epdc.h   |  151 ++
 drivers/gpu/drm/mxc-epdc/mxc_epdc_drv.c   |  373 +
 16 files changed, 3080 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/imx/fsl,mxc-epdc.yaml
 create mode 100644 drivers/gpu/drm/mxc-epdc/Kconfig
 create mode 100644 drivers/gpu/drm/mxc-epdc/Makefile
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_hw.c
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_hw.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_regs.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_update.c
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_update.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_waveform.c
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_waveform.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/mxc_epdc.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/mxc_epdc_drv.c

-- 
2.30.2



[RFC PATCH 2/6] drm: Add skeleton for EPDC driver

2022-02-07 Thread Andreas Kemnade
This driver is for the EPD controller in the i.MX SoCs. Add a skeleton
and basic things for the driver

Signed-off-by: Andreas Kemnade 
---
 drivers/gpu/drm/Kconfig |   2 +
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/mxc-epdc/Kconfig|  15 +
 drivers/gpu/drm/mxc-epdc/Makefile   |   5 +
 drivers/gpu/drm/mxc-epdc/epdc_regs.h| 442 
 drivers/gpu/drm/mxc-epdc/mxc_epdc.h |  20 ++
 drivers/gpu/drm/mxc-epdc/mxc_epdc_drv.c | 248 +
 7 files changed, 733 insertions(+)
 create mode 100644 drivers/gpu/drm/mxc-epdc/Kconfig
 create mode 100644 drivers/gpu/drm/mxc-epdc/Makefile
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_regs.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/mxc_epdc.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/mxc_epdc_drv.c

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index b1f22e457fd0..6b6b44ff7556 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -390,6 +390,8 @@ source "drivers/gpu/drm/gud/Kconfig"
 
 source "drivers/gpu/drm/sprd/Kconfig"
 
+source "drivers/gpu/drm/mxc-epdc/Kconfig"
+
 config DRM_HYPERV
tristate "DRM Support for Hyper-V synthetic video device"
depends on DRM && PCI && MMU && HYPERV
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 301a44dc18e3..e5eb9815cf9a 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -131,6 +131,7 @@ obj-$(CONFIG_DRM_PANFROST) += panfrost/
 obj-$(CONFIG_DRM_ASPEED_GFX) += aspeed/
 obj-$(CONFIG_DRM_MCDE) += mcde/
 obj-$(CONFIG_DRM_TIDSS) += tidss/
+obj-$(CONFIG_DRM_MXC_EPDC) += mxc-epdc/
 obj-y  += xlnx/
 obj-y  += gud/
 obj-$(CONFIG_DRM_HYPERV) += hyperv/
diff --git a/drivers/gpu/drm/mxc-epdc/Kconfig b/drivers/gpu/drm/mxc-epdc/Kconfig
new file mode 100644
index ..3f5744161cff
--- /dev/null
+++ b/drivers/gpu/drm/mxc-epdc/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0
+config DRM_MXC_EPDC
+   tristate "i.MX EPD Controller"
+   depends on DRM && OF
+   depends on (COMPILE_TEST || ARCH_MXC)
+   select DRM_KMS_HELPER
+   select DRM_KMS_CMA_HELPER
+   select DMA_CMA if HAVE_DMA_CONTIGUOUS
+   select CMA if HAVE_DMA_CONTIGUOUS
+   help
+ Choose this option if you have an i.MX system with an EPDC.
+ It enables the usage of E-paper displays. A waveform is expected
+ to be present in /lib/firmware/imx/epdc/epdc.fw
+
+ If M is selected this module will be called mxc_epdc_drm.
diff --git a/drivers/gpu/drm/mxc-epdc/Makefile 
b/drivers/gpu/drm/mxc-epdc/Makefile
new file mode 100644
index ..a47ced72b7f6
--- /dev/null
+++ b/drivers/gpu/drm/mxc-epdc/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+mxc_epdc_drm-y := mxc_epdc_drv.o
+
+obj-$(CONFIG_DRM_MXC_EPDC) += mxc_epdc_drm.o
+
diff --git a/drivers/gpu/drm/mxc-epdc/epdc_regs.h 
b/drivers/gpu/drm/mxc-epdc/epdc_regs.h
new file mode 100644
index ..83445c56d911
--- /dev/null
+++ b/drivers/gpu/drm/mxc-epdc/epdc_regs.h
@@ -0,0 +1,442 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/* Copyright (C) 2010-2013 Freescale Semiconductor, Inc. */
+
+#ifndef __EPDC_REGS_INCLUDED__
+#define __EPDC_REGS_INCLUDED__
+
+/*
+ * Register addresses
+ **/
+
+#define EPDC_CTRL  0x000
+#define EPDC_CTRL_SET  0x004
+#define EPDC_CTRL_CLEAR0x008
+#define EPDC_CTRL_TOGGLE   0x00C
+#define EPDC_WB_ADDR_TCE_V30x010
+#define EPDC_WVADDR0x020
+#define EPDC_WB_ADDR   0x030
+#define EPDC_RES   0x040
+#define EPDC_FORMAT0x050
+#define EPDC_FORMAT_SET0x054
+#define EPDC_FORMAT_CLEAR  0x058
+#define EPDC_FORMAT_TOGGLE 0x05C
+#define EPDC_WB_FIELD0 0x060
+#define EPDC_WB_FIELD0_SET 0x064
+#define EPDC_WB_FIELD0_CLEAR   0x068
+#define EPDC_WB_FIELD0_TOGGLE  0x06C
+#define EPDC_WB_FIELD1 0x070
+#define EPDC_WB_FIELD1_SET 0x074
+#define EPDC_WB_FIELD1_CLEAR   0x078
+#define EPDC_WB_FIELD1_TOGGLE  0x07C
+#define EPDC_WB_FIELD2 0x080
+#define EPDC_WB_FIELD2_SET 0x084
+#define EPDC_WB_FIELD2_CLEAR   0x088
+#define EPDC_WB_FIELD2_TOGGLE  0x08C
+#define EPDC_WB_FIELD3 0x090
+#define EPDC_WB_FIELD3_SET 0x094
+#define EPDC_WB_FIELD3_CLEAR   0x098
+#define EPDC_WB_FIELD3_TOGGLE  0x09C
+#define EPDC_FIFOCTRL  0x0A0
+#define EPDC_FIFOCTRL_SET  0x0A4
+#define EPDC_FIFOCTRL_CLEAR0x0A8
+#define EPDC_FIFOCTRL_TOGGLE   0x0AC
+#define EPDC_UPD_ADDR  0x100
+#define EPDC_UPD_STRIDE0x110
+#define EPDC_UPD_

Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier

2022-02-07 Thread Janne Grunau
On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding  
> > > wrote:
> > > >
> > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding 
> > > > >  wrote:
> > > > > >
> > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > >
> > > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > > version of reg. Perhaps abuse 'assigned-address' for this 
> > > > > > > purpose. The
> > > > > > > issue I see would be handling reserved iova areas without a 
> > > > > > > physical
> > > > > > > area. That can be handled with just a iova and no reg. We already 
> > > > > > > have
> > > > > > > a no reg case.
> > > > > >
> > > > > > I had thought about that initially. One thing I'm worried about is 
> > > > > > that
> > > > > > every child node in /reserved-memory will effectively cause the 
> > > > > > memory
> > > > > > that it described to be reserved. But we don't want that for regions
> > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > >
> > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > > You need a specific handler for them. We'd probably want a compatible
> > > > > as well for these virtual reservations.
> > > >
> > > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > > that they won't be used by the IOVA allocator. Typically these would be
> > > > used for cases where those addresses have some special meaning.
> > > >
> > > > Do we want something like:
> > > >
> > > > compatible = "iommu-reserved";
> > > >
> > > > for these? Or would that need to be:
> > > >
> > > > compatible = "linux,iommu-reserved";
> > > >
> > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > compatible strings in the reserved-memory DT bindings directory.
> > > 
> > > I would not use 'linux,' here.
> > > 
> > > >
> > > > On the other hand, do we actually need the compatible string? Because we
> > > > don't really want to associate much extra information with this like we
> > > > do for example with "shared-dma-pool". The logic to handle this would
> > > > all be within the IOMMU framework. All we really need is for the
> > > > standard reservation code to skip nodes that don't have a reg property
> > > > so we don't reserve memory for "virtual-only" allocations.
> > > 
> > > It doesn't hurt to have one and I can imagine we might want to iterate
> > > over all the nodes. It's slightly easier and more common to iterate
> > > over compatible nodes rather than nodes with some property.
> > > 
> > > > > Are these being global in DT going to be a problem? Presumably we have
> > > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > > space. That could be solved with something like this:
> > > > >
> > > > > iommu-addresses = <&iommu1  >;
> > > >
> > > > The only case that would be problematic would be if we have overlapping
> > > > physical regions, because that will probably trip up the standard code.
> > > >
> > > > But this could also be worked around by looking at iommu-addresses. For
> > > > example, if we had something like this:
> > > >
> > > > reserved-memory {
> > > > fb_dc0: fb@8000 {
> > > > reg = <0x8000 0x0100>;
> > > > iommu-addresses = <0xa000 0x0100>;
> > > > };
> > > >
> > > > fb_dc1: fb@8000 {
> > > 
> > > You can't have 2 nodes with the same name (actually, you can, they
> > > just get merged together). Different names with the same unit-address
> > > is a dtc warning. I'd really like to make that a full blown
> > > overlapping region check.
> > 
> > Right... so this would be a lot easier to deal with using that earlier
> > proposal where the IOMMU regions were a separate thing and referencing
> > the reserved-memory nodes. In those cases we could just have the
> > physical reservation for the framebuffer once (so we don't get any
> > duplicates or overlaps) and then have each IOVA reservation reference
> > that to create the mapping.
> > 
> > > 
> > > > reg = <0x8000 0x0100>;
> > > > iommu-addresses = <0xb000 0x0100>;
> > > > };
> > > > };
> > > >
> > > > We could make the code identify

[RFC PATCH 6/6] arm: dts: imx6sl: Add EPDC

2022-02-07 Thread Andreas Kemnade
Extend definition of EPDC.

Signed-off-by: Andreas Kemnade 
---
 arch/arm/boot/dts/imx6sl.dtsi | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm/boot/dts/imx6sl.dtsi b/arch/arm/boot/dts/imx6sl.dtsi
index c7d907c5c352..919e86e4fc74 100644
--- a/arch/arm/boot/dts/imx6sl.dtsi
+++ b/arch/arm/boot/dts/imx6sl.dtsi
@@ -765,8 +765,11 @@ pxp: pxp@20f {
};
 
epdc: epdc@20f4000 {
+   compatible = "fsl,imx6sl-epdc";
reg = <0x020f4000 0x4000>;
interrupts = <0 97 IRQ_TYPE_LEVEL_HIGH>;
+   clocks = <&clks IMX6SL_CLK_EPDC_AXI>, <&clks 
IMX6SL_CLK_EPDC_PIX>;
+   clock-names = "axi", "pix";
};
 
lcdif: lcdif@20f8000 {
-- 
2.30.2



[RFC PATCH 3/6] drm: mxc-epdc: Add display and waveform initialisation

2022-02-07 Thread Andreas Kemnade
Adds display parameter initialisation, display power up/down and
waveform loading

Signed-off-by: Andreas Kemnade 
---
 drivers/gpu/drm/mxc-epdc/Makefile|   2 +-
 drivers/gpu/drm/mxc-epdc/epdc_hw.c   | 495 +++
 drivers/gpu/drm/mxc-epdc/epdc_hw.h   |   8 +
 drivers/gpu/drm/mxc-epdc/epdc_waveform.c | 189 +
 drivers/gpu/drm/mxc-epdc/epdc_waveform.h |   7 +
 drivers/gpu/drm/mxc-epdc/mxc_epdc.h  |  81 
 drivers/gpu/drm/mxc-epdc/mxc_epdc_drv.c  |  94 +
 7 files changed, 875 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_hw.c
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_hw.h
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_waveform.c
 create mode 100644 drivers/gpu/drm/mxc-epdc/epdc_waveform.h

diff --git a/drivers/gpu/drm/mxc-epdc/Makefile 
b/drivers/gpu/drm/mxc-epdc/Makefile
index a47ced72b7f6..0263ef2bf0db 100644
--- a/drivers/gpu/drm/mxc-epdc/Makefile
+++ b/drivers/gpu/drm/mxc-epdc/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-mxc_epdc_drm-y := mxc_epdc_drv.o
+mxc_epdc_drm-y := mxc_epdc_drv.o epdc_hw.o epdc_waveform.o
 
 obj-$(CONFIG_DRM_MXC_EPDC) += mxc_epdc_drm.o
 
diff --git a/drivers/gpu/drm/mxc-epdc/epdc_hw.c 
b/drivers/gpu/drm/mxc-epdc/epdc_hw.c
new file mode 100644
index ..a74cbd237e0d
--- /dev/null
+++ b/drivers/gpu/drm/mxc-epdc/epdc_hw.c
@@ -0,0 +1,495 @@
+// SPDX-License-Identifier: GPL-2.0+
+// Copyright (C) 2022 Andreas Kemnade
+//
+/*
+ * based on the EPDC framebuffer driver
+ * Copyright (C) 2010-2016 Freescale Semiconductor, Inc.
+ * Copyright 2017 NXP
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mxc_epdc.h"
+#include "epdc_regs.h"
+#include "epdc_hw.h"
+#include "epdc_waveform.h"
+
+void mxc_epdc_powerup(struct mxc_epdc *priv)
+{
+   int ret = 0;
+
+   mutex_lock(&priv->power_mutex);
+
+   /*
+* If power down request is pending, clear
+* powering_down to cancel the request.
+*/
+   if (priv->powering_down)
+   priv->powering_down = false;
+
+   if (priv->powered) {
+   mutex_unlock(&priv->power_mutex);
+   return;
+   }
+
+   dev_dbg(priv->drm.dev, "EPDC Powerup\n");
+
+   priv->updates_active = true;
+
+   /* Enable the v3p3 regulator */
+   ret = regulator_enable(priv->v3p3_regulator);
+   if (IS_ERR((void *)ret)) {
+   dev_err(priv->drm.dev,
+   "Unable to enable V3P3 regulator. err = 0x%x\n",
+   ret);
+   mutex_unlock(&priv->power_mutex);
+   return;
+   }
+
+   usleep_range(1000, 2000);
+
+   pm_runtime_get_sync(priv->drm.dev);
+
+   /* Enable clocks to EPDC */
+   clk_prepare_enable(priv->epdc_clk_axi);
+   clk_prepare_enable(priv->epdc_clk_pix);
+
+   epdc_write(priv, EPDC_CTRL_CLEAR, EPDC_CTRL_CLKGATE);
+
+   /* Enable power to the EPD panel */
+   ret = regulator_enable(priv->display_regulator);
+   if (IS_ERR((void *)ret)) {
+   dev_err(priv->drm.dev,
+   "Unable to enable DISPLAY regulator. err = 0x%x\n",
+   ret);
+   mutex_unlock(&priv->power_mutex);
+   return;
+   }
+
+   ret = regulator_enable(priv->vcom_regulator);
+   if (IS_ERR((void *)ret)) {
+   dev_err(priv->drm.dev,
+   "Unable to enable VCOM regulator. err = 0x%x\n",
+   ret);
+   mutex_unlock(&priv->power_mutex);
+   return;
+   }
+
+   priv->powered = true;
+
+   mutex_unlock(&priv->power_mutex);
+}
+
+void mxc_epdc_powerdown(struct mxc_epdc *priv)
+{
+   mutex_lock(&priv->power_mutex);
+
+   /* If powering_down has been cleared, a powerup
+* request is pre-empting this powerdown request.
+*/
+   if (!priv->powering_down
+   || (!priv->powered)) {
+   mutex_unlock(&priv->power_mutex);
+   return;
+   }
+
+   dev_dbg(priv->drm.dev, "EPDC Powerdown\n");
+
+   /* Disable power to the EPD panel */
+   regulator_disable(priv->vcom_regulator);
+   regulator_disable(priv->display_regulator);
+
+   /* Disable clocks to EPDC */
+   epdc_write(priv, EPDC_CTRL_SET, EPDC_CTRL_CLKGATE);
+   clk_disable_unprepare(priv->epdc_clk_pix);
+   clk_disable_unprepare(priv->epdc_clk_axi);
+
+   pm_runtime_put_sync_suspend(priv->drm.dev);
+
+   /* turn off the V3p3 */
+   regulator_disable(priv->v3p3_regulator);
+
+   priv->powered = false;
+   priv->powering_down = false;
+
+   if (priv->wait_for_powerdown) {
+   priv->wait_for_powerdown = false;
+   complete(&priv->powerdown_compl);
+   }
+
+   mutex_unlock(&priv->power_mutex);
+}
+
+static void epdc_set_horizontal_timing(struct mxc_epdc *priv, u32 horiz_

[RFC PATCH 1/6] dt-bindings: display: imx: Add EPDC

2022-02-07 Thread Andreas Kemnade
Add a binding for the Electrophoretic Display Controller found at least
in the i.MX6.
The timing subnode is directly here to avoid having display parameters
spread all over the plate.

Supplies are organized the same way as in the fbdev driver in the
NXP/Freescale kernel forks. The regulators used for that purpose,
like the TPS65185, the SY7636A and MAX17135 have typically a single bit to
start a bunch of regulators of higher or negative voltage with a
well-defined timing. VCOM can be handled separately, but can also be
incorporated into that single bit.

Signed-off-by: Andreas Kemnade 
---
 .../bindings/display/imx/fsl,mxc-epdc.yaml| 159 ++
 1 file changed, 159 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/imx/fsl,mxc-epdc.yaml

diff --git a/Documentation/devicetree/bindings/display/imx/fsl,mxc-epdc.yaml 
b/Documentation/devicetree/bindings/display/imx/fsl,mxc-epdc.yaml
new file mode 100644
index ..7e0795cc3f70
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/imx/fsl,mxc-epdc.yaml
@@ -0,0 +1,159 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/imx/fsl,mxc-epdc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX6 EPDC
+
+maintainers:
+  - Andreas Kemnade 
+
+description: |
+  The EPDC is a controller for handling electronic paper displays found in
+  i.MX6 SoCs.
+
+properties:
+  compatible:
+enum:
+  - fsl,imx6sl-epdc
+  - fsl,imx6sll-epdc
+
+  reg:
+maxItems: 1
+
+  clocks:
+items:
+  - description: Bus clock
+  - description: Pixel clock
+
+  clock-names:
+items:
+  - const: axi
+  - const: pix
+
+  interrupts:
+maxItems: 1
+
+  vscan-holdoff:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  sdoed-width:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  sdoed-delay:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  sdoez-width:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  sdoez-delay:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  gdclk-hp-offs:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  gdsp-offs:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  gdoe-offs:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  gdclk-offs:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  num-ce:
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  timing:
+$ref: /display/panel/panel-timing.yaml#
+
+  DISPLAY-supply:
+description:
+  A couple of +/- voltages automatically powered on in a defintive order
+
+  VCOM-supply:
+description: compensation voltage
+
+  V3P3-supply:
+description: V3P3 supply
+
+  epd-thermal-zone:
+description:
+  Zone to get temperature of the EPD from, practically ambient temperature.
+
+
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+  - interrupts
+  - vscan-holdoff
+  - sdoed-width
+  - sdoed-delay
+  - sdoez-width
+  - sdoez-delay
+  - gdclk-hp-offs
+  - gdsp-offs
+  - gdoe-offs
+  - gdclk-offs
+  - num-ce
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+
+epdc: epdc@20f4000 {
+compatible = "fsl,imx6sl-epdc";
+reg = <0x020f4000 0x4000>;
+interrupts = <0 97 IRQ_TYPE_LEVEL_HIGH>;
+clocks = <&clks IMX6SL_CLK_EPDC_AXI>, <&clks IMX6SL_CLK_EPDC_PIX>;
+clock-names = "axi", "pix";
+
+pinctrl-names = "default";
+pinctrl-0 = <&pinctrl_epdc0>;
+V3P3-supply = <&V3P3_reg>;
+VCOM-supply = <&VCOM_reg>;
+DISPLAY-supply = <&DISPLAY_reg>;
+epd-thermal-zone = "epd-thermal";
+
+vscan-holdoff = <4>;
+sdoed-width = <10>;
+sdoed-delay = <20>;
+sdoez-width = <10>;
+sdoez-delay = <20>;
+gdclk-hp-offs = <562>;
+gdsp-offs = <662>;
+gdoe-offs = <0>;
+gdclk-offs = <225>;
+num-ce = <3>;
+status = "okay";
+
+timing {
+clock-frequency = <8000>;
+hactive = <1448>;
+hback-porch = <16>;
+hfront-porch = <102>;
+hsync-len = <28>;
+vactive = <1072>;
+vback-porch = <4>;
+vfront-porch = <4>;
+vsync-len = <2>;
+};
+};
+...
-- 
2.30.2



Re: [PATCH] [RFC] drm: mxsfb: Implement LCDIF scanout CRC32 support

2022-02-07 Thread Liu Ying
On Mon, 2022-02-07 at 09:14 +0100, Marek Vasut wrote:
> On 2/7/22 06:13, Liu Ying wrote:
> > Hi Marek,
> 
> Hi,
> 
> > On Sun, 2022-02-06 at 19:56 +0100, Marek Vasut wrote:
> > > The LCDIF controller as present in i.MX6SX/i.MX8M Mini/Nano has a
> > > CRC_STAT
> > > register, which contains CRC32 of the frame as it was clocked out
> > > of the
> > > DPI interface of the LCDIF. This is likely meant as a functional
> > > safety
> > > register.
> > > 
> > > Unfortunatelly, there is zero documentation on how the CRC32 is
> > > calculated,
> > > there is no documentation of the polynomial, the init value, nor
> > > on which
> > > data is the checksum applied.
> > > 
> > > By applying brute-force on 8 pixel / 2 line frame, which is the
> > > minimum
> > > size LCDIF would work with, it turns out the polynomial is
> > > CRC32_POLY_LE
> > > 0xedb88320 , init value is 0x , the input data are
> > > bitrev32()
> > > of the entire frame and the resulting CRC has to be also
> > > bitrev32()ed.
> > 
> > No idea how the HW calculates the CRC value.
> > I didn't hear anyone internal tried this feature.
> 
> It would be nice if the datasheet could be improved.

Agreed.

> 
> There are many blank areas which are undocumented, this LCDIF CRC32 
> feature, i.MX8M Mini Arteris NOC at 0x3270 , the ARM GPV NIC-301
> at 
> 0x32{0,1,2,3,4,5,6,8}0 and their master/slave port mapping. The
> NOC 
> and NICs were documented at least up to i.MX6QP and then that 
> information disappeared from NXP datasheets. I think reconfiguring
> the 
> NOC/NIC QoS would help mitigate this shift issue described below (*).

I also think the QoS would help if it is configureable.

> 
> Do you know if there is some additional NOC/NIC documentation for
> i.MX8M 
> Mini available ?

No.

> 
> > > Doing this calculation in software for each frame is unrealistic
> > > due to
> > > the CPU demand, implement at least a sysfs attribute which
> > > permits testing
> > > the current frame on demand.
> > 
> > Why not using the existing debugfs CRC support implemented
> > in drivers/gpu/drm/drm_debugfs_crc.c?
> 
> I wasn't aware of that, thanks.

No problem.

> 
> > > Unfortunatelly, this functionality has another problem. On all of
> > > those SoCs,
> > > it is possible to overload interconnect e.g. by concurrent USB
> > > and uSDHC
> > > transfers, at which point the LCDIF LFIFO suffers an UNDERFLOW
> > > condition,
> > > which results in the image being shifted to the right by exactly
> > > LFIFO size
> > > pixels. On i.MX8M Mini, the LFIFO is 76x256 bits = 2432 Byte ~=
> > > 810 pixel
> > > at 24bpp. In this case, the LCDIF does not assert UNDERFLOW_IRQ
> > > bit, the
> > > frame CRC32 indicated in CRC_STAT register matches the CRC32 of
> > > the frame
> > > in DRAM, the RECOVER_ON_UNDERFLOW bit has no effect, so if this
> > > mode of
> > > failure occurs, the failure gets undetected and uncorrected.
> > 
> > Hmmm, interesting, no UNDERFLOW_IRQ bit asserted when LCDIF suffers
> > an
> > UNDERFLOW condition?
> 
> Yes

Did you ever see UNDERFLOW_IRQ bit asserted in any case?

Liu Ying



Re: [PATCH] backlight: pwm_bl: Avoid open coded arithmetic in memory allocation

2022-02-07 Thread Lee Jones
On Mon, 07 Feb 2022, Uwe Kleine-König wrote:

> On Sat, Feb 05, 2022 at 08:40:48AM +0100, Christophe JAILLET wrote:
> > kmalloc_array()/kcalloc() should be used to avoid potential overflow when
> > a multiplication is needed to compute the size of the requested memory.
> > 
> > So turn a kzalloc()+explicit size computation into an equivalent kcalloc().
> > 
> > Signed-off-by: Christophe JAILLET 
> 
> LGTM
> 
> Acked-by: Christophe JAILLET 
> 
> Thanks
> Uwe

I am totally confused!

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[RFC 0/2] drm/i915/ttm: Evict and store of compressed object

2022-02-07 Thread Ramalingam C
On flat-ccs capable platform we need to evict and resore the ccs data
along with the corresponding main memory.

This ccs data can only be access through BLT engine through a special
cmd ( )

To support above requirement of flat-ccs enabled i915 platforms this
series adds new param called ccs_pages_needed to the ttm_tt_init(),
to increase the ttm_tt->num_pages of system memory when the obj has the
lmem placement possibility.

This will be on top of the flat-ccs enabling series
https://patchwork.freedesktop.org/series/95686/

For more about flat-ccs feature please have a look at
https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5

Testing of the series is WIP and looking forward for the early review on
the amendment to ttm_tt_init and the approach.

Ramalingam C (2):
  drm/i915/ttm: Add extra pages for handling ccs data
  drm/i915/migrate: Evict and restore the ccs data

 drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c|  23 +-
 drivers/gpu/drm/i915/gt/intel_migrate.c| 283 +++--
 drivers/gpu/drm/qxl/qxl_ttm.c  |   2 +-
 drivers/gpu/drm/ttm/ttm_agp_backend.c  |   2 +-
 drivers/gpu/drm/ttm/ttm_tt.c   |  12 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
 include/drm/ttm/ttm_tt.h   |   4 +-
 8 files changed, 191 insertions(+), 139 deletions(-)

-- 
2.20.1



[RFC 1/2] drm/i915/ttm: Add extra pages for handling ccs data

2022-02-07 Thread Ramalingam C
While evicting the local memory data on flat-ccs capable platform we
need to evict the ccs data associated to the data. For this, we are
adding extra pages ((size / 256) >> PAGE_SIZE) into the ttm_tt.

To achieve this we are adding a new param into the ttm_tt_init as
ccs_pages_needed, which will be added into the ttm_tt->num_pages.

Signed-off-by: Ramalingam C 
Suggested-by: Thomas Hellstorm 
---
 drivers/gpu/drm/drm_gem_vram_helper.c  |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c| 23 +-
 drivers/gpu/drm/qxl/qxl_ttm.c  |  2 +-
 drivers/gpu/drm/ttm/ttm_agp_backend.c  |  2 +-
 drivers/gpu/drm/ttm/ttm_tt.c   | 12 ++-
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |  2 +-
 include/drm/ttm/ttm_tt.h   |  4 +++-
 7 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 3f00192215d1..eef1f4dc7232 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -864,7 +864,7 @@ static struct ttm_tt *bo_driver_ttm_tt_create(struct 
ttm_buffer_object *bo,
if (!tt)
return NULL;
 
-   ret = ttm_tt_init(tt, bo, page_flags, ttm_cached);
+   ret = ttm_tt_init(tt, bo, page_flags, ttm_cached, 0);
if (ret < 0)
goto err_ttm_tt_init;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 84cae740b4a5..bb71aa6d66c0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -16,6 +16,7 @@
 #include "gem/i915_gem_ttm.h"
 #include "gem/i915_gem_ttm_move.h"
 #include "gem/i915_gem_ttm_pm.h"
+#include "gt/intel_gpu_commands.h"
 
 #define I915_TTM_PRIO_PURGE 0
 #define I915_TTM_PRIO_NO_PAGES  1
@@ -242,12 +243,27 @@ static const struct i915_refct_sgt_ops tt_rsgt_ops = {
.release = i915_ttm_tt_release
 };
 
+static inline bool
+i915_gem_object_has_lmem_placement(struct drm_i915_gem_object *obj)
+{
+   int i;
+
+   for (i = 0; i < obj->mm.n_placements; i++)
+   if (obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL)
+   return true;
+
+   return false;
+}
+
 static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 uint32_t page_flags)
 {
+   struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
+bdev);
struct ttm_resource_manager *man =
ttm_manager_type(bo->bdev, bo->resource->mem_type);
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+   unsigned long ccs_pages_needed = 0;
enum ttm_caching caching;
struct i915_ttm_tt *i915_tt;
int ret;
@@ -270,7 +286,12 @@ static struct ttm_tt *i915_ttm_tt_create(struct 
ttm_buffer_object *bo,
i915_tt->is_shmem = true;
}
 
-   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);
+   if (HAS_FLAT_CCS(i915) && i915_gem_object_has_lmem_placement(obj))
+   ccs_pages_needed = DIV_ROUND_UP(DIV_ROUND_UP(bo->base.size,
+  NUM_CCS_BYTES_PER_BLOCK), 
PAGE_SIZE);
+
+   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
+ caching, ccs_pages_needed);
if (ret)
goto err_free;
 
diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
index b2e33d5ba5d0..52156b54498f 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -113,7 +113,7 @@ static struct ttm_tt *qxl_ttm_tt_create(struct 
ttm_buffer_object *bo,
ttm = kzalloc(sizeof(struct ttm_tt), GFP_KERNEL);
if (ttm == NULL)
return NULL;
-   if (ttm_tt_init(ttm, bo, page_flags, ttm_cached)) {
+   if (ttm_tt_init(ttm, bo, page_flags, ttm_cached, 0)) {
kfree(ttm);
return NULL;
}
diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c 
b/drivers/gpu/drm/ttm/ttm_agp_backend.c
index 6ddc16f0fe2b..d27691f2e451 100644
--- a/drivers/gpu/drm/ttm/ttm_agp_backend.c
+++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c
@@ -134,7 +134,7 @@ struct ttm_tt *ttm_agp_tt_create(struct ttm_buffer_object 
*bo,
agp_be->mem = NULL;
agp_be->bridge = bridge;
 
-   if (ttm_tt_init(&agp_be->ttm, bo, page_flags, ttm_write_combined)) {
+   if (ttm_tt_init(&agp_be->ttm, bo, page_flags, ttm_write_combined, 0)) {
kfree(agp_be);
return NULL;
}
diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 79c870a3bef8..80355465f717 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -134,9 +134,10 @@ void ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt 
*ttm)
 static void ttm_tt_init_fields(struct ttm_tt *ttm,
   struct t

[RFC 2/2] drm/i915/migrate: Evict and restore the ccs data

2022-02-07 Thread Ramalingam C
When we are swapping out the local memory obj on flat-ccs capable platform,
we need to capture the ccs data too along with main meory and we need to
restore it when we are swapping in the content.

Extracting and restoring the CCS data is done through a special cmd called
XY_CTRL_SURF_COPY_BLT

Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 283 +---
 1 file changed, 155 insertions(+), 128 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 5bdab0b3c735..e60ae6ff1847 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -449,14 +449,146 @@ static bool wa_1209644611_applies(int ver, u32 size)
return height % 4 == 3 && height <= 8;
 }
 
+/**
+ * DOC: Flat-CCS - Memory compression for Local memory
+ *
+ * On Xe-HP and later devices, we use dedicated compression control state (CCS)
+ * stored in local memory for each surface, to support the 3D and media
+ * compression formats.
+ *
+ * The memory required for the CCS of the entire local memory is 1/256 of the
+ * local memory size. So before the kernel boot, the required memory is 
reserved
+ * for the CCS data and a secure register will be programmed with the CCS base
+ * address.
+ *
+ * Flat CCS data needs to be cleared when a lmem object is allocated.
+ * And CCS data can be copied in and out of CCS region through
+ * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
+ *
+ * When we exaust the lmem, if the object's placements support smem, then we 
can
+ * directly decompress the compressed lmem object into smem and start using it
+ * from smem itself.
+ *
+ * But when we need to swapout the compressed lmem object into a smem region
+ * though objects' placement doesn't support smem, then we copy the lmem 
content
+ * as it is into smem region along with ccs data (using XY_CTRL_SURF_COPY_BLT).
+ * When the object is referred, lmem content will be swaped in along with
+ * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at corresponding
+ * location.
+ *
+ *
+ * Flat-CCS Modifiers for different compression formats
+ * 
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers of Flat 
CCS
+ * render compression formats. Though the general layout is same as
+ * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression algorithm is
+ * used. Render compression uses 128 byte compression blocks
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers of Flat CCS
+ * media compression formats. Though the general layout is same as
+ * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm is
+ * used. Media compression uses 256 byte compression blocks.
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the buffers of Flat
+ * CCS clear color render compression formats. Unified compression format for
+ * clear color render compression. The genral layout is a tiled layout using
+ * 4Kb tiles i.e Tile4 layout.
+ */
+
+static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags)
+{
+   /* Mask the 3 LSB to use the PPGTT address space */
+   *cmd++ = MI_FLUSH_DW | flags;
+   *cmd++ = lower_32_bits(dst);
+   *cmd++ = upper_32_bits(dst);
+
+   return cmd;
+}
+
+static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915, int size)
+{
+   u32 num_cmds, num_blks, total_size;
+
+   if (!GET_CCS_SIZE(i915, size))
+   return 0;
+
+   /*
+* XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte
+* blocks. one XY_CTRL_SURF_COPY_BLT command can
+* trnasfer upto 1024 blocks.
+*/
+   num_blks = GET_CCS_SIZE(i915, size);
+   num_cmds = (num_blks + (NUM_CCS_BLKS_PER_XFER - 1)) >> 10;
+   total_size = (XY_CTRL_SURF_INSTR_SIZE) * num_cmds;
+
+   /*
+* We need to add a flush before and after
+* XY_CTRL_SURF_COPY_BLT
+*/
+   total_size += 2 * MI_FLUSH_DW_SIZE;
+   return total_size;
+}
+
+static u32 *_i915_ctrl_surf_copy_blt(u32 *cmd, u64 src_addr, u64 dst_addr,
+u8 src_mem_access, u8 dst_mem_access,
+int src_mocs, int dst_mocs,
+u16 num_ccs_blocks)
+{
+   int i = num_ccs_blocks;
+
+   /*
+* The XY_CTRL_SURF_COPY_BLT instruction is used to copy the CCS
+* data in and out of the CCS region.
+*
+* We can copy at most 1024 blocks of 256 bytes using one
+* XY_CTRL_SURF_COPY_BLT instruction.
+*
+* In case we need to copy more than 1024 blocks, we need to add
+* another instruction to the same batch buffer.
+*
+* 1024 blocks of 256 bytes of CCS represent a total 256KB of CCS.
+*
+* 256 KB of CCS represents 256 * 256 KB = 64 MB of LMEM.
+*/
+   d

Re: [PATCH 5/5] drm/stm: ltdc: add support of ycbcr pixel formats

2022-02-07 Thread yannick Fertre

Hi Nathan,

On 2/2/22 17:54, Nathan Chancellor wrote:

Hi Yannick,

On Wed, Dec 15, 2021 at 10:48:43PM +0100, Yannick Fertre wrote:

This patch adds the following YCbCr input pixel formats on the latest
LTDC hardware version:

1 plane  (co-planar)  : YUYV, YVYU, UYVY, VYUY
2 planes (semi-planar): NV12, NV21
3 planes (full-planar): YU12=I420=DRM YUV420, YV12=DRM YVU420

Signed-off-by: Yannick Fertre 





+static inline void ltdc_set_ycbcr_config(struct drm_plane *plane, u32 
drm_pix_fmt)
+{
+   struct ltdc_device *ldev = plane_to_ltdc(plane);
+   struct drm_plane_state *state = plane->state;
+   u32 lofs = plane->index * LAY_OFS;
+   u32 val;
+
+   switch (drm_pix_fmt) {
+   case DRM_FORMAT_YUYV:
+   val = (YCM_I << 4) | LxPCR_YF | LxPCR_CBF;
+   break;
+   case DRM_FORMAT_YVYU:
+   val = (YCM_I << 4) | LxPCR_YF;
+   break;
+   case DRM_FORMAT_UYVY:
+   val = (YCM_I << 4) | LxPCR_CBF;
+   break;
+   case DRM_FORMAT_VYUY:
+   val = (YCM_I << 4);
+   break;
+   case DRM_FORMAT_NV12:
+   val = (YCM_SP << 4) | LxPCR_CBF;
+   break;
+   case DRM_FORMAT_NV21:
+   val = (YCM_SP << 4);
+   break;
+   case DRM_FORMAT_YUV420:
+   case DRM_FORMAT_YVU420:
+   val = (YCM_FP << 4);
+   break;
+   default:
+   /* RGB or not a YCbCr supported format */
+   break;
+   }
+
+   /* Enable limited range */
+   if (state->color_range == DRM_COLOR_YCBCR_LIMITED_RANGE)
+   val |= LxPCR_YREN;
+
+   /* enable ycbcr conversion */
+   val |= LxPCR_YCEN;
+
+   regmap_write(ldev->regmap, LTDC_L1PCR + lofs, val);
+}


This patch as commit 484e72d3146b ("drm/stm: ltdc: add support of ycbcr
pixel formats") in -next introduced the following clang warning:

drivers/gpu/drm/stm/ltdc.c:625:2: warning: variable 'val' is used uninitialized 
whenever switch default is taken [-Wsometimes-uninitialized]
 default:
 ^~~
drivers/gpu/drm/stm/ltdc.c:635:2: note: uninitialized use occurs here
 val |= LxPCR_YCEN;
 ^~~
drivers/gpu/drm/stm/ltdc.c:600:9: note: initialize the variable 'val' to 
silence this warning
 u32 val;
^
 = 0
1 warning generated.

Would it be okay to just return in the default case (maybe with a
message about an unsupported format?) or should there be another fix?

Cheers,



Thanks for your help.
It'okay for a message for unsupported format with a return in the 
default case.

Do you want create & push the patch?

Best regards


Re: (subset) [PATCH] drm/vc4: crtc: Fix redundant variable assignment

2022-02-07 Thread Maxime Ripard
On Thu, 3 Feb 2022 16:11:51 +0100, Maxime Ripard wrote:
> The variable is assigned twice to the same value. Let's drop one.
> 
> 

Applied to drm/drm-misc (drm-misc-fixes).

Thanks!
Maxime


Re: [Intel-gfx] [RFC 1/2] drm/i915/ttm: Add extra pages for handling ccs data

2022-02-07 Thread Intel

Hi, Ram,


On 2/7/22 10:37, Ramalingam C wrote:

While evicting the local memory data on flat-ccs capable platform we
need to evict the ccs data associated to the data.



  For this, we are
adding extra pages ((size / 256) >> PAGE_SIZE) into the ttm_tt.

To achieve this we are adding a new param into the ttm_tt_init as
ccs_pages_needed, which will be added into the ttm_tt->num_pages.


Please use imperative form above, Instead of "We are adding..", use "Add"




Signed-off-by: Ramalingam C 
Suggested-by: Thomas Hellstorm 

Hellstorm instead of Hellstrom might scare people off. :)

---
  drivers/gpu/drm/drm_gem_vram_helper.c  |  2 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c| 23 +-
  drivers/gpu/drm/qxl/qxl_ttm.c  |  2 +-
  drivers/gpu/drm/ttm/ttm_agp_backend.c  |  2 +-
  drivers/gpu/drm/ttm/ttm_tt.c   | 12 ++-
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |  2 +-
  include/drm/ttm/ttm_tt.h   |  4 +++-
  7 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 3f00192215d1..eef1f4dc7232 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -864,7 +864,7 @@ static struct ttm_tt *bo_driver_ttm_tt_create(struct 
ttm_buffer_object *bo,
if (!tt)
return NULL;
  
-	ret = ttm_tt_init(tt, bo, page_flags, ttm_cached);

+   ret = ttm_tt_init(tt, bo, page_flags, ttm_cached, 0);
if (ret < 0)
goto err_ttm_tt_init;
  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c

index 84cae740b4a5..bb71aa6d66c0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -16,6 +16,7 @@
  #include "gem/i915_gem_ttm.h"
  #include "gem/i915_gem_ttm_move.h"
  #include "gem/i915_gem_ttm_pm.h"
+#include "gt/intel_gpu_commands.h"
  
  #define I915_TTM_PRIO_PURGE 0

  #define I915_TTM_PRIO_NO_PAGES  1
@@ -242,12 +243,27 @@ static const struct i915_refct_sgt_ops tt_rsgt_ops = {
.release = i915_ttm_tt_release
  };
  
+static inline bool

+i915_gem_object_has_lmem_placement(struct drm_i915_gem_object *obj)
+{
+   int i;
+
+   for (i = 0; i < obj->mm.n_placements; i++)
+   if (obj->mm.placements[i]->type == INTEL_MEMORY_LOCAL)
+   return true;
+
+   return false;
+}
+
  static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo,
 uint32_t page_flags)
  {
+   struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
+bdev);
struct ttm_resource_manager *man =
ttm_manager_type(bo->bdev, bo->resource->mem_type);
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+   unsigned long ccs_pages_needed = 0;
enum ttm_caching caching;
struct i915_ttm_tt *i915_tt;
int ret;
@@ -270,7 +286,12 @@ static struct ttm_tt *i915_ttm_tt_create(struct 
ttm_buffer_object *bo,
i915_tt->is_shmem = true;
}
  
-	ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching);

+   if (HAS_FLAT_CCS(i915) && i915_gem_object_has_lmem_placement(obj))
+   ccs_pages_needed = DIV_ROUND_UP(DIV_ROUND_UP(bo->base.size,
+  NUM_CCS_BYTES_PER_BLOCK), 
PAGE_SIZE);
+
+   ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags,
+ caching, ccs_pages_needed);


I'd suggest a patch that first adds the functionality to TTM, where even 
i915 passes in 0 here, and a follow-up patch for the i915 functionality 
where we add the ccs requirement.




if (ret)
goto err_free;
  
diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c

index b2e33d5ba5d0..52156b54498f 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -113,7 +113,7 @@ static struct ttm_tt *qxl_ttm_tt_create(struct 
ttm_buffer_object *bo,
ttm = kzalloc(sizeof(struct ttm_tt), GFP_KERNEL);
if (ttm == NULL)
return NULL;
-   if (ttm_tt_init(ttm, bo, page_flags, ttm_cached)) {
+   if (ttm_tt_init(ttm, bo, page_flags, ttm_cached, 0)) {
kfree(ttm);
return NULL;
}
diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c 
b/drivers/gpu/drm/ttm/ttm_agp_backend.c
index 6ddc16f0fe2b..d27691f2e451 100644
--- a/drivers/gpu/drm/ttm/ttm_agp_backend.c
+++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c
@@ -134,7 +134,7 @@ struct ttm_tt *ttm_agp_tt_create(struct ttm_buffer_object 
*bo,
agp_be->mem = NULL;
agp_be->bridge = bridge;
  
-	if (ttm_tt_init(&agp_be->ttm, bo, page_flags, ttm_write_combined)) {

+   if (ttm_tt_init(&agp_be->ttm, bo, page_flags, ttm_write_combined, 0)) {
kfree(agp_be);

Re: [PATCH] [RFC] drm: mxsfb: Implement LCDIF scanout CRC32 support

2022-02-07 Thread Marek Vasut

On 2/7/22 10:18, Liu Ying wrote:

Hi,


On Sun, 2022-02-06 at 19:56 +0100, Marek Vasut wrote:

The LCDIF controller as present in i.MX6SX/i.MX8M Mini/Nano has a
CRC_STAT
register, which contains CRC32 of the frame as it was clocked out
of the
DPI interface of the LCDIF. This is likely meant as a functional
safety
register.

Unfortunatelly, there is zero documentation on how the CRC32 is
calculated,
there is no documentation of the polynomial, the init value, nor
on which
data is the checksum applied.

By applying brute-force on 8 pixel / 2 line frame, which is the
minimum
size LCDIF would work with, it turns out the polynomial is
CRC32_POLY_LE
0xedb88320 , init value is 0x , the input data are
bitrev32()
of the entire frame and the resulting CRC has to be also
bitrev32()ed.


No idea how the HW calculates the CRC value.
I didn't hear anyone internal tried this feature.


It would be nice if the datasheet could be improved.


Agreed.



There are many blank areas which are undocumented, this LCDIF CRC32
feature, i.MX8M Mini Arteris NOC at 0x3270 , the ARM GPV NIC-301
at
0x32{0,1,2,3,4,5,6,8}0 and their master/slave port mapping. The
NOC
and NICs were documented at least up to i.MX6QP and then that
information disappeared from NXP datasheets. I think reconfiguring
the
NOC/NIC QoS would help mitigate this shift issue described below (*).


I also think the QoS would help if it is configureable.


It is programmable, it's just the port mapping which is undocumented.


Do you know if there is some additional NOC/NIC documentation for
i.MX8M
Mini available ?


No.


Can you ask someone internally in NXP maybe ?


Doing this calculation in software for each frame is unrealistic
due to
the CPU demand, implement at least a sysfs attribute which
permits testing
the current frame on demand.


Why not using the existing debugfs CRC support implemented
in drivers/gpu/drm/drm_debugfs_crc.c?


I wasn't aware of that, thanks.


No problem.




Unfortunatelly, this functionality has another problem. On all of
those SoCs,
it is possible to overload interconnect e.g. by concurrent USB
and uSDHC
transfers, at which point the LCDIF LFIFO suffers an UNDERFLOW
condition,
which results in the image being shifted to the right by exactly
LFIFO size
pixels. On i.MX8M Mini, the LFIFO is 76x256 bits = 2432 Byte ~=
810 pixel
at 24bpp. In this case, the LCDIF does not assert UNDERFLOW_IRQ
bit, the
frame CRC32 indicated in CRC_STAT register matches the CRC32 of
the frame
in DRAM, the RECOVER_ON_UNDERFLOW bit has no effect, so if this
mode of
failure occurs, the failure gets undetected and uncorrected.


Hmmm, interesting, no UNDERFLOW_IRQ bit asserted when LCDIF suffers
an
UNDERFLOW condition?


Yes


Did you ever see UNDERFLOW_IRQ bit asserted in any case?


I didn't see the UNDERFLOW_IRQ bit asserted during my tests, either with 
this IRQ enabled (UNDERFLOW_IRQ_EN=1) or with the IRQ disabled 
(UNDERFLOW_IRQ_EN=0) by reading the CTRL1 register in interrupt handler 
when CUR_FRAME_DONE_IRQ triggered the IRQ handler.


I did see a few auto-recoveries of the panel back into non-shifted 
image, that happened once in some 100-200 tests. Mostly the LCDIF does 
not recover automatically.


Re: [PATCH 1/2] drm/mm: Add an iterator to optimally walk over holes for an allocation (v2)

2022-02-07 Thread Tvrtko Ursulin



On 04/02/2022 01:19, Vivek Kasireddy wrote:

This iterator relies on drm_mm_first_hole() and drm_mm_next_hole()
functions to identify suitable holes for an allocation of a given
size by efficiently traversing the rbtree associated with the given
allocator.

It replaces the for loop in drm_mm_insert_node_in_range() and can
also be used by drm drivers to quickly identify holes of a certain
size within a given range.

v2: (Tvrtko)
- Prepend a double underscore for the newly exported first/next_hole
- s/each_best_hole/each_suitable_hole/g
- Mask out DRM_MM_INSERT_ONCE from the mode before calling
   first/next_hole and elsewhere.

Suggested-by: Tvrtko Ursulin 
Signed-off-by: Vivek Kasireddy 
---
  drivers/gpu/drm/drm_mm.c | 38 ++
  include/drm/drm_mm.h | 36 
  2 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 8257f9d4f619..b6da1dffcfcb 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -352,10 +352,10 @@ static struct drm_mm_node *find_hole_addr(struct drm_mm 
*mm, u64 addr, u64 size)
return node;
  }
  
-static struct drm_mm_node *

-first_hole(struct drm_mm *mm,
-  u64 start, u64 end, u64 size,
-  enum drm_mm_insert_mode mode)
+struct drm_mm_node *
+__drm_mm_first_hole(struct drm_mm *mm,
+   u64 start, u64 end, u64 size,
+   enum drm_mm_insert_mode mode)
  {
switch (mode) {
default:
@@ -374,6 +374,7 @@ first_hole(struct drm_mm *mm,
hole_stack);
}
  }
+EXPORT_SYMBOL(__drm_mm_first_hole);
  
  /**

   * DECLARE_NEXT_HOLE_ADDR - macro to declare next hole functions
@@ -410,11 +411,11 @@ static struct drm_mm_node *name(struct drm_mm_node 
*entry, u64 size)  \
  DECLARE_NEXT_HOLE_ADDR(next_hole_high_addr, rb_left, rb_right)
  DECLARE_NEXT_HOLE_ADDR(next_hole_low_addr, rb_right, rb_left)
  
-static struct drm_mm_node *

-next_hole(struct drm_mm *mm,
- struct drm_mm_node *node,
- u64 size,
- enum drm_mm_insert_mode mode)
+struct drm_mm_node *
+__drm_mm_next_hole(struct drm_mm *mm,
+  struct drm_mm_node *node,
+  u64 size,
+  enum drm_mm_insert_mode mode)
  {
switch (mode) {
default:
@@ -432,6 +433,7 @@ next_hole(struct drm_mm *mm,
return &node->hole_stack == &mm->hole_stack ? NULL : node;
}
  }
+EXPORT_SYMBOL(__drm_mm_next_hole);
  
  /**

   * drm_mm_reserve_node - insert an pre-initialized node
@@ -520,7 +522,6 @@ int drm_mm_insert_node_in_range(struct drm_mm * const mm,
  {
struct drm_mm_node *hole;
u64 remainder_mask;
-   bool once;
  
  	DRM_MM_BUG_ON(range_start > range_end);
  
@@ -533,22 +534,19 @@ int drm_mm_insert_node_in_range(struct drm_mm * const mm,

if (alignment <= 1)
alignment = 0;
  
-	once = mode & DRM_MM_INSERT_ONCE;

-   mode &= ~DRM_MM_INSERT_ONCE;
-
remainder_mask = is_power_of_2(alignment) ? alignment - 1 : 0;
-   for (hole = first_hole(mm, range_start, range_end, size, mode);
-hole;
-hole = once ? NULL : next_hole(mm, hole, size, mode)) {
+   drm_mm_for_each_suitable_hole(hole, mm, range_start, range_end,
+ size, mode) {
u64 hole_start = __drm_mm_hole_node_start(hole);
u64 hole_end = hole_start + hole->hole_size;
u64 adj_start, adj_end;
u64 col_start, col_end;
+   enum drm_mm_insert_mode placement = mode & ~DRM_MM_INSERT_ONCE;


Could move outside the loop, but not sure if it matters much.

Could also call the masked out variable mode and the passed in one 
caller_mode, or something, and that way have a smaller diff. (Four 
following hunks wouldn't be there.)


  
-		if (mode == DRM_MM_INSERT_LOW && hole_start >= range_end)

+   if (placement == DRM_MM_INSERT_LOW && hole_start >= range_end)
break;
  
-		if (mode == DRM_MM_INSERT_HIGH && hole_end <= range_start)

+   if (placement == DRM_MM_INSERT_HIGH && hole_end <= range_start)
break;
  
  		col_start = hole_start;

@@ -562,7 +560,7 @@ int drm_mm_insert_node_in_range(struct drm_mm * const mm,
if (adj_end <= adj_start || adj_end - adj_start < size)
continue;
  
-		if (mode == DRM_MM_INSERT_HIGH)

+   if (placement == DRM_MM_INSERT_HIGH)
adj_start = adj_end - size;
  
  		if (alignment) {

@@ -574,7 +572,7 @@ int drm_mm_insert_node_in_range(struct drm_mm * const mm,
div64_u64_rem(adj_start, alignment, &rem);
if (rem) {
adj_start -= rem;
-   if (mode != DRM_MM_INSERT_HIGH)
+

Re: [PATCH 1/3] i915/gvt: Introduce the mmio_table.c to support VFIO new mdev API

2022-02-07 Thread Jani Nikula
On Mon, 07 Feb 2022, Christoph Hellwig  wrote:
> On Mon, Feb 07, 2022 at 08:28:13AM +, Wang, Zhi A wrote:
>> 1) About having the mmio_table.h, I would like to keep the stuff in a
>> dedicated header as putting them in intel_gvt.h might needs i915 guys
>> to maintain it.
>> 2) The other one is about if we should move the mmio_table.c into
>> i915 folder. I guess we need the some comments from Jani. In the
>> current version that I am testing, it's still in GVT folder. Guess we
>> can submit a patch to move it to i915 folder later if Jani is ok
>> about that.
>
> Yes, let's have Jani chime in on these.  They're basically one and the
> same issue.  This code will have to be built into into the core i915
> driver even with my planned split, which is kindof the point of this
> exercise.  I think it makes sense to use the subdirectories as boundaries
> for where the code ends up and not to declarare maintainership boundaries,
> but it will be up to the i915 and gvt maintainers to decide that.

Agreed. If there's going to be a gvt.ko, I think all of gvt/ should be
part of that module, nothing more, nothing less.

The gvt related files in i915/ should probably be named intel_gvt* or
something, ditto for function naming, and we'll probably want patches
touching them be Cc'd to intel-gfx list.

Joonas, Rodrigo, Tvrtko, thoughts?

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH] drm/dp: Remove common Post Cursor2 register handling

2022-02-07 Thread Jani Nikula
On Fri, 04 Feb 2022, Kees Cook  wrote:
> Ping,
>
> This is a OOB read fix. Can someone please pick this up?

Daniel? Thierry?

As I said, I reviewed this but I'm not comfortable applying patches that
change the functionality of drivers I don't maintain.

BR,
Jani.


>
> -Kees
>
> On Wed, Jan 05, 2022 at 09:35:07AM -0800, Kees Cook wrote:
>> The link_status array was not large enough to read the Adjust Request
>> Post Cursor2 register, so remove the common helper function to avoid
>> an OOB read, found with a -Warray-bounds build:
>> 
>> drivers/gpu/drm/drm_dp_helper.c: In function 
>> 'drm_dp_get_adjust_request_post_cursor':
>> drivers/gpu/drm/drm_dp_helper.c:59:27: error: array subscript 10 is outside 
>> array bounds of 'const u8[6]' {aka 'const unsigned char[6]'} 
>> [-Werror=array-bounds]
>>59 | return link_status[r - DP_LANE0_1_STATUS];
>>   |~~~^~~
>> drivers/gpu/drm/drm_dp_helper.c:147:51: note: while referencing 'link_status'
>>   147 | u8 drm_dp_get_adjust_request_post_cursor(const u8 
>> link_status[DP_LINK_STATUS_SIZE],
>>   |  
>> ~^~~~
>> 
>> Replace the only user of the helper with an open-coded fetch and decode,
>> similar to drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c.
>> 
>> Fixes: 79465e0ffeb9 ("drm/dp: Add helper to get post-cursor adjustments")
>> Cc: Maarten Lankhorst 
>> Cc: Maxime Ripard 
>> Cc: Thomas Zimmermann 
>> Cc: David Airlie 
>> Cc: Daniel Vetter 
>> Cc: dri-devel@lists.freedesktop.org
>> Signed-off-by: Kees Cook 
>> ---
>> This is the alternative to:
>> https://lore.kernel.org/lkml/20211203084354.3105253-1-keesc...@chromium.org/
>> ---
>>  drivers/gpu/drm/drm_dp_helper.c | 10 --
>>  drivers/gpu/drm/tegra/dp.c  | 11 ++-
>>  include/drm/drm_dp_helper.h |  2 --
>>  3 files changed, 10 insertions(+), 13 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/drm_dp_helper.c 
>> b/drivers/gpu/drm/drm_dp_helper.c
>> index 23f9073bc473..c9528aa62c9c 100644
>> --- a/drivers/gpu/drm/drm_dp_helper.c
>> +++ b/drivers/gpu/drm/drm_dp_helper.c
>> @@ -144,16 +144,6 @@ u8 drm_dp_get_adjust_tx_ffe_preset(const u8 
>> link_status[DP_LINK_STATUS_SIZE],
>>  }
>>  EXPORT_SYMBOL(drm_dp_get_adjust_tx_ffe_preset);
>>  
>> -u8 drm_dp_get_adjust_request_post_cursor(const u8 
>> link_status[DP_LINK_STATUS_SIZE],
>> - unsigned int lane)
>> -{
>> -unsigned int offset = DP_ADJUST_REQUEST_POST_CURSOR2;
>> -u8 value = dp_link_status(link_status, offset);
>> -
>> -return (value >> (lane << 1)) & 0x3;
>> -}
>> -EXPORT_SYMBOL(drm_dp_get_adjust_request_post_cursor);
>> -
>>  static int __8b10b_clock_recovery_delay_us(const struct drm_dp_aux *aux, u8 
>> rd_interval)
>>  {
>>  if (rd_interval > 4)
>> diff --git a/drivers/gpu/drm/tegra/dp.c b/drivers/gpu/drm/tegra/dp.c
>> index 70dfb7d1dec5..f5535eb04c6b 100644
>> --- a/drivers/gpu/drm/tegra/dp.c
>> +++ b/drivers/gpu/drm/tegra/dp.c
>> @@ -549,6 +549,15 @@ static void drm_dp_link_get_adjustments(struct 
>> drm_dp_link *link,
>>  {
>>  struct drm_dp_link_train_set *adjust = &link->train.adjust;
>>  unsigned int i;
>> +u8 post_cursor;
>> +int err;
>> +
>> +err = drm_dp_dpcd_read(link->aux, DP_ADJUST_REQUEST_POST_CURSOR2,
>> +   &post_cursor, sizeof(post_cursor));
>> +if (err < 0) {
>> +DRM_ERROR("failed to read post_cursor2: %d\n", err);
>> +post_cursor = 0;
>> +}
>>  
>>  for (i = 0; i < link->lanes; i++) {
>>  adjust->voltage_swing[i] =
>> @@ -560,7 +569,7 @@ static void drm_dp_link_get_adjustments(struct 
>> drm_dp_link *link,
>>  DP_TRAIN_PRE_EMPHASIS_SHIFT;
>>  
>>  adjust->post_cursor[i] =
>> -drm_dp_get_adjust_request_post_cursor(status, i);
>> +(post_cursor >> (i << 1)) & 0x3;
>>  }
>>  }
>>  
>> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
>> index 472dac376284..fdf3cf6ccc02 100644
>> --- a/include/drm/drm_dp_helper.h
>> +++ b/include/drm/drm_dp_helper.h
>> @@ -1528,8 +1528,6 @@ u8 drm_dp_get_adjust_request_pre_emphasis(const u8 
>> link_status[DP_LINK_STATUS_SI
>>int lane);
>>  u8 drm_dp_get_adjust_tx_ffe_preset(const u8 
>> link_status[DP_LINK_STATUS_SIZE],
>> int lane);
>> -u8 drm_dp_get_adjust_request_post_cursor(const u8 
>> link_status[DP_LINK_STATUS_SIZE],
>> - unsigned int lane);
>>  
>>  #define DP_BRANCH_OUI_HEADER_SIZE   0xc
>>  #define DP_RECEIVER_CAP_SIZE0xf
>> -- 
>> 2.30.2
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH v2] drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP

2022-02-07 Thread Naresh Kamboju
Hi Thomas,

On Thu, 3 Feb 2022 at 15:09, Thomas Zimmermann  wrote:
>
> As reported in [1], DRM_PANEL_EDP depends on DRM_DP_HELPER. Select
> the option to fix the build failure. The error message is shown
> below.
>
>   arm-linux-gnueabihf-ld: drivers/gpu/drm/panel/panel-edp.o: in function
> `panel_edp_probe': panel-edp.c:(.text+0xb74): undefined reference to
> `drm_panel_dp_aux_backlight'
>   make[1]: *** [/builds/linux/Makefile:1222: vmlinux] Error 1
>
> The issue has been reported before, when DisplayPort helpers were
> hidden behind the option CONFIG_DRM_KMS_HELPER. [2]
>
> v2:
> * fix and expand commit description (Arnd)
>
> Signed-off-by: Thomas Zimmermann 
> Fixes: adb9d5a2cc77 ("drm/dp: Move DisplayPort helpers into separate helper 
> module")
> Fixes: 5f04e7ce392d ("drm/panel-edp: Split eDP panels out of panel-simple")
> Reported-by: Naresh Kamboju 
> Reported-by: Linux Kernel Functional Testing 
> Link: 
> https://lore.kernel.org/dri-devel/CA+G9fYvN0NyaVkRQmA1O6rX7H8PPaZrUAD7=rdy33qy9ruu...@mail.gmail.com/
>  # [1]
> Link: 
> https://lore.kernel.org/all/2027062704.14671-1-rdun...@infradead.org/ # 
> [2]
> Cc: Thomas Zimmermann 
> Cc: Lyude Paul 
> Cc: Daniel Vetter 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: dri-devel@lists.freedesktop.org

Tested-by: Naresh Kamboju 
Tested-by: Linux Kernel Functional Testing 

This patch fixes the repored build problem.


> ---
>  drivers/gpu/drm/panel/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
> index 434c2861bb40..0aec5a10b064 100644
> --- a/drivers/gpu/drm/panel/Kconfig
> +++ b/drivers/gpu/drm/panel/Kconfig
> @@ -106,6 +106,7 @@ config DRM_PANEL_EDP
> depends on PM
> select VIDEOMODE_HELPERS
> select DRM_DP_AUX_BUS
> +   select DRM_DP_HELPER
> help
>   DRM panel driver for dumb eDP panels that need at most a regulator 
> and
>   a GPIO to be powered up. Optionally a backlight can be attached so
> --
> 2.34.1
>


-- 
Linaro LKFT
https://lkft.linaro.org


Re: [RFC PATCH 1/3] drm: Extract amdgpu_sa.c as a generic suballocation helper

2022-02-07 Thread Maarten Lankhorst
Op 04-02-2022 om 19:29 schreef Christian König:
> Oh, that's on my TODO list for years!
>
> Am 04.02.22 um 18:48 schrieb Maarten Lankhorst:
>> Suballocating a buffer object is something that is not driver
>> generic, and is useful for other drivers as well.
>>
>> Signed-off-by: Maarten Lankhorst 
>> ---
>>   drivers/gpu/drm/Makefile   |   4 +-
>>   drivers/gpu/drm/drm_suballoc.c | 424 +
>>   include/drm/drm_suballoc.h |  78 ++
>>   3 files changed, 505 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/gpu/drm/drm_suballoc.c
>>   create mode 100644 include/drm/drm_suballoc.h
>>
>> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
>> index 8675c2af7ae1..b848bcf8790c 100644
>> --- a/drivers/gpu/drm/Makefile
>> +++ b/drivers/gpu/drm/Makefile
>> @@ -57,7 +57,9 @@ drm_kms_helper-y := drm_bridge_connector.o 
>> drm_crtc_helper.o \
>>   drm_scdc_helper.o drm_gem_atomic_helper.o \
>>   drm_gem_framebuffer_helper.o \
>>   drm_atomic_state_helper.o drm_damage_helper.o \
>> -    drm_format_helper.o drm_self_refresh_helper.o drm_rect.o
>> +    drm_format_helper.o drm_self_refresh_helper.o drm_rect.o \
>> +    drm_suballoc.o
>> +
>
> I think we should put that into a separate module like we now do with other 
> helpers as well.
Can easily be done, it will likely be a very small helper. The code itself is 
just under a page. I felt the overhead wasn't worth it, but will do so.
>>   drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
>>   drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o
>>   diff --git a/drivers/gpu/drm/drm_suballoc.c 
>> b/drivers/gpu/drm/drm_suballoc.c
>> new file mode 100644
>> index ..e0bb35367b71
>> --- /dev/null
>> +++ b/drivers/gpu/drm/drm_suballoc.c
>> @@ -0,0 +1,424 @@
>> +/*
>> + * Copyright 2011 Red Hat Inc.
>> + * All Rights Reserved.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the
>> + * "Software"), to deal in the Software without restriction, including
>> + * without limitation the rights to use, copy, modify, merge, publish,
>> + * distribute, sub license, and/or sell copies of the Software, and to
>> + * permit persons to whom the Software is furnished to do so, subject to
>> + * the following conditions:
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
>> OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
>> + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY 
>> CLAIM,
>> + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
>> + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
>> + * USE OR OTHER DEALINGS IN THE SOFTWARE.
>> + *
>> + * The above copyright notice and this permission notice (including the
>> + * next paragraph) shall be included in all copies or substantial portions
>> + * of the Software.
>> + *
>> + */
>> +/*
>> + * Authors:
>> + *    Jerome Glisse 
>> + */
>
> That is hopelessly outdated. IIRC I completely rewrote that stuff in ~2012.
If you rewrote it, can you give me an updated copyright header please?
>
>> +/* Algorithm:
>> + *
>> + * We store the last allocated bo in "hole", we always try to allocate
>> + * after the last allocated bo. Principle is that in a linear GPU ring
>> + * progression was is after last is the oldest bo we allocated and thus
>> + * the first one that should no longer be in use by the GPU.
>> + *
>> + * If it's not the case we skip over the bo after last to the closest
>> + * done bo if such one exist. If none exist and we are not asked to
>> + * block we report failure to allocate.
>> + *
>> + * If we are asked to block we wait on all the oldest fence of all
>> + * rings. We just wait for any of those fence to complete.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +static void drm_suballoc_remove_locked(struct drm_suballoc *sa);
>> +static void drm_suballoc_try_free(struct drm_suballoc_manager *sa_manager);
>> +
>> +/**
>> + * drm_suballoc_manager_init - Initialise the drm_suballoc_manager
>> + *
>> + * @sa_manager: pointer to the sa_manager
>> + * @size: number of bytes we want to suballocate
>> + * @align: alignment for each suballocated chunk
>> + *
>> + * Prepares the suballocation manager for suballocations.
>> + */
>> +void drm_suballoc_manager_init(struct drm_suballoc_manager *sa_manager,
>> +   u32 size, u32 align)
>> +{
>> +    u32 i;
>> +
>> +    if (!align)
>> +    align = 1;
>> +
>> +    /* alignment must be a power of 2 */
>> +    BUG_ON(align & (align - 1));
>
> When we move that I think we should cleanup the code once more, e.g. use 
> is_power_of_2() function here for example.

Yeah, I wa

[PATCH] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Hans de Goede
Fix the following warning from "make htmldocs":

drivers/gpu/drm/drm_privacy_screen.c:392:
  warning: Function parameter or member 'data' not described in
  'drm_privacy_screen_register'

Fixes: 30598d925d46 ("drm/privacy_screen: Add drvdata in drm_privacy_screen")
Cc: Rajat Jain 
Reported-by: Stephen Rothwell 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/drm_privacy_screen.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_privacy_screen.c 
b/drivers/gpu/drm/drm_privacy_screen.c
index 03b149cc455b..45c080134488 100644
--- a/drivers/gpu/drm/drm_privacy_screen.c
+++ b/drivers/gpu/drm/drm_privacy_screen.c
@@ -379,6 +379,7 @@ static void drm_privacy_screen_device_release(struct device 
*dev)
  * drm_privacy_screen_register - register a privacy-screen
  * @parent: parent-device for the privacy-screen
  * @ops: &struct drm_privacy_screen_ops pointer with ops for the privacy-screen
+ * @data: Private data owned by the privacy screen provider
  *
  * Create and register a privacy-screen.
  *
-- 
2.33.1



Re: [PATCH] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Simon Ser
Reviewed-by: Simon Ser 


Re: [PATCH] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Hans de Goede
Hi Simon,

On 2/7/22 12:37, Simon Ser wrote:
> Reviewed-by: Simon Ser 

Thank you, I also have this very similar patch pennding (also a simple htmldocs 
warning fix).

Any chance you can also review this one? :

https://patchwork.freedesktop.org/patch/470957/

Regards,

Hans
 



Re: [RFC 0/2] drm/i915/ttm: Evict and store of compressed object

2022-02-07 Thread Christian König

Am 07.02.22 um 10:37 schrieb Ramalingam C:

On flat-ccs capable platform we need to evict and resore the ccs data
along with the corresponding main memory.

This ccs data can only be access through BLT engine through a special
cmd ( )

To support above requirement of flat-ccs enabled i915 platforms this
series adds new param called ccs_pages_needed to the ttm_tt_init(),
to increase the ttm_tt->num_pages of system memory when the obj has the
lmem placement possibility.


Well question is why isn't the buffer object allocated with the extra 
space in the first place?


Regards,
Christian.



This will be on top of the flat-ccs enabling series
https://patchwork.freedesktop.org/series/95686/

For more about flat-ccs feature please have a look at
https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5

Testing of the series is WIP and looking forward for the early review on
the amendment to ttm_tt_init and the approach.

Ramalingam C (2):
   drm/i915/ttm: Add extra pages for handling ccs data
   drm/i915/migrate: Evict and restore the ccs data

  drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c|  23 +-
  drivers/gpu/drm/i915/gt/intel_migrate.c| 283 +++--
  drivers/gpu/drm/qxl/qxl_ttm.c  |   2 +-
  drivers/gpu/drm/ttm/ttm_agp_backend.c  |   2 +-
  drivers/gpu/drm/ttm/ttm_tt.c   |  12 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
  include/drm/ttm/ttm_tt.h   |   4 +-
  8 files changed, 191 insertions(+), 139 deletions(-)





Re: [PATCH] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Simon Ser
On Wednesday, January 26th, 2022 at 16:11, Hans de Goede  
wrote:

> - * A pointer to the drm_privacy_screen's struct is passed as the void *data
> + * A pointer to the drm_privacy_screen's struct is passed as the void \*data

Maybe we can use @data here instead? It's used to refer to arguments or struct
members.

Alternatively, use double backquotes to format it as inline code blocks:

   ``void *data``


Re: [Intel-gfx] [PATCH 1/5] drm/i915/dg2: Add Wa_22011450934

2022-02-07 Thread Matthew Auld
On Fri, 28 Jan 2022 at 18:52, Ramalingam C  wrote:
>
> An indirect ctx wabb is implemented as per Wa_22011450934 to avoid rcs
> restore hang during context restore of a preempted context in GPGPU mode
>
> Signed-off-by: Ramalingam C 
> cc: Chris Wilson 
Acked-by: Matthew Auld 


Re: [Intel-gfx] [PATCH 1/5] drm/i915/dg2: Add Wa_22011450934

2022-02-07 Thread Matthew Auld

On 07/02/2022 11:48, Matthew Auld wrote:

On Fri, 28 Jan 2022 at 18:52, Ramalingam C  wrote:


An indirect ctx wabb is implemented as per Wa_22011450934 to avoid rcs
restore hang during context restore of a preempted context in GPGPU mode

Signed-off-by: Ramalingam C 
cc: Chris Wilson 

Acked-by: Matthew Auld 


Also, feel free to upgrade to r-b for this and patches 2-4.


Re: [PATCH 1/3] i915/gvt: Introduce the mmio_table.c to support VFIO new mdev API

2022-02-07 Thread Zhi Wang

On 2/7/22 05:48, Jani Nikula wrote:


On Mon, 07 Feb 2022, Christoph Hellwig  wrote:

On Mon, Feb 07, 2022 at 08:28:13AM +, Wang, Zhi A wrote:

1) About having the mmio_table.h, I would like to keep the stuff in a
dedicated header as putting them in intel_gvt.h might needs i915 guys
to maintain it.
2) The other one is about if we should move the mmio_table.c into
i915 folder. I guess we need the some comments from Jani. In the
current version that I am testing, it's still in GVT folder. Guess we
can submit a patch to move it to i915 folder later if Jani is ok
about that.

Yes, let's have Jani chime in on these.  They're basically one and the
same issue.  This code will have to be built into into the core i915
driver even with my planned split, which is kindof the point of this
exercise.  I think it makes sense to use the subdirectories as boundaries
for where the code ends up and not to declarare maintainership boundaries,
but it will be up to the i915 and gvt maintainers to decide that.

Agreed. If there's going to be a gvt.ko, I think all of gvt/ should be
part of that module, nothing more, nothing less.

The gvt related files in i915/ should probably be named intel_gvt* or
something, ditto for function naming, and we'll probably want patches
touching them be Cc'd to intel-gfx list.

Joonas, Rodrigo, Tvrtko, thoughts?

BR,
Jani.


Hi Christoph and Jani:

Thanks for the comments. It would be nice that people can achieve a 
agreement. I am OK with both of the options and also moving some files 
into different folders doesn't needs me to do the full test run again. :)



Thanks,

Zhi.



Re: [PATCH v5 2/5] drm/i915/gt: Drop invalidate_csb_entries

2022-02-07 Thread Tvrtko Ursulin



On 04/02/2022 16:37, Michael Cheng wrote:

Drop invalidate_csb_entries and directly call drm_clflush_virt_range.
This allows for one less function call, and prevent complier errors when
building for non-x86 architectures.

v2(Michael Cheng): Drop invalidate_csb_entries function and directly
   invoke drm_clflush_virt_range. Thanks to Tvrtko for the
   sugguestion.

Signed-off-by: Michael Cheng 
---
  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 12 +++-
  1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 9bb7c863172f..7500c06562da 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1646,12 +1646,6 @@ cancel_port_requests(struct intel_engine_execlists * 
const execlists,
return inactive;
  }
  
-static void invalidate_csb_entries(const u64 *first, const u64 *last)

-{
-   clflush((void *)first);
-   clflush((void *)last);
-}
-
  /*
   * Starting with Gen12, the status has a new format:
   *
@@ -1999,7 +1993,7 @@ process_csb(struct intel_engine_cs *engine, struct 
i915_request **inactive)
 * the wash as hardware, working or not, will need to do the
 * invalidation before.
 */
-   invalidate_csb_entries(&buf[0], &buf[num_entries - 1]);
+   drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0]));
  
  	/*

 * We assume that any event reflects a change in context flow
@@ -2783,8 +2777,8 @@ static void reset_csb_pointers(struct intel_engine_cs 
*engine)
  
  	/* Check that the GPU does indeed update the CSB entries! */

memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
-   invalidate_csb_entries(&execlists->csb_status[0],
-  &execlists->csb_status[reset_value]);
+   drm_clflush_virt_range(&execlists->csb_status[0],
+   sizeof(&execlists->csb_status[reset_value]));


Hm I thought we covered this already, should be:

drm_clflush_virt_range(&execlists->csb_status[0],
   execlists->csb_size * sizeof(execlists->csb_status[0]));

Regards,

Tvrtko

  
  	/* Once more for luck and our trusty paranoia */

ENGINE_WRITE(engine, RING_CONTEXT_STATUS_PTR,


Re: [PATCH v5 0/5] Use drm_clflush* instead of clflush

2022-02-07 Thread Tvrtko Ursulin



On 04/02/2022 16:37, Michael Cheng wrote:

This patch series re-work a few i915 functions to use drm_clflush_virt_range
instead of calling clflush or clflushopt directly. This will prevent errors
when building for non-x86 architectures.

v2: s/PAGE_SIZE/sizeof(value) for Re-work intel_write_status_page and added
more patches to convert additional clflush/clflushopt to use drm_clflush*.
(Michael Cheng)

v3: Drop invalidate_csb_entries and directly invoke drm_clflush_virt_ran

v4: Remove extra memory barriers

v5: s/cache_clflush_range/drm_clflush_virt_range


Is anyone interested in this story noticing my open? I will repeat:

How about we add i915_clflush_virt_range as static inline and by doing 
so avoid adding function calls to code paths which are impossible on Arm 
builds? Case in point relocations, probably execlists backend as well.


Downside would be effectively duplicating drm_clfush_virt_range code. 
But for me, (Also considering no other driver calls it so why it is 
there? Should it be deleted?), that would be okay.


Regards,

Tvrtko


Michael Cheng (5):
   drm/i915/gt: Re-work intel_write_status_page
   drm/i915/gt: Drop invalidate_csb_entries
   drm/i915/gt: Re-work reset_csb
   drm/i915/: Re-work clflush_write32
   drm/i915/gt: replace cache_clflush_range

  .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +++-
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 12 ++--
  drivers/gpu/drm/i915/gt/intel_engine.h| 13 -
  .../drm/i915/gt/intel_execlists_submission.c  | 19 ++-
  drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +-
  drivers/gpu/drm/i915/gt/intel_ppgtt.c |  2 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 +-
  7 files changed, 22 insertions(+), 36 deletions(-)



Re: [PATCH 1/3] i915/gvt: Introduce the mmio_table.c to support VFIO new mdev API

2022-02-07 Thread Jani Nikula
On Mon, 07 Feb 2022, Christoph Hellwig  wrote:
> On Mon, Feb 07, 2022 at 06:57:13AM -0500, Zhi Wang wrote:
>> Hi Christoph and Jani:
>>
>> Thanks for the comments. It would be nice that people can achieve a 
>> agreement. I am OK with both of the options and also moving some files into 
>> different folders doesn't needs me to do the full test run again. :)
>
> The way I understood Jani he agrees that the mmio table, which needs to
> be part of the core i915 module should not be under the gvt/ subdiretory.
> I.e. it could be drivers/gpu/drm/i915/intel_gvt_mmio_table.c.  The
> declarations could then go either into drivers/gpu/drm/i915/intel_gvt.h
> or drivers/gpu/drm/i915/intel_gvt_mmio_table.h.

Correct.

Generally I prefer to have the declarations for stuff in intel_foo.c to
be placed in intel_foo.h, and named intel_foo_*.


BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH v2 1/4] drm/format-helper: Add drm_fb_{xrgb8888, gray8}_to_mono_reversed()

2022-02-07 Thread Thomas Zimmermann

Hi

Am 04.02.22 um 22:02 schrieb Ilia Mirkin:

On Fri, Feb 4, 2022 at 10:53 AM Thomas Zimmermann  wrote:


Hi

Am 04.02.22 um 14:43 schrieb Javier Martinez Canillas:

Add support to convert XR24 and 8-bit grayscale to reversed monochrome for
drivers that control monochromatic panels, that only have 1 bit per pixel.

The drm_fb_gray8_to_mono_reversed() helper was based on the function that
does the same in the drivers/gpu/drm/tiny/repaper.c driver.

Signed-off-by: Javier Martinez Canillas 
---

(no changes since v1)

   drivers/gpu/drm/drm_format_helper.c | 80 +
   include/drm/drm_format_helper.h |  7 +++
   2 files changed, 87 insertions(+)

diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index 0f28dd2bdd72..cdce4b7c25d9 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -584,3 +584,83 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int 
dst_pitch, uint32_t dst_for
   return -EINVAL;
   }
   EXPORT_SYMBOL(drm_fb_blit_toio);
+
+static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, size_t 
pixels)
+{
+ unsigned int xb, i;
+
+ for (xb = 0; xb < pixels / 8; xb++) {


In practice, all mode widths are multiples of 8 because VGA mandated it.
So it's ok-ish to assume this here. You should probably at least print a
warning somewhere if (pixels % 8 != 0)


Not sure if it's relevant, but 1366x768 was a fairly popular laptop
resolution. There's even a dedicated drm_mode_fixup_1366x768 in
drm_edid.c. (Would it have killed them to add 2 more horizontal
pixels? Apparently.)


D'oh!

Do you know how the text console looks in this mode? Fonts still expect 
a multiple of 8.


Best regards
Thomas



Cheers,

   -ilia


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [Intel-gfx] [PATCH v5 0/5] Use drm_clflush* instead of clflush

2022-02-07 Thread Jani Nikula
On Mon, 07 Feb 2022, Tvrtko Ursulin  wrote:
> On 04/02/2022 16:37, Michael Cheng wrote:
>> This patch series re-work a few i915 functions to use drm_clflush_virt_range
>> instead of calling clflush or clflushopt directly. This will prevent errors
>> when building for non-x86 architectures.
>> 
>> v2: s/PAGE_SIZE/sizeof(value) for Re-work intel_write_status_page and added
>> more patches to convert additional clflush/clflushopt to use drm_clflush*.
>> (Michael Cheng)
>> 
>> v3: Drop invalidate_csb_entries and directly invoke drm_clflush_virt_ran
>> 
>> v4: Remove extra memory barriers
>> 
>> v5: s/cache_clflush_range/drm_clflush_virt_range
>
> Is anyone interested in this story noticing my open? I will repeat:
>
> How about we add i915_clflush_virt_range as static inline and by doing 
> so avoid adding function calls to code paths which are impossible on Arm 
> builds? Case in point relocations, probably execlists backend as well.
>
> Downside would be effectively duplicating drm_clfush_virt_range code. 
> But for me, (Also considering no other driver calls it so why it is 
> there? Should it be deleted?), that would be okay.

Keep it simple first, optimize later if necessary?

BR,
Jani.


>
> Regards,
>
> Tvrtko
>
>> Michael Cheng (5):
>>drm/i915/gt: Re-work intel_write_status_page
>>drm/i915/gt: Drop invalidate_csb_entries
>>drm/i915/gt: Re-work reset_csb
>>drm/i915/: Re-work clflush_write32
>>drm/i915/gt: replace cache_clflush_range
>> 
>>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  8 +++-
>>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 12 ++--
>>   drivers/gpu/drm/i915/gt/intel_engine.h| 13 -
>>   .../drm/i915/gt/intel_execlists_submission.c  | 19 ++-
>>   drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +-
>>   drivers/gpu/drm/i915/gt/intel_ppgtt.c |  2 +-
>>   .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  2 +-
>>   7 files changed, 22 insertions(+), 36 deletions(-)
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Hans de Goede
Hi,

Thank you for the review.

On 2/7/22 12:43, Simon Ser wrote:
> On Wednesday, January 26th, 2022 at 16:11, Hans de Goede 
>  wrote:
> 
>> - * A pointer to the drm_privacy_screen's struct is passed as the void *data
>> + * A pointer to the drm_privacy_screen's struct is passed as the void \*data
> 
> Maybe we can use @data here instead? It's used to refer to arguments or struct
> members.

It is not an argument to the function being described; nor is it a struct 
member,
it is the void *data in:

typedef int (*notifier_fn_t)(struct notifier_block *nb,
unsigned long action, void *data);


> Alternatively, use double backquotes to format it as inline code blocks:
> 
>``void *data``

So I'll prepare a v2 using this.

Regards,

Hans



Re: [PATCH] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Simon Ser
On Monday, February 7th, 2022 at 13:55, Hans de Goede  
wrote:

> It is not an argument to the function being described

Ah right, makes sense then!


[PATCH v2 00/12] iio: buffer-dma: write() and new DMABUF based API

2022-02-07 Thread Paul Cercueil
Hi Jonathan,

This is the V2 of my patchset that introduces a new userspace interface
based on DMABUF objects to complement the fileio API, and adds write()
support to the existing fileio API.

Changes since v1:

- the patches that were merged in v1 have been (obviously) dropped from
  this patchset;
- the patch that was setting the write-combine cache setting has been
  dropped as well, as it was simply not useful.
- [01/12]: 
* Only remove the outgoing queue, and keep the incoming queue, as we
  want the buffer to start streaming data as soon as it is enabled.
* Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
  same as IIO_BLOCK_STATE_DONE.
- [02/12]:
* Fix block->state not being reset in
  iio_dma_buffer_request_update() for output buffers.
* Only update block->bytes_used once and add a comment about why we
  update it.
* Add a comment about why we're setting a different state for output
  buffers in iio_dma_buffer_request_update()
* Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]:
Only allow the new IOCTLs on the buffer FD created with
IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
* Explicitly state that the new interface is optional and is
  not implemented by all drivers.
* The IOCTLs can now only be called on the buffer FD returned by
  IIO_BUFFER_GET_FD_IOCTL.
* Move the page up a bit in the index since it is core stuff and not
  driver-specific.

The patches not listed here have not been modified since v1.

Cheers,
-Paul

Alexandru Ardelean (1):
  iio: buffer-dma: split iio_dma_buffer_fileio_free() function

Paul Cercueil (11):
  iio: buffer-dma: Get rid of outgoing queue
  iio: buffer-dma: Enable buffer write support
  iio: buffer-dmaengine: Support specifying buffer direction
  iio: buffer-dmaengine: Enable write support
  iio: core: Add new DMABUF interface infrastructure
  iio: buffer-dma: Use DMABUFs instead of custom solution
  iio: buffer-dma: Implement new DMABUF based userspace API
  iio: buffer-dmaengine: Support new DMABUF based userspace API
  iio: core: Add support for cyclic buffers
  iio: buffer-dmaengine: Add support for cyclic buffers
  Documentation: iio: Document high-speed DMABUF based API

 Documentation/driver-api/dma-buf.rst  |   2 +
 Documentation/iio/dmabuf_api.rst  |  94 +++
 Documentation/iio/index.rst   |   2 +
 drivers/iio/adc/adi-axi-adc.c |   3 +-
 drivers/iio/buffer/industrialio-buffer-dma.c  | 610 ++
 .../buffer/industrialio-buffer-dmaengine.c|  42 +-
 drivers/iio/industrialio-buffer.c |  60 ++
 include/linux/iio/buffer-dma.h|  38 +-
 include/linux/iio/buffer-dmaengine.h  |   5 +-
 include/linux/iio/buffer_impl.h   |   8 +
 include/uapi/linux/iio/buffer.h   |  30 +
 11 files changed, 749 insertions(+), 145 deletions(-)
 create mode 100644 Documentation/iio/dmabuf_api.rst

-- 
2.34.1



[PATCH v2 01/12] iio: buffer-dma: Get rid of outgoing queue

2022-02-07 Thread Paul Cercueil
The buffer-dma code was using two queues, incoming and outgoing, to
manage the state of the blocks in use.

While this totally works, it adds some complexity to the code,
especially since the code only manages 2 blocks. It is much easier to
just check each block's state manually, and keep a counter for the next
block to dequeue.

Since the new DMABUF based API wouldn't use the outgoing queue anyway,
getting rid of it now makes the upcoming changes simpler.

With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can
be removed.

v2: - Only remove the outgoing queue, and keep the incoming queue, as we
  want the buffer to start streaming data as soon as it is enabled.
- Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
  same as IIO_BLOCK_STATE_DONE.

Signed-off-by: Paul Cercueil 
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++--
 include/linux/iio/buffer-dma.h   |  7 ++--
 2 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
b/drivers/iio/buffer/industrialio-buffer-dma.c
index d348af8b9705..1fc91467d1aa 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -179,7 +179,7 @@ static struct iio_dma_buffer_block 
*iio_dma_buffer_alloc_block(
}
 
block->size = size;
-   block->state = IIO_BLOCK_STATE_DEQUEUED;
+   block->state = IIO_BLOCK_STATE_DONE;
block->queue = queue;
INIT_LIST_HEAD(&block->head);
kref_init(&block->kref);
@@ -191,16 +191,8 @@ static struct iio_dma_buffer_block 
*iio_dma_buffer_alloc_block(
 
 static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block)
 {
-   struct iio_dma_buffer_queue *queue = block->queue;
-
-   /*
-* The buffer has already been freed by the application, just drop the
-* reference.
-*/
-   if (block->state != IIO_BLOCK_STATE_DEAD) {
+   if (block->state != IIO_BLOCK_STATE_DEAD)
block->state = IIO_BLOCK_STATE_DONE;
-   list_add_tail(&block->head, &queue->outgoing);
-   }
 }
 
 /**
@@ -261,7 +253,6 @@ static bool iio_dma_block_reusable(struct 
iio_dma_buffer_block *block)
 * not support abort and has not given back the block yet.
 */
switch (block->state) {
-   case IIO_BLOCK_STATE_DEQUEUED:
case IIO_BLOCK_STATE_QUEUED:
case IIO_BLOCK_STATE_DONE:
return true;
@@ -317,7 +308,6 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
 * dead. This means we can reset the lists without having to fear
 * corrution.
 */
-   INIT_LIST_HEAD(&queue->outgoing);
spin_unlock_irq(&queue->list_lock);
 
INIT_LIST_HEAD(&queue->incoming);
@@ -456,14 +446,20 @@ static struct iio_dma_buffer_block 
*iio_dma_buffer_dequeue(
struct iio_dma_buffer_queue *queue)
 {
struct iio_dma_buffer_block *block;
+   unsigned int idx;
 
spin_lock_irq(&queue->list_lock);
-   block = list_first_entry_or_null(&queue->outgoing, struct
-   iio_dma_buffer_block, head);
-   if (block != NULL) {
-   list_del(&block->head);
-   block->state = IIO_BLOCK_STATE_DEQUEUED;
+
+   idx = queue->fileio.next_dequeue;
+   block = queue->fileio.blocks[idx];
+
+   if (block->state == IIO_BLOCK_STATE_DONE) {
+   idx = (idx + 1) % ARRAY_SIZE(queue->fileio.blocks);
+   queue->fileio.next_dequeue = idx;
+   } else {
+   block = NULL;
}
+
spin_unlock_irq(&queue->list_lock);
 
return block;
@@ -539,6 +535,7 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buf);
struct iio_dma_buffer_block *block;
size_t data_available = 0;
+   unsigned int i;
 
/*
 * For counting the available bytes we'll use the size of the block not
@@ -552,8 +549,15 @@ size_t iio_dma_buffer_data_available(struct iio_buffer 
*buf)
data_available += queue->fileio.active_block->size;
 
spin_lock_irq(&queue->list_lock);
-   list_for_each_entry(block, &queue->outgoing, head)
-   data_available += block->size;
+
+   for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+   block = queue->fileio.blocks[i];
+
+   if (block != queue->fileio.active_block
+   && block->state == IIO_BLOCK_STATE_DONE)
+   data_available += block->size;
+   }
+
spin_unlock_irq(&queue->list_lock);
mutex_unlock(&queue->lock);
 
@@ -617,7 +621,6 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue,
queue->ops = ops;
 
INIT_LIST_HEAD(&queue->incoming);
-   INIT_LIST_HEAD(&queue->outgoing);
 
mutex_init(&queue->lock);
spin_lock_init(&queue->list_

[PATCH v2 02/12] iio: buffer-dma: Enable buffer write support

2022-02-07 Thread Paul Cercueil
Adding write support to the buffer-dma code is easy - the write()
function basically needs to do the exact same thing as the read()
function: dequeue a block, read or write the data, enqueue the block
when entirely processed.

Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write()
now both call a function iio_buffer_dma_io(), which will perform this
task.

The .space_available() callback can return the exact same value as the
.data_available() callback for input buffers, since in both cases we
count the exact same thing (the number of bytes in each available
block).

Note that we preemptively reset block->bytes_used to the buffer's size
in iio_dma_buffer_request_update(), as in the future the
iio_dma_buffer_enqueue() function won't reset it.

v2: - Fix block->state not being reset in
  iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment about why we
  update it.
- Add a comment about why we're setting a different state for output
  buffers in iio_dma_buffer_request_update()
- Remove useless cast to bool (!!) in iio_dma_buffer_io()

Signed-off-by: Paul Cercueil 
Reviewed-by: Alexandru Ardelean 
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 88 
 include/linux/iio/buffer-dma.h   |  7 ++
 2 files changed, 79 insertions(+), 16 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
b/drivers/iio/buffer/industrialio-buffer-dma.c
index 1fc91467d1aa..a9f1b673374f 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct 
iio_dma_buffer_block *block)
block->state = IIO_BLOCK_STATE_DONE;
 }
 
+static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue)
+{
+   __poll_t flags;
+
+   if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
+   flags = EPOLLIN | EPOLLRDNORM;
+   else
+   flags = EPOLLOUT | EPOLLWRNORM;
+
+   wake_up_interruptible_poll(&queue->buffer.pollq, flags);
+}
+
 /**
  * iio_dma_buffer_block_done() - Indicate that a block has been completed
  * @block: The completed block
@@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block 
*block)
spin_unlock_irqrestore(&queue->list_lock, flags);
 
iio_buffer_block_put_atomic(block);
-   wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
+   iio_dma_buffer_queue_wake(queue);
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
 
@@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct 
iio_dma_buffer_queue *queue,
}
spin_unlock_irqrestore(&queue->list_lock, flags);
 
-   wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
+   iio_dma_buffer_queue_wake(queue);
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort);
 
@@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct iio_buffer 
*buffer)
queue->fileio.blocks[i] = block;
}
 
-   block->state = IIO_BLOCK_STATE_QUEUED;
-   list_add_tail(&block->head, &queue->incoming);
+   /*
+* block->bytes_used may have been modified previously, e.g. by
+* iio_dma_buffer_block_list_abort(). Reset it here to the
+* block's so that iio_dma_buffer_io() will work.
+*/
+   block->bytes_used = block->size;
+
+   /*
+* If it's an input buffer, mark the block as queued, and
+* iio_dma_buffer_enable() will submit it. Otherwise mark it as
+* done, which means it's ready to be dequeued.
+*/
+   if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
+   block->state = IIO_BLOCK_STATE_QUEUED;
+   list_add_tail(&block->head, &queue->incoming);
+   } else {
+   block->state = IIO_BLOCK_STATE_DONE;
+   }
}
 
 out_unlock:
@@ -465,20 +493,12 @@ static struct iio_dma_buffer_block 
*iio_dma_buffer_dequeue(
return block;
 }
 
-/**
- * iio_dma_buffer_read() - DMA buffer read callback
- * @buffer: Buffer to read form
- * @n: Number of bytes to read
- * @user_buffer: Userspace buffer to copy the data to
- *
- * Should be used as the read callback for iio_buffer_access_ops
- * struct for DMA buffers.
- */
-int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
-   char __user *user_buffer)
+static int iio_dma_buffer_io(struct iio_buffer *buffer,
+size_t n, char __user *user_buffer, bool is_write)
 {
struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
struct iio_dma_buffer_block *block;
+   void *addr;
int ret;
 
if (n < buffer->bytes_per_datum)
@@ -501,8 +521,13 @@ int i

[PATCH v2 03/12] iio: buffer-dmaengine: Support specifying buffer direction

2022-02-07 Thread Paul Cercueil
Update the devm_iio_dmaengine_buffer_setup() function to support
specifying the buffer direction.

Update the iio_dmaengine_buffer_submit() function to handle input
buffers as well as output buffers.

Signed-off-by: Paul Cercueil 
Reviewed-by: Alexandru Ardelean 
---
 drivers/iio/adc/adi-axi-adc.c |  3 ++-
 .../buffer/industrialio-buffer-dmaengine.c| 24 +++
 include/linux/iio/buffer-dmaengine.h  |  5 +++-
 3 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/iio/adc/adi-axi-adc.c b/drivers/iio/adc/adi-axi-adc.c
index a73e3c2d212f..0a6f2c32b1b9 100644
--- a/drivers/iio/adc/adi-axi-adc.c
+++ b/drivers/iio/adc/adi-axi-adc.c
@@ -113,7 +113,8 @@ static int adi_axi_adc_config_dma_buffer(struct device *dev,
dma_name = "rx";
 
return devm_iio_dmaengine_buffer_setup(indio_dev->dev.parent,
-  indio_dev, dma_name);
+  indio_dev, dma_name,
+  IIO_BUFFER_DIRECTION_IN);
 }
 
 static int adi_axi_adc_read_raw(struct iio_dev *indio_dev,
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c 
b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index f8ce26a24c57..ac26b04aa4a9 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -64,14 +64,25 @@ static int iio_dmaengine_buffer_submit_block(struct 
iio_dma_buffer_queue *queue,
struct dmaengine_buffer *dmaengine_buffer =
iio_buffer_to_dmaengine_buffer(&queue->buffer);
struct dma_async_tx_descriptor *desc;
+   enum dma_transfer_direction dma_dir;
+   size_t max_size;
dma_cookie_t cookie;
 
-   block->bytes_used = min(block->size, dmaengine_buffer->max_size);
-   block->bytes_used = round_down(block->bytes_used,
-   dmaengine_buffer->align);
+   max_size = min(block->size, dmaengine_buffer->max_size);
+   max_size = round_down(max_size, dmaengine_buffer->align);
+
+   if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
+   block->bytes_used = max_size;
+   dma_dir = DMA_DEV_TO_MEM;
+   } else {
+   dma_dir = DMA_MEM_TO_DEV;
+   }
+
+   if (!block->bytes_used || block->bytes_used > max_size)
+   return -EINVAL;
 
desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
-   block->phys_addr, block->bytes_used, DMA_DEV_TO_MEM,
+   block->phys_addr, block->bytes_used, dma_dir,
DMA_PREP_INTERRUPT);
if (!desc)
return -ENOMEM;
@@ -275,7 +286,8 @@ static struct iio_buffer 
*devm_iio_dmaengine_buffer_alloc(struct device *dev,
  */
 int devm_iio_dmaengine_buffer_setup(struct device *dev,
struct iio_dev *indio_dev,
-   const char *channel)
+   const char *channel,
+   enum iio_buffer_direction dir)
 {
struct iio_buffer *buffer;
 
@@ -286,6 +298,8 @@ int devm_iio_dmaengine_buffer_setup(struct device *dev,
 
indio_dev->modes |= INDIO_BUFFER_HARDWARE;
 
+   buffer->direction = dir;
+
return iio_device_attach_buffer(indio_dev, buffer);
 }
 EXPORT_SYMBOL_GPL(devm_iio_dmaengine_buffer_setup);
diff --git a/include/linux/iio/buffer-dmaengine.h 
b/include/linux/iio/buffer-dmaengine.h
index 5c355be89814..538d0479cdd6 100644
--- a/include/linux/iio/buffer-dmaengine.h
+++ b/include/linux/iio/buffer-dmaengine.h
@@ -7,11 +7,14 @@
 #ifndef __IIO_DMAENGINE_H__
 #define __IIO_DMAENGINE_H__
 
+#include 
+
 struct iio_dev;
 struct device;
 
 int devm_iio_dmaengine_buffer_setup(struct device *dev,
struct iio_dev *indio_dev,
-   const char *channel);
+   const char *channel,
+   enum iio_buffer_direction dir);
 
 #endif
-- 
2.34.1



[PATCH v2 04/12] iio: buffer-dmaengine: Enable write support

2022-02-07 Thread Paul Cercueil
Use the iio_dma_buffer_write() and iio_dma_buffer_space_available()
functions provided by the buffer-dma core, to enable write support in
the buffer-dmaengine code.

Signed-off-by: Paul Cercueil 
Reviewed-by: Alexandru Ardelean 
---
 drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c 
b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index ac26b04aa4a9..5cde8fd81c7f 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -123,12 +123,14 @@ static void iio_dmaengine_buffer_release(struct 
iio_buffer *buf)
 
 static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = {
.read = iio_dma_buffer_read,
+   .write = iio_dma_buffer_write,
.set_bytes_per_datum = iio_dma_buffer_set_bytes_per_datum,
.set_length = iio_dma_buffer_set_length,
.request_update = iio_dma_buffer_request_update,
.enable = iio_dma_buffer_enable,
.disable = iio_dma_buffer_disable,
.data_available = iio_dma_buffer_data_available,
+   .space_available = iio_dma_buffer_space_available,
.release = iio_dmaengine_buffer_release,
 
.modes = INDIO_BUFFER_HARDWARE,
-- 
2.34.1



[PATCH v2 05/12] iio: core: Add new DMABUF interface infrastructure

2022-02-07 Thread Paul Cercueil
Add the necessary infrastructure to the IIO core to support a new
optional DMABUF based interface.

The advantage of this new DMABUF based interface vs. the read()
interface, is that it avoids an extra copy of the data between the
kernel and userspace. This is particularly userful for high-speed
devices which produce several megabytes or even gigabytes of data per
second.

The data in this new DMABUF interface is managed at the granularity of
DMABUF objects. Reducing the granularity from byte level to block level
is done to reduce the userspace-kernelspace synchronization overhead
since performing syscalls for each byte at a few Mbps is just not
feasible.

This of course leads to a slightly increased latency. For this reason an
application can choose the size of the DMABUFs as well as how many it
allocates. E.g. two DMABUFs would be a traditional double buffering
scheme. But using a higher number might be necessary to avoid
underflow/overflow situations in the presence of scheduling latencies.

As part of the interface, 2 new IOCTLs have been added:

IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *):
 Each call will allocate a new DMABUF object. The return value (if not
 a negative errno value as error) will be the file descriptor of the new
 DMABUF.

IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
 Place the DMABUF object into the queue pending for hardware process.

These two IOCTLs have to be performed on the IIO buffer's file
descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.

To access the data stored in a block by userspace the block must be
mapped to the process's memory. This is done by calling mmap() on the
DMABUF's file descriptor.

Before accessing the data through the map, you must use the
DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
DMA_BUF_SYNC_START flag, to make sure that the data is available.
This call may block until the hardware is done with this block. Once
you are done reading or writing the data, you must use this ioctl again
with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
kernel's queue.

If you need to know when the hardware is done with a DMABUF, you can
poll its file descriptor for the EPOLLOUT event.

Finally, to destroy a DMABUF object, simply call close() on its file
descriptor.

A typical workflow for the new interface is:

  for block in blocks:
DMABUF_ALLOC block
mmap block

  enable buffer

  while !done
for block in blocks:
  DMABUF_ENQUEUE block

  DMABUF_SYNC_START block
  process data
  DMABUF_SYNC_END block

  disable buffer

  for block in blocks:
close block

v2: Only allow the new IOCTLs on the buffer FD created with
IIO_BUFFER_GET_FD_IOCTL().

Signed-off-by: Paul Cercueil 
---
 drivers/iio/industrialio-buffer.c | 55 +++
 include/linux/iio/buffer_impl.h   |  8 +
 include/uapi/linux/iio/buffer.h   | 29 
 3 files changed, 92 insertions(+)

diff --git a/drivers/iio/industrialio-buffer.c 
b/drivers/iio/industrialio-buffer.c
index 94eb9f6cf128..72f333a519bc 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode 
*inode, struct file *filep)
return 0;
 }
 
+static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
+struct iio_dmabuf __user *user_buf)
+{
+   struct iio_dmabuf dmabuf;
+
+   if (!buffer->access->enqueue_dmabuf)
+   return -EPERM;
+
+   if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
+   return -EFAULT;
+
+   if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
+   return -EINVAL;
+
+   return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
+}
+
+static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
+  struct iio_dmabuf_alloc_req __user *user_req)
+{
+   struct iio_dmabuf_alloc_req req;
+
+   if (!buffer->access->alloc_dmabuf)
+   return -EPERM;
+
+   if (copy_from_user(&req, user_req, sizeof(req)))
+   return -EFAULT;
+
+   if (req.resv)
+   return -EINVAL;
+
+   return buffer->access->alloc_dmabuf(buffer, &req);
+}
+
+static long iio_buffer_chrdev_ioctl(struct file *filp,
+   unsigned int cmd, unsigned long arg)
+{
+   struct iio_dev_buffer_pair *ib = filp->private_data;
+   struct iio_buffer *buffer = ib->buffer;
+   void __user *_arg = (void __user *)arg;
+
+   switch (cmd) {
+   case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
+   return iio_buffer_alloc_dmabuf(buffer, _arg);
+   case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
+   /* TODO: support non-blocking enqueue operation */
+   return iio_buffer_enqueue_dmabuf(buffer, _arg);
+   defaul

[PATCH v2 06/12] iio: buffer-dma: split iio_dma_buffer_fileio_free() function

2022-02-07 Thread Paul Cercueil
From: Alexandru Ardelean 

A part of the logic in the iio_dma_buffer_exit() is required for the change
to add mmap support to IIO buffers.
This change splits the logic into a separate function, which will be
re-used later.

Signed-off-by: Alexandru Ardelean 
Cc: Alexandru Ardelean 
Signed-off-by: Paul Cercueil 
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 43 +++-
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
b/drivers/iio/buffer/industrialio-buffer-dma.c
index a9f1b673374f..15ea7bc3ac08 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -374,6 +374,29 @@ int iio_dma_buffer_request_update(struct iio_buffer 
*buffer)
 }
 EXPORT_SYMBOL_GPL(iio_dma_buffer_request_update);
 
+static void iio_dma_buffer_fileio_free(struct iio_dma_buffer_queue *queue)
+{
+   unsigned int i;
+
+   spin_lock_irq(&queue->list_lock);
+   for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+   if (!queue->fileio.blocks[i])
+   continue;
+   queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD;
+   }
+   spin_unlock_irq(&queue->list_lock);
+
+   INIT_LIST_HEAD(&queue->incoming);
+
+   for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
+   if (!queue->fileio.blocks[i])
+   continue;
+   iio_buffer_block_put(queue->fileio.blocks[i]);
+   queue->fileio.blocks[i] = NULL;
+   }
+   queue->fileio.active_block = NULL;
+}
+
 static void iio_dma_buffer_submit_block(struct iio_dma_buffer_queue *queue,
struct iio_dma_buffer_block *block)
 {
@@ -694,27 +717,9 @@ EXPORT_SYMBOL_GPL(iio_dma_buffer_init);
  */
 void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue)
 {
-   unsigned int i;
-
mutex_lock(&queue->lock);
 
-   spin_lock_irq(&queue->list_lock);
-   for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
-   if (!queue->fileio.blocks[i])
-   continue;
-   queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD;
-   }
-   spin_unlock_irq(&queue->list_lock);
-
-   INIT_LIST_HEAD(&queue->incoming);
-
-   for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
-   if (!queue->fileio.blocks[i])
-   continue;
-   iio_buffer_block_put(queue->fileio.blocks[i]);
-   queue->fileio.blocks[i] = NULL;
-   }
-   queue->fileio.active_block = NULL;
+   iio_dma_buffer_fileio_free(queue);
queue->ops = NULL;
 
mutex_unlock(&queue->lock);
-- 
2.34.1



[PATCH v2 07/12] iio: buffer-dma: Use DMABUFs instead of custom solution

2022-02-07 Thread Paul Cercueil
Enhance the current fileio code by using DMABUF objects instead of
custom buffers.

This adds more code than it removes, but:
- a lot of the complexity can be dropped, e.g. custom kref and
  iio_buffer_block_put_atomic() are not needed anymore;
- it will be much easier to introduce an API to export these DMABUF
  objects to userspace in a following patch.

Signed-off-by: Paul Cercueil 
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 192 ---
 include/linux/iio/buffer-dma.h   |   8 +-
 2 files changed, 122 insertions(+), 78 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
b/drivers/iio/buffer/industrialio-buffer-dma.c
index 15ea7bc3ac08..54e6000cd2ee 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -90,103 +91,145 @@
  * callback is called from within the custom callback.
  */
 
-static void iio_buffer_block_release(struct kref *kref)
-{
-   struct iio_dma_buffer_block *block = container_of(kref,
-   struct iio_dma_buffer_block, kref);
-
-   WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
-
-   dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
-   block->vaddr, block->phys_addr);
-
-   iio_buffer_put(&block->queue->buffer);
-   kfree(block);
-}
-
-static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
-{
-   kref_get(&block->kref);
-}
-
-static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
-{
-   kref_put(&block->kref, iio_buffer_block_release);
-}
-
-/*
- * dma_free_coherent can sleep, hence we need to take some special care to be
- * able to drop a reference from an atomic context.
- */
-static LIST_HEAD(iio_dma_buffer_dead_blocks);
-static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
-
-static void iio_dma_buffer_cleanup_worker(struct work_struct *work)
-{
-   struct iio_dma_buffer_block *block, *_block;
-   LIST_HEAD(block_list);
-
-   spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
-   list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
-   spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
-
-   list_for_each_entry_safe(block, _block, &block_list, head)
-   iio_buffer_block_release(&block->kref);
-}
-static DECLARE_WORK(iio_dma_buffer_cleanup_work, 
iio_dma_buffer_cleanup_worker);
-
-static void iio_buffer_block_release_atomic(struct kref *kref)
-{
+struct iio_buffer_dma_buf_attachment {
+   struct scatterlist sgl;
+   struct sg_table sg_table;
struct iio_dma_buffer_block *block;
-   unsigned long flags;
-
-   block = container_of(kref, struct iio_dma_buffer_block, kref);
-
-   spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
-   list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
-   spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
-
-   schedule_work(&iio_dma_buffer_cleanup_work);
-}
-
-/*
- * Version of iio_buffer_block_put() that can be called from atomic context
- */
-static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block)
-{
-   kref_put(&block->kref, iio_buffer_block_release_atomic);
-}
+};
 
 static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf)
 {
return container_of(buf, struct iio_dma_buffer_queue, buffer);
 }
 
+static struct iio_buffer_dma_buf_attachment *
+to_iio_buffer_dma_buf_attachment(struct sg_table *table)
+{
+   return container_of(table, struct iio_buffer_dma_buf_attachment, 
sg_table);
+}
+
+static void iio_buffer_block_get(struct iio_dma_buffer_block *block)
+{
+   get_dma_buf(block->dmabuf);
+}
+
+static void iio_buffer_block_put(struct iio_dma_buffer_block *block)
+{
+   dma_buf_put(block->dmabuf);
+}
+
+static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
+struct dma_buf_attachment *at)
+{
+   at->priv = dbuf->priv;
+
+   return 0;
+}
+
+static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
+  enum dma_data_direction dma_dir)
+{
+   struct iio_dma_buffer_block *block = at->priv;
+   struct iio_buffer_dma_buf_attachment *dba;
+   int ret;
+
+   dba = kzalloc(sizeof(*dba), GFP_KERNEL);
+   if (!dba)
+   return ERR_PTR(-ENOMEM);
+
+   sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
+   dba->sg_table.sgl = &dba->sgl;
+   dba->sg_table.nents = 1;
+   dba->block = block;
+
+   ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
+   if (ret) {
+   kfree(dba);
+   return ERR_PTR(ret);
+   }
+
+   return &dba->sg_table;
+}
+
+static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
+struct sg_table *sg_t

[PATCH v2 08/12] iio: buffer-dma: Implement new DMABUF based userspace API

2022-02-07 Thread Paul Cercueil
Implement the two functions iio_dma_buffer_alloc_dmabuf() and
iio_dma_buffer_enqueue_dmabuf(), as well as all the necessary bits to
enable userspace access to the DMABUF objects.

These two functions are exported as GPL symbols so that IIO buffer
implementations can support the new DMABUF based userspace API.

Signed-off-by: Paul Cercueil 
---
 drivers/iio/buffer/industrialio-buffer-dma.c | 260 ++-
 include/linux/iio/buffer-dma.h   |  13 +
 2 files changed, 266 insertions(+), 7 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
b/drivers/iio/buffer/industrialio-buffer-dma.c
index 54e6000cd2ee..b9c3b01c5ea0 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -15,7 +15,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 
 /*
@@ -97,6 +99,18 @@ struct iio_buffer_dma_buf_attachment {
struct iio_dma_buffer_block *block;
 };
 
+struct iio_buffer_dma_fence {
+   struct dma_fence base;
+   struct iio_dma_buffer_block *block;
+   spinlock_t lock;
+};
+
+static struct iio_buffer_dma_fence *
+to_iio_buffer_dma_fence(struct dma_fence *fence)
+{
+   return container_of(fence, struct iio_buffer_dma_fence, base);
+}
+
 static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf)
 {
return container_of(buf, struct iio_dma_buffer_queue, buffer);
@@ -118,6 +132,48 @@ static void iio_buffer_block_put(struct 
iio_dma_buffer_block *block)
dma_buf_put(block->dmabuf);
 }
 
+static const char *
+iio_buffer_dma_fence_get_driver_name(struct dma_fence *fence)
+{
+   struct iio_buffer_dma_fence *iio_fence = to_iio_buffer_dma_fence(fence);
+
+   return dev_name(iio_fence->block->queue->dev);
+}
+
+static void iio_buffer_dma_fence_release(struct dma_fence *fence)
+{
+   struct iio_buffer_dma_fence *iio_fence = to_iio_buffer_dma_fence(fence);
+
+   kfree(iio_fence);
+}
+
+static const struct dma_fence_ops iio_buffer_dma_fence_ops = {
+   .get_driver_name= iio_buffer_dma_fence_get_driver_name,
+   .get_timeline_name  = iio_buffer_dma_fence_get_driver_name,
+   .release= iio_buffer_dma_fence_release,
+};
+
+static struct dma_fence *
+iio_dma_buffer_create_dma_fence(struct iio_dma_buffer_block *block)
+{
+   struct iio_buffer_dma_fence *fence;
+   u64 ctx;
+
+   fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+   if (!fence)
+   return ERR_PTR(-ENOMEM);
+
+   fence->block = block;
+   spin_lock_init(&fence->lock);
+
+   ctx = dma_fence_context_alloc(1);
+
+   dma_fence_init(&fence->base, &iio_buffer_dma_fence_ops,
+  &fence->lock, ctx, 0);
+
+   return &fence->base;
+}
+
 static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
 struct dma_buf_attachment *at)
 {
@@ -162,10 +218,26 @@ static void iio_buffer_dma_buf_unmap(struct 
dma_buf_attachment *at,
kfree(dba);
 }
 
+static int iio_buffer_dma_buf_mmap(struct dma_buf *dbuf,
+  struct vm_area_struct *vma)
+{
+   struct iio_dma_buffer_block *block = dbuf->priv;
+   struct device *dev = block->queue->dev;
+
+   vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
+
+   if (vma->vm_ops->open)
+   vma->vm_ops->open(vma);
+
+   return dma_mmap_coherent(dev, vma, block->vaddr, block->phys_addr,
+vma->vm_end - vma->vm_start);
+}
+
 static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
 {
struct iio_dma_buffer_block *block = dbuf->priv;
struct iio_dma_buffer_queue *queue = block->queue;
+   bool is_fileio = block->fileio;
 
WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
 
@@ -175,6 +247,9 @@ static void iio_buffer_dma_buf_release(struct dma_buf *dbuf)
  block->vaddr, block->phys_addr);
kfree(block);
 
+   queue->num_blocks--;
+   if (is_fileio)
+   queue->num_fileio_blocks--;
mutex_unlock(&queue->lock);
iio_buffer_put(&queue->buffer);
 }
@@ -183,11 +258,12 @@ static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops 
= {
.attach = iio_buffer_dma_buf_attach,
.map_dma_buf= iio_buffer_dma_buf_map,
.unmap_dma_buf  = iio_buffer_dma_buf_unmap,
+   .mmap   = iio_buffer_dma_buf_mmap,
.release= iio_buffer_dma_buf_release,
 };
 
 static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
-   struct iio_dma_buffer_queue *queue, size_t size)
+   struct iio_dma_buffer_queue *queue, size_t size, bool fileio)
 {
struct iio_dma_buffer_block *block;
DEFINE_DMA_BUF_EXPORT_INFO(einfo);
@@ -218,10 +294,15 @@ static struct iio_dma_buffer_block 
*iio_dma_buffer_alloc_block(
block->size = size;
block->state = IIO_BLOCK_STATE_

[PATCH v2 09/12] iio: buffer-dmaengine: Support new DMABUF based userspace API

2022-02-07 Thread Paul Cercueil
Use the functions provided by the buffer-dma core to implement the
DMABUF userspace API in the buffer-dmaengine IIO buffer implementation.

Signed-off-by: Paul Cercueil 
---
 drivers/iio/buffer/industrialio-buffer-dmaengine.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c 
b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index 5cde8fd81c7f..57a8b2e4ba3c 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -133,6 +133,9 @@ static const struct iio_buffer_access_funcs 
iio_dmaengine_buffer_ops = {
.space_available = iio_dma_buffer_space_available,
.release = iio_dmaengine_buffer_release,
 
+   .alloc_dmabuf = iio_dma_buffer_alloc_dmabuf,
+   .enqueue_dmabuf = iio_dma_buffer_enqueue_dmabuf,
+
.modes = INDIO_BUFFER_HARDWARE,
.flags = INDIO_BUFFER_FLAG_FIXED_WATERMARK,
 };
-- 
2.34.1



[PATCH v2 10/12] iio: core: Add support for cyclic buffers

2022-02-07 Thread Paul Cercueil
Introduce a new flag IIO_BUFFER_DMABUF_CYCLIC in the "flags" field of
the iio_dmabuf uapi structure.

When set, the DMABUF enqueued with the enqueue ioctl will be endlessly
repeated on the TX output, until the buffer is disabled.

Signed-off-by: Paul Cercueil 
Reviewed-by: Alexandru Ardelean 
---
 drivers/iio/industrialio-buffer.c | 5 +
 include/uapi/linux/iio/buffer.h   | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/iio/industrialio-buffer.c 
b/drivers/iio/industrialio-buffer.c
index 72f333a519bc..85331cedaad8 100644
--- a/drivers/iio/industrialio-buffer.c
+++ b/drivers/iio/industrialio-buffer.c
@@ -1535,6 +1535,11 @@ static int iio_buffer_enqueue_dmabuf(struct iio_buffer 
*buffer,
if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
return -EINVAL;
 
+   /* Cyclic flag is only supported on output buffers */
+   if ((dmabuf.flags & IIO_BUFFER_DMABUF_CYCLIC) &&
+   buffer->direction != IIO_BUFFER_DIRECTION_OUT)
+   return -EINVAL;
+
return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
 }
 
diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h
index e4621b926262..2d541d038c02 100644
--- a/include/uapi/linux/iio/buffer.h
+++ b/include/uapi/linux/iio/buffer.h
@@ -7,7 +7,8 @@
 
 #include 
 
-#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS  0x
+#define IIO_BUFFER_DMABUF_CYCLIC   (1 << 0)
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS  0x0001
 
 /**
  * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs
-- 
2.34.1



[PATCH v2 11/12] iio: buffer-dmaengine: Add support for cyclic buffers

2022-02-07 Thread Paul Cercueil
Handle the IIO_BUFFER_DMABUF_CYCLIC flag to support cyclic buffers.

Signed-off-by: Paul Cercueil 
Reviewed-by: Alexandru Ardelean 
---
 drivers/iio/buffer/industrialio-buffer-dma.c  |  1 +
 .../iio/buffer/industrialio-buffer-dmaengine.c| 15 ---
 include/linux/iio/buffer-dma.h|  3 +++
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c 
b/drivers/iio/buffer/industrialio-buffer-dma.c
index b9c3b01c5ea0..6185af2f33f0 100644
--- a/drivers/iio/buffer/industrialio-buffer-dma.c
+++ b/drivers/iio/buffer/industrialio-buffer-dma.c
@@ -901,6 +901,7 @@ int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
}
 
dma_block->bytes_used = iio_dmabuf->bytes_used ?: dma_block->size;
+   dma_block->cyclic = iio_dmabuf->flags & IIO_BUFFER_DMABUF_CYCLIC;
 
switch (dma_block->state) {
case IIO_BLOCK_STATE_QUEUED:
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c 
b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
index 57a8b2e4ba3c..952e2160a11e 100644
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
+++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
@@ -81,9 +81,18 @@ static int iio_dmaengine_buffer_submit_block(struct 
iio_dma_buffer_queue *queue,
if (!block->bytes_used || block->bytes_used > max_size)
return -EINVAL;
 
-   desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
-   block->phys_addr, block->bytes_used, dma_dir,
-   DMA_PREP_INTERRUPT);
+   if (block->cyclic) {
+   desc = dmaengine_prep_dma_cyclic(dmaengine_buffer->chan,
+block->phys_addr,
+block->size,
+block->bytes_used,
+dma_dir, 0);
+   } else {
+   desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
+  block->phys_addr,
+  block->bytes_used, dma_dir,
+  DMA_PREP_INTERRUPT);
+   }
if (!desc)
return -ENOMEM;
 
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
index 5bd687132355..3a5d9169e573 100644
--- a/include/linux/iio/buffer-dma.h
+++ b/include/linux/iio/buffer-dma.h
@@ -40,6 +40,7 @@ enum iio_block_state {
  * @phys_addr: Physical address of the blocks memory
  * @queue: Parent DMA buffer queue
  * @state: Current state of the block
+ * @cyclic: True if this is a cyclic buffer
  * @fileio: True if this buffer is used for fileio mode
  * @dmabuf: Underlying DMABUF object
  */
@@ -63,6 +64,8 @@ struct iio_dma_buffer_block {
 */
enum iio_block_state state;
 
+   bool cyclic;
+
bool fileio;
struct dma_buf *dmabuf;
 };
-- 
2.34.1



[PATCH v2 12/12] Documentation: iio: Document high-speed DMABUF based API

2022-02-07 Thread Paul Cercueil
Document the new DMABUF based API.

v2: - Explicitly state that the new interface is optional and is
  not implemented by all drivers.
- The IOCTLs can now only be called on the buffer FD returned by
  IIO_BUFFER_GET_FD_IOCTL.
- Move the page up a bit in the index since it is core stuff and not
  driver-specific.

Signed-off-by: Paul Cercueil 
---
 Documentation/driver-api/dma-buf.rst |  2 +
 Documentation/iio/dmabuf_api.rst | 94 
 Documentation/iio/index.rst  |  2 +
 3 files changed, 98 insertions(+)
 create mode 100644 Documentation/iio/dmabuf_api.rst

diff --git a/Documentation/driver-api/dma-buf.rst 
b/Documentation/driver-api/dma-buf.rst
index 2cd7db82d9fe..d3c9b58d2706 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -1,3 +1,5 @@
+.. _dma-buf:
+
 Buffer Sharing and Synchronization
 ==
 
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst
new file mode 100644
index ..43bb2c1b9fdc
--- /dev/null
+++ b/Documentation/iio/dmabuf_api.rst
@@ -0,0 +1,94 @@
+===
+High-speed DMABUF interface for IIO
+===
+
+1. Overview
+===
+
+The Industrial I/O subsystem supports access to buffers through a file-based
+interface, with read() and write() access calls through the IIO device's dev
+node.
+
+It additionally supports a DMABUF based interface, where the userspace
+application can allocate and append DMABUF objects to the buffer's queue.
+This interface is however optional and is not available in all drivers.
+
+The advantage of this DMABUF based interface vs. the read()
+interface, is that it avoids an extra copy of the data between the
+kernel and userspace. This is particularly useful for high-speed
+devices which produce several megabytes or even gigabytes of data per
+second.
+
+The data in this DMABUF interface is managed at the granularity of
+DMABUF objects. Reducing the granularity from byte level to block level
+is done to reduce the userspace-kernelspace synchronization overhead
+since performing syscalls for each byte at a few Mbps is just not
+feasible.
+
+This of course leads to a slightly increased latency. For this reason an
+application can choose the size of the DMABUFs as well as how many it
+allocates. E.g. two DMABUFs would be a traditional double buffering
+scheme. But using a higher number might be necessary to avoid
+underflow/overflow situations in the presence of scheduling latencies.
+
+2. User API
+===
+
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)``
+
+
+Each call will allocate a new DMABUF object. The return value (if not
+a negative errno value as error) will be the file descriptor of the new
+DMABUF.
+
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)``
+
+
+Place the DMABUF object into the queue pending for hardware process.
+
+These two IOCTLs have to be performed on the IIO buffer's file
+descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
+
+3. Usage
+
+
+To access the data stored in a block by userspace the block must be
+mapped to the process's memory. This is done by calling mmap() on the
+DMABUF's file descriptor.
+
+Before accessing the data through the map, you must use the
+DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the
+DMA_BUF_SYNC_START flag, to make sure that the data is available.
+This call may block until the hardware is done with this block. Once
+you are done reading or writing the data, you must use this ioctl again
+with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the
+kernel's queue.
+
+If you need to know when the hardware is done with a DMABUF, you can
+poll its file descriptor for the EPOLLOUT event.
+
+Finally, to destroy a DMABUF object, simply call close() on its file
+descriptor.
+
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+
+A typical workflow for the new interface is:
+
+for block in blocks:
+  DMABUF_ALLOC block
+  mmap block
+
+enable buffer
+
+while !done
+  for block in blocks:
+DMABUF_ENQUEUE block
+
+DMABUF_SYNC_START block
+process data
+DMABUF_SYNC_END block
+
+disable buffer
+
+for block in blocks:
+  close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst
index 58b7a4ebac51..669deb67ddee 100644
--- a/Documentation/iio/index.rst
+++ b/Documentation/iio/index.rst
@@ -9,4 +9,6 @@ Industrial I/O
 
iio_configfs
 
+   dmabuf_api
+
ep93xx_adc
-- 
2.34.1



[PATCH v2] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Hans de Goede
Fix the following warning from "make htmldocs":

drivers/gpu/drm/drm_privacy_screen.c:270:
 WARNING: Inline emphasis start-string without end-string.

Fixes: 8a12b170558a ("drm/privacy-screen: Add notifier support (v2)")
Reported-by: Stephen Rothwell 
Signed-off-by: Hans de Goede 
---
Changes in v2:
- Use double backtick quotes around the "void *data" instead of using
  a backslash to escape the *
---
 drivers/gpu/drm/drm_privacy_screen.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_privacy_screen.c 
b/drivers/gpu/drm/drm_privacy_screen.c
index beaf99e9120a..b688841c18e4 100644
--- a/drivers/gpu/drm/drm_privacy_screen.c
+++ b/drivers/gpu/drm/drm_privacy_screen.c
@@ -269,7 +269,7 @@ EXPORT_SYMBOL(drm_privacy_screen_get_state);
  *
  * The notifier is called with no locks held. The new hw_state and sw_state
  * can be retrieved using the drm_privacy_screen_get_state() function.
- * A pointer to the drm_privacy_screen's struct is passed as the void *data
+ * A pointer to the drm_privacy_screen's struct is passed as the ``void *data``
  * argument of the notifier_block's notifier_call.
  *
  * The notifier will NOT be called when changes are made through
-- 
2.33.1



Re: [PATCH 6/8] drm/ast: Initialize encoder and connector for VGA in helper function

2022-02-07 Thread Thomas Zimmermann

Hi

Am 03.02.22 um 18:43 schrieb Javier Martinez Canillas:

On 1/11/22 13:00, Thomas Zimmermann wrote:

Move encoder and connector initialization into a single helper and
put all related mode-setting structures into a single place. Done in
preparation of moving transmitter code into separate helpers. No
functional changes.

Signed-off-by: Thomas Zimmermann 


Reviewed-by: Javier Martinez Canillas 

[snip]


-   encoder->possible_crtcs = 1;


[snip]


+   encoder->possible_crtcs = drm_crtc_mask(crtc);


This is a somewhat unrelated change. It's OK because is fairly simple
but I would probably still do the cleanups as separate patches.


I'll put this change into a separate patch.

Best regards
Thomas



Best regards,


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] drm/panel: panel-simple: Fix proper bpc for AM-1280800N3TZQW-T00H

2022-02-07 Thread Jagan Teki
Hi Sam,

On Mon, Dec 20, 2021 at 1:45 PM Sam Ravnborg  wrote:
>
> Hi Jagan,
>
> On Sun, Dec 19, 2021 at 10:10:10PM +0530, Jagan Teki wrote:
> > Hi Sam,
> >
> > On Thu, Nov 11, 2021 at 3:11 PM Jagan Teki  
> > wrote:
> > >
> > > AM-1280800N3TZQW-T00H panel support 8 bpc not 6 bpc as per
> > > recent testing in i.MX8MM platform.
> > >
> > > Fix it.
> > >
> > > Fixes: bca684e69c4c ("drm/panel: simple: Add AM-1280800N3TZQW-T00H")
> > > Signed-off-by: Jagan Teki 
> > > ---
> > >  drivers/gpu/drm/panel/panel-simple.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/panel/panel-simple.c 
> > > b/drivers/gpu/drm/panel/panel-simple.c
> > > index eb475a3a774b..cf3f21f649cb 100644
> > > --- a/drivers/gpu/drm/panel/panel-simple.c
> > > +++ b/drivers/gpu/drm/panel/panel-simple.c
> > > @@ -719,7 +719,7 @@ static const struct drm_display_mode 
> > > ampire_am_1280800n3tzqw_t00h_mode = {
> > >  static const struct panel_desc ampire_am_1280800n3tzqw_t00h = {
> > > .modes = &ire_am_1280800n3tzqw_t00h_mode,
> > > .num_modes = 1,
> > > -   .bpc = 6,
> > > +   .bpc = 8,
> >
> > Any response on this?
>
> I am too busy with other stuff at the moment to spend time on Linux
> stuff, but expect to re-surface sometime after xmas.

Any further comments?

Thanks,
Jagan.


Re: [PATCH v2] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Simon Ser
Reviewed-by: Simon Ser 


Re: [PATCH 8/8] drm/ast: Move SIL164-based connector code into separate helpers

2022-02-07 Thread Thomas Zimmermann

Hi

Am 03.02.22 um 18:57 schrieb Javier Martinez Canillas:

On 1/11/22 13:00, Thomas Zimmermann wrote:

Add helpers for initializing SIL164-based connectors. These used to be
handled by the VGA connector code. But SIL164 provides output via DVI-I,
so set the encoder and connector types accordingly.

If a SIL164 chip has been detected, ast will now create a DVI-I
connector instead of a VGA connector.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/ast/ast_drv.h  | 15 ++
  drivers/gpu/drm/ast/ast_mode.c | 99 +-
  2 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 420d19d8459e..c3a582372649 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -140,6 +140,17 @@ to_ast_vga_connector(struct drm_connector *connector)
return container_of(connector, struct ast_vga_connector, base);
  }
  


[snip]


+static int ast_sil164_connector_init(struct drm_device *dev,
+struct ast_sil164_connector 
*ast_sil164_connector)
+{
+   struct drm_connector *connector = &ast_sil164_connector->base;
+   int ret;
+
+   ast_sil164_connector->i2c = ast_i2c_create(dev);
+   if (!ast_sil164_connector->i2c)
+   drm_err(dev, "failed to add ddc bus for connector\n");
+
+   if (ast_sil164_connector->i2c)
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_sil164_connector_funcs,
+ DRM_MODE_CONNECTOR_DVII,
+ 
&ast_sil164_connector->i2c->adapter);
+   else
+   ret = drm_connector_init(dev, connector, 
&ast_sil164_connector_funcs,
+DRM_MODE_CONNECTOR_DVII);
+   if (ret)


I think you want a kfree(i2c) here before returning.

And where is the struct ast_i2c_chan freed if the function succeeds ?


The memory and data structure is managed with 
drmm_add_action_or_reset(). It will be released together with the DRM 
driver (either on success or failure). See ast_i2c_create() for the details.


Best regards
Thomas



With that,

Reviewed-by: Javier Martinez Canillas 

Best regards,


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v2] drm/privacy-screen: Fix sphinx warning

2022-02-07 Thread Hans de Goede
Hi,

On 2/7/22 14:05, Simon Ser wrote:
> Reviewed-by: Simon Ser 

Thank I've pushed this one to drm-misc-fixes now; and the other one you reviewed
to drm-misc-next.

Regards,

Hans



Re: [RFC 0/2] drm/i915/ttm: Evict and store of compressed object

2022-02-07 Thread Hellstrom, Thomas
Hi, Christian,

On Mon, 2022-02-07 at 12:41 +0100, Christian König wrote:
> Am 07.02.22 um 10:37 schrieb Ramalingam C:
> > On flat-ccs capable platform we need to evict and resore the ccs data
> > along with the corresponding main memory.
> > 
> > This ccs data can only be access through BLT engine through a special
> > cmd ( )
> > 
> > To support above requirement of flat-ccs enabled i915 platforms this
> > series adds new param called ccs_pages_needed to the ttm_tt_init(),
> > to increase the ttm_tt->num_pages of system memory when the obj has
> > the
> > lmem placement possibility.
> 
> Well question is why isn't the buffer object allocated with the extra
> space in the first place?

That wastes precious VRAM. The extra space is needed only when the bo
is evicted.

We've had a previous short disussion on this here:
https://lists.freedesktop.org/archives/dri-devel/2021-August/321161.html

Thanks,
Thomas


> 
> Regards,
> Christian.
> 
> > 
> > This will be on top of the flat-ccs enabling series
> > https://patchwork.freedesktop.org/series/95686/
> > 
> > For more about flat-ccs feature please have a look at
> > https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5
> > 
> > Testing of the series is WIP and looking forward for the early review
> > on
> > the amendment to ttm_tt_init and the approach.
> > 
> > Ramalingam C (2):
> >    drm/i915/ttm: Add extra pages for handling ccs data
> >    drm/i915/migrate: Evict and restore the ccs data
> > 
> >   drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c    |  23 +-
> >   drivers/gpu/drm/i915/gt/intel_migrate.c    | 283 +++---
> > ---
> >   drivers/gpu/drm/qxl/qxl_ttm.c  |   2 +-
> >   drivers/gpu/drm/ttm/ttm_agp_backend.c  |   2 +-
> >   drivers/gpu/drm/ttm/ttm_tt.c   |  12 +-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
> >   include/drm/ttm/ttm_tt.h   |   4 +-
> >   8 files changed, 191 insertions(+), 139 deletions(-)
> > 
> 

--
Intel Sweden AB
Registered Office: Isafjordsgatan 30B, 164 40 Kista, Stockholm, Sweden
Registration Number: 556189-6027

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Re: [RFC 0/2] drm/i915/ttm: Evict and store of compressed object

2022-02-07 Thread Ramalingam C
On 2022-02-07 at 12:41:59 +0100, Christian König wrote:
> Am 07.02.22 um 10:37 schrieb Ramalingam C:
> > On flat-ccs capable platform we need to evict and resore the ccs data
> > along with the corresponding main memory.
> > 
> > This ccs data can only be access through BLT engine through a special
> > cmd ( )
> > 
> > To support above requirement of flat-ccs enabled i915 platforms this
> > series adds new param called ccs_pages_needed to the ttm_tt_init(),
> > to increase the ttm_tt->num_pages of system memory when the obj has the
> > lmem placement possibility.
> 
> Well question is why isn't the buffer object allocated with the extra space
> in the first place?
Hi Christian,

On Xe-HP and later devices, we use dedicated compression control state (CCS)
stored in local memory for each surface, to support the 3D and media
compression formats.

The memory required for the CCS of the entire local memory is 1/256 of the
local memory size. So before the kernel boot, the required memory is reserved
for the CCS data and a secure register will be programmed with the CCS base
address

So when we allocate a object in local memory we dont need to explicitly
allocate the space for ccs data. But when we evict the obj into the smem
 to hold the compression related data along with the obj we need smem
 space of obj_size + (obj_size/256).

 Hence when we create smem for an obj with lmem placement possibility we
 create with the extra space.

 Ram.
> 
> Regards,
> Christian.
> 
> > 
> > This will be on top of the flat-ccs enabling series
> > https://patchwork.freedesktop.org/series/95686/
> > 
> > For more about flat-ccs feature please have a look at
> > https://patchwork.freedesktop.org/patch/471777/?series=95686&rev=5
> > 
> > Testing of the series is WIP and looking forward for the early review on
> > the amendment to ttm_tt_init and the approach.
> > 
> > Ramalingam C (2):
> >drm/i915/ttm: Add extra pages for handling ccs data
> >drm/i915/migrate: Evict and restore the ccs data
> > 
> >   drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
> >   drivers/gpu/drm/i915/gem/i915_gem_ttm.c|  23 +-
> >   drivers/gpu/drm/i915/gt/intel_migrate.c| 283 +++--
> >   drivers/gpu/drm/qxl/qxl_ttm.c  |   2 +-
> >   drivers/gpu/drm/ttm/ttm_agp_backend.c  |   2 +-
> >   drivers/gpu/drm/ttm/ttm_tt.c   |  12 +-
> >   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
> >   include/drm/ttm/ttm_tt.h   |   4 +-
> >   8 files changed, 191 insertions(+), 139 deletions(-)
> > 
> 


Re: [PATCH 3/8] mm: remove pointless includes from

2022-02-07 Thread Jason Gunthorpe
On Mon, Feb 07, 2022 at 07:32:44AM +0100, Christoph Hellwig wrote:
> hmm.h pulls in the world for no good reason at all.  Remove the
> includes and push a few ones into the users instead.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 +
>  drivers/gpu/drm/nouveau/nouveau_dmem.c   | 1 +
>  include/linux/hmm.h  | 9 ++---
>  lib/test_hmm.c   | 2 ++
>  4 files changed, 6 insertions(+), 7 deletions(-)

Reviewed-by: Jason Gunthorpe 

Jason


[PATCH v2 1/9] drm/ast: Fail if connector initialization fails

2022-02-07 Thread Thomas Zimmermann
Update the connector code to fail if the connector could not be
initialized. The current code just ignored the error and failed
later when the connector was supposed to be used.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_mode.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index ab52efb15670..51cc6fef1b92 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1322,18 +1322,21 @@ static int ast_connector_init(struct drm_device *dev)
struct ast_connector *ast_connector = &ast->connector;
struct drm_connector *connector = &ast_connector->base;
struct drm_encoder *encoder = &ast->encoder;
+   int ret;
 
ast_connector->i2c = ast_i2c_create(dev);
if (!ast_connector->i2c)
drm_err(dev, "failed to add ddc bus for connector\n");
 
if (ast_connector->i2c)
-   drm_connector_init_with_ddc(dev, connector, 
&ast_connector_funcs,
-   DRM_MODE_CONNECTOR_VGA,
-   &ast_connector->i2c->adapter);
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_connector_funcs,
+ DRM_MODE_CONNECTOR_VGA,
+ &ast_connector->i2c->adapter);
else
-   drm_connector_init(dev, connector, &ast_connector_funcs,
-  DRM_MODE_CONNECTOR_VGA);
+   ret = drm_connector_init(dev, connector, &ast_connector_funcs,
+DRM_MODE_CONNECTOR_VGA);
+   if (ret)
+   return ret;
 
drm_connector_helper_add(connector, &ast_connector_helper_funcs);
 
-- 
2.34.1



[PATCH v2 3/9] drm/ast: Remove AST_TX_ITE66121 constant

2022-02-07 Thread Thomas Zimmermann
The ITE66121 is an HDMI transmitter chip. There's no code for
detecting or programming the chip within ast. Remove the enum
constant.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_drv.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 00bfa41ff7cb..6e77be1d06d3 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -69,7 +69,6 @@ enum ast_chip {
 enum ast_tx_chip {
AST_TX_NONE,
AST_TX_SIL164,
-   AST_TX_ITE66121,
AST_TX_DP501,
 };
 
-- 
2.34.1



[PATCH v2 0/9] drm/ast: Untangle connector helpers

2022-02-07 Thread Thomas Zimmermann
The ast driver supports different transmitter chips to support DVI
and HDMI connectors. It's al been packed into the same helpers
functons and exported as VGA connector.

Introduce a separate set of connector helpers for each transmitter
chip, and thus connector type. Also provide the correct encoder for
each connector.

This change mostly affects the connector's get_modes helper, where
VGA-, DVI- and HDMI-related code was lumped into the same function.
It's now all separate. While at it, also rework and cleanup the
initialization of the related data structures.

Tested on AST 2100 and AST 2300 hardware. I don't have hardware with
special transmitter chips (SIL164, DP501), so I could only test the VGA
code.

v2:
* move encoder's preferred-CRTC bitmask setup into separate patch
  (Javier)

Thomas Zimmermann (9):
  drm/ast: Fail if connector initialization fails
  drm/ast: Move connector mode_valid function to CRTC
  drm/ast: Remove AST_TX_ITE66121 constant
  drm/ast: Remove unused value dp501_maxclk
  drm/ast: Rename struct ast_connector to struct ast_vga_connector
  drm/ast: Initialize encoder and connector for VGA in helper function
  drm/ast: Read encoder possible-CRTC mask from drm_crtc_mask()
  drm/ast: Move DP501-based connector code into separate helpers
  drm/ast: Move SIL164-based connector code into separate helpers

 drivers/gpu/drm/ast/ast_dp501.c |  58 -
 drivers/gpu/drm/ast/ast_drv.h   |  37 ++-
 drivers/gpu/drm/ast/ast_mode.c  | 413 +++-
 3 files changed, 331 insertions(+), 177 deletions(-)


base-commit: 0bb81b5d6db5f689b67f9d8b35323235c45e890f
-- 
2.34.1



[PATCH v2 2/9] drm/ast: Move connector mode_valid function to CRTC

2022-02-07 Thread Thomas Zimmermann
The tests in ast_mode_valid() verify the correct resolution for the
supplied mode. This is a limitation of the CRTC, so move the function
to the CRTC helpers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_mode.c | 129 +
 1 file changed, 66 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 51cc6fef1b92..ab0a86cecbba 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1005,6 +1005,71 @@ static void ast_crtc_dpms(struct drm_crtc *crtc, int 
mode)
}
 }
 
+static enum drm_mode_status
+ast_crtc_helper_mode_valid(struct drm_crtc *crtc, const struct 
drm_display_mode *mode)
+{
+   struct ast_private *ast = to_ast_private(crtc->dev);
+   enum drm_mode_status status;
+   uint32_t jtemp;
+
+   if (ast->support_wide_screen) {
+   if ((mode->hdisplay == 1680) && (mode->vdisplay == 1050))
+   return MODE_OK;
+   if ((mode->hdisplay == 1280) && (mode->vdisplay == 800))
+   return MODE_OK;
+   if ((mode->hdisplay == 1440) && (mode->vdisplay == 900))
+   return MODE_OK;
+   if ((mode->hdisplay == 1360) && (mode->vdisplay == 768))
+   return MODE_OK;
+   if ((mode->hdisplay == 1600) && (mode->vdisplay == 900))
+   return MODE_OK;
+
+   if ((ast->chip == AST2100) || (ast->chip == AST2200) ||
+   (ast->chip == AST2300) || (ast->chip == AST2400) ||
+   (ast->chip == AST2500)) {
+   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1080))
+   return MODE_OK;
+
+   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1200)) {
+   jtemp = ast_get_index_reg_mask(ast, 
AST_IO_CRTC_PORT, 0xd1, 0xff);
+   if (jtemp & 0x01)
+   return MODE_NOMODE;
+   else
+   return MODE_OK;
+   }
+   }
+   }
+
+   status = MODE_NOMODE;
+
+   switch (mode->hdisplay) {
+   case 640:
+   if (mode->vdisplay == 480)
+   status = MODE_OK;
+   break;
+   case 800:
+   if (mode->vdisplay == 600)
+   status = MODE_OK;
+   break;
+   case 1024:
+   if (mode->vdisplay == 768)
+   status = MODE_OK;
+   break;
+   case 1280:
+   if (mode->vdisplay == 1024)
+   status = MODE_OK;
+   break;
+   case 1600:
+   if (mode->vdisplay == 1200)
+   status = MODE_OK;
+   break;
+   default:
+   break;
+   }
+
+   return status;
+}
+
 static int ast_crtc_helper_atomic_check(struct drm_crtc *crtc,
struct drm_atomic_state *state)
 {
@@ -1107,6 +1172,7 @@ ast_crtc_helper_atomic_disable(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs ast_crtc_helper_funcs = {
+   .mode_valid = ast_crtc_helper_mode_valid,
.atomic_check = ast_crtc_helper_atomic_check,
.atomic_flush = ast_crtc_helper_atomic_flush,
.atomic_enable = ast_crtc_helper_atomic_enable,
@@ -1241,71 +1307,8 @@ static int ast_get_modes(struct drm_connector *connector)
return 0;
 }
 
-static enum drm_mode_status ast_mode_valid(struct drm_connector *connector,
- struct drm_display_mode *mode)
-{
-   struct ast_private *ast = to_ast_private(connector->dev);
-   int flags = MODE_NOMODE;
-   uint32_t jtemp;
-
-   if (ast->support_wide_screen) {
-   if ((mode->hdisplay == 1680) && (mode->vdisplay == 1050))
-   return MODE_OK;
-   if ((mode->hdisplay == 1280) && (mode->vdisplay == 800))
-   return MODE_OK;
-   if ((mode->hdisplay == 1440) && (mode->vdisplay == 900))
-   return MODE_OK;
-   if ((mode->hdisplay == 1360) && (mode->vdisplay == 768))
-   return MODE_OK;
-   if ((mode->hdisplay == 1600) && (mode->vdisplay == 900))
-   return MODE_OK;
-
-   if ((ast->chip == AST2100) || (ast->chip == AST2200) ||
-   (ast->chip == AST2300) || (ast->chip == AST2400) ||
-   (ast->chip == AST2500)) {
-   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1080))
-   return MODE_OK;
-
-   if ((mode->hdisplay == 1920) && (mode->vdisplay == 
1200)) {
-   jtemp = 

[PATCH v2 4/9] drm/ast: Remove unused value dp501_maxclk

2022-02-07 Thread Thomas Zimmermann
Remove reading the link-rate. The value is maintained by the connector
code but never used.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_dp501.c | 58 -
 drivers/gpu/drm/ast/ast_drv.h   |  1 -
 drivers/gpu/drm/ast/ast_mode.c  |  7 ++--
 3 files changed, 3 insertions(+), 63 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
index cd93c44f2662..204c926a18ea 100644
--- a/drivers/gpu/drm/ast/ast_dp501.c
+++ b/drivers/gpu/drm/ast/ast_dp501.c
@@ -272,64 +272,6 @@ static bool ast_launch_m68k(struct drm_device *dev)
return true;
 }
 
-u8 ast_get_dp501_max_clk(struct drm_device *dev)
-{
-   struct ast_private *ast = to_ast_private(dev);
-   u32 boot_address, offset, data;
-   u8 linkcap[4], linkrate, linklanes, maxclk = 0xff;
-   u32 *plinkcap;
-
-   if (ast->config_mode == ast_use_p2a) {
-   boot_address = get_fw_base(ast);
-
-   /* validate FW version */
-   offset = AST_DP501_GBL_VERSION;
-   data = ast_mindwm(ast, boot_address + offset);
-   if ((data & AST_DP501_FW_VERSION_MASK) != 
AST_DP501_FW_VERSION_1) /* version: 1x */
-   return maxclk;
-
-   /* Read Link Capability */
-   offset  = AST_DP501_LINKRATE;
-   plinkcap = (u32 *)linkcap;
-   *plinkcap  = ast_mindwm(ast, boot_address + offset);
-   if (linkcap[2] == 0) {
-   linkrate = linkcap[0];
-   linklanes = linkcap[1];
-   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * 
linklanes);
-   if (data > 0xff)
-   data = 0xff;
-   maxclk = (u8)data;
-   }
-   } else {
-   if (!ast->dp501_fw_buf)
-   return AST_DP501_DEFAULT_DCLK;  /* 1024x768 as default 
*/
-
-   /* dummy read */
-   offset = 0x;
-   data = readl(ast->dp501_fw_buf + offset);
-
-   /* validate FW version */
-   offset = AST_DP501_GBL_VERSION;
-   data = readl(ast->dp501_fw_buf + offset);
-   if ((data & AST_DP501_FW_VERSION_MASK) != 
AST_DP501_FW_VERSION_1) /* version: 1x */
-   return maxclk;
-
-   /* Read Link Capability */
-   offset = AST_DP501_LINKRATE;
-   plinkcap = (u32 *)linkcap;
-   *plinkcap = readl(ast->dp501_fw_buf + offset);
-   if (linkcap[2] == 0) {
-   linkrate = linkcap[0];
-   linklanes = linkcap[1];
-   data = (linkrate == 0x0a) ? (90 * linklanes) : (54 * 
linklanes);
-   if (data > 0xff)
-   data = 0xff;
-   maxclk = (u8)data;
-   }
-   }
-   return maxclk;
-}
-
 bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata)
 {
struct ast_private *ast = to_ast_private(dev);
diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 6e77be1d06d3..479bb120dd05 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -171,7 +171,6 @@ struct ast_private {
} config_mode;
 
enum ast_tx_chip tx_chip_type;
-   u8 dp501_maxclk;
u8 *dp501_fw_addr;
const struct firmware *dp501_fw;/* dp501 fw */
 };
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index ab0a86cecbba..a70158b2e29f 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1284,16 +1284,15 @@ static int ast_get_modes(struct drm_connector 
*connector)
int ret;
 
if (ast->tx_chip_type == AST_TX_DP501) {
-   ast->dp501_maxclk = 0xff;
edid = kmalloc(128, GFP_KERNEL);
if (!edid)
return -ENOMEM;
 
flags = ast_dp501_read_edid(connector->dev, (u8 *)edid);
-   if (flags)
-   ast->dp501_maxclk = 
ast_get_dp501_max_clk(connector->dev);
-   else
+   if (!flags) {
kfree(edid);
+   edid = NULL;
+   }
}
if (!flags && ast_connector->i2c)
edid = drm_get_edid(connector, &ast_connector->i2c->adapter);
-- 
2.34.1



[PATCH v2 5/9] drm/ast: Rename struct ast_connector to struct ast_vga_connector

2022-02-07 Thread Thomas Zimmermann
Prepare for introducing other connectors besides VGA. No functional
changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_drv.h  | 10 
 drivers/gpu/drm/ast/ast_mode.c | 45 +-
 2 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 479bb120dd05..e1cb31acdaac 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -129,15 +129,15 @@ struct ast_i2c_chan {
struct i2c_algo_bit_data bit;
 };
 
-struct ast_connector {
+struct ast_vga_connector {
struct drm_connector base;
struct ast_i2c_chan *i2c;
 };
 
-static inline struct ast_connector *
-to_ast_connector(struct drm_connector *connector)
+static inline struct ast_vga_connector *
+to_ast_vga_connector(struct drm_connector *connector)
 {
-   return container_of(connector, struct ast_connector, base);
+   return container_of(connector, struct ast_vga_connector, base);
 }
 
 /*
@@ -161,7 +161,7 @@ struct ast_private {
struct ast_cursor_plane cursor_plane;
struct drm_crtc crtc;
struct drm_encoder encoder;
-   struct ast_connector connector;
+   struct ast_vga_connector connector;
 
bool support_wide_screen;
enum {
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index a70158b2e29f..384879b27ccc 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1272,12 +1272,12 @@ static int ast_encoder_init(struct drm_device *dev)
 }
 
 /*
- * Connector
+ * VGA Connector
  */
 
-static int ast_get_modes(struct drm_connector *connector)
+static int ast_vga_connector_helper_get_modes(struct drm_connector *connector)
 {
-   struct ast_connector *ast_connector = to_ast_connector(connector);
+   struct ast_vga_connector *ast_vga_connector = 
to_ast_vga_connector(connector);
struct ast_private *ast = to_ast_private(connector->dev);
struct edid *edid = NULL;
bool flags = false;
@@ -1294,23 +1294,23 @@ static int ast_get_modes(struct drm_connector 
*connector)
edid = NULL;
}
}
-   if (!flags && ast_connector->i2c)
-   edid = drm_get_edid(connector, &ast_connector->i2c->adapter);
+   if (!flags && ast_vga_connector->i2c)
+   edid = drm_get_edid(connector, 
&ast_vga_connector->i2c->adapter);
if (edid) {
-   drm_connector_update_edid_property(&ast_connector->base, edid);
+   drm_connector_update_edid_property(connector, edid);
ret = drm_add_edid_modes(connector, edid);
kfree(edid);
return ret;
}
-   drm_connector_update_edid_property(&ast_connector->base, NULL);
+   drm_connector_update_edid_property(connector, NULL);
return 0;
 }
 
-static const struct drm_connector_helper_funcs ast_connector_helper_funcs = {
-   .get_modes = ast_get_modes,
+static const struct drm_connector_helper_funcs ast_vga_connector_helper_funcs 
= {
+   .get_modes = ast_vga_connector_helper_get_modes,
 };
 
-static const struct drm_connector_funcs ast_connector_funcs = {
+static const struct drm_connector_funcs ast_vga_connector_funcs = {
.reset = drm_atomic_helper_connector_reset,
.fill_modes = drm_helper_probe_single_connector_modes,
.destroy = drm_connector_cleanup,
@@ -1318,29 +1318,29 @@ static const struct drm_connector_funcs 
ast_connector_funcs = {
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
-static int ast_connector_init(struct drm_device *dev)
+static int ast_vga_connector_init(struct drm_device *dev)
 {
struct ast_private *ast = to_ast_private(dev);
-   struct ast_connector *ast_connector = &ast->connector;
-   struct drm_connector *connector = &ast_connector->base;
+   struct ast_vga_connector *ast_vga_connector = &ast->connector;
+   struct drm_connector *connector = &ast_vga_connector->base;
struct drm_encoder *encoder = &ast->encoder;
int ret;
 
-   ast_connector->i2c = ast_i2c_create(dev);
-   if (!ast_connector->i2c)
+   ast_vga_connector->i2c = ast_i2c_create(dev);
+   if (!ast_vga_connector->i2c)
drm_err(dev, "failed to add ddc bus for connector\n");
 
-   if (ast_connector->i2c)
-   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_connector_funcs,
+   if (ast_vga_connector->i2c)
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_vga_connector_funcs,
  DRM_MODE_CONNECTOR_VGA,
- &ast_connector->i2c->adapter);
+ 
&ast_vga_connector->i2c->adapter);
else
-   ret = drm_connector_init(dev, connector, &ast_con

[PATCH v2 8/9] drm/ast: Move DP501-based connector code into separate helpers

2022-02-07 Thread Thomas Zimmermann
Add helpers for DP501-based connectors. DP501 provides output via
DisplayPort. This used to be handled by the VGA connector code.

If a DP501 chip has been detected, ast will now create a DisplayPort
connector instead of a VGA connector.

Remove the DP501 code from ast_vga_connector_helper_get_modes(). Also
remove the call to drm_connector_update_edid_property(), which is
performed by drm_get_edid().

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_drv.h  |   4 ++
 drivers/gpu/drm/ast/ast_mode.c | 128 +++--
 2 files changed, 109 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index cda50fb887ed..420d19d8459e 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -165,6 +165,10 @@ struct ast_private {
struct drm_encoder encoder;
struct ast_vga_connector vga_connector;
} vga;
+   struct {
+   struct drm_encoder encoder;
+   struct drm_connector connector;
+   } dp501;
} output;
 
bool support_wide_screen;
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 09995a3d8c43..12dbf5b229e6 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1259,30 +1260,22 @@ static int ast_crtc_init(struct drm_device *dev)
 static int ast_vga_connector_helper_get_modes(struct drm_connector *connector)
 {
struct ast_vga_connector *ast_vga_connector = 
to_ast_vga_connector(connector);
-   struct ast_private *ast = to_ast_private(connector->dev);
-   struct edid *edid = NULL;
-   bool flags = false;
-   int ret;
+   struct edid *edid;
+   int count;
 
-   if (ast->tx_chip_type == AST_TX_DP501) {
-   edid = kmalloc(128, GFP_KERNEL);
-   if (!edid)
-   return -ENOMEM;
+   if (!ast_vga_connector->i2c)
+   goto err_drm_connector_update_edid_property;
 
-   flags = ast_dp501_read_edid(connector->dev, (u8 *)edid);
-   if (!flags) {
-   kfree(edid);
-   edid = NULL;
-   }
-   }
-   if (!flags && ast_vga_connector->i2c)
-   edid = drm_get_edid(connector, 
&ast_vga_connector->i2c->adapter);
-   if (edid) {
-   drm_connector_update_edid_property(connector, edid);
-   ret = drm_add_edid_modes(connector, edid);
-   kfree(edid);
-   return ret;
-   }
+   edid = drm_get_edid(connector, &ast_vga_connector->i2c->adapter);
+   if (!edid)
+   goto err_drm_connector_update_edid_property;
+
+   count = drm_add_edid_modes(connector, edid);
+   kfree(edid);
+
+   return count;
+
+err_drm_connector_update_edid_property:
drm_connector_update_edid_property(connector, NULL);
return 0;
 }
@@ -1354,6 +1347,92 @@ static int ast_vga_output_init(struct ast_private *ast)
return 0;
 }
 
+/*
+ * DP501 Connector
+ */
+
+static int ast_dp501_connector_helper_get_modes(struct drm_connector 
*connector)
+{
+   void *edid;
+   bool succ;
+   int count;
+
+   edid = kmalloc(EDID_LENGTH, GFP_KERNEL);
+   if (!edid)
+   goto err_drm_connector_update_edid_property;
+
+   succ = ast_dp501_read_edid(connector->dev, edid);
+   if (!succ)
+   goto err_kfree;
+
+   drm_connector_update_edid_property(connector, edid);
+   count = drm_add_edid_modes(connector, edid);
+   kfree(edid);
+
+   return count;
+
+err_kfree:
+   kfree(edid);
+err_drm_connector_update_edid_property:
+   drm_connector_update_edid_property(connector, NULL);
+   return 0;
+}
+
+static const struct drm_connector_helper_funcs 
ast_dp501_connector_helper_funcs = {
+   .get_modes = ast_dp501_connector_helper_get_modes,
+};
+
+static const struct drm_connector_funcs ast_dp501_connector_funcs = {
+   .reset = drm_atomic_helper_connector_reset,
+   .fill_modes = drm_helper_probe_single_connector_modes,
+   .destroy = drm_connector_cleanup,
+   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+};
+
+static int ast_dp501_connector_init(struct drm_device *dev, struct 
drm_connector *connector)
+{
+   int ret;
+
+   ret = drm_connector_init(dev, connector, &ast_dp501_connector_funcs,
+DRM_MODE_CONNECTOR_DisplayPort);
+   if (ret)
+   return ret;
+
+   drm_connector_helper_add(connector, &ast_dp501_connector_helper_funcs);
+
+   connector->interlace_allowed = 0;
+   connector->doublescan_allowed = 0;
+

[PATCH v2 6/9] drm/ast: Initialize encoder and connector for VGA in helper function

2022-02-07 Thread Thomas Zimmermann
Move encoder and connector initialization into a single helper and
put all related mode-setting structures into a single place. Done in
preparation of moving transmitter code into separate helpers. No
functional changes.

v2:
* move encoder CRTC bitmask fix into separate patch (Javier)

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_drv.h  |  8 +++--
 drivers/gpu/drm/ast/ast_mode.c | 62 --
 2 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index e1cb31acdaac..cda50fb887ed 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -160,8 +160,12 @@ struct ast_private {
struct drm_plane primary_plane;
struct ast_cursor_plane cursor_plane;
struct drm_crtc crtc;
-   struct drm_encoder encoder;
-   struct ast_vga_connector connector;
+   union {
+   struct {
+   struct drm_encoder encoder;
+   struct ast_vga_connector vga_connector;
+   } vga;
+   } output;
 
bool support_wide_screen;
enum {
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 384879b27ccc..bd01aea90784 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1252,25 +1252,6 @@ static int ast_crtc_init(struct drm_device *dev)
return 0;
 }
 
-/*
- * Encoder
- */
-
-static int ast_encoder_init(struct drm_device *dev)
-{
-   struct ast_private *ast = to_ast_private(dev);
-   struct drm_encoder *encoder = &ast->encoder;
-   int ret;
-
-   ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_DAC);
-   if (ret)
-   return ret;
-
-   encoder->possible_crtcs = 1;
-
-   return 0;
-}
-
 /*
  * VGA Connector
  */
@@ -1318,12 +1299,10 @@ static const struct drm_connector_funcs 
ast_vga_connector_funcs = {
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 };
 
-static int ast_vga_connector_init(struct drm_device *dev)
+static int ast_vga_connector_init(struct drm_device *dev,
+ struct ast_vga_connector *ast_vga_connector)
 {
-   struct ast_private *ast = to_ast_private(dev);
-   struct ast_vga_connector *ast_vga_connector = &ast->connector;
struct drm_connector *connector = &ast_vga_connector->base;
-   struct drm_encoder *encoder = &ast->encoder;
int ret;
 
ast_vga_connector->i2c = ast_i2c_create(dev);
@@ -1347,7 +1326,30 @@ static int ast_vga_connector_init(struct drm_device *dev)
 
connector->polled = DRM_CONNECTOR_POLL_CONNECT;
 
-   drm_connector_attach_encoder(connector, encoder);
+   return 0;
+}
+
+static int ast_vga_output_init(struct ast_private *ast)
+{
+   struct drm_device *dev = &ast->base;
+   struct drm_crtc *crtc = &ast->crtc;
+   struct drm_encoder *encoder = &ast->output.vga.encoder;
+   struct ast_vga_connector *ast_vga_connector = 
&ast->output.vga.vga_connector;
+   struct drm_connector *connector = &ast_vga_connector->base;
+   int ret;
+
+   ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_DAC);
+   if (ret)
+   return ret;
+   encoder->possible_crtcs = 1;
+
+   ret = ast_vga_connector_init(dev, ast_vga_connector);
+   if (ret)
+   return ret;
+
+   ret = drm_connector_attach_encoder(connector, encoder);
+   if (ret)
+   return ret;
 
return 0;
 }
@@ -1408,8 +1410,16 @@ int ast_mode_config_init(struct ast_private *ast)
return ret;
 
ast_crtc_init(dev);
-   ast_encoder_init(dev);
-   ast_vga_connector_init(dev);
+
+   switch (ast->tx_chip_type) {
+   case AST_TX_NONE:
+   case AST_TX_SIL164:
+   case AST_TX_DP501:
+   ret = ast_vga_output_init(ast);
+   break;
+   }
+   if (ret)
+   return ret;
 
drm_mode_config_reset(dev);
 
-- 
2.34.1



[PATCH v2 9/9] drm/ast: Move SIL164-based connector code into separate helpers

2022-02-07 Thread Thomas Zimmermann
Add helpers for initializing SIL164-based connectors. These used to be
handled by the VGA connector code. But SIL164 provides output via DVI-I,
so set the encoder and connector types accordingly.

If a SIL164 chip has been detected, ast will now create a DVI-I
connector instead of a VGA connector.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/ast/ast_drv.h  | 15 ++
 drivers/gpu/drm/ast/ast_mode.c | 99 +-
 2 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 420d19d8459e..c3a582372649 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -140,6 +140,17 @@ to_ast_vga_connector(struct drm_connector *connector)
return container_of(connector, struct ast_vga_connector, base);
 }
 
+struct ast_sil164_connector {
+   struct drm_connector base;
+   struct ast_i2c_chan *i2c;
+};
+
+static inline struct ast_sil164_connector *
+to_ast_sil164_connector(struct drm_connector *connector)
+{
+   return container_of(connector, struct ast_sil164_connector, base);
+}
+
 /*
  * Device
  */
@@ -165,6 +176,10 @@ struct ast_private {
struct drm_encoder encoder;
struct ast_vga_connector vga_connector;
} vga;
+   struct {
+   struct drm_encoder encoder;
+   struct ast_sil164_connector sil164_connector;
+   } sil164;
struct {
struct drm_encoder encoder;
struct drm_connector connector;
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index 12dbf5b229e6..6f4aa8e8b0ab 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1347,6 +1347,100 @@ static int ast_vga_output_init(struct ast_private *ast)
return 0;
 }
 
+/*
+ * SIL164 Connector
+ */
+
+static int ast_sil164_connector_helper_get_modes(struct drm_connector 
*connector)
+{
+   struct ast_sil164_connector *ast_sil164_connector = 
to_ast_sil164_connector(connector);
+   struct edid *edid;
+   int count;
+
+   if (!ast_sil164_connector->i2c)
+   goto err_drm_connector_update_edid_property;
+
+   edid = drm_get_edid(connector, &ast_sil164_connector->i2c->adapter);
+   if (!edid)
+   goto err_drm_connector_update_edid_property;
+
+   count = drm_add_edid_modes(connector, edid);
+   kfree(edid);
+
+   return count;
+
+err_drm_connector_update_edid_property:
+   drm_connector_update_edid_property(connector, NULL);
+   return 0;
+}
+
+static const struct drm_connector_helper_funcs 
ast_sil164_connector_helper_funcs = {
+   .get_modes = ast_sil164_connector_helper_get_modes,
+};
+
+static const struct drm_connector_funcs ast_sil164_connector_funcs = {
+   .reset = drm_atomic_helper_connector_reset,
+   .fill_modes = drm_helper_probe_single_connector_modes,
+   .destroy = drm_connector_cleanup,
+   .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
+   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+};
+
+static int ast_sil164_connector_init(struct drm_device *dev,
+struct ast_sil164_connector 
*ast_sil164_connector)
+{
+   struct drm_connector *connector = &ast_sil164_connector->base;
+   int ret;
+
+   ast_sil164_connector->i2c = ast_i2c_create(dev);
+   if (!ast_sil164_connector->i2c)
+   drm_err(dev, "failed to add ddc bus for connector\n");
+
+   if (ast_sil164_connector->i2c)
+   ret = drm_connector_init_with_ddc(dev, connector, 
&ast_sil164_connector_funcs,
+ DRM_MODE_CONNECTOR_DVII,
+ 
&ast_sil164_connector->i2c->adapter);
+   else
+   ret = drm_connector_init(dev, connector, 
&ast_sil164_connector_funcs,
+DRM_MODE_CONNECTOR_DVII);
+   if (ret)
+   return ret;
+
+   drm_connector_helper_add(connector, &ast_sil164_connector_helper_funcs);
+
+   connector->interlace_allowed = 0;
+   connector->doublescan_allowed = 0;
+
+   connector->polled = DRM_CONNECTOR_POLL_CONNECT;
+
+   return 0;
+}
+
+static int ast_sil164_output_init(struct ast_private *ast)
+{
+   struct drm_device *dev = &ast->base;
+   struct drm_crtc *crtc = &ast->crtc;
+   struct drm_encoder *encoder = &ast->output.sil164.encoder;
+   struct ast_sil164_connector *ast_sil164_connector = 
&ast->output.sil164.sil164_connector;
+   struct drm_connector *connector = &ast_sil164_connector->base;
+   int ret;
+
+   ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_TMDS);
+   if (ret)
+   return ret;
+   encoder->poss

[PATCH v2 7/9] drm/ast: Read encoder possible-CRTC mask from drm_crtc_mask()

2022-02-07 Thread Thomas Zimmermann
Read the encoder's possible-CRTC mask from the involved CRTC instead
of hard-coding it.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/ast/ast_mode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index bd01aea90784..09995a3d8c43 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -1341,7 +1341,7 @@ static int ast_vga_output_init(struct ast_private *ast)
ret = drm_simple_encoder_init(dev, encoder, DRM_MODE_ENCODER_DAC);
if (ret)
return ret;
-   encoder->possible_crtcs = 1;
+   encoder->possible_crtcs = drm_crtc_mask(crtc);
 
ret = ast_vga_connector_init(dev, ast_vga_connector);
if (ret)
-- 
2.34.1



Re: [PATCH v2 7/9] drm/ast: Read encoder possible-CRTC mask from drm_crtc_mask()

2022-02-07 Thread Javier Martinez Canillas
On 2/7/22 15:15, Thomas Zimmermann wrote:
> Read the encoder's possible-CRTC mask from the involved CRTC instead
> of hard-coding it.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

Reviewed-by: Javier Martinez Canillas 

Best regards,
-- 
Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [RFC 0/2] drm/i915/ttm: Evict and store of compressed object

2022-02-07 Thread Christian König

Am 07.02.22 um 14:53 schrieb Ramalingam C:

On 2022-02-07 at 12:41:59 +0100, Christian König wrote:

Am 07.02.22 um 10:37 schrieb Ramalingam C:

On flat-ccs capable platform we need to evict and resore the ccs data
along with the corresponding main memory.

This ccs data can only be access through BLT engine through a special
cmd ( )

To support above requirement of flat-ccs enabled i915 platforms this
series adds new param called ccs_pages_needed to the ttm_tt_init(),
to increase the ttm_tt->num_pages of system memory when the obj has the
lmem placement possibility.

Well question is why isn't the buffer object allocated with the extra space
in the first place?

Hi Christian,

On Xe-HP and later devices, we use dedicated compression control state (CCS)
stored in local memory for each surface, to support the 3D and media
compression formats.

The memory required for the CCS of the entire local memory is 1/256 of the
local memory size. So before the kernel boot, the required memory is reserved
for the CCS data and a secure register will be programmed with the CCS base
address

So when we allocate a object in local memory we dont need to explicitly
allocate the space for ccs data. But when we evict the obj into the smem
  to hold the compression related data along with the obj we need smem
  space of obj_size + (obj_size/256).

  Hence when we create smem for an obj with lmem placement possibility we
  create with the extra space.


Exactly that's what I've been missing in the cover letter and/or commit 
messages, comments etc..


Over all sounds like a valid explanation to me, just one comment on the 
code/naming:



  int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
-   uint32_t page_flags, enum ttm_caching caching)
+   uint32_t page_flags, enum ttm_caching caching,
+   unsigned long ccs_pages)


Please don't try to leak any i915 specific stuff into common TTM code.

For example use the wording extra_pages instead of ccs_pages here.

Apart from that looks good to me,
Christian.



  Ram.

Regards,
Christian.


This will be on top of the flat-ccs enabling series
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F95686%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=V9wQZvb0JwtplIBSYYXGzrg%2BEMvn4hfkscziPFDvZDY%3D&reserved=0

For more about flat-ccs feature please have a look at
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F471777%2F%3Fseries%3D95686%26rev%3D5&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=aYjoTKMbZvi%2Fnr7hkSH4SYGxZIv8Dj210dNrBnUNpQw%3D&reserved=0

Testing of the series is WIP and looking forward for the early review on
the amendment to ttm_tt_init and the approach.

Ramalingam C (2):
drm/i915/ttm: Add extra pages for handling ccs data
drm/i915/migrate: Evict and restore the ccs data

   drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
   drivers/gpu/drm/i915/gem/i915_gem_ttm.c|  23 +-
   drivers/gpu/drm/i915/gt/intel_migrate.c| 283 +++--
   drivers/gpu/drm/qxl/qxl_ttm.c  |   2 +-
   drivers/gpu/drm/ttm/ttm_agp_backend.c  |   2 +-
   drivers/gpu/drm/ttm/ttm_tt.c   |  12 +-
   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
   include/drm/ttm/ttm_tt.h   |   4 +-
   8 files changed, 191 insertions(+), 139 deletions(-)





Re: [Intel-gfx] [PATCH 1/3] i915/gvt: Introduce the mmio_table.c to support VFIO new mdev API

2022-02-07 Thread kernel test robot
Hi Zhi,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[also build test WARNING on next-20220207]
[cannot apply to drm-intel/for-linux-next hch-configfs/for-next v5.17-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Zhi-Wang/i915-gvt-Introduce-the-mmio_table-c-to-support-VFIO-new-mdev-API/20220127-200727
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: x86_64-rhel-8.3-kselftests 
(https://download.01.org/0day-ci/archive/20220207/202202072226.kzm2qm5q-...@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.4-dirty
# 
https://github.com/0day-ci/linux/commit/533f92651a7a56481a053f1e04dc5a5ec024ffb9
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Zhi-Wang/i915-gvt-Introduce-the-mmio_table-c-to-support-VFIO-new-mdev-API/20220127-200727
git checkout 533f92651a7a56481a053f1e04dc5a5ec024ffb9
# save the config file to linux build tree
mkdir build_dir
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir 
ARCH=x86_64 SHELL=/bin/bash drivers/gpu/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/i915/gvt/handlers.c:45:6: sparse: sparse: symbol 
>> 'intel_gvt_match_device' was not declared. Should it be static?

vim +/intel_gvt_match_device +45 drivers/gpu/drm/i915/gvt/handlers.c

12d14cc43b3470 Zhi Wang 2016-08-30  44  
12d14cc43b3470 Zhi Wang 2016-08-30 @45  bool intel_gvt_match_device(struct 
intel_gvt *gvt,
12d14cc43b3470 Zhi Wang 2016-08-30  46  unsigned long device)
12d14cc43b3470 Zhi Wang 2016-08-30  47  {
533f92651a7a56 Zhi Wang 2022-01-27  48  return 
intel_gvt_get_device_type(gvt->gt->i915) & device;
12d14cc43b3470 Zhi Wang 2016-08-30  49  }
12d14cc43b3470 Zhi Wang 2016-08-30  50  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [RFC 0/2] drm/i915/ttm: Evict and store of compressed object

2022-02-07 Thread C, Ramalingam
On 2022-02-07 at 15:37:09 +0100, Christian König wrote:
> Am 07.02.22 um 14:53 schrieb Ramalingam C:
> > On 2022-02-07 at 12:41:59 +0100, Christian König wrote:
> > > Am 07.02.22 um 10:37 schrieb Ramalingam C:
> > > > On flat-ccs capable platform we need to evict and resore the ccs data
> > > > along with the corresponding main memory.
> > > > 
> > > > This ccs data can only be access through BLT engine through a special
> > > > cmd ( )
> > > > 
> > > > To support above requirement of flat-ccs enabled i915 platforms this
> > > > series adds new param called ccs_pages_needed to the ttm_tt_init(),
> > > > to increase the ttm_tt->num_pages of system memory when the obj has the
> > > > lmem placement possibility.
> > > Well question is why isn't the buffer object allocated with the extra 
> > > space
> > > in the first place?
> > Hi Christian,
> > 
> > On Xe-HP and later devices, we use dedicated compression control state (CCS)
> > stored in local memory for each surface, to support the 3D and media
> > compression formats.
> > 
> > The memory required for the CCS of the entire local memory is 1/256 of the
> > local memory size. So before the kernel boot, the required memory is 
> > reserved
> > for the CCS data and a secure register will be programmed with the CCS base
> > address
> > 
> > So when we allocate a object in local memory we dont need to explicitly
> > allocate the space for ccs data. But when we evict the obj into the smem
> >   to hold the compression related data along with the obj we need smem
> >   space of obj_size + (obj_size/256).
> > 
> >   Hence when we create smem for an obj with lmem placement possibility we
> >   create with the extra space.
> 
> Exactly that's what I've been missing in the cover letter and/or commit
> messages, comments etc..
> 
> Over all sounds like a valid explanation to me, just one comment on the
> code/naming:
> 
> >   int ttm_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo,
> > -   uint32_t page_flags, enum ttm_caching caching)
> > +   uint32_t page_flags, enum ttm_caching caching,
> > +   unsigned long ccs_pages)
> 
> Please don't try to leak any i915 specific stuff into common TTM code.
> 
> For example use the wording extra_pages instead of ccs_pages here.
> 
> Apart from that looks good to me,

Thank you. I will address the comments on naming.

Ram
> Christian.
> 
> > 
> >   Ram.
> > > Regards,
> > > Christian.
> > > 
> > > > This will be on top of the flat-ccs enabling series
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fseries%2F95686%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=V9wQZvb0JwtplIBSYYXGzrg%2BEMvn4hfkscziPFDvZDY%3D&reserved=0
> > > > 
> > > > For more about flat-ccs feature please have a look at
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F471777%2F%3Fseries%3D95686%26rev%3D5&data=04%7C01%7Cchristian.koenig%40amd.com%7Ce54bb7576a334a76cab008d9ea4138e5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637798388115252727%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=aYjoTKMbZvi%2Fnr7hkSH4SYGxZIv8Dj210dNrBnUNpQw%3D&reserved=0
> > > > 
> > > > Testing of the series is WIP and looking forward for the early review on
> > > > the amendment to ttm_tt_init and the approach.
> > > > 
> > > > Ramalingam C (2):
> > > > drm/i915/ttm: Add extra pages for handling ccs data
> > > > drm/i915/migrate: Evict and restore the ccs data
> > > > 
> > > >drivers/gpu/drm/drm_gem_vram_helper.c  |   2 +-
> > > >drivers/gpu/drm/i915/gem/i915_gem_ttm.c|  23 +-
> > > >drivers/gpu/drm/i915/gt/intel_migrate.c| 283 
> > > > +++--
> > > >drivers/gpu/drm/qxl/qxl_ttm.c  |   2 +-
> > > >drivers/gpu/drm/ttm/ttm_agp_backend.c  |   2 +-
> > > >drivers/gpu/drm/ttm/ttm_tt.c   |  12 +-
> > > >drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c |   2 +-
> > > >include/drm/ttm/ttm_tt.h   |   4 +-
> > > >8 files changed, 191 insertions(+), 139 deletions(-)
> > > > 
> 


Re: [RFC 2/2] drm/i915/migrate: Evict and restore the ccs data

2022-02-07 Thread Hellstrom, Thomas
Hi, Ram,

A couple of quick questions before starting a more detailed review:

1) Does this also support migrating of compressed data LMEM->LMEM?
What-about inter-tile?

2) Do we need to block faulting of compressed data in the fault handler
as a follow-up patch?

/Thomas


On Mon, 2022-02-07 at 15:07 +0530, Ramalingam C wrote:
> When we are swapping out the local memory obj on flat-ccs capable
> platform,
> we need to capture the ccs data too along with main meory and we need
> to
> restore it when we are swapping in the content.
> 
> Extracting and restoring the CCS data is done through a special cmd
> called
> XY_CTRL_SURF_COPY_BLT
> 
> Signed-off-by: Ramalingam C 
> ---
>  drivers/gpu/drm/i915/gt/intel_migrate.c | 283 +-
> --
>  1 file changed, 155 insertions(+), 128 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c
> b/drivers/gpu/drm/i915/gt/intel_migrate.c
> index 5bdab0b3c735..e60ae6ff1847 100644
> --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> @@ -449,14 +449,146 @@ static bool wa_1209644611_applies(int ver, u32
> size)
> return height % 4 == 3 && height <= 8;
>  }
>  
> +/**
> + * DOC: Flat-CCS - Memory compression for Local memory
> + *
> + * On Xe-HP and later devices, we use dedicated compression control
> state (CCS)
> + * stored in local memory for each surface, to support the 3D and
> media
> + * compression formats.
> + *
> + * The memory required for the CCS of the entire local memory is
> 1/256 of the
> + * local memory size. So before the kernel boot, the required memory
> is reserved
> + * for the CCS data and a secure register will be programmed with
> the CCS base
> + * address.
> + *
> + * Flat CCS data needs to be cleared when a lmem object is
> allocated.
> + * And CCS data can be copied in and out of CCS region through
> + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
> + *
> + * When we exaust the lmem, if the object's placements support smem,
> then we can
> + * directly decompress the compressed lmem object into smem and
> start using it
> + * from smem itself.
> + *
> + * But when we need to swapout the compressed lmem object into a
> smem region
> + * though objects' placement doesn't support smem, then we copy the
> lmem content
> + * as it is into smem region along with ccs data (using
> XY_CTRL_SURF_COPY_BLT).
> + * When the object is referred, lmem content will be swaped in along
> with
> + * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at
> corresponding
> + * location.
> + *
> + *
> + * Flat-CCS Modifiers for different compression formats
> + * 
> + *
> + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers
> of Flat CCS
> + * render compression formats. Though the general layout is same as
> + * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression
> algorithm is
> + * used. Render compression uses 128 byte compression blocks
> + *
> + * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers
> of Flat CCS
> + * media compression formats. Though the general layout is same as
> + * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression
> algorithm is
> + * used. Media compression uses 256 byte compression blocks.
> + *
> + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the
> buffers of Flat
> + * CCS clear color render compression formats. Unified compression
> format for
> + * clear color render compression. The genral layout is a tiled
> layout using
> + * 4Kb tiles i.e Tile4 layout.
> + */
> +
> +static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags)
> +{
> +   /* Mask the 3 LSB to use the PPGTT address space */
> +   *cmd++ = MI_FLUSH_DW | flags;
> +   *cmd++ = lower_32_bits(dst);
> +   *cmd++ = upper_32_bits(dst);
> +
> +   return cmd;
> +}
> +
> +static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915,
> int size)
> +{
> +   u32 num_cmds, num_blks, total_size;
> +
> +   if (!GET_CCS_SIZE(i915, size))
> +   return 0;
> +
> +   /*
> +    * XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte
> +    * blocks. one XY_CTRL_SURF_COPY_BLT command can
> +    * trnasfer upto 1024 blocks.
> +    */
> +   num_blks = GET_CCS_SIZE(i915, size);
> +   num_cmds = (num_blks + (NUM_CCS_BLKS_PER_XFER - 1)) >> 10;
> +   total_size = (XY_CTRL_SURF_INSTR_SIZE) * num_cmds;
> +
> +   /*
> +    * We need to add a flush before and after
> +    * XY_CTRL_SURF_COPY_BLT
> +    */
> +   total_size += 2 * MI_FLUSH_DW_SIZE;
> +   return total_size;
> +}
> +
> +static u32 *_i915_ctrl_surf_copy_blt(u32 *cmd, u64 src_addr, u64
> dst_addr,
> +    u8 src_mem_access, u8
> dst_mem_access,
> +    int src_mocs, int dst_mocs,
> +    u16 num_ccs_blocks)
> +{
> +   int i 

Re: [RFC 2/2] drm/i915/migrate: Evict and restore the ccs data

2022-02-07 Thread Ramalingam C
On 2022-02-07 at 20:25:42 +0530, Hellstrom, Thomas wrote:
> Hi, Ram,
> 
> A couple of quick questions before starting a more detailed review:
> 
> 1) Does this also support migrating of compressed data LMEM->LMEM?
> What-about inter-tile?
Honestly this series mainly facused on eviction of lmem into smem and
restoration of same.

To cover migration, we need to handle this differently from eviction.
Becasue when we migrate the compressed content we need to be able to use
that from that new placement. can't keep the ccs data separately.

Migration of lmem->smem needs decompression incorportated.
Migration of lmem_m->lmem_n needs to maintain the
compressed/decompressed state as it is.

So we need to pass the information upto emit_copy to differentiate
eviction and migration

If you dont have objection I would like to take the migration once we
have the eviction of lmem in place.

> 
> 2) Do we need to block faulting of compressed data in the fault handler
> as a follow-up patch?

In case of evicted compressed data we dont need to treat it differently
from the evicted normal data. So I dont think this needs a special
treatment. Sorry if i dont understand your question.

Ram
> 
> /Thomas
> 
> 
> On Mon, 2022-02-07 at 15:07 +0530, Ramalingam C wrote:
> > When we are swapping out the local memory obj on flat-ccs capable
> > platform,
> > we need to capture the ccs data too along with main meory and we need
> > to
> > restore it when we are swapping in the content.
> >
> > Extracting and restoring the CCS data is done through a special cmd
> > called
> > XY_CTRL_SURF_COPY_BLT
> >
> > Signed-off-by: Ramalingam C 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_migrate.c | 283 +-
> > --
> >  1 file changed, 155 insertions(+), 128 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c
> > b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > index 5bdab0b3c735..e60ae6ff1847 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > @@ -449,14 +449,146 @@ static bool wa_1209644611_applies(int ver, u32
> > size)
> > return height % 4 == 3 && height <= 8;
> >  }
> >
> > +/**
> > + * DOC: Flat-CCS - Memory compression for Local memory
> > + *
> > + * On Xe-HP and later devices, we use dedicated compression control
> > state (CCS)
> > + * stored in local memory for each surface, to support the 3D and
> > media
> > + * compression formats.
> > + *
> > + * The memory required for the CCS of the entire local memory is
> > 1/256 of the
> > + * local memory size. So before the kernel boot, the required memory
> > is reserved
> > + * for the CCS data and a secure register will be programmed with
> > the CCS base
> > + * address.
> > + *
> > + * Flat CCS data needs to be cleared when a lmem object is
> > allocated.
> > + * And CCS data can be copied in and out of CCS region through
> > + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
> > + *
> > + * When we exaust the lmem, if the object's placements support smem,
> > then we can
> > + * directly decompress the compressed lmem object into smem and
> > start using it
> > + * from smem itself.
> > + *
> > + * But when we need to swapout the compressed lmem object into a
> > smem region
> > + * though objects' placement doesn't support smem, then we copy the
> > lmem content
> > + * as it is into smem region along with ccs data (using
> > XY_CTRL_SURF_COPY_BLT).
> > + * When the object is referred, lmem content will be swaped in along
> > with
> > + * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at
> > corresponding
> > + * location.
> > + *
> > + *
> > + * Flat-CCS Modifiers for different compression formats
> > + * 
> > + *
> > + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers
> > of Flat CCS
> > + * render compression formats. Though the general layout is same as
> > + * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression
> > algorithm is
> > + * used. Render compression uses 128 byte compression blocks
> > + *
> > + * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers
> > of Flat CCS
> > + * media compression formats. Though the general layout is same as
> > + * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression
> > algorithm is
> > + * used. Media compression uses 256 byte compression blocks.
> > + *
> > + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the
> > buffers of Flat
> > + * CCS clear color render compression formats. Unified compression
> > format for
> > + * clear color render compression. The genral layout is a tiled
> > layout using
> > + * 4Kb tiles i.e Tile4 layout.
> > + */
> > +
> > +static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags)
> > +{
> > +   /* Mask the 3 LSB to use the PPGTT address space */
> > +   *cmd++ = MI_FLUSH_DW | flags;
> > +   *cmd++ = lower_32_bits(dst);
> > +   *cmd++ = upper_32_bit

[PATCH] gpu: ipu-v3: Fix dev_dbg frequency output

2022-02-07 Thread Mark Jonas
From: Leo Ruan 

This commit corrects the printing of the IPU clock error percentage if
it is between -0.1% to -0.9%. For example, if the pixel clock requested
is 27.2 MHz but only 27.0 MHz can be achieved the deviation is -0.8%.
But the fixed point math had a flaw and calculated error of 0.2%.

Before:
  Clocks: IPU 27000Hz DI 24716667Hz Needed 2720Hz
  IPU clock can give 2700 with divider 10, error 0.2%
  Want 2720Hz IPU 27000Hz DI 24716667Hz using IPU, 2700Hz

After:
  Clocks: IPU 27000Hz DI 24716667Hz Needed 2720Hz
  IPU clock can give 2700 with divider 10, error -0.8%
  Want 2720Hz IPU 27000Hz DI 24716667Hz using IPU, 2700Hz

Signed-off-by: Leo Ruan 
Signed-off-by: Mark Jonas 
---
 drivers/gpu/ipu-v3/ipu-di.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-di.c b/drivers/gpu/ipu-v3/ipu-di.c
index b4a31d506fcc..74eca68891ad 100644
--- a/drivers/gpu/ipu-v3/ipu-di.c
+++ b/drivers/gpu/ipu-v3/ipu-di.c
@@ -451,8 +451,9 @@ static void ipu_di_config_clock(struct ipu_di *di,
 
error = rate / (sig->mode.pixelclock / 1000);
 
-   dev_dbg(di->ipu->dev, "  IPU clock can give %lu with divider 
%u, error %d.%u%%\n",
-   rate, div, (signed)(error - 1000) / 10, error % 10);
+   dev_dbg(di->ipu->dev, "  IPU clock can give %lu with divider 
%u, error %c%d.%d%%\n",
+   rate, div, error < 1000 ? '-' : '+',
+   abs(error - 1000) / 10, abs(error - 1000) % 10);
 
/* Allow a 1% error */
if (error < 1010 && error >= 990) {
-- 
2.17.1



Re: [RFC 2/2] drm/i915/migrate: Evict and restore the ccs data

2022-02-07 Thread Hellstrom, Thomas
On Mon, 2022-02-07 at 20:44 +0530, Ramalingam C wrote:
> On 2022-02-07 at 20:25:42 +0530, Hellstrom, Thomas wrote:
> > Hi, Ram,
> > 
> > A couple of quick questions before starting a more detailed review:
> > 
> > 1) Does this also support migrating of compressed data LMEM->LMEM?
> > What-about inter-tile?
> Honestly this series mainly facused on eviction of lmem into smem and
> restoration of same.
> 
> To cover migration, we need to handle this differently from eviction.
> Becasue when we migrate the compressed content we need to be able to
> use
> that from that new placement. can't keep the ccs data separately.
> 
> Migration of lmem->smem needs decompression incorportated.
> Migration of lmem_m->lmem_n needs to maintain the
> compressed/decompressed state as it is.
> 
> So we need to pass the information upto emit_copy to differentiate
> eviction and migration
> 
> If you dont have objection I would like to take the migration once we
> have the eviction of lmem in place.

Sure NP. I was thinking that in the final solution we might also need
to think about the possibility that we might evict to another lmem
region, although I figure that won't be enabled until we support multi-
tile.

> 
> > 
> > 2) Do we need to block faulting of compressed data in the fault
> > handler
> > as a follow-up patch?
> 
> In case of evicted compressed data we dont need to treat it
> differently
> from the evicted normal data. So I dont think this needs a special
> treatment. Sorry if i dont understand your question.

My question wasn't directly related to eviction actually, but does
user-space need to have mmap access to compressed data? If not, block
it?

Thanks,
Thomas



> 
> Ram
> > 
> > /Thomas
> > 
> > 
> > On Mon, 2022-02-07 at 15:07 +0530, Ramalingam C wrote:
> > > When we are swapping out the local memory obj on flat-ccs capable
> > > platform,
> > > we need to capture the ccs data too along with main meory and we
> > > need
> > > to
> > > restore it when we are swapping in the content.
> > > 
> > > Extracting and restoring the CCS data is done through a special
> > > cmd
> > > called
> > > XY_CTRL_SURF_COPY_BLT
> > > 
> > > Signed-off-by: Ramalingam C 
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_migrate.c | 283 +-
> > > 
> > > --
> > >  1 file changed, 155 insertions(+), 128 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > index 5bdab0b3c735..e60ae6ff1847 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
> > > @@ -449,14 +449,146 @@ static bool wa_1209644611_applies(int ver,
> > > u32
> > > size)
> > >     return height % 4 == 3 && height <= 8;
> > >  }
> > > 
> > > +/**
> > > + * DOC: Flat-CCS - Memory compression for Local memory
> > > + *
> > > + * On Xe-HP and later devices, we use dedicated compression
> > > control
> > > state (CCS)
> > > + * stored in local memory for each surface, to support the 3D
> > > and
> > > media
> > > + * compression formats.
> > > + *
> > > + * The memory required for the CCS of the entire local memory is
> > > 1/256 of the
> > > + * local memory size. So before the kernel boot, the required
> > > memory
> > > is reserved
> > > + * for the CCS data and a secure register will be programmed
> > > with
> > > the CCS base
> > > + * address.
> > > + *
> > > + * Flat CCS data needs to be cleared when a lmem object is
> > > allocated.
> > > + * And CCS data can be copied in and out of CCS region through
> > > + * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data
> > > directly.
> > > + *
> > > + * When we exaust the lmem, if the object's placements support
> > > smem,
> > > then we can
> > > + * directly decompress the compressed lmem object into smem and
> > > start using it
> > > + * from smem itself.
> > > + *
> > > + * But when we need to swapout the compressed lmem object into a
> > > smem region
> > > + * though objects' placement doesn't support smem, then we copy
> > > the
> > > lmem content
> > > + * as it is into smem region along with ccs data (using
> > > XY_CTRL_SURF_COPY_BLT).
> > > + * When the object is referred, lmem content will be swaped in
> > > along
> > > with
> > > + * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at
> > > corresponding
> > > + * location.
> > > + *
> > > + *
> > > + * Flat-CCS Modifiers for different compression formats
> > > + * 
> > > + *
> > > + * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the
> > > buffers
> > > of Flat CCS
> > > + * render compression formats. Though the general layout is same
> > > as
> > > + * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression
> > > algorithm is
> > > + * used. Render compression uses 128 byte compression blocks
> > > + *
> > > + * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the
> > > buffers
> > > of Flat CCS
> > > + * media compression for

[PATCH 00/30] My patch queue

2022-02-07 Thread Maxim Levitsky
This is set of various patches that are stuck in my patch queue.

KVM_REQ_GET_NESTED_STATE_PAGES patch is mostly RFC, but it does seem
to work for me.

Read-only APIC ID is also somewhat RFC.

Some of these patches are preparation for support for nested AVIC
which I almost done developing, and will start testing very soon.

Best regards,
Maxim Levitsky

Maxim Levitsky (30):
  KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT &&
!gCR0.PG case
  KVM: x86: nSVM: fix potential NULL derefernce on nested migration
  KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state
  KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a
result of RSM
  KVM: x86: nSVM: expose clean bit support to the guest
  KVM: x86: mark syntethic SMM vmexit as SVM_EXIT_SW
  KVM: x86: nSVM: deal with L1 hypervisor that intercepts interrupts but
lets L2 control them
  KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when
inhibiting it
  KVM: x86: SVM: move avic definitions from AMD's spec to svm.h
  KVM: x86: SVM: fix race between interrupt delivery and AVIC inhibition
  KVM: x86: SVM: use vmcb01 in avic_init_vmcb
  KVM: x86: SVM: allow AVIC to co-exist with a nested guest running
  KVM: x86: lapic: don't allow to change APIC ID when apic acceleration
is enabled
  KVM: x86: lapic: don't allow to change local apic id when using older
x2apic api
  KVM: x86: SVM: remove avic's broken code that updated APIC ID
  KVM: x86: SVM: allow to force AVIC to be enabled
  KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set
  KVM: x86: mmu: add strict mmu mode
  KVM: x86: mmu: add gfn_in_memslot helper
  KVM: x86: mmu: allow to enable write tracking externally
  x86: KVMGT: use kvm_page_track_write_tracking_enable
  KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running
  KVM: x86: nSVM: implement nested LBR virtualization
  KVM: x86: nSVM: implement nested VMLOAD/VMSAVE
  KVM: x86: nSVM: support PAUSE filter threshold and count when
cpu_pm=on
  KVM: x86: nSVM: implement nested vGIF
  KVM: x86: add force_intercept_exceptions_mask
  KVM: SVM: implement force_intercept_exceptions_mask
  KVM: VMX: implement force_intercept_exceptions_mask
  KVM: x86: get rid of KVM_REQ_GET_NESTED_STATE_PAGES

 arch/x86/include/asm/kvm-x86-ops.h|   1 +
 arch/x86/include/asm/kvm_host.h   |  24 +-
 arch/x86/include/asm/kvm_page_track.h |   1 +
 arch/x86/include/asm/msr-index.h  |   1 +
 arch/x86/include/asm/svm.h|  36 +++
 arch/x86/include/uapi/asm/kvm.h   |   1 +
 arch/x86/kvm/Kconfig  |   3 -
 arch/x86/kvm/hyperv.c |   4 +
 arch/x86/kvm/lapic.c  |  53 ++--
 arch/x86/kvm/mmu.h|   8 +-
 arch/x86/kvm/mmu/mmu.c|  31 ++-
 arch/x86/kvm/mmu/page_track.c |  10 +-
 arch/x86/kvm/svm/avic.c   | 135 +++---
 arch/x86/kvm/svm/nested.c | 167 +++-
 arch/x86/kvm/svm/svm.c| 375 ++
 arch/x86/kvm/svm/svm.h|  60 +++--
 arch/x86/kvm/svm/svm_onhyperv.c   |   1 +
 arch/x86/kvm/vmx/nested.c | 107 +++-
 arch/x86/kvm/vmx/vmcs.h   |   6 +
 arch/x86/kvm/vmx/vmx.c|  48 +++-
 arch/x86/kvm/x86.c|  42 ++-
 arch/x86/kvm/x86.h|   5 +
 drivers/gpu/drm/i915/Kconfig  |   1 -
 drivers/gpu/drm/i915/gvt/kvmgt.c  |   5 +
 include/linux/kvm_host.h  |  10 +-
 25 files changed, 764 insertions(+), 371 deletions(-)

-- 
2.26.3




[PATCH 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case

2022-02-07 Thread Maxim Levitsky
When the guest doesn't enable paging, and NPT/EPT is disabled, we
use guest't paging CR3's as KVM's shadow paging pointer and
we are technically in direct mode as if we were to use NPT/EPT.

In direct mode we create SPTEs with user mode permissions
because usually in the direct mode the NPT/EPT doesn't
need to restrict access based on guest CPL
(there are MBE/GMET extenstions for that but KVM doesn't use them).

In this special "use guest paging as direct" mode however,
and if CR4.SMAP/CR4.SMEP are enabled, that will make the CPU
fault on each access and KVM will enter endless loop of page faults.

Since page protection doesn't have any meaning in !PG case,
just don't passthrough these bits.

The fix is the same as was done for VMX in commit:
commit 656ec4a4928a ("KVM: VMX: fix SMEP and SMAP without EPT")

This fixes the boot of windows 10 without NPT for good.
(Without this patch, BSP boots, but APs were stuck in endless
loop of page faults, causing the VM boot with 1 CPU)

Signed-off-by: Maxim Levitsky 
Cc: sta...@vger.kernel.org
---
 arch/x86/kvm/svm/svm.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 975be872cd1a3..995c203a62fd9 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1596,6 +1596,7 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 {
struct vcpu_svm *svm = to_svm(vcpu);
u64 hcr0 = cr0;
+   bool old_paging = is_paging(vcpu);
 
 #ifdef CONFIG_X86_64
if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) {
@@ -1612,8 +1613,11 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long 
cr0)
 #endif
vcpu->arch.cr0 = cr0;
 
-   if (!npt_enabled)
+   if (!npt_enabled) {
hcr0 |= X86_CR0_PG | X86_CR0_WP;
+   if (old_paging != is_paging(vcpu))
+   svm_set_cr4(vcpu, kvm_read_cr4(vcpu));
+   }
 
/*
 * re-enable caching here because the QEMU bios
@@ -1657,8 +1661,12 @@ void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long 
cr4)
svm_flush_tlb_current(vcpu);
 
vcpu->arch.cr4 = cr4;
-   if (!npt_enabled)
+   if (!npt_enabled) {
cr4 |= X86_CR4_PAE;
+
+   if (!is_paging(vcpu))
+   cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
+   }
cr4 |= host_cr4_mce;
to_svm(vcpu)->vmcb->save.cr4 = cr4;
vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR);
-- 
2.26.3



[PATCH 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration

2022-02-07 Thread Maxim Levitsky
Turns out that due to review feedback and/or rebases
I accidentally moved the call to nested_svm_load_cr3 to be too early,
before the NPT is enabled, which is very wrong to do.

KVM can't even access guest memory at that point as nested NPT
is needed for that, and of course it won't initialize the walk_mmu,
which is main issue the patch was addressing.

Fix this for real.

Fixes: 232f75d3b4b5 ("KVM: nSVM: call nested_svm_load_cr3 on nested state load")
Cc: sta...@vger.kernel.org

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/nested.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 1218b5a342fc8..39d280e7e80ef 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1457,18 +1457,6 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
!__nested_vmcb_check_save(vcpu, &save_cached))
goto out_free;
 
-   /*
-* While the nested guest CR3 is already checked and set by
-* KVM_SET_SREGS, it was set when nested state was yet loaded,
-* thus MMU might not be initialized correctly.
-* Set it again to fix this.
-*/
-
-   ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
- nested_npt_enabled(svm), false);
-   if (WARN_ON_ONCE(ret))
-   goto out_free;
-
 
/*
 * All checks done, we can enter guest mode. Userspace provides
@@ -1494,6 +1482,20 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 
svm_switch_vmcb(svm, &svm->nested.vmcb02);
nested_vmcb02_prepare_control(svm);
+
+   /*
+* While the nested guest CR3 is already checked and set by
+* KVM_SET_SREGS, it was set when nested state was yet loaded,
+* thus MMU might not be initialized correctly.
+* Set it again to fix this.
+*/
+
+   ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
+ nested_npt_enabled(svm), false);
+   if (WARN_ON_ONCE(ret))
+   goto out_free;
+
+
kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
ret = 0;
 out_free:
-- 
2.26.3



[PATCH 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state

2022-02-07 Thread Maxim Levitsky
While usually, restoring the smm state makes the KVM enter
the nested guest thus a different vmcb (vmcb02 vs vmcb01),
KVM should still mark it as dirty, since hardware
can in theory cache multiple vmcbs.

Failure to do so, combined with lack of setting the
nested_run_pending (which is fixed in the next patch),
might make KVM re-enter vmcb01, which was just exited from,
with completely different set of guest state registers
(SMM vs non SMM) and without proper dirty bits set,
which results in the CPU reusing stale IDTR pointer
which leads to a guest shutdown on any interrupt.

On the real hardware this usually doesn't happen,
but when running nested, L0's KVM does check and
honour few dirty bits, causing this issue to happen.

This patch fixes boot of hyperv and SMM enabled
windows VM running nested on KVM.

Signed-off-by: Maxim Levitsky 
Cc: sta...@vger.kernel.org
---
 arch/x86/kvm/svm/svm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 995c203a62fd9..3f1d11e652123 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4267,6 +4267,8 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const 
char *smstate)
 * Enter the nested guest now
 */
 
+   vmcb_mark_all_dirty(svm->vmcb01.ptr);
+
vmcb12 = map.hva;
nested_copy_vmcb_control_to_cache(svm, &vmcb12->control);
nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
-- 
2.26.3



[PATCH 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM

2022-02-07 Thread Maxim Levitsky
While RSM induced VM entries are not full VM entries,
they still need to be followed by actual VM entry to complete it,
unlike setting the nested state.

This patch fixes boot of hyperv and SMM enabled
windows VM running nested on KVM, which fail due
to this issue combined with lack of dirty bit setting.

Signed-off-by: Maxim Levitsky 
Cc: sta...@vger.kernel.org
---
 arch/x86/kvm/svm/svm.c | 5 +
 arch/x86/kvm/vmx/vmx.c | 1 +
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3f1d11e652123..71bfa52121622 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4274,6 +4274,11 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const 
char *smstate)
nested_copy_vmcb_save_to_cache(svm, &vmcb12->save);
ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false);
 
+   if (ret)
+   goto unmap_save;
+
+   svm->nested.nested_run_pending = 1;
+
 unmap_save:
kvm_vcpu_unmap(vcpu, &map_save, true);
 unmap_map:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8ac5a6fa77203..fc9c4eca90a78 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7659,6 +7659,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const 
char *smstate)
if (ret)
return ret;
 
+   vmx->nested.nested_run_pending = 1;
vmx->nested.smm.guest_mode = false;
}
return 0;
-- 
2.26.3



[PATCH 05/30] KVM: x86: nSVM: expose clean bit support to the guest

2022-02-07 Thread Maxim Levitsky
KVM already honours few clean bits thus it makes sense
to let the nested guest know about it.

Note that KVM also doesn't check if the hardware supports
clean bits, and therefore nested KVM was
already setting clean bits and L0 KVM
was already honouring them.


Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/svm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 71bfa52121622..8013be9edf27c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4663,6 +4663,7 @@ static __init void svm_set_cpu_caps(void)
/* CPUID 0x8001 and 0x800A (SVM features) */
if (nested) {
kvm_cpu_cap_set(X86_FEATURE_SVM);
+   kvm_cpu_cap_set(X86_FEATURE_VMCBCLEAN);
 
if (nrips)
kvm_cpu_cap_set(X86_FEATURE_NRIPS);
-- 
2.26.3



[PATCH 06/30] KVM: x86: mark syntethic SMM vmexit as SVM_EXIT_SW

2022-02-07 Thread Maxim Levitsky
Use a dummy unused vmexit reason to mark the 'VM exit'
that is happening when we exit to handle SMM,
which is not a real VM exit.

This makes it a bit easier to read the KVM trace,
and avoids other potential problems.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8013be9edf27c..9a4e299ed5673 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4194,7 +4194,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, char 
*smstate)
svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP];
 
-   ret = nested_svm_vmexit(svm);
+   ret = nested_svm_simple_vmexit(svm, SVM_EXIT_SW);
if (ret)
return ret;
 
-- 
2.26.3



[PATCH 07/30] KVM: x86: nSVM: deal with L1 hypervisor that intercepts interrupts but lets L2 control them

2022-02-07 Thread Maxim Levitsky
Fix a corner case in which the L1 hypervisor intercepts
interrupts (INTERCEPT_INTR) and either doesn't set
virtual interrupt masking (V_INTR_MASKING) or enters a
nested guest with EFLAGS.IF disabled prior to the entry.

In this case, despite the fact that L1 intercepts the interrupts,
KVM still needs to set up an interrupt window to wait before
injecting the INTR vmexit.

Currently the KVM instead enters an endless loop of 'req_immediate_exit'.

Exactly the same issue also happens for SMIs and NMI.
Fix this as well.

Note that on VMX, this case is impossible as there is only
'vmexit on external interrupts' execution control which either set,
in which case both host and guest's EFLAGS.IF
are ignored, or not set, in which case no VMexits are delivered.


Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/svm.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9a4e299ed5673..22e614008cf59 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3372,11 +3372,13 @@ static int svm_nmi_allowed(struct kvm_vcpu *vcpu, bool 
for_injection)
if (svm->nested.nested_run_pending)
return -EBUSY;
 
+   if (svm_nmi_blocked(vcpu))
+   return 0;
+
/* An NMI must not be injected into L2 if it's supposed to VM-Exit.  */
if (for_injection && is_guest_mode(vcpu) && nested_exit_on_nmi(svm))
return -EBUSY;
-
-   return !svm_nmi_blocked(vcpu);
+   return 1;
 }
 
 static bool svm_get_nmi_mask(struct kvm_vcpu *vcpu)
@@ -3428,9 +3430,13 @@ bool svm_interrupt_blocked(struct kvm_vcpu *vcpu)
 static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection)
 {
struct vcpu_svm *svm = to_svm(vcpu);
+
if (svm->nested.nested_run_pending)
return -EBUSY;
 
+   if (svm_interrupt_blocked(vcpu))
+   return 0;
+
/*
 * An IRQ must not be injected into L2 if it's supposed to VM-Exit,
 * e.g. if the IRQ arrived asynchronously after checking nested events.
@@ -3438,7 +3444,7 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, 
bool for_injection)
if (for_injection && is_guest_mode(vcpu) && nested_exit_on_intr(svm))
return -EBUSY;
 
-   return !svm_interrupt_blocked(vcpu);
+   return 1;
 }
 
 static void svm_enable_irq_window(struct kvm_vcpu *vcpu)
@@ -4169,11 +4175,14 @@ static int svm_smi_allowed(struct kvm_vcpu *vcpu, bool 
for_injection)
if (svm->nested.nested_run_pending)
return -EBUSY;
 
+   if (svm_smi_blocked(vcpu))
+   return 0;
+
/* An SMI must not be injected into L2 if it's supposed to VM-Exit.  */
if (for_injection && is_guest_mode(vcpu) && nested_exit_on_smi(svm))
return -EBUSY;
 
-   return !svm_smi_blocked(vcpu);
+   return 1;
 }
 
 static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
-- 
2.26.3



[PATCH 08/30] KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it

2022-02-07 Thread Maxim Levitsky
kvm_apic_update_apicv is called when AVIC is still active, thus IRR bits
can be set by the CPU after it is called, and don't cause the irr_pending
to be set to true.

Also logic in avic_kick_target_vcpu doesn't expect a race with this
function so to make it simple, just keep irr_pending set to true and
let the next interrupt injection to the guest clear it.


Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/lapic.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0da7d0960fcb5..dd4e2888c244b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2307,7 +2307,12 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu)
apic->irr_pending = true;
apic->isr_count = 1;
} else {
-   apic->irr_pending = (apic_search_irr(apic) != -1);
+   /*
+* Don't clear irr_pending, searching the IRR can race with
+* updates from the CPU as APICv is still active from hardware's
+* perspective.  The flag will be cleared as appropriate when
+* KVM injects the interrupt.
+*/
apic->isr_count = count_vectors(apic->regs + APIC_ISR);
}
 }
-- 
2.26.3



[PATCH 09/30] KVM: x86: SVM: move avic definitions from AMD's spec to svm.h

2022-02-07 Thread Maxim Levitsky
asm/svm.h is the correct place for all values that are defined in
the SVM spec, and that includes AVIC.

Also add some values from the spec that were not defined before
and will be soon useful.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/include/asm/svm.h   | 36 
 arch/x86/kvm/svm/avic.c  | 22 +--
 arch/x86/kvm/svm/svm.h   | 11 --
 4 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 01e2650b95859..552ff8a5ea023 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -476,6 +476,7 @@
 #define MSR_AMD64_ICIBSEXTDCTL 0xc001103c
 #define MSR_AMD64_IBSOPDATA4   0xc001103d
 #define MSR_AMD64_IBS_REG_COUNT_MAX8 /* includes MSR_AMD64_IBSBRTARGET */
+#define MSR_AMD64_SVM_AVIC_DOORBELL0xc001011b
 #define MSR_AMD64_VM_PAGE_FLUSH0xc001011e
 #define MSR_AMD64_SEV_ES_GHCB  0xc0010130
 #define MSR_AMD64_SEV  0xc0010131
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index b00dbc5fac2b2..bb2fb78523cee 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -220,6 +220,42 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
 #define SVM_NESTED_CTL_SEV_ENABLE  BIT(1)
 #define SVM_NESTED_CTL_SEV_ES_ENABLE   BIT(2)
 
+
+/* AVIC */
+#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK   (0xFF)
+#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT31
+#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK   (1 << 31)
+
+#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK   (0xFFULL)
+#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK   (0xFFULL << 12)
+#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62)
+#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK  (1ULL << 63)
+#define AVIC_PHYSICAL_ID_TABLE_SIZE_MASK   (0xFF)
+
+#define AVIC_DOORBELL_PHYSICAL_ID_MASK (0xFF)
+
+#define AVIC_UNACCEL_ACCESS_WRITE_MASK 1
+#define AVIC_UNACCEL_ACCESS_OFFSET_MASK0xFF0
+#define AVIC_UNACCEL_ACCESS_VECTOR_MASK0x
+
+enum avic_ipi_failure_cause {
+   AVIC_IPI_FAILURE_INVALID_INT_TYPE,
+   AVIC_IPI_FAILURE_TARGET_NOT_RUNNING,
+   AVIC_IPI_FAILURE_INVALID_TARGET,
+   AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
+};
+
+
+/*
+ * 0xff is broadcast, so the max index allowed for physical APIC ID
+ * table is 0xfe.  APIC IDs above 0xff are reserved.
+ */
+#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
+
+#define AVIC_HPA_MASK  ~((0xFFFULL << 52) | 0xFFF)
+#define VMCB_AVIC_APIC_BAR_MASK0xFF000ULL
+
+
 struct vmcb_seg {
u16 selector;
u16 attrib;
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 99f907ec5aa8f..fabfc337e1c35 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -27,20 +27,6 @@
 #include "irq.h"
 #include "svm.h"
 
-#define SVM_AVIC_DOORBELL  0xc001011b
-
-#define AVIC_HPA_MASK  ~((0xFFFULL << 52) | 0xFFF)
-
-/*
- * 0xff is broadcast, so the max index allowed for physical APIC ID
- * table is 0xfe.  APIC IDs above 0xff are reserved.
- */
-#define AVIC_MAX_PHYSICAL_ID_COUNT 255
-
-#define AVIC_UNACCEL_ACCESS_WRITE_MASK 1
-#define AVIC_UNACCEL_ACCESS_OFFSET_MASK0xFF0
-#define AVIC_UNACCEL_ACCESS_VECTOR_MASK0x
-
 /* AVIC GATAG is encoded using VM and VCPU IDs */
 #define AVIC_VCPU_ID_BITS  8
 #define AVIC_VCPU_ID_MASK  ((1 << AVIC_VCPU_ID_BITS) - 1)
@@ -73,12 +59,6 @@ struct amd_svm_iommu_ir {
void *data; /* Storing pointer to struct amd_ir_data */
 };
 
-enum avic_ipi_failure_cause {
-   AVIC_IPI_FAILURE_INVALID_INT_TYPE,
-   AVIC_IPI_FAILURE_TARGET_NOT_RUNNING,
-   AVIC_IPI_FAILURE_INVALID_TARGET,
-   AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
-};
 
 /* Note:
  * This function is called from IOMMU driver to notify
@@ -702,7 +682,7 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
 * one is harmless).
 */
if (cpu != get_cpu())
-   wrmsrl(SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu));
+   wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, 
kvm_cpu_get_apicid(cpu));
put_cpu();
} else {
/*
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 852b12aee03d7..6343558982c73 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -555,17 +555,6 @@ extern struct kvm_x86_nested_ops svm_nested_ops;
 
 /* avic.c */
 
-#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK   (0xFF)
-#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT31
-#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK   (1 << 31)
-
-#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_

[PATCH 10/30] KVM: x86: SVM: fix race between interrupt delivery and AVIC inhibition

2022-02-07 Thread Maxim Levitsky
If svm_deliver_avic_intr is called just after the target vcpu's AVIC got
inhibited, it might read a stale value of vcpu->arch.apicv_active
which can lead to the target vCPU not noticing the interrupt.

To fix this use load-acquire/store-release so that, if the target vCPU
is IN_GUEST_MODE, we're guaranteed to see a previous disabling of the
AVIC.  If AVIC has been disabled in the meanwhile, proceed with the
KVM_REQ_EVENT-based delivery.

All this complicated logic is actually exactly how we can handle an
incomplete IPI vmexit; the only difference lies in who sets IRR, whether
KVM or the processor.

Also incomplete IPI vmexit also has the same races as
svm_deliver_avic_intr.
Therefore use the avic_kick_target_vcpu there as well.

Co-developed-by: Paolo Bonzini 
Signed-off-by: Paolo Bonzini 
Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/avic.c | 73 ++---
 arch/x86/kvm/svm/svm.c  | 65 
 arch/x86/kvm/svm/svm.h  |  3 ++
 arch/x86/kvm/x86.c  |  4 ++-
 4 files changed, 82 insertions(+), 63 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index fabfc337e1c35..4c2d622b3b9f0 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -269,6 +269,24 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
return 0;
 }
 
+
+void avic_ring_doorbell(struct kvm_vcpu *vcpu)
+{
+   /*
+* Note, the vCPU could get migrated to a different pCPU at any
+* point, which could result in signalling the wrong/previous
+* pCPU.  But if that happens the vCPU is guaranteed to do a
+* VMRUN (after being migrated) and thus will process pending
+* interrupts, i.e. a doorbell is not needed (and the spurious
+* one is harmless).
+*/
+   int cpu = READ_ONCE(vcpu->cpu);
+
+   if (cpu != get_cpu())
+   wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu));
+   put_cpu();
+}
+
 static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
   u32 icrl, u32 icrh)
 {
@@ -284,8 +302,13 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct 
kvm_lapic *source,
kvm_for_each_vcpu(i, vcpu, kvm) {
if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
GET_APIC_DEST_FIELD(icrh),
-   icrl & APIC_DEST_MASK))
-   kvm_vcpu_wake_up(vcpu);
+   icrl & APIC_DEST_MASK)) {
+   vcpu->arch.apic->irr_pending = true;
+   svm_complete_interrupt_delivery(vcpu,
+   icrl & APIC_MODE_MASK,
+   icrl & 
APIC_INT_LEVELTRIG,
+   icrl & 
APIC_VECTOR_MASK);
+   }
}
 }
 
@@ -649,52 +672,6 @@ void avic_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 
*eoi_exit_bitmap)
return;
 }
 
-int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
-{
-   if (!vcpu->arch.apicv_active)
-   return -1;
-
-   kvm_lapic_set_irr(vec, vcpu->arch.apic);
-
-   /*
-* Pairs with the smp_mb_*() after setting vcpu->guest_mode in
-* vcpu_enter_guest() to ensure the write to the vIRR is ordered before
-* the read of guest_mode, which guarantees that either VMRUN will see
-* and process the new vIRR entry, or that the below code will signal
-* the doorbell if the vCPU is already running in the guest.
-*/
-   smp_mb__after_atomic();
-
-   /*
-* Signal the doorbell to tell hardware to inject the IRQ if the vCPU
-* is in the guest.  If the vCPU is not in the guest, hardware will
-* automatically process AVIC interrupts at VMRUN.
-*/
-   if (vcpu->mode == IN_GUEST_MODE) {
-   int cpu = READ_ONCE(vcpu->cpu);
-
-   /*
-* Note, the vCPU could get migrated to a different pCPU at any
-* point, which could result in signalling the wrong/previous
-* pCPU.  But if that happens the vCPU is guaranteed to do a
-* VMRUN (after being migrated) and thus will process pending
-* interrupts, i.e. a doorbell is not needed (and the spurious
-* one is harmless).
-*/
-   if (cpu != get_cpu())
-   wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, 
kvm_cpu_get_apicid(cpu));
-   put_cpu();
-   } else {
-   /*
-* Wake the vCPU if it was blocking.  KVM will then detect the
-* pending IRQ when checking if the vCPU has a wake event.
-*/
-   kvm_vcpu_wake_up(vcpu);
-   }
-
-   return 0;
-}
-
 bool avic_dy_apicv_has_pending_interrup

[PATCH 11/30] KVM: x86: SVM: use vmcb01 in avic_init_vmcb

2022-02-07 Thread Maxim Levitsky
Out of precation use vmcb01 when enabling host AVIC.

No functional change intended.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/avic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 4c2d622b3b9f0..c6072245f7fbb 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -167,7 +167,7 @@ int avic_vm_init(struct kvm *kvm)
 
 void avic_init_vmcb(struct vcpu_svm *svm)
 {
-   struct vmcb *vmcb = svm->vmcb;
+   struct vmcb *vmcb = svm->vmcb01.ptr;
struct kvm_svm *kvm_svm = to_kvm_svm(svm->vcpu.kvm);
phys_addr_t bpa = __sme_set(page_to_phys(svm->avic_backing_page));
phys_addr_t lpa = 
__sme_set(page_to_phys(kvm_svm->avic_logical_id_table_page));
-- 
2.26.3



[PATCH 12/30] KVM: x86: SVM: allow AVIC to co-exist with a nested guest running

2022-02-07 Thread Maxim Levitsky
Inhibit the AVIC of the vCPU that is running nested for the duration of the
nested run, so that all interrupts arriving from both its vCPU siblings
and from KVM are delivered using normal IPIs and cause that vCPU to vmexit.

Note that unlike normal AVIC inhibition, there is no need to
update the AVIC mmio memslot, because the nested guest uses its
own set of paging tables.
That also means that AVIC doesn't need to be inhibited VM wide.

Note that in the theory when a nested guest doesn't intercept
physical interrupts, we could continue using AVIC to deliver them
to it but don't bother doing so for now. Plus when nested AVIC
is implemented, the nested guest will likely use it, which will
not allow this optimization to be used

(can't use real AVIC to support both L1 and L2 at the same time)

Signed-off-by: Maxim Levitsky 
---
 arch/x86/include/asm/kvm-x86-ops.h |  1 +
 arch/x86/include/asm/kvm_host.h|  8 +++-
 arch/x86/kvm/svm/avic.c|  7 ++-
 arch/x86/kvm/svm/nested.c  | 15 ++-
 arch/x86/kvm/svm/svm.c | 31 +++---
 arch/x86/kvm/svm/svm.h |  1 +
 arch/x86/kvm/x86.c | 18 +++--
 7 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h 
b/arch/x86/include/asm/kvm-x86-ops.h
index 9e37dc3d88636..c0d8f351dcbc0 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -125,6 +125,7 @@ KVM_X86_OP_NULL(migrate_timers)
 KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP_NULL(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
+KVM_X86_OP_NULL(vcpu_has_apicv_inhibit_condition);
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_NULL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c371ee7e45f78..256539c0481c5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1039,7 +1039,6 @@ struct kvm_x86_msr_filter {
 
 #define APICV_INHIBIT_REASON_DISABLE0
 #define APICV_INHIBIT_REASON_HYPERV 1
-#define APICV_INHIBIT_REASON_NESTED 2
 #define APICV_INHIBIT_REASON_IRQWIN 3
 #define APICV_INHIBIT_REASON_PIT_REINJ  4
 #define APICV_INHIBIT_REASON_X2APIC5
@@ -1494,6 +1493,12 @@ struct kvm_x86_ops {
int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
 
void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
+
+   /*
+* Returns true if for some reason APICv (e.g guest mode)
+* must be inhibited on this vCPU
+*/
+   bool (*vcpu_has_apicv_inhibit_condition)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_nested_ops {
@@ -1784,6 +1789,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, 
gva_t gva,
 
 bool kvm_apicv_activated(struct kvm *kvm);
 void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
+bool vcpu_has_apicv_inhibit_condition(struct kvm_vcpu *vcpu);
 void kvm_request_apicv_update(struct kvm *kvm, bool activate,
  unsigned long bit);
 
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index c6072245f7fbb..8f23e7d239097 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -677,6 +677,12 @@ bool avic_dy_apicv_has_pending_interrupt(struct kvm_vcpu 
*vcpu)
return false;
 }
 
+bool avic_has_vcpu_inhibit_condition(struct kvm_vcpu *vcpu)
+{
+   return is_guest_mode(vcpu);
+}
+
+
 static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
 {
unsigned long flags;
@@ -888,7 +894,6 @@ bool avic_check_apicv_inhibit_reasons(ulong bit)
ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) |
  BIT(APICV_INHIBIT_REASON_ABSENT) |
  BIT(APICV_INHIBIT_REASON_HYPERV) |
- BIT(APICV_INHIBIT_REASON_NESTED) |
  BIT(APICV_INHIBIT_REASON_IRQWIN) |
  BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
  BIT(APICV_INHIBIT_REASON_X2APIC) |
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 39d280e7e80ef..ac9159b0618c7 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -551,11 +551,6 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm 
*svm)
 * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes.
 */
 
-   /*
-* Also covers avic_vapic_bar, avic_backing_page, avic_logical_id,
-* avic_physical_id.
-*/
-   WARN_ON(kvm_apicv_activated(svm->vcpu.kvm));
 
/* Copied from vmcb01.  msrpm_base can be overwritten later.  */
svm->vmcb->control.nested_ctl = svm->vmcb01.ptr->control.nested_ctl;
@@ -659,6 +654,9 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 
vmcb12_gpa,
 
svm_set_gif(svm, true);
 
+   if (kvm_vcpu_apicv_active(vcpu))
+   kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
+
return 0;
 }
 
@@ -923,6 +921,13 @@ int nested_sv

[PATCH 13/30] KVM: x86: lapic: don't allow to change APIC ID when apic acceleration is enabled

2022-02-07 Thread Maxim Levitsky
No normal guest has any reason to change physical APIC IDs, and
allowing this introduces bugs into APIC acceleration code.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/lapic.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index dd4e2888c244b..7ff695cab27b2 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2002,10 +2002,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 
reg, u32 val)
 
switch (reg) {
case APIC_ID:   /* Local APIC ID */
-   if (!apic_x2apic_mode(apic))
-   kvm_apic_set_xapic_id(apic, val >> 24);
-   else
+   if (apic_x2apic_mode(apic)) {
ret = 1;
+   break;
+   }
+   /*
+* Don't allow setting APIC ID with any APIC acceleration
+* enabled to avoid unexpected issues
+*/
+   if (enable_apicv && ((val >> 24) != apic->vcpu->vcpu_id)) {
+   kvm_vm_bugged(apic->vcpu->kvm);
+   break;
+   }
+
+   kvm_apic_set_xapic_id(apic, val >> 24);
break;
 
case APIC_TASKPRI:
@@ -2572,10 +2582,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
 static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
struct kvm_lapic_state *s, bool set)
 {
-   if (apic_x2apic_mode(vcpu->arch.apic)) {
-   u32 *id = (u32 *)(s->regs + APIC_ID);
-   u32 *ldr = (u32 *)(s->regs + APIC_LDR);
+   u32 *id = (u32 *)(s->regs + APIC_ID);
+   u32 *ldr = (u32 *)(s->regs + APIC_LDR);
 
+   if (!apic_x2apic_mode(vcpu->arch.apic)) {
+   /* Don't allow setting APIC ID with any APIC acceleration
+* enabled to avoid unexpected issues
+*/
+   if (enable_apicv && (*id >> 24) != vcpu->vcpu_id)
+   return -EINVAL;
+   } else {
if (vcpu->kvm->arch.x2apic_format) {
if (*id != vcpu->vcpu_id)
return -EINVAL;
-- 
2.26.3



[PATCH 14/30] KVM: x86: lapic: don't allow to change local apic id when using older x2apic api

2022-02-07 Thread Maxim Levitsky
KVM allowed to set non boot apic id via setting apic state
if using older non x2apic 32 bit apic id userspace api.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/lapic.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 7ff695cab27b2..aeddd68d31181 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2592,15 +2592,15 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
if (enable_apicv && (*id >> 24) != vcpu->vcpu_id)
return -EINVAL;
} else {
-   if (vcpu->kvm->arch.x2apic_format) {
-   if (*id != vcpu->vcpu_id)
-   return -EINVAL;
-   } else {
-   if (set)
-   *id >>= 24;
-   else
-   *id <<= 24;
-   }
+
+   if (!vcpu->kvm->arch.x2apic_format && set)
+   *id >>= 24;
+
+   if (*id != vcpu->vcpu_id)
+   return -EINVAL;
+
+   if (!vcpu->kvm->arch.x2apic_format && !set)
+   *id <<= 24;
 
/* In x2APIC mode, the LDR is fixed and based on the id */
if (set)
-- 
2.26.3



[PATCH 15/30] KVM: x86: SVM: remove avic's broken code that updated APIC ID

2022-02-07 Thread Maxim Levitsky
Now that KVM doesn't allow to change APIC ID in case AVIC is
enabled, remove buggy AVIC code that tried to do so.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/avic.c | 35 ---
 1 file changed, 35 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 8f23e7d239097..768252b3dfee6 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -440,35 +440,6 @@ static int avic_handle_ldr_update(struct kvm_vcpu *vcpu)
return ret;
 }
 
-static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu)
-{
-   u64 *old, *new;
-   struct vcpu_svm *svm = to_svm(vcpu);
-   u32 id = kvm_xapic_id(vcpu->arch.apic);
-
-   if (vcpu->vcpu_id == id)
-   return 0;
-
-   old = avic_get_physical_id_entry(vcpu, vcpu->vcpu_id);
-   new = avic_get_physical_id_entry(vcpu, id);
-   if (!new || !old)
-   return 1;
-
-   /* We need to move physical_id_entry to new offset */
-   *new = *old;
-   *old = 0ULL;
-   to_svm(vcpu)->avic_physical_id_cache = new;
-
-   /*
-* Also update the guest physical APIC ID in the logical
-* APIC ID table entry if already setup the LDR.
-*/
-   if (svm->ldr_reg)
-   avic_handle_ldr_update(vcpu);
-
-   return 0;
-}
-
 static void avic_handle_dfr_update(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -488,10 +459,6 @@ static int avic_unaccel_trap_write(struct vcpu_svm *svm)
AVIC_UNACCEL_ACCESS_OFFSET_MASK;
 
switch (offset) {
-   case APIC_ID:
-   if (avic_handle_apic_id_update(&svm->vcpu))
-   return 0;
-   break;
case APIC_LDR:
if (avic_handle_ldr_update(&svm->vcpu))
return 0;
@@ -584,8 +551,6 @@ int avic_init_vcpu(struct vcpu_svm *svm)
 
 void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu)
 {
-   if (avic_handle_apic_id_update(vcpu) != 0)
-   return;
avic_handle_dfr_update(vcpu);
avic_handle_ldr_update(vcpu);
 }
-- 
2.26.3



[PATCH 16/30] KVM: x86: SVM: allow to force AVIC to be enabled

2022-02-07 Thread Maxim Levitsky
Apparently on some systems AVIC is disabled in CPUID but still usable.

Allow the user to override the CPUID if the user is willing to
take the risk.

Signed-off-by: Maxim Levitsky 
---
 arch/x86/kvm/svm/svm.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 85035324ed762..b88ca7f07a0fc 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -202,6 +202,9 @@ module_param(tsc_scaling, int, 0444);
 static bool avic;
 module_param(avic, bool, 0444);
 
+static bool force_avic;
+module_param_unsafe(force_avic, bool, 0444);
+
 bool __read_mostly dump_invalid_vmcb;
 module_param(dump_invalid_vmcb, bool, 0644);
 
@@ -4839,10 +4842,14 @@ static __init int svm_hardware_setup(void)
nrips = false;
}
 
-   enable_apicv = avic = avic && npt_enabled && 
boot_cpu_has(X86_FEATURE_AVIC);
+   enable_apicv = avic = avic && npt_enabled && 
(boot_cpu_has(X86_FEATURE_AVIC) || force_avic);
 
if (enable_apicv) {
-   pr_info("AVIC enabled\n");
+   if (!boot_cpu_has(X86_FEATURE_AVIC)) {
+   pr_warn("AVIC is not supported in CPUID but force 
enabled");
+   pr_warn("Your system might crash and burn");
+   } else
+   pr_info("AVIC enabled\n");
 
amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
} else {
-- 
2.26.3



  1   2   3   >