[RFC PATCH 0/3] Register read/write tracing with dynamic debug and pstore

2018-08-03 Thread Sai Prakash Ranjan
Hi,

This patch series adds a new tracing facility for register reads and writes 
called
Register Trace Buffer(RTB).

We also add pstore support through which we can save all register read/write 
logs into a
persistent ram buffer that can be dumped after reboot.

It can be used to determine from where register was read/written before 
unclocked
access or some kind of bus hang or an unexpected reset caused by some buggy 
driver
which happens a lot during initial development stages.

In addition to this, we provide dynamic debug support to filter out unwanted 
logs
and limit trace to only specific files or directories since there can be aweful
lot of register events and we will be interested only in specific drivers or 
subsystems
which we will be working on. Last few RTB entries will give us the hint for 
debugging.
With dynamic debug, we are also reducing the overhead of tracing considerably.

Also as a bonus, this tracing can be extended to include IRQ, printk, context 
switch
and lot other things with proper hooks. It can be very helpful for real case 
debug scenarios.

Below is a simple example of identifying cause for bus hang in qcom mdp tested 
on db410c.
This hang was intentionally introduced just to show the usecase of RTB.
The module used can be found here: 
https://github.com/saiprakash-ranjan/Bus-Hang which does
an unclocked access and will reset db410c and later logs can be viewed through 
pstore.

Note: I just copied bus_hang.c to drivers/soc/qcom and built it.

1) Set bootargs with dyndbg parameter as below:

   # dyndbg="file drivers/soc/qcom/* +p"

2) Bus hang by reading below debugfs entry with bus_hang module.

   # cat /sys/kernel/debug/hang/bus_hang

3) After restart, we can find the cause in last entry i.e. 
(bus_hang_mdp+0x98/0xb0)

   # cat /sys/fs/pstore/rtb-ramoops-0
   [LOGK_WRITEL ] ts:1373101930  data:0cd065a4
qcom_smsm_probe+0x51c/0x668
   [LOGK_WRITEL ] ts:1373311878  data:0cd06608
qcom_smsm_probe+0x51c/0x668
   [LOGK_READL  ] ts:18177142294  data:0ab85040   
bus_hang_mdp+0x98/0xb0

4) Offending register access found as below:

   # (gdb)
   # (gdb) list *(bus_hang_mdp+0x98)
   # 0x0867cdc8 is in bus_hang_mdp (drivers/soc/qcom/bus_hang.c:10).
   # 5   static int bus_hang_mdp(void *data, u64 *val)
   # 6   {
   # 7   void *p = ioremap(0x01a01000, SZ_4K);
   # 8   unsigned int a;
   # 9
   # 10  a = __raw_readl((void *)((unsigned long)p + 0x40));  <
   # 11
   # 12  *val = a;
   # 13
   # 14  return 0;
   # (gdb)

There will be a lot more real usecases where RTB can be used. Maybe we can test 
on other boards as well.

Patchwise one line description is given below:

This trace module is based on RTB driver in CAF.
Link: 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/kernel/trace/msm_rtb.c

Patch 1 provides the api called uncached_logk which can be used to log register 
accesses.

Patch 2 adds the pstore support for viewing the logs.

Patch 3 adds dynamic debug support to filter the register readl/writel access.

Sai Prakash Ranjan (3):
  tracing: Add support for logging data to uncached buffer
  pstore: Add register readl/writel tracing support
  dynamic_debug: Add support for dynamic register trace

 .../bindings/reserved-memory/ramoops.txt  |   7 +-
 arch/arm64/include/asm/io.h   |  93 ++
 fs/pstore/Kconfig |  12 ++
 fs/pstore/Makefile|   1 +
 fs/pstore/inode.c |  71 +++-
 fs/pstore/internal.h  |   8 +
 fs/pstore/platform.c  |   4 +
 fs/pstore/ram.c   |  42 -
 fs/pstore/rtb.c   |  45 +
 include/linux/dynamic_debug.h |  10 ++
 include/linux/pstore.h|   2 +
 include/linux/pstore_ram.h|   1 +
 include/linux/rtb.h   |  31 
 kernel/trace/Kconfig  |   8 +
 kernel/trace/Makefile |   2 +
 kernel/trace/trace_rtb.c  | 163 ++
 16 files changed, 495 insertions(+), 5 deletions(-)
 create mode 100644 fs/pstore/rtb.c
 create mode 100644 include/linux/rtb.h
 create mode 100644 kernel/trace/trace_rtb.c

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[RFC PATCH 1/3] tracing: Add support for logging data to uncached buffer

2018-08-03 Thread Sai Prakash Ranjan
Add RTB trace support to write data to a small uncached buffer.
When a system reset occurs, valuable data may still be remaining
in the cache (e.g. last printks) and this data will probably
be lost, giving an incomplete picture of what the system was last
doing. By logging useful information to this uncached region
(e.g. readl/writel, last printk), a better picture of what the
system was last doing can be obtained.

Dummy platform device is created to use dma api to allocate
uncached memory.

We add an additional property called rtb-size in ramoops device
tree node for pstore RTB buffer size and use the same in the
trace module. DT documentation has been modified to include this.

Information logged in this buffer include log type(readl/writel),
timestamp, extra data from the caller, caller ip and name.
This can be extented as needed to include more options(e.g. cpu)
for better debugging.

Also RTB panic notifier with high priority is used to make sure that RTB
is disabled right after a kernel panic so that log buffer could be
prevented from being flooded with some I/O operations.

This is based on RTB driver in CAF. Link below:
  * https://source.codeaurora.org/quic/la/kernel/msm-4.9
Modified to support pstore for viewing traces.

Signed-off-by: Sai Prakash Ranjan 
---
 .../bindings/reserved-memory/ramoops.txt  |   7 +-
 include/linux/rtb.h   |  24 +++
 kernel/trace/Kconfig  |   7 +
 kernel/trace/Makefile |   2 +
 kernel/trace/trace_rtb.c  | 160 ++
 5 files changed, 198 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/rtb.h
 create mode 100644 kernel/trace/trace_rtb.c

diff --git a/Documentation/devicetree/bindings/reserved-memory/ramoops.txt 
b/Documentation/devicetree/bindings/reserved-memory/ramoops.txt
index 0eba562fe5c6..f99019d1119b 100644
--- a/Documentation/devicetree/bindings/reserved-memory/ramoops.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/ramoops.txt
@@ -14,8 +14,8 @@ Any remaining space will be used for a circular buffer of 
oops and panic
 records.  These records have a configurable size, with a size of 0 indicating
 that they should be disabled.
 
-At least one of "record-size", "console-size", "ftrace-size", or "pmsg-size"
-must be set non-zero, but are otherwise optional as listed below.
+At least one of "record-size", "console-size", "ftrace-size", "pmsg-size" or
+"rtb-size" must be set non-zero, but are otherwise optional as listed below.
 
 
 Required properties:
@@ -42,6 +42,9 @@ Optional properties:
 - pmsg-size: size in bytes of log buffer reserved for userspace messages
   (defaults to 0: disabled)
 
+- rtb-size: size in bytes of log buffer reserved for RTB buffer traces.
+  (defaults to 0: disabled)
+
 - unbuffered: if present, use unbuffered mappings to map the reserved region
   (defaults to buffered mappings)
 
diff --git a/include/linux/rtb.h b/include/linux/rtb.h
new file mode 100644
index ..a969bd020466
--- /dev/null
+++ b/include/linux/rtb.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _RTB_H
+#define _RTB_H
+
+struct rtb_layout {
+   const char *log_type;
+   u32 idx;
+   u64 caller;
+   u64 data;
+   u64 timestamp;
+} __attribute__ ((__packed__));
+
+#if defined(CONFIG_RTB)
+void uncached_logk(const char *log_type, void *data);
+int rtb_init(void);
+void rtb_exit(void);
+#else
+static inline void uncached_logk(const char *log_type,
+   void *data) { }
+static inline int rtb_init(void) { return 0; }
+static inline void rtb_exit(void) { }
+#endif
+
+#endif /* _RTB_H */
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index dcc0166d1997..9bbf7d1f60aa 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -722,6 +722,13 @@ config TRACING_EVENTS_GPIO
help
  Enable tracing events for gpio subsystem
 
+config RTB
+   bool "Register Trace Buffer"
+   help
+ Add support for logging different events to a small uncached
+ region. This is designed to aid in debugging reset cases where the
+ caches may not be flushed before the target resets.
+
 endif # FTRACE
 
 endif # TRACING_SUPPORT
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index e2538c7638d4..e27b2119abce 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -72,4 +72,6 @@ obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
 
 obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
 
+obj-$(CONFIG_RTB) += trace_rtb.o
+
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_rtb.c b/kernel/trace/trace_rtb.c
new file mode 100644
index ..e8c24db71a2d
--- /dev/null
+++ b/kernel/trace/trace_rtb.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 The Linux Foundation. All rights r

[RFC PATCH 2/3] pstore: Add register readl/writel tracing support

2018-08-03 Thread Sai Prakash Ranjan
readl/writel are typically used for reading from memory
mapped registers, which can cause hangs if accessed
unclocked. Tracing these events can help in debugging
various issues faced during initial development.

We log this trace information in persistent ram buffer which
can be viewed after reset.

We use pstore_rtb_call() to write the RTB log to pstore.
RTB buffer size is taken from ramoops dt node with additional
property called rtb-size.

For reading the trace after mounting pstore, rtb-ramoops entry
can be seen in /sys/fs/pstore/ as in below sample output.

Sample output of tracing register reads/writes in drivers:

 # mount -t pstore pstore /sys/fs/pstore
 # tail /sys/fs/pstore/rtb-ramoops-0
 [LOGK_READL  ] ts:36468476204  data:0800d0fc
gic_check_gicv2+0x58/0x60
 [LOGK_WRITEL ] ts:36468477715  data:0800d000
gic_cpu_if_up+0xc4/0x110
 [LOGK_READL  ] ts:36468478548  data:0800d000
gic_cpu_if_up+0xf0/0x110
 [LOGK_WRITEL ] ts:36468480319  data:0800d000
gic_cpu_if_up+0xc4/0x110
 [LOGK_READL  ] ts:36468481048  data:0800d00c
gic_handle_irq+0xac/0x128
 [LOGK_WRITEL ] ts:36468482923  data:0800d010
gic_handle_irq+0x124/0x128
 [LOGK_READL  ] ts:36468483184  data:0800d00c
gic_handle_irq+0xac/0x128
 [LOGK_WRITEL ] ts:36468485215  data:0800d010
gic_handle_irq+0x124/0x128
 [LOGK_READL  ] ts:36468486309  data:0800d00c
gic_handle_irq+0xac/0x128
 [LOGK_WRITEL ] ts:36468488236  data:0800d010
gic_handle_irq+0x124/0x128

Output has below 5 fields:

 * Log type, Timestamp, Data from caller which is the address of
   read/write, Caller ip and Caller name.

Signed-off-by: Sai Prakash Ranjan 
---
 fs/pstore/Kconfig  | 12 +++
 fs/pstore/Makefile |  1 +
 fs/pstore/inode.c  | 71 +-
 fs/pstore/internal.h   |  8 +
 fs/pstore/platform.c   |  4 +++
 fs/pstore/ram.c| 42 --
 fs/pstore/rtb.c| 45 
 include/linux/pstore.h |  2 ++
 include/linux/pstore_ram.h |  1 +
 include/linux/rtb.h|  7 
 kernel/trace/trace_rtb.c   |  3 ++
 11 files changed, 193 insertions(+), 3 deletions(-)
 create mode 100644 fs/pstore/rtb.c

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 09c19ef91526..395c1ee55de0 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -113,6 +113,18 @@ config PSTORE_PMSG
 
  If unsure, say N.
 
+config PSTORE_RTB
+   bool "Log register operations like read/write"
+   depends on PSTORE && PSTORE!=m
+   depends on RTB
+   help
+ When this option is enabled, rtb driver will log all register
+ reads/writes into a persistent ram buffer that can be decoded
+ and dumped after reboot through pstore filesystem. It can be used
+ to debug readl/writel access.
+
+ If unsure, say N.
+
 config PSTORE_FTRACE
bool "Persistent function tracer"
depends on PSTORE
diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
index 967b5891f325..c772c9420f57 100644
--- a/fs/pstore/Makefile
+++ b/fs/pstore/Makefile
@@ -9,6 +9,7 @@ pstore-objs += inode.o platform.o
 pstore-$(CONFIG_PSTORE_FTRACE) += ftrace.o
 
 pstore-$(CONFIG_PSTORE_PMSG)   += pmsg.o
+pstore-$(CONFIG_PSTORE_RTB)+= rtb.o
 
 ramoops-objs += ram.o ram_core.o
 obj-$(CONFIG_PSTORE_RAM)   += ramoops.o
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 5fcb845b9fec..469e65ec3f07 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -57,6 +58,7 @@ struct pstore_ftrace_seq_data {
 };
 
 #define REC_SIZE sizeof(struct pstore_ftrace_record)
+#define REC_SIZE_RTB sizeof(struct rtb_layout)
 
 static void free_pstore_private(struct pstore_private *private)
 {
@@ -131,13 +133,73 @@ static const struct seq_operations pstore_ftrace_seq_ops 
= {
.show   = pstore_ftrace_seq_show,
 };
 
+static void *pstore_rtb_seq_start(struct seq_file *s, loff_t *pos)
+{
+   struct pstore_private *ps = s->private;
+   struct pstore_ftrace_seq_data *rdata;
+
+   rdata = kzalloc(sizeof(*rdata), GFP_KERNEL);
+   if (!rdata)
+   return NULL;
+
+   rdata->off = ps->total_size % REC_SIZE_RTB;
+   rdata->off += *pos * REC_SIZE_RTB;
+   if (rdata->off + REC_SIZE_RTB > ps->total_size) {
+   kfree(rdata);
+   return NULL;
+   }
+
+   return rdata;
+}
+
+static void pstore_rtb_seq_stop(struct seq_file *s, void *v)
+{
+   kfree(v);
+}
+
+static void *pstore_rtb_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+   struct pstore_private *ps = s->private;
+   struct pstore_ftrace_seq_data *rdata = v;
+
+   rdata->off += REC_SIZE_RTB;
+   if (rdata->off + REC_SIZE_RTB > ps->total_size)
+

[RFC PATCH 3/3] dynamic_debug: Add support for dynamic register trace

2018-08-03 Thread Sai Prakash Ranjan
Introduce dynamic debug filtering mechanism to register
tracing as dynamic_rtb() which will reduce a lot of
overhead otherwise of tracing all the register reads/writes
in all files.

Now we can just specify the file name or any wildcard pattern
as any other dynamic debug facility in bootargs and dynamic rtb
will just trace them and the output can be seen in pstore.

TODO: Now we use same 'p' flag but will add a separate flag for register trace
later.

Example for tracing all register reads/writes in drivers/soc/qcom/* below:

  # dyndbg="file drivers/soc/qcom/* +p" in bootargs
  # reboot -f
  # mount -t pstore pstore /sys/fs/pstore
  # cat /sys/fs/pstore/rtb-ramoops-0
[LOGK_WRITEL ] ts:1373030419  data:0d5065a4
qcom_smsm_probe+0x51c/0x668
[LOGK_WRITEL ] ts:1373360576  data:0d506608
qcom_smsm_probe+0x51c/0x668

Also we add uncached_logk api to readl/writel definitions for arm64
as of now. This can be extended to arm as well later for tracing.

Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/include/asm/io.h   | 93 +++
 include/linux/dynamic_debug.h | 10 
 kernel/trace/Kconfig  |  1 +
 3 files changed, 104 insertions(+)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 35b2e50f17fb..e5f68b1b00a0 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -22,6 +22,7 @@
 #ifdef __KERNEL__
 
 #include 
+#include 
 
 #include 
 #include 
@@ -36,6 +37,7 @@
 /*
  * Generic IO read/write.  These perform native-endian accesses.
  */
+#if !defined(CONFIG_RTB)
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
 {
@@ -104,6 +106,97 @@ static inline u64 __raw_readq(const volatile void __iomem 
*addr)
 : "=r" (val) : "r" (addr));
return val;
 }
+#else
+static inline void __raw_writeb_log(u8 val, volatile void __iomem *addr)
+{
+   asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
+}
+
+static inline void __raw_writew_log(u16 val, volatile void __iomem *addr)
+{
+   asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
+}
+
+static inline void __raw_writel_log(u32 val, volatile void __iomem *addr)
+{
+   asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
+}
+
+static inline void __raw_writeq_log(u64 val, volatile void __iomem *addr)
+{
+   asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
+}
+
+static inline u8 __raw_readb_log(const volatile void __iomem *addr)
+{
+   u8 val;
+
+   asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
+"ldarb %w0, [%1]",
+ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+: "=r" (val) : "r" (addr));
+   return val;
+}
+
+static inline u16 __raw_readw_log(const volatile void __iomem *addr)
+{
+   u16 val;
+
+   asm volatile(ALTERNATIVE("ldrh %w0, [%1]",
+"ldarh %w0, [%1]",
+ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+: "=r" (val) : "r" (addr));
+   return val;
+}
+
+static inline u32 __raw_readl_log(const volatile void __iomem *addr)
+{
+   u32 val;
+
+   asm volatile(ALTERNATIVE("ldr %w0, [%1]",
+"ldar %w0, [%1]",
+ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+: "=r" (val) : "r" (addr));
+   return val;
+}
+
+static inline u64 __raw_readq_log(const volatile void __iomem *addr)
+{
+   u64 val;
+
+   asm volatile(ALTERNATIVE("ldr %0, [%1]",
+"ldar %0, [%1]",
+ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
+: "=r" (val) : "r" (addr));
+   return val;
+}
+
+#define __raw_write_logged(v, a, _t) ({\
+   volatile void __iomem *_a = (a);\
+   void *_addr = (void __force *)(_a); \
+   dynamic_rtb("LOGK_WRITEL", _addr);  \
+   __raw_write##_t##_log((v), _a); \
+   })
+
+#define __raw_writeb(v, a) __raw_write_logged((v), a, b)
+#define __raw_writew(v, a) __raw_write_logged((v), a, w)
+#define __raw_writel(v, a) __raw_write_logged((v), a, l)
+#define __raw_writeq(v, a) __raw_write_logged((v), a, q)
+
+#define __raw_read_logged(a, _l, _t)({ \
+   _t __a; \
+   const volatile void __iomem *_a = (const volatile void __iomem *)(a);\
+   void *_addr = (void __force *)(_a); \
+   dynamic_rtb(&q

Re: [RFC PATCH v2 2/3] pstore: Add register read/write{b,w,l,q} tracing support

2018-08-28 Thread Sai Prakash Ranjan

On 8/27/2018 9:45 PM, Steven Rostedt wrote:

On Sat, 25 Aug 2018 12:54:07 +0530
Sai Prakash Ranjan  wrote:



Ftrace does not trace __raw{read,write}{b,l,w,q}() functions. I am not
sure why and how it is filtered out because I do not see any notrace
flag in those functions, maybe that whole directory is filtered out.
So adding this functionality to ftrace would mean removing the notrace
for these functions i.e., something like using
__raw{read,write}{b,l,w,q}() as the available filter functions. Also
pstore ftrace does not filter functions to trace I suppose?


It's not traced because it is inlined. Simply make the __raw_read
function a normal function and it will be traced. And then you could
use ftrace to read the function.

If this has to be per arch, you can register your callback with the
REGS flags, and pt_regs will be passed to your callback function you
attached to __raw_read*() as if you inserted a break point at that
location, and you can get any reg data you want there.




Thank you very much for the information Steven. Ok so we can get 
function parameters with pt_regs.




Coming to the reason as to why it would be good to keep this separate
from ftrace would be:
* Ftrace can get ip and parent ip, but suppose we need extra data (field
data) as in above sample output we would not be able to get through ftrace.


As mentioned above, you can get regs (and ftrace is being expanded now
to get parameters of functions).


You mean there is another way to get parameters other than regs?



* Although this patch is for tracing register read/write, we can easily
add more functionality since we have dynamic_rtb api which can be hooked
to functions and start tracing events(IRQ, Context ID) something similar
to tracepoints.
Initially thought of having tracepoints for logging register read/write
but I do not know if we can export tracepoint data to pstore since the
main usecase is to debug unknown resets and hangs.


I don't know why not? We have read/write tracepoints for
read/write_msr() calls in x86.

Anything can add a hook to the callback of the tracepoints, and use
that infrastructure instead of creating yet another dynamic code
modification infrastructure.

Thanks for pointing out to read/write_msr, I checked it and was able to 
implement something similar for arm64. But still can we export 
tracepoint data to pstore because we need to debug reset cases and for 
that pstore is of real importance?. If so then it would be great to have 
various events logged into pstore which can be a lot of help for debugging.


Also with the current dynamic filtering of read/write(PATCH 3/3), it is 
a lot easier to filter register read/write since we use dynamic debug 
framework which provides file, function and line level filtering 
capacity. Maybe if we can add something like this to trace events it 
would be better?


- Sai Prakash


Re: [RFC PATCH v2 2/3] pstore: Add register read/write{b,w,l,q} tracing support

2018-08-28 Thread Sai Prakash Ranjan

On 8/28/2018 9:32 PM, Steven Rostedt wrote:

On Tue, 28 Aug 2018 18:47:33 +0530
Sai Prakash Ranjan  wrote:


On 8/27/2018 9:45 PM, Steven Rostedt wrote:

On Sat, 25 Aug 2018 12:54:07 +0530
Sai Prakash Ranjan  wrote:

   

Ftrace does not trace __raw{read,write}{b,l,w,q}() functions. I am not
sure why and how it is filtered out because I do not see any notrace
flag in those functions, maybe that whole directory is filtered out.
So adding this functionality to ftrace would mean removing the notrace
for these functions i.e., something like using
__raw{read,write}{b,l,w,q}() as the available filter functions. Also
pstore ftrace does not filter functions to trace I suppose?


It's not traced because it is inlined. Simply make the __raw_read
function a normal function and it will be traced. And then you could
use ftrace to read the function.

If this has to be per arch, you can register your callback with the
REGS flags, and pt_regs will be passed to your callback function you
attached to __raw_read*() as if you inserted a break point at that
location, and you can get any reg data you want there.

  


Thank you very much for the information Steven. Ok so we can get
function parameters with pt_regs.


Yes.





Coming to the reason as to why it would be good to keep this separate
from ftrace would be:
* Ftrace can get ip and parent ip, but suppose we need extra data (field
data) as in above sample output we would not be able to get through ftrace.


As mentioned above, you can get regs (and ftrace is being expanded now
to get parameters of functions).
   

You mean there is another way to get parameters other than regs?


No, but you could register a callback function to be called when a
function is hit, and the pt_regs are passed to it. We are working on
getting parameters from the pt_regs (see this patch:
  http://lkml.kernel.org/r/152465885737.26224.2822487520472783854.stgit@devbox)


Cool, thanks for the link.




* Although this patch is for tracing register read/write, we can easily
add more functionality since we have dynamic_rtb api which can be hooked
to functions and start tracing events(IRQ, Context ID) something similar
to tracepoints.
Initially thought of having tracepoints for logging register read/write
but I do not know if we can export tracepoint data to pstore since the
main usecase is to debug unknown resets and hangs.


I don't know why not? We have read/write tracepoints for
read/write_msr() calls in x86.

Anything can add a hook to the callback of the tracepoints, and use
that infrastructure instead of creating yet another dynamic code
modification infrastructure.
   

Thanks for pointing out to read/write_msr, I checked it and was able to
implement something similar for arm64. But still can we export
tracepoint data to pstore because we need to debug reset cases and for
that pstore is of real importance?. If so then it would be great to have
various events logged into pstore which can be a lot of help for debugging.

Also with the current dynamic filtering of read/write(PATCH 3/3), it is
a lot easier to filter register read/write since we use dynamic debug
framework which provides file, function and line level filtering
capacity. Maybe if we can add something like this to trace events it
would be better?


I would recommend using the tracepoint infrastructure. Note,
tracepoints and trace events are two different things. Trace events use
tracepoints, and you use trace events to create tracepoints, thus they
are tightly coupled. But once a tracepoint exists, anything can connect
to them without needing to use the trace event.

Let's look at the read_msr trace event. Because it is in a header, to
avoid "include hell" we open code some of it:

static inline unsigned long long native_read_msr(unsigned int msr)
{
unsigned long long val;

val = __rdmsr(msr);

if (msr_tracepoint_active(__tracepoint_read_msr))
do_trace_read_msr(msr, val, 0);

return val;
}

Where:

#ifdef CONFIG_TRACEPOINTS
#define msr_tracepoint_active(t) static_key_false(&(t).key)
#else
#define msr_tracepoint_active(t) false
#endif

We have to open code the access to the tracepoint.key because msr.h is
used in a lot of critical headers, we couldn't use the normal
tracepoint.h header here.

The "static_key_false()" is a jump label that is just a nop. When the
static_key is enabled, the nop is converted to a static "jmp" to the
code that calls "do_trace_read_msr()". This is a function call to a
function defined in msr.c (where we can do proper includes), and all
that does is call the tracepoint "trace_read_msr()", which is also a
static key that, when enabled, will iterate over a list of functions it
should call with the defined parameters (msr, val, failed).

When defining the trace event for "read_msr", it creates the tracepoint
"trace_read_msr()" and we place it in this do_trace_read_m

Re: [RFC PATCH 3/3] dynamic_debug: Add support for dynamic register trace

2018-08-08 Thread Sai Prakash Ranjan

On 8/7/2018 10:27 PM, Will Deacon wrote:

On Fri, Aug 03, 2018 at 07:58:44PM +0530, Sai Prakash Ranjan wrote:

Introduce dynamic debug filtering mechanism to register
tracing as dynamic_rtb() which will reduce a lot of
overhead otherwise of tracing all the register reads/writes
in all files.

Now we can just specify the file name or any wildcard pattern
as any other dynamic debug facility in bootargs and dynamic rtb
will just trace them and the output can be seen in pstore.

TODO: Now we use same 'p' flag but will add a separate flag for register trace
later.

Example for tracing all register reads/writes in drivers/soc/qcom/* below:

   # dyndbg="file drivers/soc/qcom/* +p" in bootargs
   # reboot -f
   # mount -t pstore pstore /sys/fs/pstore
   # cat /sys/fs/pstore/rtb-ramoops-0
 [LOGK_WRITEL ] ts:1373030419  data:0d5065a4
qcom_smsm_probe+0x51c/0x668
 [LOGK_WRITEL ] ts:1373360576  data:0d506608
qcom_smsm_probe+0x51c/0x668

Also we add uncached_logk api to readl/writel definitions for arm64
as of now. This can be extended to arm as well later for tracing.

Signed-off-by: Sai Prakash Ranjan 
---
  arch/arm64/include/asm/io.h   | 93 +++


Putting all of this in the arch code, which basically duplicates everything,
feels very wrong to me. Perhaps take a look at the ongoing work for
instrumenting the atomics and take some inspiration from there?

Ideally, the architecture just needs to provide the low-level primivites
(which it already does) and the core can generate instruments versions if
required.



Hi Will,

Thanks for the review. Will look at instrumented atomics implementation 
and get back. Let me know if anything else can be improved.


 - Sai Prakash


[RFC PATCH v2 1/3] tracing: Add support for logging data to uncached buffer

2018-08-24 Thread Sai Prakash Ranjan
Add RTB trace support to write data to a small uncached buffer.
When a system reset occurs, valuable data may still be remaining
in the cache (e.g. last printks) and this data will probably
be lost, giving an incomplete picture of what the system was last
doing. By logging useful information to this uncached region
(e.g. read/write{b,w,l,q}, last printk), a better picture of what the
system was last doing can be obtained.

Dummy platform device is created to use dma api to allocate
uncached memory. Also initialize dma_mask and coherent_dma_mask after
commit 4d8bde883bfb ("OF: Don't set default coherent DMA mask").

We add an additional property called rtb-size in ramoops device
tree node for pstore RTB buffer size and use the same in the
trace module. DT documentation has been modified to include this.

Information logged in this buffer include log type(read/write{b,w,l,q}),
timestamp, extra data from the caller, caller ip and name.
This can be extented as needed to include more options(e.g. cpu)
for better debugging.

Also RTB panic notifier with high priority is used to make sure that RTB
is disabled right after a kernel panic so that log buffer could be
prevented from being flooded with some I/O operations.

This is based on RTB driver in CAF. Link below:
  * https://source.codeaurora.org/quic/la/kernel/msm-4.9
Modified to support pstore for viewing traces.

Signed-off-by: Sai Prakash Ranjan 
---
 .../bindings/reserved-memory/ramoops.txt  |   7 +-
 include/linux/rtb.h   |  24 +++
 kernel/trace/Kconfig  |   7 +
 kernel/trace/Makefile |   2 +
 kernel/trace/trace_rtb.c  | 143 ++
 5 files changed, 181 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/rtb.h
 create mode 100644 kernel/trace/trace_rtb.c

diff --git a/Documentation/devicetree/bindings/reserved-memory/ramoops.txt 
b/Documentation/devicetree/bindings/reserved-memory/ramoops.txt
index 0eba562fe5c6..f99019d1119b 100644
--- a/Documentation/devicetree/bindings/reserved-memory/ramoops.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/ramoops.txt
@@ -14,8 +14,8 @@ Any remaining space will be used for a circular buffer of 
oops and panic
 records.  These records have a configurable size, with a size of 0 indicating
 that they should be disabled.
 
-At least one of "record-size", "console-size", "ftrace-size", or "pmsg-size"
-must be set non-zero, but are otherwise optional as listed below.
+At least one of "record-size", "console-size", "ftrace-size", "pmsg-size" or
+"rtb-size" must be set non-zero, but are otherwise optional as listed below.
 
 
 Required properties:
@@ -42,6 +42,9 @@ Optional properties:
 - pmsg-size: size in bytes of log buffer reserved for userspace messages
   (defaults to 0: disabled)
 
+- rtb-size: size in bytes of log buffer reserved for RTB buffer traces.
+  (defaults to 0: disabled)
+
 - unbuffered: if present, use unbuffered mappings to map the reserved region
   (defaults to buffered mappings)
 
diff --git a/include/linux/rtb.h b/include/linux/rtb.h
new file mode 100644
index ..a969bd020466
--- /dev/null
+++ b/include/linux/rtb.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _RTB_H
+#define _RTB_H
+
+struct rtb_layout {
+   const char *log_type;
+   u32 idx;
+   u64 caller;
+   u64 data;
+   u64 timestamp;
+} __attribute__ ((__packed__));
+
+#if defined(CONFIG_RTB)
+void uncached_logk(const char *log_type, void *data);
+int rtb_init(void);
+void rtb_exit(void);
+#else
+static inline void uncached_logk(const char *log_type,
+   void *data) { }
+static inline int rtb_init(void) { return 0; }
+static inline void rtb_exit(void) { }
+#endif
+
+#endif /* _RTB_H */
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index c042a455afc6..fd46c7e0016f 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -774,6 +774,13 @@ config TRACING_EVENTS_GPIO
help
  Enable tracing events for gpio subsystem
 
+config RTB
+   bool "Register Trace Buffer"
+   help
+ Add support for logging different events to a small uncached
+ region. This is designed to aid in debugging reset cases where the
+ caches may not be flushed before the target resets.
+
 endif # FTRACE
 
 endif # TRACING_SUPPORT
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 98d53b39a8ee..0b8167124354 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -78,4 +78,6 @@ obj-$(CONFIG_UPROBE_EVENTS) += trace_uprobe.o
 
 obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
 
+obj-$(CONFIG_RTB) += trace_rtb.o
+
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_rtb.c b/kernel/trace/trace_rtb.c
new file mode 100644
index ..3e0a85e7b504
--- /dev/null
+

[RFC PATCH v2 0/3] Register read/write tracing with dynamic debug and pstore

2018-08-24 Thread Sai Prakash Ranjan
Hi,

This patch series adds a new tracing facility for register reads and writes 
called
Register Trace Buffer(RTB).

We also add pstore support through which we can save all register read/write 
logs into a
persistent ram buffer that can be dumped after reboot.

It can be used to determine from where register was read/written before 
unclocked
access or some kind of bus hang or an unexpected reset caused by some buggy 
driver
which happens a lot during initial development stages.

In addition to this, we provide dynamic debug support to filter out unwanted 
logs
and limit trace to only specific files or directories since there can be aweful
lot of register events and we will be interested only in specific drivers or 
subsystems
which we will be working on. Last few RTB entries will give us the hint for 
debugging.
With dynamic debug, we are also reducing the overhead of tracing considerably.

Also as a bonus, this tracing can be extended to include IRQ, printk, context 
switch
and lot other things with proper hooks. It can be very helpful for real case 
debug scenarios.

Below is a simple example of identifying cause for bus hang in qcom mdp tested 
on db410c.
This hang was intentionally introduced just to show the usecase of RTB.
The module used can be found here: 
https://github.com/saiprakash-ranjan/Bus-Hang which does
an unclocked access and will reset db410c and later logs can be viewed through 
pstore.

Note: I just copied bus_hang.c to drivers/soc/qcom and built it.

1) Set bootargs with dyndbg parameter as below:

   # dyndbg="file drivers/soc/qcom/* +p"

2) Bus hang by reading below debugfs entry with bus_hang module.

   # cat /sys/kernel/debug/hang/bus_hang

3) After restart, we can find the cause in last entry i.e. 
(bus_hang_mdp+0x98/0xb0)

   # cat /sys/fs/pstore/rtb-ramoops-0
   [LOGK_WRITE] ts:1373101930  data:0cd065a4
qcom_smsm_probe+0x51c/0x668
   [LOGK_WRITE] ts:1373311878  data:0cd06608
qcom_smsm_probe+0x51c/0x668
   [LOGK_READ ] ts:18177142294  data:0ab85040   
bus_hang_mdp+0x98/0xb0

4) Offending register access found as below:

   # (gdb)
   # (gdb) list *(bus_hang_mdp+0x98)
   # 0x0867cdc8 is in bus_hang_mdp (drivers/soc/qcom/bus_hang.c:10).
   # 5   static int bus_hang_mdp(void *data, u64 *val)
   # 6   {
   # 7   void *p = ioremap(0x01a01000, SZ_4K);
   # 8   unsigned int a;
   # 9
   # 10  a = __raw_readl((void *)((unsigned long)p + 0x40));  <
   # 11
   # 12  *val = a;
   # 13
   # 14  return 0;
   # (gdb)

There will be a lot more real usecases where RTB can be used. Maybe we can test 
on other boards as well.

This trace module is based on RTB driver in CAF.
Link: 
https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/kernel/trace/msm_rtb.c

Patchwise one line description is given below:

Patch 1 provides the api called uncached_logk which is then called 
within dynamic_rtb for logging register accesses, i.e. (read/write{b,w,l,q})

Patch 2 adds the pstore support for displaying the logs after reset.

Patch 3 adds dynamic debug support to filter the register read/write{b,w,l,q} 
access.
Also this patch adds asm-generic/io-instrumented.h file for keeping 
instrumentation
away from arch code as suggested by Will Deacon.

v2:
 * Addressed Will's comment to keep instrumentation out of arch code and also
   remove duplicate code
 * Addressed Steven's comments regarding code cleanup
 * Fixed commit description to be more specific about register accesses i.e.,
   use read/write{b,l,w,q} instead of readl/writel since we will be tracing all

Sai Prakash Ranjan (3):
  tracing: Add support for logging data to uncached buffer
  pstore: Add register read/write{b,w,l,q} tracing support
  dynamic_debug: Add support for dynamic register trace

 .../bindings/reserved-memory/ramoops.txt  |   7 +-
 arch/arm64/include/asm/io.h   |  26 ++--
 fs/pstore/Kconfig |  12 ++
 fs/pstore/Makefile|   1 +
 fs/pstore/inode.c |  71 -
 fs/pstore/internal.h  |   8 +
 fs/pstore/platform.c  |   4 +
 fs/pstore/ram.c   |  42 -
 fs/pstore/rtb.c   |  45 ++
 include/asm-generic/io-instrumented.h |  32 
 include/linux/dynamic_debug.h |  13 ++
 include/linux/pstore.h|   2 +
 include/linux/pstore_ram.h|   1 +
 include/linux/rtb.h   |  31 
 kernel/trace/Kconfig  |   8 +
 kernel/trace/Makefile |   2 +
 kernel/trace/trace_rtb.c  | 146 ++
 17 files changed, 430 insertions(+), 21 deletions(-)
 create mode 100644 fs/pstore/rtb.c
 create mode 100644 include/asm-generic/io-instrumented.h
 cre

[RFC PATCH v2 2/3] pstore: Add register read/write{b,w,l,q} tracing support

2018-08-24 Thread Sai Prakash Ranjan
read/write{b,w,l,q} are typically used for reading from memory
mapped registers, which can cause hangs if accessed
unclocked. Tracing these events can help in debugging
various issues faced during initial development.

We log this trace information in persistent ram buffer which
can be viewed after reset.

We use pstore_rtb_call() to write the RTB log to pstore.
RTB buffer size is taken from ramoops dt node with additional
property called rtb-size.

For reading the trace after mounting pstore, rtb-ramoops entry
can be seen in /sys/fs/pstore/ as in below sample output.

Sample output of tracing register reads/writes in drivers:

 # mount -t pstore pstore /sys/fs/pstore
 # tail /sys/fs/pstore/rtb-ramoops-0
 [LOGK_READ ] ts:36468476204  data:0800d0fc
gic_check_gicv2+0x58/0x60
 [LOGK_WRITE] ts:36468477715  data:0800d000
gic_cpu_if_up+0xc4/0x110
 [LOGK_READ ] ts:36468478548  data:0800d000
gic_cpu_if_up+0xf0/0x110
 [LOGK_WRITE] ts:36468480319  data:0800d000
gic_cpu_if_up+0xc4/0x110
 [LOGK_READ ] ts:36468481048  data:0800d00c
gic_handle_irq+0xac/0x128
 [LOGK_WRITE] ts:36468482923  data:0800d010
gic_handle_irq+0x124/0x128
 [LOGK_READ ] ts:36468483184  data:0800d00c
gic_handle_irq+0xac/0x128
 [LOGK_WRITE] ts:36468485215  data:0800d010
gic_handle_irq+0x124/0x128
 [LOGK_READ ] ts:36468486309  data:0800d00c
gic_handle_irq+0xac/0x128
 [LOGK_WRITE] ts:36468488236  data:0800d010
gic_handle_irq+0x124/0x128

Output has below 5 fields:

 * Log type, Timestamp, Data from caller which is the address of
   read/write{b,w,l,q}, Caller ip and Caller name.

Signed-off-by: Sai Prakash Ranjan 
---
 fs/pstore/Kconfig  | 12 +++
 fs/pstore/Makefile |  1 +
 fs/pstore/inode.c  | 71 +-
 fs/pstore/internal.h   |  8 +
 fs/pstore/platform.c   |  4 +++
 fs/pstore/ram.c| 42 --
 fs/pstore/rtb.c| 45 
 include/linux/pstore.h |  2 ++
 include/linux/pstore_ram.h |  1 +
 include/linux/rtb.h|  7 
 kernel/trace/trace_rtb.c   |  3 ++
 11 files changed, 193 insertions(+), 3 deletions(-)
 create mode 100644 fs/pstore/rtb.c

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 503086f7f7c1..4f1ba1253dfd 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -124,6 +124,18 @@ config PSTORE_PMSG
 
  If unsure, say N.
 
+config PSTORE_RTB
+   bool "Log register operations like read/write"
+   depends on PSTORE && PSTORE!=m
+   depends on RTB
+   help
+ When this option is enabled, rtb driver will log all register
+ reads/writes into a persistent ram buffer that can be decoded
+ and dumped after reboot through pstore filesystem. It can be used
+ to debug readl/writel access.
+
+ If unsure, say N.
+
 config PSTORE_FTRACE
bool "Persistent function tracer"
depends on PSTORE
diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile
index 967b5891f325..c772c9420f57 100644
--- a/fs/pstore/Makefile
+++ b/fs/pstore/Makefile
@@ -9,6 +9,7 @@ pstore-objs += inode.o platform.o
 pstore-$(CONFIG_PSTORE_FTRACE) += ftrace.o
 
 pstore-$(CONFIG_PSTORE_PMSG)   += pmsg.o
+pstore-$(CONFIG_PSTORE_RTB)+= rtb.o
 
 ramoops-objs += ram.o ram_core.o
 obj-$(CONFIG_PSTORE_RAM)   += ramoops.o
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 5fcb845b9fec..467fb29bfd68 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -57,6 +58,7 @@ struct pstore_ftrace_seq_data {
 };
 
 #define REC_SIZE sizeof(struct pstore_ftrace_record)
+#define REC_SIZE_RTB sizeof(struct rtb_layout)
 
 static void free_pstore_private(struct pstore_private *private)
 {
@@ -131,13 +133,73 @@ static const struct seq_operations pstore_ftrace_seq_ops 
= {
.show   = pstore_ftrace_seq_show,
 };
 
+static void *pstore_rtb_seq_start(struct seq_file *s, loff_t *pos)
+{
+   struct pstore_private *ps = s->private;
+   struct pstore_ftrace_seq_data *rdata;
+
+   rdata = kzalloc(sizeof(*rdata), GFP_KERNEL);
+   if (!rdata)
+   return NULL;
+
+   rdata->off = ps->total_size % REC_SIZE_RTB;
+   rdata->off += *pos * REC_SIZE_RTB;
+   if (rdata->off + REC_SIZE_RTB > ps->total_size) {
+   kfree(rdata);
+   return NULL;
+   }
+
+   return rdata;
+}
+
+static void pstore_rtb_seq_stop(struct seq_file *s, void *v)
+{
+   kfree(v);
+}
+
+static void *pstore_rtb_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+   struct pstore_private *ps = s->private;
+   struct pstore_ftrace_seq_data *rdata = v;
+
+   rdata->off += REC_SIZE_RTB;
+   if (rdata->off + REC_SIZE_RTB > ps->total_size)
+   ret

[RFC PATCH v2 3/3] dynamic_debug: Add support for dynamic register trace

2018-08-24 Thread Sai Prakash Ranjan
Introduce dynamic debug filtering mechanism to register
tracing as dynamic_rtb() which will reduce a lot of
overhead otherwise of tracing all the register reads/writes
in all files.

Now we can just specify the file name or any wildcard pattern
as any other dynamic debug facility in bootargs and dynamic rtb
will just trace them and the output can be seen in pstore.

TODO: Now we use same 'p' flag but will add a separate flag for register trace
later.

Also add asm-generic/io-instrumented.h file for instrumentation of IO
operations such as read/write{b,w,l,q} as per Will's suggestion to not touch
arch code and let core generate instrumentation.
This can be extended to arm as well later for tracing.

Example for tracing all register reads/writes in drivers/soc/qcom/* below:

  # dyndbg="file drivers/soc/qcom/* +p" in bootargs
  # reboot -f
  # mount -t pstore pstore /sys/fs/pstore
  # cat /sys/fs/pstore/rtb-ramoops-0
[LOGK_WRITE] ts:1373030419  data:0d5065a4
qcom_smsm_probe+0x51c/0x668
[LOGK_WRITE] ts:1373360576  data:0d506608
qcom_smsm_probe+0x51c/0x668

Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/include/asm/io.h   | 26 +-
 include/asm-generic/io-instrumented.h | 32 +++
 include/linux/dynamic_debug.h | 13 +++
 kernel/trace/Kconfig  |  1 +
 4 files changed, 56 insertions(+), 16 deletions(-)
 create mode 100644 include/asm-generic/io-instrumented.h

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 35b2e50f17fb..aafd4b0be9f0 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -22,6 +22,7 @@
 #ifdef __KERNEL__
 
 #include 
+#include 
 
 #include 
 #include 
@@ -36,32 +37,27 @@
 /*
  * Generic IO read/write.  These perform native-endian accesses.
  */
-#define __raw_writeb __raw_writeb
-static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
+static inline void arch_raw_writeb(u8 val, volatile void __iomem *addr)
 {
asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_writew __raw_writew
-static inline void __raw_writew(u16 val, volatile void __iomem *addr)
+static inline void arch_raw_writew(u16 val, volatile void __iomem *addr)
 {
asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_writel __raw_writel
-static inline void __raw_writel(u32 val, volatile void __iomem *addr)
+static inline void arch_raw_writel(u32 val, volatile void __iomem *addr)
 {
asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_writeq __raw_writeq
-static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
+static inline void arch_raw_writeq(u64 val, volatile void __iomem *addr)
 {
asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr));
 }
 
-#define __raw_readb __raw_readb
-static inline u8 __raw_readb(const volatile void __iomem *addr)
+static inline u8 arch_raw_readb(const volatile void __iomem *addr)
 {
u8 val;
asm volatile(ALTERNATIVE("ldrb %w0, [%1]",
@@ -71,8 +67,7 @@ static inline u8 __raw_readb(const volatile void __iomem 
*addr)
return val;
 }
 
-#define __raw_readw __raw_readw
-static inline u16 __raw_readw(const volatile void __iomem *addr)
+static inline u16 arch_raw_readw(const volatile void __iomem *addr)
 {
u16 val;
 
@@ -83,8 +78,7 @@ static inline u16 __raw_readw(const volatile void __iomem 
*addr)
return val;
 }
 
-#define __raw_readl __raw_readl
-static inline u32 __raw_readl(const volatile void __iomem *addr)
+static inline u32 arch_raw_readl(const volatile void __iomem *addr)
 {
u32 val;
asm volatile(ALTERNATIVE("ldr %w0, [%1]",
@@ -94,8 +88,7 @@ static inline u32 __raw_readl(const volatile void __iomem 
*addr)
return val;
 }
 
-#define __raw_readq __raw_readq
-static inline u64 __raw_readq(const volatile void __iomem *addr)
+static inline u64 arch_raw_readq(const volatile void __iomem *addr)
 {
u64 val;
asm volatile(ALTERNATIVE("ldr %0, [%1]",
@@ -193,6 +186,7 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, 
size_t size);
 #define iowrite32be(v,p)   ({ __iowmb(); __raw_writel((__force 
__u32)cpu_to_be32(v), p); })
 #define iowrite64be(v,p)   ({ __iowmb(); __raw_writeq((__force 
__u64)cpu_to_be64(v), p); })
 
+#include 
 #include 
 
 /*
diff --git a/include/asm-generic/io-instrumented.h 
b/include/asm-generic/io-instrumented.h
new file mode 100644
index ..ce273742b98c
--- /dev/null
+++ b/include/asm-generic/io-instrumented.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_GENERIC_IO_INSTRUMENTED_H
+#define _ASM_GENERIC_IO_INSTRUMENTED_H
+
+#include 
+
+#define __raw_write(v, a, _t) ({

Re: [RFC PATCH v2 2/3] pstore: Add register read/write{b,w,l,q} tracing support

2018-08-25 Thread Sai Prakash Ranjan

On 8/24/2018 8:59 PM, Kees Cook wrote:

On Fri, Aug 24, 2018 at 7:45 AM, Sai Prakash Ranjan
 wrote:

read/write{b,w,l,q} are typically used for reading from memory
mapped registers, which can cause hangs if accessed
unclocked. Tracing these events can help in debugging
various issues faced during initial development.

We log this trace information in persistent ram buffer which
can be viewed after reset.

We use pstore_rtb_call() to write the RTB log to pstore.
RTB buffer size is taken from ramoops dt node with additional
property called rtb-size.

For reading the trace after mounting pstore, rtb-ramoops entry
can be seen in /sys/fs/pstore/ as in below sample output.

Sample output of tracing register reads/writes in drivers:

  # mount -t pstore pstore /sys/fs/pstore
  # tail /sys/fs/pstore/rtb-ramoops-0
  [LOGK_READ ] ts:36468476204  data:0800d0fc
gic_check_gicv2+0x58/0x60
  [LOGK_WRITE] ts:36468477715  data:0800d000
gic_cpu_if_up+0xc4/0x110
  [LOGK_READ ] ts:36468478548  data:0800d000
gic_cpu_if_up+0xf0/0x110
  [LOGK_WRITE] ts:36468480319  data:0800d000
gic_cpu_if_up+0xc4/0x110
  [LOGK_READ ] ts:36468481048  data:0800d00c
gic_handle_irq+0xac/0x128
  [LOGK_WRITE] ts:36468482923  data:0800d010
gic_handle_irq+0x124/0x128
  [LOGK_READ ] ts:36468483184  data:0800d00c
gic_handle_irq+0xac/0x128
  [LOGK_WRITE] ts:36468485215  data:0800d010
gic_handle_irq+0x124/0x128
  [LOGK_READ ] ts:36468486309  data:0800d00c
gic_handle_irq+0xac/0x128
  [LOGK_WRITE] ts:36468488236  data:0800d010
gic_handle_irq+0x124/0x128

Output has below 5 fields:

  * Log type, Timestamp, Data from caller which is the address of
read/write{b,w,l,q}, Caller ip and Caller name.

Signed-off-by: Sai Prakash Ranjan 


As this is a tracing-like method, could this instead be added to
ftrace? That would mean it could reuse all the ftrace tools and you'd
get pstore storage for free.



Ftrace does not trace __raw{read,write}{b,l,w,q}() functions. I am not 
sure why and how it is filtered out because I do not see any notrace 
flag in those functions, maybe that whole directory is filtered out.
So adding this functionality to ftrace would mean removing the notrace 
for these functions i.e., something like using 
__raw{read,write}{b,l,w,q}() as the available filter functions. Also 
pstore ftrace does not filter functions to trace I suppose?


Coming to the reason as to why it would be good to keep this separate 
from ftrace would be:
* Ftrace can get ip and parent ip, but suppose we need extra data (field 
data) as in above sample output we would not be able to get through ftrace.


* Although this patch is for tracing register read/write, we can easily
add more functionality since we have dynamic_rtb api which can be hooked 
to functions and start tracing events(IRQ, Context ID) something similar 
to tracepoints.
Initially thought of having tracepoints for logging register read/write 
but I do not know if we can export tracepoint data to pstore since the 
main usecase is to debug unknown resets and hangs.


* This can be something similar to mmiotrace in x86 and kept seperate 
from function tracer.


Thanks,
Sai Prakash


Re: [PATCH 3/3] soc: qcom: Add AOSS QMP genpd provider

2018-11-26 Thread Sai Prakash Ranjan

Hi Bjorn,

On 11/12/2018 1:35 PM, Bjorn Andersson wrote:

The AOSS QMP genpd provider implements control over power-related
resources related to low-power state associated with the remoteprocs in
the system as well as control over a set of clocks related to debug
hardware in the SoC.

Signed-off-by: Bjorn Andersson 
---
  drivers/soc/qcom/Kconfig   |   8 ++
  drivers/soc/qcom/Makefile  |   1 +
  drivers/soc/qcom/aoss-qmp-pd.c | 135 +
  3 files changed, 144 insertions(+)
  create mode 100644 drivers/soc/qcom/aoss-qmp-pd.c

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index ba08fc00d7f5..e1eda3d59748 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -10,6 +10,14 @@ config QCOM_AOSS_QMP
  micro-controller in the AOSS, using QMP, to control certain resource
  that are not exposed through RPMh.
  
+config QCOM_AOSS_QMP_PD

+   tristate "Qualcomm AOSS Messaging Power Domain driver"
+   depends on QCOM_AOSS_QMP
+   help
+ This driver provides the means of controlling the AOSSs handling of
+ low-power state for resources related to the remoteproc subsystems as
+ well as controlling the debug clocks.
+
  config QCOM_COMMAND_DB
bool "Qualcomm Command DB"
depends on ARCH_QCOM || COMPILE_TEST
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index d0d7fdc94d9a..ebfa414a5b77 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -1,6 +1,7 @@
  # SPDX-License-Identifier: GPL-2.0
  CFLAGS_rpmh-rsc.o := -I$(src)
  obj-$(CONFIG_QCOM_AOSS_QMP) +=aoss-qmp.o
+obj-$(CONFIG_QCOM_AOSS_QMP_PD) += aoss-qmp-pd.o
  obj-$(CONFIG_QCOM_GENI_SE) += qcom-geni-se.o
  obj-$(CONFIG_QCOM_COMMAND_DB) += cmd-db.o
  obj-$(CONFIG_QCOM_GLINK_SSR) +=   glink_ssr.o
diff --git a/drivers/soc/qcom/aoss-qmp-pd.c b/drivers/soc/qcom/aoss-qmp-pd.c
new file mode 100644
index ..467d0db4abfa
--- /dev/null
+++ b/drivers/soc/qcom/aoss-qmp-pd.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018, Linaro Ltd
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct qmp_pd {
+   struct qmp *qmp;
+
+   struct generic_pm_domain pd;
+
+   const char *name;
+};
+
+#define to_qmp_pd_resource(res) container_of(res, struct qmp_pd, pd)
+
+struct qmp_pd_resource {
+   const char *name;
+   int (*on)(struct generic_pm_domain *domain);
+   int (*off)(struct generic_pm_domain *domain);
+};
+
+static int qmp_pd_clock_toggle(struct qmp_pd *res, bool enable)
+{
+   char buf[96];
+   size_t n;
+
+   n = snprintf(buf, sizeof(buf), "{class: clock, res: %s, val: %d}",
+res->name, !!enable);
+   return qmp_send(res->qmp, buf, n);
+}
+


I was trying to get QDSS working with these patches and found one issue
in qmp_send of qmp_pd_clock_toggle.

The third return value should be sizeof(buf) instead of n because n just
returns len as 33 and the below check in qmp send will always fail and
trigger WARN_ON's.

 if (WARN_ON(len % sizeof(u32))) {
 dev_err(qmp->dev, "message not 32-bit aligned\n");
 return -EINVAL;
 }

Also I observed that multiple "ucore will not ack channel" messages with
len being returned n instead of buf size.

One more thing is do we really require *WARN_ON and dev_err* both 
because it just spams the kernel logs, I think dev_err message is clear

enough to be able to understand the error condition.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 3/3] soc: qcom: Add AOSS QMP genpd provider

2018-11-27 Thread Sai Prakash Ranjan

Hi Bjorn,

On 11/12/2018 1:35 PM, Bjorn Andersson wrote:

The AOSS QMP genpd provider implements control over power-related
resources related to low-power state associated with the remoteprocs in
the system as well as control over a set of clocks related to debug
hardware in the SoC.

Signed-off-by: Bjorn Andersson 


One more issue is that amba bus probe fails and coresight(qdss) does not
work with these because of clocks being modeled as power-domain.

Below is the log snippet:

[4.580715] coresight-etm4x: probe of 704.etm failed with error -2
[4.588087] coresight-etm4x: probe of 714.etm failed with error -2
[4.595407] coresight-etm4x: probe of 724.etm failed with error -2
[4.602796] coresight-etm4x: probe of 734.etm failed with error -2
[4.610108] coresight-etm4x: probe of 744.etm failed with error -2
[4.617453] coresight-etm4x: probe of 754.etm failed with error -2
[4.624831] coresight-etm4x: probe of 764.etm failed with error -2
[4.632190] coresight-etm4x: probe of 774.etm failed with error -2

This is because Amba bus probe has amba_get_enable_pclk() which gets
apb_pclk and returns error if it can't get that clk.

Just for testing, I ignored amba_get_enable_pclk() in probe and
coresight seems to work fine.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-11-15 Thread Sai Prakash Ranjan

On 11/13/2018 3:14 PM, Srinivas Kandagatla wrote:

Hi Sai,



On 25/10/18 15:36, saiprakash.ran...@codeaurora.org wrote:

"If I disable dma node and LS-UART0, then I don't see any crash and
ftrace also works fine"

And one more observation is that even without ftrace cmdline, if I use
earlycon and disable dma, I face the same crash.

So basically this seems to be some kind of earlycon and dma issue and
not ftrace(I can be wrong).

So adding Srinivas for more info on this dma node.


Its Interesting that my old email conversations with SBoyd show that I 
have investigated this issue in early 2016!


My analysis so far:

This reason for such behavior is due the common iface clock
(GCC_BLSP1_AHB_CLK) across multiple drivers(serial ports, bam dma
and other low speed devices).
The code flow in DB410C is bit different, as the uart0 is first
attempted to set as console and then uart1, this ordering triggers
pm state change uart_change_pm(state, UART_PM_STATE_OFF) from serial
core while setting up uart0, this would go and disable all the
clocks for uart0.
As uart1 is not setup Yet, and earlycon is still active, any
attempts by earlycon to write to registers would trigger a system
reboot as the clock was just disabled by uart0 change_pm code.

This can even be triggered with any drivers like spi which uses same
clock I guess.

Hope it helps,

Either earlycon needs to reference the clocks or those clocks needs to
be marked always-on (but only with earlycon).



Also just for a note: apq8096-db820c.dtsi shows UART0 is disabled because
bootloader does not allow access to it. Could this also be the case 
for db410c?
No, this is not the case with DB410c. DB820c has added restrictions in 
TZ, I think new booloaders should have solved this issue.





Hi Srinivas,

Thanks a lot for pointing out the cause of crash.
I just tried setting GCC_BLSP1_AHB_CLK with flag CLK_IS_CRITICAL and the
crash disappears.

But I suppose setting CLK_IS_CRITICAL is not the solution?

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH] boot_constraint: Add constraints for earlycon on dragonboard 410c

2018-11-16 Thread Sai Prakash Ranjan
iface clock is shared with other drivers, which may reconfigure
this before the serial driver comes up. This may lead to
crashes like the one below where GCC_BLSP1_AHB_CLK is same across
multiple drivers like bam dma.

<0>[3.164471] Internal error: synchronous external abort: 9610 [#1] 
PREEMPT SMP
<4>[3.164479] Modules linked in:
<4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
4.19.0-rc8-8-ge033b9909fff-dirty #175
<4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
<4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[3.164514] pc : msm_read.isra.2+0x20/0x50
<4>[3.164520] lr : msm_read.isra.2+0x1c/0x50
<4>[3.164526] sp : 08033a50
<4>[3.164531] x29: 08033a50 x28: 09486018
<4>[3.164548] x27: 0001 x26: 7dfffe7ff070
<4>[3.164565] x25: 0034 x24: 09486000
<4>[3.164582] x23:  x22: 0978e190
<4>[3.164599] x21: 095e8228 x20: 0034
<4>[3.164616] x19: 7dfffe7ff008 x18: 
<4>[3.164632] x17:  x16: 
<4>[3.164649] x15: 094a96c8 x14: 8978e6bf
<4>[3.164666] x13: 0978e6cd x12: 0038
<4>[3.164683] x11: 094c6000 x10: 0c24
<4>[3.164699] x9 : 80003c89b400 x8 : 08033970
<4>[3.164716] x7 : 8eb04100 x6 : 000af304
<4>[3.164732] x5 : 0c40 x4 : 80003c06f000
<4>[3.164750] x3 : 80003c89b498 x2 : 
<4>[3.164766] x1 : 80003ca68000 x0 : 0800
<0>[3.164785] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
<4>[3.164791] Call trace:
<4>[3.164797]  msm_read.isra.2+0x20/0x50
<4>[3.164804]  msm_reset_dm_count+0x44/0x80
<4>[3.164810]  __msm_console_write+0x1c8/0x1d0
<4>[3.164816]  msm_serial_early_write_dm+0x3c/0x50
<4>[3.164823]  console_unlock.part.6+0x468/0x528
<4>[3.164829]  vprintk_emit+0x210/0x218
<4>[3.164835]  vprintk_default+0x48/0x58
<4>[3.164841]  vprintk_func+0xf0/0x1c0
<4>[3.164847]  printk+0x74/0x94
<4>[3.164853]  sci_init+0x24/0x3c
<4>[3.164859]  do_one_initcall+0x54/0x248
<4>[3.164866]  kernel_init_freeable+0x210/0x378
<4>[3.164872]  kernel_init+0x18/0x118
<4>[3.164878]  ret_from_fork+0x10/0x1c
<0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260)

Link: 
https://lore.kernel.org/lkml/1cae8f10-55f5-20ce-9105-30af6f88b...@codeaurora.org/
Signed-off-by: Sai Prakash Ranjan 

---

This is purely dependent on boot constraint subsystem by Viresh.

Link: https://lore.kernel.org/lkml/cover.1519380923.git.viresh.ku...@linaro.org/
---
 drivers/soc/qcom/boot_constraint.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/soc/qcom/boot_constraint.c 
b/drivers/soc/qcom/boot_constraint.c
index ca01eb50d9a9..c4e580a118f0 100644
--- a/drivers/soc/qcom/boot_constraint.c
+++ b/drivers/soc/qcom/boot_constraint.c
@@ -49,6 +49,10 @@ static struct dev_boot_constraint_supply_info vddio_info = {
.name = "vddio"
 };
 
+static struct dev_boot_constraint_clk_info uart_iface_clk_info = {
+   .name = "iface",
+};
+
 static struct dev_boot_constraint constraints_mdss[] = {
{
.type = DEV_BOOT_CONSTRAINT_PM,
@@ -92,6 +96,13 @@ static struct dev_boot_constraint constraints_dsi[] = {
},
 };
 
+static struct dev_boot_constraint constraints_uart[] = {
+   {
+   .type = DEV_BOOT_CONSTRAINT_CLK,
+   .data = &uart_iface_clk_info,
+   },
+};
+
 static struct dev_boot_constraint_of constraints[] = {
{
.compat = "qcom,mdss",
@@ -105,6 +116,10 @@ static struct dev_boot_constraint_of constraints[] = {
.compat = "qcom,mdss-dsi-ctrl",
.constraints = constraints_dsi,
.count = ARRAY_SIZE(constraints_dsi),
+   }, {
+   .compat = "qcom,msm-uartdm-v1.4",
+   .constraints = constraints_uart,
+   .count = ARRAY_SIZE(constraints_uart),
},
 };
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-11-16 Thread Sai Prakash Ranjan

On 11/16/2018 9:09 AM, Viresh Kumar wrote:

On Thu, Nov 15, 2018 at 4:23 PM Srinivas Kandagatla
 wrote:


Yes, this is not the solution, but it proves that the hand-off between
booloaders and kernel is the issue.

In general there is wider issue with resources hand-off between
bootloader and kernel.

There has been some proposal in the past by Viresh for a new framework
called boot-constriants (https://lkml.org/lkml/2017/12/14/440) which am
not sure if its still actively looked at. But something similar should
be the way to address such issues.


It isn't dead code yet and I am waiting to gain few more use-cases
before I attempt
to convince Greg again :)

Here is the code..

git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux.git boot-constraint

--
viresh



Maybe you can take this earlycon issue as a usecase.
I have added a boot constraint for earlycon on db410c and have sent a 
patch. Whenever you repitch boot-constraint, you can add that as well :)


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] boot_constraint: Add constraints for earlycon on dragonboard 410c

2018-11-16 Thread Sai Prakash Ranjan

On 11/16/2018 4:23 PM, Viresh Kumar wrote:

Hi,

On 16-11-18, 16:16, Sai Prakash Ranjan wrote:

iface clock is shared with other drivers, which may reconfigure
this before the serial driver comes up. This may lead to
crashes like the one below where GCC_BLSP1_AHB_CLK is same across
multiple drivers like bam dma.

<0>[3.164471] Internal error: synchronous external abort: 9610 [#1] 
PREEMPT SMP
<4>[3.164479] Modules linked in:
<4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
4.19.0-rc8-8-ge033b9909fff-dirty #175
<4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
<4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[3.164514] pc : msm_read.isra.2+0x20/0x50
<4>[3.164520] lr : msm_read.isra.2+0x1c/0x50
<4>[3.164526] sp : 08033a50
<4>[3.164531] x29: 08033a50 x28: 09486018
<4>[3.164548] x27: 0001 x26: 7dfffe7ff070
<4>[3.164565] x25: 0034 x24: 09486000
<4>[3.164582] x23:  x22: 0978e190
<4>[3.164599] x21: 095e8228 x20: 0034
<4>[3.164616] x19: 7dfffe7ff008 x18: 
<4>[3.164632] x17:  x16: 
<4>[3.164649] x15: 094a96c8 x14: 8978e6bf
<4>[3.164666] x13: 0978e6cd x12: 0038
<4>[3.164683] x11: 094c6000 x10: 0c24
<4>[3.164699] x9 : 80003c89b400 x8 : 08033970
<4>[3.164716] x7 : 8eb04100 x6 : 000af304
<4>[3.164732] x5 : 0c40 x4 : 80003c06f000
<4>[3.164750] x3 : 80003c89b498 x2 : 
<4>[3.164766] x1 : 80003ca68000 x0 : 0800
<0>[3.164785] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
<4>[3.164791] Call trace:
<4>[3.164797]  msm_read.isra.2+0x20/0x50
<4>[3.164804]  msm_reset_dm_count+0x44/0x80
<4>[3.164810]  __msm_console_write+0x1c8/0x1d0
<4>[3.164816]  msm_serial_early_write_dm+0x3c/0x50
<4>[3.164823]  console_unlock.part.6+0x468/0x528
<4>[3.164829]  vprintk_emit+0x210/0x218
<4>[3.164835]  vprintk_default+0x48/0x58
<4>[3.164841]  vprintk_func+0xf0/0x1c0
<4>[3.164847]  printk+0x74/0x94
<4>[3.164853]  sci_init+0x24/0x3c
<4>[3.164859]  do_one_initcall+0x54/0x248
<4>[3.164866]  kernel_init_freeable+0x210/0x378
<4>[3.164872]  kernel_init+0x18/0x118
<4>[3.164878]  ret_from_fork+0x10/0x1c
<0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260)

Link: 
https://lore.kernel.org/lkml/1cae8f10-55f5-20ce-9105-30af6f88b...@codeaurora.org/
Signed-off-by: Sai Prakash Ranjan 

---

This is purely dependent on boot constraint subsystem by Viresh.

Link: https://lore.kernel.org/lkml/cover.1519380923.git.viresh.ku...@linaro.org/
---
  drivers/soc/qcom/boot_constraint.c | 15 +++
  1 file changed, 15 insertions(+)


Nice to see that the new subsystem has fixed your issue as well and
thanks for the patch. I am afraid though that this patch isn't going
to get merged anywhere as the subsystem isn't merged yet as Greg
wasn't too confident [1] about it. We are looking for more use-cases
apart from earlycon and clcd.




Hi Viresh,

Thanks for your time.

I wanted to get this patch out so that once we have enough use cases,
maybe we can add this as well.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [RFC PATCH 1/3] tracing: Add support for logging data to uncached buffer

2018-08-16 Thread Sai Prakash Ranjan

On 8/16/2018 8:29 AM, Steven Rostedt wrote:


Sorry for the late reply, I actually wrote this email over a week ago,
but never hit send. And the email was pushed back behind other
windows. :-/




Thanks for the review Steven.
And no problem on late reply, I was working on Will's comment about 
instrumentation in arch code and was about to respin a v2 patch. I have 
replied inline, let me know if any more corrections or improvements can 
be done. I would also like if Kees or someone from pstore could comment 
on patch 2.



On Fri,  3 Aug 2018 19:58:42 +0530
Sai Prakash Ranjan  wrote:


diff --git a/kernel/trace/trace_rtb.c b/kernel/trace/trace_rtb.c
new file mode 100644
index ..e8c24db71a2d
--- /dev/null
+++ b/kernel/trace/trace_rtb.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 The Linux Foundation. All rights reserved.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct platform_device *rtb_dev;
+static atomic_t rtb_idx;
+
+struct rtb_state {
+   struct rtb_layout *rtb;
+   phys_addr_t phys;
+   unsigned int nentries;
+   unsigned int size;
+   int enabled;
+};
+
+static struct rtb_state rtb = {
+   .enabled = 0,
+};


No need for the initialization, you could just have:

static struct rtb_state rtb;

And it will be initialized to all zeros. Or did you do that to document
that it is not enabled at boot?



I will correct it in v2. RTB will not be enabled until pstore is 
registered since we use pstore for logs. I will add a comment above the 
static declaration saying so.



+
+static int rtb_panic_notifier(struct notifier_block *this,
+   unsigned long event, void *ptr)
+{
+   rtb.enabled = 0;
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block rtb_panic_blk = {
+   .notifier_call  = rtb_panic_notifier,
+   .priority = INT_MAX,
+};
+
+static void rtb_write_type(const char *log_type,
+   struct rtb_layout *start)
+{
+   start->log_type = log_type;
+}
+
+static void rtb_write_caller(u64 caller, struct rtb_layout *start)
+{
+   start->caller = caller;
+}
+
+static void rtb_write_data(u64 data, struct rtb_layout *start)
+{
+   start->data = data;
+}
+
+static void rtb_write_timestamp(struct rtb_layout *start)
+{
+   start->timestamp = sched_clock();
+}


Why have the above static functions? They are not very helpful, and
appear to be actually confusing. They are used once.



Yes you are right, will remove those.


+
+static void uncached_logk_pc_idx(const char *log_type, u64 caller,
+   u64 data, int idx)
+{
+   struct rtb_layout *start;
+
+   start = &rtb.rtb[idx & (rtb.nentries - 1)];
+
+   rtb_write_type(log_type, start);
+   rtb_write_caller(caller, start);
+   rtb_write_data(data, start);
+   rtb_write_timestamp(start);


How is the above better than:

start->log_type = log_type;
start->caller = caller;
start->data = data;
start->timestamp = sched_clock();

??



Sure, will change it to above and post v2.


+   /* Make sure data is written */
+   mb();
+}
+
+static int rtb_get_idx(void)
+{
+   int i, offset;
+
+   i = atomic_inc_return(&rtb_idx);
+   i--;
+
+   /* Check if index has wrapped around */
+   offset = (i & (rtb.nentries - 1)) -
+((i - 1) & (rtb.nentries - 1));
+   if (offset < 0) {
+   i = atomic_inc_return(&rtb_idx);
+   i--;
+   }
+
+   return i;
+}
+
+noinline void notrace uncached_logk(const char *log_type, void *data)


BTW, all files in this directory have their functions notrace by
default.



Oh I missed it. Will remove notrace.

- Sai Prakash


Re: [PATCH 3/6] tracing: Add tp_pstore cmdline to have tracepoints go to pstore

2018-10-09 Thread Sai Prakash Ranjan

On 10/9/2018 4:10 AM, Joel Fernandes wrote:

On Mon, Oct 08, 2018 at 10:36:59AM -0400, Steven Rostedt wrote:

On Mon, 8 Oct 2018 19:46:15 +0530
Sai Prakash Ranjan  wrote:


Hi Joel,

Sorry for the long delay in updating this thread.
But I just observed one weird behaviour in ftrace-ramoops when I was
trying to use binary record instead of rendering text for event-ramoops.

Even though we set the ftrace-size in device tree for ramoops, the
actual ftrace-ramoops record seems to have more records than specified size.
Is this expected or some bug?


If ftrace-ramoops is using logic similar to the ftrace ring buffer,
then yes, it will be able to store more events than specified. The
ftrace ring buffer is broken up into pages, and everything is rounded
up to the nearest page (note, the data may be smaller than a page,
because each page also contains a header).


In the pstore code, the pages are contiguous. The platform drivers for that
platform configure 'ftrace-size' which is in bytes. That is further divided
by the number of CPUs. The records from all the buffers are then merged at read 
time.

Each function trace record is of type:
struct pstore_ftrace_record {
 unsigned long ip;
 unsigned long parent_ip;
 u64 ts;
};

which should be 24 bytes.

But there is an ECC block (with ECC data and ECC header) added to each CPU in
this case of the backing store of the pstore being RAM (pstore backing stores
can be other media types too).



Thanks for this info.


Below is the ftrace-ramoops size passed in dtsi for db410c and we can
see that the ftrace record is more.

# cat /sys/module/ramoops/parameters/ftrace_size
131072


I'm assuming this too is like the ftrace ring buffer, where the size is
per cpu (and why you do a search per cpu below).


#
# cat /sys/fs/pstore/ftrace-ramoops-0 | wc -c
560888
#
# cat /sys/fs/pstore/ftrace-ramoops-0 | grep CPU:0 | wc -c
137758
#
# cat /sys/fs/pstore/ftrace-ramoops-0 | grep CPU:1 | wc -c
140560
#
# cat /sys/fs/pstore/ftrace-ramoops-0 | grep CPU:2 | wc -c
141174
#
# cat /sys/fs/pstore/ftrace-ramoops-0 | grep CPU:3 | wc -c
141396
#


That could be because you're counting text characters when you're counting.
You need to count the binary size.



Right thanks.


If you are using binary record, isn't this what you want? The
ftrace_size is the size of how much binary data is stored. When you
translate the binary into human text, the text should be bigger.


I don't see this in console or dmesg ramoops and also with the
event-ramoops which I have posted since we don't use binary record, only
ftrace-ramoops uses binary record to store trace data.

Also regarding the warning on "event_call->event.funcs->trace()" call,
I see it everytime without spinlock. Also we see output_printk using
spinlock when making this call. I could not find a way to pass event
buffer size and allocate in pstore. Steven can give some hints with this
I guess.


The spinlock warning you're talking about is this one correct?

[1.389382] WARNING: CPU: 2 PID: 2 at kernel/trace/trace_output.c:289 
trace_raw_output_prep+0x18/0xa0
[1.389416] Modules linked in:
which you reported here:
https://lkml.org/lkml/2018/9/22/319

This warning happens I think because you're trying to format the events while
the trace events are being generated. You said you got this warning when you
didn't use the spinlock. I believe the spinlocking prevents such races, but
if you didn't need to format the events into text into text in the first
place, then you wouldn't need such locking at all.

I believe ftrace doesn't have such issues because such locking is taken care
off when the trace events are formatted from the trace buffer and displayed,
but I could be wrong about that.. I'll let Steven provide more inputs about
how this warning can occur.

Yes Steven would have more insight on this warning.



As a suggestion, please always provide references to the warnings you're
referring to, such as previous LKML posts or atleast the warning message so
folks know which warning you're talking about.



Yes sure.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

Hi,

On dragonboard 410c, with "ftrace=function" boot args, the console 
output slows down and board resets without any backtrace as below. This 
is tested on latest kernel and seems to exist even in older kernels as well.


[2.949164] EINJ: ACPI disabled.
[3.133001] Serial: 8250/16550 dri
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic

But with pstore enabled, able to get the below backtrace:

<4>[2.949164] EINJ: ACPI disabled.
<6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
<6>[3.164097] SuperH (H)SCI(F) driver initialized
<0>[3.164471] Internal error: synchronous external abort: 9610 
[#1] PREEMPT SMP

<4>[3.164479] Modules linked in:
<4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 
4.19.0-rc8-8-ge033b9909fff-dirty #175
<4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 
SBC (DT)

<4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[3.164514] pc : msm_read.isra.2+0x20/0x50
<4>[3.164520] lr : msm_read.isra.2+0x1c/0x50
<4>[3.164526] sp : 08033a50
<4>[3.164531] x29: 08033a50 x28: 09486018
<4>[3.164548] x27: 0001 x26: 7dfffe7ff070
<4>[3.164565] x25: 0034 x24: 09486000
<4>[3.164582] x23:  x22: 0978e190
<4>[3.164599] x21: 095e8228 x20: 0034
<4>[3.164616] x19: 7dfffe7ff008 x18: 
<4>[3.164632] x17:  x16: 
<4>[3.164649] x15: 094a96c8 x14: 8978e6bf
<4>[3.164666] x13: 0978e6cd x12: 0038
<4>[3.164683] x11: 094c6000 x10: 0c24
<4>[3.164699] x9 : 80003c89b400 x8 : 08033970
<4>[3.164716] x7 : 8eb04100 x6 : 000af304
<4>[3.164732] x5 : 0c40 x4 : 80003c06f000
<4>[3.164750] x3 : 80003c89b498 x2 : 
<4>[3.164766] x1 : 80003ca68000 x0 : 0800
<0>[3.164785] Process swapper/0 (pid: 1, stack limit = 
0x(ptrval))

<4>[3.164791] Call trace:
<4>[3.164797]  msm_read.isra.2+0x20/0x50
<4>[3.164804]  msm_reset_dm_count+0x44/0x80
<4>[3.164810]  __msm_console_write+0x1c8/0x1d0
<4>[3.164816]  msm_serial_early_write_dm+0x3c/0x50
<4>[3.164823]  console_unlock.part.6+0x468/0x528
<4>[3.164829]  vprintk_emit+0x210/0x218
<4>[3.164835]  vprintk_default+0x48/0x58
<4>[3.164841]  vprintk_func+0xf0/0x1c0
<4>[3.164847]  printk+0x74/0x94
<4>[3.164853]  sci_init+0x24/0x3c
<4>[3.164859]  do_one_initcall+0x54/0x248
<4>[3.164866]  kernel_init_freeable+0x210/0x378
<4>[3.164872]  kernel_init+0x18/0x118
<4>[3.164878]  ret_from_fork+0x10/0x1c
<0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260)

Seems to be issue with the msm serial driver and not ftrace.
Could someone look into it.

One more thing is for pstore dmesg-ramoops, I had to change 
late_initcall to postcore_initcall which brings the question as to why 
we changed to late_initcall?
Simple git blame shows to support crypto compress api, but is it really 
helpful? A lot of boottime issues can be caught with pstore enabled at 
postcore_initcall rather than late_initcall, this backtrace

is just one example. Is there any way we could change this?

Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/16/2018 5:14 PM, Greg Kroah-Hartman wrote:

On Tue, Oct 16, 2018 at 05:08:25PM +0530, Sai Prakash Ranjan wrote:

Hi,

On dragonboard 410c, with "ftrace=function" boot args, the console output
slows down and board resets without any backtrace as below. This is tested
on latest kernel and seems to exist even in older kernels as well.

[2.949164] EINJ: ACPI disabled.
[3.133001] Serial: 8250/16550 dri
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic

But with pstore enabled, able to get the below backtrace:

<4>[2.949164] EINJ: ACPI disabled.
<6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
<6>[3.164097] SuperH (H)SCI(F) driver initialized
<0>[3.164471] Internal error: synchronous external abort: 9610 [#1]
PREEMPT SMP
<4>[3.164479] Modules linked in:
<4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
4.19.0-rc8-8-ge033b9909fff-dirty #175
<4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC
(DT)
<4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[3.164514] pc : msm_read.isra.2+0x20/0x50
<4>[3.164520] lr : msm_read.isra.2+0x1c/0x50
<4>[3.164526] sp : 08033a50
<4>[3.164531] x29: 08033a50 x28: 09486018
<4>[3.164548] x27: 0001 x26: 7dfffe7ff070
<4>[3.164565] x25: 0034 x24: 09486000
<4>[3.164582] x23:  x22: 0978e190
<4>[3.164599] x21: 095e8228 x20: 0034
<4>[3.164616] x19: 7dfffe7ff008 x18: 
<4>[3.164632] x17:  x16: 
<4>[3.164649] x15: 094a96c8 x14: 8978e6bf
<4>[3.164666] x13: 0978e6cd x12: 0038
<4>[3.164683] x11: 094c6000 x10: 0c24
<4>[3.164699] x9 : 80003c89b400 x8 : 08033970
<4>[3.164716] x7 : 8eb04100 x6 : 000af304
<4>[3.164732] x5 : 0c40 x4 : 80003c06f000
<4>[3.164750] x3 : 80003c89b498 x2 : 
<4>[3.164766] x1 : 80003ca68000 x0 : 0800
<0>[3.164785] Process swapper/0 (pid: 1, stack limit =
0x(ptrval))
<4>[3.164791] Call trace:
<4>[3.164797]  msm_read.isra.2+0x20/0x50
<4>[3.164804]  msm_reset_dm_count+0x44/0x80
<4>[3.164810]  __msm_console_write+0x1c8/0x1d0
<4>[3.164816]  msm_serial_early_write_dm+0x3c/0x50
<4>[3.164823]  console_unlock.part.6+0x468/0x528
<4>[3.164829]  vprintk_emit+0x210/0x218
<4>[3.164835]  vprintk_default+0x48/0x58
<4>[3.164841]  vprintk_func+0xf0/0x1c0
<4>[3.164847]  printk+0x74/0x94
<4>[3.164853]  sci_init+0x24/0x3c
<4>[3.164859]  do_one_initcall+0x54/0x248
<4>[3.164866]  kernel_init_freeable+0x210/0x378
<4>[3.164872]  kernel_init+0x18/0x118
<4>[3.164878]  ret_from_fork+0x10/0x1c
<0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260)

Seems to be issue with the msm serial driver and not ftrace.
Could someone look into it.


As you have the hardware to reproduce this, and it has always been an
issue, you are the best placed to help resolve this.



I would definitely test if serial guys could point somewhere, not that 
the backtrace is not pointing :) but I am not aware of this serial driver.


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/16/2018 8:59 PM, Steven Rostedt wrote:

On Tue, 16 Oct 2018 17:08:25 +0530
Sai Prakash Ranjan  wrote:


Hi,

On dragonboard 410c, with "ftrace=function" boot args, the console
output slows down and board resets without any backtrace as below. This
is tested on latest kernel and seems to exist even in older kernels as well.


So this only happens when ftrace=function is on the boot console.



Yes. If I do not use boot console, target does not crash.





[2.949164] EINJ: ACPI disabled.
[3.133001] Serial: 8250/16550 dri
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic

But with pstore enabled, able to get the below backtrace:

<4>[2.949164] EINJ: ACPI disabled.
<6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
<6>[3.164097] SuperH (H)SCI(F) driver initialized
<0>[3.164471] Internal error: synchronous external abort: 9610
[#1] PREEMPT SMP
<4>[3.164479] Modules linked in:
<4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
4.19.0-rc8-8-ge033b9909fff-dirty #175
<4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016
SBC (DT)
<4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[3.164514] pc : msm_read.isra.2+0x20/0x50
<4>[3.164520] lr : msm_read.isra.2+0x1c/0x50
<4>[3.164526] sp : 08033a50
<4>[3.164531] x29: 08033a50 x28: 09486018
<4>[3.164548] x27: 0001 x26: 7dfffe7ff070
<4>[3.164565] x25: 0034 x24: 09486000
<4>[3.164582] x23:  x22: 0978e190
<4>[3.164599] x21: 095e8228 x20: 0034
<4>[3.164616] x19: 7dfffe7ff008 x18: 
<4>[3.164632] x17:  x16: 
<4>[3.164649] x15: 094a96c8 x14: 8978e6bf
<4>[3.164666] x13: 0978e6cd x12: 0038
<4>[3.164683] x11: 094c6000 x10: 0c24
<4>[3.164699] x9 : 80003c89b400 x8 : 08033970
<4>[3.164716] x7 : 8eb04100 x6 : 000af304
<4>[3.164732] x5 : 0c40 x4 : 80003c06f000
<4>[3.164750] x3 : 80003c89b498 x2 : 
<4>[3.164766] x1 : 80003ca68000 x0 : 0800
<0>[3.164785] Process swapper/0 (pid: 1, stack limit =
0x(ptrval))
<4>[3.164791] Call trace:
<4>[3.164797]  msm_read.isra.2+0x20/0x50
<4>[3.164804]  msm_reset_dm_count+0x44/0x80
<4>[3.164810]  __msm_console_write+0x1c8/0x1d0
<4>[3.164816]  msm_serial_early_write_dm+0x3c/0x50
<4>[3.164823]  console_unlock.part.6+0x468/0x528
<4>[3.164829]  vprintk_emit+0x210/0x218
<4>[3.164835]  vprintk_default+0x48/0x58
<4>[3.164841]  vprintk_func+0xf0/0x1c0
<4>[3.164847]  printk+0x74/0x94
<4>[3.164853]  sci_init+0x24/0x3c
<4>[3.164859]  do_one_initcall+0x54/0x248
<4>[3.164866]  kernel_init_freeable+0x210/0x378
<4>[3.164872]  kernel_init+0x18/0x118
<4>[3.164878]  ret_from_fork+0x10/0x1c
<0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260)

Seems to be issue with the msm serial driver and not ftrace.
Could someone look into it.


I'm guessing that there may have been an issue with ftrace, it tried to
print, but the printing caused a bug that rebooted the box.

Does function tracing work after boot up? That is, without the
ftrace=function, can you do:

  echo function > /sys/kernel/debug/tracing/current_tracer

without any issue?



Yes ftrace in general works without any issue. I have also tested on 
db820c and sdm845 where "ftrace=function" works fine. I am seeing this 
issue only on db410c board.




One more thing is for pstore dmesg-ramoops, I had to change
late_initcall to postcore_initcall which brings the question as to why
we changed to late_initcall?
Simple git blame shows to support crypto compress api, but is it really
helpful? A lot of boottime issues can be caught with pstore enabled at
postcore_initcall rather than late_initcall, this backtrace
is just one example. Is there any way we could change this?


Does it break if the crypto is not initialized? Perhaps add a command
line flag to have it happen earlier:



I didnt see any breakage, have been using ramoops with postcore_initcall 
for sometime now.



  ramoops=earlyinit

and add a postcore_initcall that checks if that flag is set, and if so,
it does the work then, and the late_initcall() will do nothing.

That way, you can still have unmodified kernels use pstore when it
crashes at boot up.



Sounds good.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/16/2018 10:27 PM, Steven Rostedt wrote:


OK, can you add to the command line:

  ftrace=function ftrace_filter=*schedule*

to see if it's a specific function that may be causing the issue (but
hopefully it's not one of the scheduling functions that caused it).



Target boots fine with this. So its not scheduling functions that is 
causing it. Also I tried with ftrace_filter=*msm* just to be sure if

tracing driver functions is causing any issue but its NOT.



Does it break if the crypto is not initialized? Perhaps add a command
line flag to have it happen earlier:
  


I didnt see any breakage, have been using ramoops with postcore_initcall
for sometime now.


   ramoops=earlyinit

and add a postcore_initcall that checks if that flag is set, and if so,
it does the work then, and the late_initcall() will do nothing.

That way, you can still have unmodified kernels use pstore when it
crashes at boot up.
   


Sounds good.


Great, I guess you can write a patch to do that ;-)



Sure I can :) but as Kees said it would be better if we could
find a way to make it work with a late initialization of compression.
I will try on that.

Thanks,

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/16/2018 11:18 PM, Steven Rostedt wrote:

On Tue, 16 Oct 2018 23:06:24 +0530
Sai Prakash Ranjan  wrote:


On 10/16/2018 10:27 PM, Steven Rostedt wrote:


OK, can you add to the command line:

   ftrace=function ftrace_filter=*schedule*

to see if it's a specific function that may be causing the issue (but
hopefully it's not one of the scheduling functions that caused it).
   


Target boots fine with this. So its not scheduling functions that is
causing it. Also I tried with ftrace_filter=*msm* just to be sure if
tracing driver functions is causing any issue but its NOT.


OK, seems that something is being traced that shouldn't be.

When this happens after boot up, it's easy to bisect down to the
problem function. But since it's at boot up, it will take a lot longer.

I would suggest to start by going down the alphabet.

ftrace_filter=a*
ftrace_filter=b*
ftrace_filter=c*
[...]

And at least find the letter the bad function starts with.

Note, it could be more than one function (I've had that a couple of
times), and to find that out, you can test with "ftrace_notrace". Say
you find that the problem function starts with 'x'. You can do:

ftrace_notrace=x*

Which will trace all functions except those that start with an 'x', to
make sure it still boots.

Remember, you still need to have ftrace=function for all of this.

Once you find the letter of the function, you can try the next letter,
or perhaps come up with another method. I would say look at the
functions in /sys/kernel/debug/tracing/available_filter_functions, but
they don't list the init function (that can be traced). But you can
use /proc/kallsyms instead.



Ok got it, this sounds fun. I'll give it a try.

Thanks

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/16/2018 11:46 PM, Steven Rostedt wrote:


[ Removed ivan.iva...@linaro.org due to getting mail delivery errors ]

On Tue, 16 Oct 2018 23:35:00 +0530
Sai Prakash Ranjan  wrote:


Ok got it, this sounds fun. I'll give it a try.


Awesome, I'm looking forward to seeing what you come up with.



Now, I can trigger this crash reboot with "ftrace=function 
ftrace_filter=f*,g*,h*,i*,j*,k*,l*,m*,n*,o*,p*,q*"


And pstore dmesg log gives slightly different backtrace than 
before(i.e., uart_add_one_port+0x374/0x4c8):


<4>[1.976945] EINJ: ACPI disabled.
<6>[2.070011] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
<6>[2.087353] SuperH (H)SCI(F) driver initialized
<6>[2.093708] msm_serial 78af000.serial: msm_serial: detected port #1
<6>[2.093901] msm_serial 78af000.serial: uartclk = 1920
<6>[2.099078] 78af000.serial: ttyMSM1 at MMIO 0x78af000 (irq = 9, 
base_baud = 120) is a MSM

<6>[2.108386] msm_serial 78b.serial: msm_serial: detected port #0
<6>[2.113076] msm_serial 78b.serial: uartclk = 7372800
<6>[2.119300] 78b.serial: ttyMSM0 at MMIO 0x78b (irq = 10, 
base_baud = 460800) is a MSM
<0>[2.119769] Internal error: synchronous external abort: 9610 
[#1] PREEMPT SMP

<4>[2.119772] Modules linked in:
<4>[2.119780] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
4.19.0-rc8-8-ge033b9909fff-dirty #2
<4>[2.119785] Hardware name: Qualcomm Technologies, Inc. APQ 8016 
SBC (DT)

<4>[2.119789] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[2.119792] pc : msm_read.isra.2+0x20/0x50
<4>[2.119796] lr : msm_read.isra.2+0x1c/0x50
<4>[2.119799] sp : 08033760
<4>[2.119802] x29: 08033760 x28: 09486018
<4>[2.119809] x27: 0001 x26: 7dfffe7ff070
<4>[2.119816] x25: 0062 x24: 09486000
<4>[2.119824] x23:  x22: 0978e190
<4>[2.119831] x21: 095e8228 x20: 0062
<4>[2.119838] x19: 7dfffe7ff008 x18: 
<4>[2.119845] x17:  x16: 
<4>[2.119852] x15: 094a96c8 x14: 3d20647561625f65
<4>[2.119859] x13: 736162202c303120 x12: 3d20717269282030
<4>[2.119866] x11: 3030306238377830 x10: 0134
<4>[2.119874] x9 : 80003c356400 x8 : 08033680
<4>[2.119881] x7 : 80003c354100 x6 : 000d9b72
<4>[2.119888] x5 : 0150 x4 : 80003bc8f000
<4>[2.119895] x3 : 80003c356498 x2 : 
<4>[2.119902] x1 : 80003bf1 x0 : 0800
<0>[2.119911] Process swapper/0 (pid: 1, stack limit = 
0x(ptrval))

<4>[2.119914] Call trace:
<4>[2.119917]  msm_read.isra.2+0x20/0x50
<4>[2.119920]  msm_reset_dm_count+0x44/0x80
<4>[2.119924]  __msm_console_write+0x1c8/0x1d0
<4>[2.119928]  msm_serial_early_write_dm+0x3c/0x50
<4>[2.119931]  console_unlock.part.6+0x468/0x528
<4>[2.119935]  vprintk_emit+0x210/0x218
<4>[2.119938]  vprintk_default+0x48/0x58
<4>[2.119942]  vprintk_func+0xf0/0x1c0
<4>[2.119945]  printk+0x74/0x94
<4>[2.119949]  uart_add_one_port+0x374/0x4c8
<4>[2.119952]  msm_serial_probe+0x140/0x1c8
<4>[2.119955]  platform_drv_probe+0x58/0xb0
<4>[2.119959]  really_probe+0x1f4/0x3c8
<4>[2.119963]  driver_probe_device+0x12c/0x150
<4>[2.119966]  __driver_attach+0x144/0x148
<4>[2.119969]  bus_for_each_dev+0x78/0xe0
<4>[2.119973]  driver_attach+0x30/0x40
<4>[2.119976]  bus_add_driver+0x1c8/0x2a8
<4>[2.119979]  driver_register+0x68/0x118
<4>[2.119983]  __platform_driver_register+0x54/0x60
<4>[2.119987]  msm_serial_init+0x40/0x70
<4>[2.119990]  do_one_initcall+0x54/0x248
<4>[2.119993]  kernel_init_freeable+0x210/0x378
<4>[2.119997]  kernel_init+0x18/0x118
<4>[2.12]  ret_from_fork+0x10/0x1c
<0>[2.120004] Code: aa1e03e0 8b214273 97e61727 d503201f (b9400260)

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/17/2018 12:11 AM, Steven Rostedt wrote:

On Tue, 16 Oct 2018 23:55:14 +0530
Sai Prakash Ranjan  wrote:


On 10/16/2018 11:46 PM, Steven Rostedt wrote:


[ Removed ivan.iva...@linaro.org due to getting mail delivery errors ]

On Tue, 16 Oct 2018 23:35:00 +0530
Sai Prakash Ranjan  wrote:
   

Ok got it, this sounds fun. I'll give it a try.


Awesome, I'm looking forward to seeing what you come up with.
   


Now, I can trigger this crash reboot with "ftrace=function
ftrace_filter=f*,g*,h*,i*,j*,k*,l*,m*,n*,o*,p*,q*"


Does it crash with just "f*" or "g*" or any singularity of the above?

Hmm, I wonder if it is from tracing msm_read?



Haa seems like you are right! With "ftrace=function 
ftrace_filter=msm_read" , I can trigger the crash, but

sadly "ftrace_notrace=msm_read" also crashes.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/17/2018 12:33 AM, Steven Rostedt wrote:

On Wed, 17 Oct 2018 00:31:03 +0530
Sai Prakash Ranjan  wrote:


Haa seems like you are right! With "ftrace=function
ftrace_filter=msm_read" , I can trigger the crash, but
sadly "ftrace_notrace=msm_read" also crashes.


So there's more than one problem area.

What about ftrace_notrace=m*

?



That too crashes.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-16 Thread Sai Prakash Ranjan

On 10/17/2018 12:46 AM, Steven Rostedt wrote:

On Tue, 16 Oct 2018 15:15:16 -0400
Steven Rostedt  wrote:


I'd like to see the full command line as well. I bet if you remove the
qcom,msm-uartdm from the command line, and had just ftrace=function, it
may also boot fine too. Can you try that?


Note, I probably wont respond for the rest of the day.



Cool, no problem. Will update if I find anything.

Thanks for your time!

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-17 Thread Sai Prakash Ranjan

On 10/17/2018 2:21 AM, Stephen Boyd wrote:

Quoting Sai Prakash Ranjan (2018-10-16 12:35:57)

On 10/17/2018 12:45 AM, Steven Rostedt wrote:

On Wed, 17 Oct 2018 00:36:05 +0530
Sai Prakash Ranjan  wrote:


On 10/17/2018 12:33 AM, Steven Rostedt wrote:

On Wed, 17 Oct 2018 00:31:03 +0530
Sai Prakash Ranjan  wrote:


Haa seems like you are right! With "ftrace=function
ftrace_filter=msm_read" , I can trigger the crash, but
sadly "ftrace_notrace=msm_read" also crashes.


So there's more than one problem area.

What about ftrace_notrace=m*

?



That too crashes.



Which compiler are you using and can you send me your config.


Config attached.

Compiler: aarch64-linux-gnu-gcc (Linaro GCC 6.3-2017.02) 6.3.1 20170109


I wonder if there's something screwing up with the way ftrace nops are
working on this board.

A couple things of note, 1) it works fine after boot up. 2) it crashes
in the initcall code, so it's not due to ftrace being enabled too early.

I'd like to see the full command line as well. I bet if you remove the
qcom,msm-uartdm from the command line, and had just ftrace=function, it
may also boot fine too. Can you try that?



Kernel command line: root=/dev/disk/by-partlabel/rootfs rw rootwait
ftrace=function ftrace_notrace=m* earlycon console=ttyMSM0,115200n8

qcom,msm-uartdm is not in command line, it is the earlycon. So without
earlycon(bootconsole), board boots fine as we discussed earlier.



Have you tried with earlycon and no ftrace things on the commandline?

  root=/dev/disk/by-partlabel/rootfs rw rootwait earlycon 
console=ttyMSM0,115200n8

If earlycon is causing problems, it sounds like this may be another case
of earlycon uart handing off to the uart driver and that failing because
something gets printed while the uart is transitioning from the earlycon
console to the kernel boot console. I recall the uart would trample on
itself in interesting ways.



Yes I have tried with only earlycon enabled and everything is fine. 
Issue is reproduced only with ftrace=function cmdline.


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-17 Thread Sai Prakash Ranjan

On 10/17/2018 3:43 PM, Joel Fernandes wrote:

Hi Kees,

On Tue, Oct 16, 2018 at 10:02:53AM -0700, Kees Cook wrote:

On Tue, Oct 16, 2018 at 8:29 AM, Steven Rostedt  wrote:

On Tue, 16 Oct 2018 17:08:25 +0530
Sai Prakash Ranjan  wrote:

One more thing is for pstore dmesg-ramoops, I had to change
late_initcall to postcore_initcall which brings the question as to why
we changed to late_initcall?
Simple git blame shows to support crypto compress api, but is it really
helpful? A lot of boottime issues can be caught with pstore enabled at
postcore_initcall rather than late_initcall, this backtrace
is just one example. Is there any way we could change this?


Does it break if the crypto is not initialized? Perhaps add a command
line flag to have it happen earlier:

  ramoops=earlyinit

and add a postcore_initcall that checks if that flag is set, and if so,
it does the work then, and the late_initcall() will do nothing.

That way, you can still have unmodified kernels use pstore when it
crashes at boot up.


Even better, if we could find a way to make it work with a late
initialization of compression (i.e. postcore_initcall() by default
means anything caught would be uncompressed, but anything after
late_initcall() would be compressed). I'd be very happy to review
patches for that!


What do you think about the (untested) patch below? It seems to me that it
should solve the issue of missing early crash dumps, but I have not tested it
yet. Sai, would you mind trying it out and let me know if you can see the
early crash dumps properly now?

8<---
From: "Joel Fernandes (Google)" 
Subject: [RFC] pstore: allocate compression during late_initcall

ramoop's pstore registration (using pstore_register) has to run during
late_initcall because crypto backend may not be ready during
postcore_initcall. This causes missing of dmesg crash dumps which could
have been caught by pstore.

Instead, lets allow ramoops pstore registration earlier, and once crypto
is ready we can initialize the compression.

Reported-by: Sai Prakash Ranjan 
Signed-off-by: Joel Fernandes (Google) 
---
  fs/pstore/platform.c | 13 +
  fs/pstore/ram.c  |  2 +-
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 15e99d5a681d..f09066db2d4d 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -780,6 +780,19 @@ void __init pstore_choose_compression(void)
}
  }
  
+static int __init pstore_compression_late_init(void)

+{
+   /*
+* Check if any pstore backends registered earlier but did not allocate
+* for compression because crypto was not ready, if so then initialize
+* compression.
+*/
+   if (psinfo && !tfm)
+   allocate_buf_for_compression();
+   return 0;
+}
+late_initcall(pstore_compression_late_init);
+
  module_param(compress, charp, 0444);
  MODULE_PARM_DESC(compress, "Pstore compression to use");
  
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c

index bbd1e357c23d..98e48d1a9776 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -940,7 +940,7 @@ static int __init ramoops_init(void)
ramoops_register_dummy();
return platform_driver_register(&ramoops_driver);
  }
-late_initcall(ramoops_init);
+postcore_initcall(ramoops_init);
  
  static void __exit ramoops_exit(void)

  {



Yes I could see the early crash dump. Also I tested with different 
compression (LZO) instead of deflate just to be sure and it works fine, 
thanks :)


Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-17 Thread Sai Prakash Ranjan

On 10/17/2018 4:39 AM, Joel Fernandes wrote:

On Tue, Oct 16, 2018 at 05:08:25PM +0530, Sai Prakash Ranjan wrote:

Hi,

On dragonboard 410c, with "ftrace=function" boot args, the console output
slows down and board resets without any backtrace as below. This is tested
on latest kernel and seems to exist even in older kernels as well.

[2.949164] EINJ: ACPI disabled.
[3.133001] Serial: 8250/16550 dri
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic

But with pstore enabled, able to get the below backtrace:
<4>[2.949164] EINJ: ACPI disabled.
<6>[3.133001] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
<6>[3.164097] SuperH (H)SCI(F) driver initialized
<0>[3.164471] Internal error: synchronous external abort: 9610 [#1]
PREEMPT SMP
<4>[3.164479] Modules linked in:
<4>[3.164495] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
4.19.0-rc8-8-ge033b9909fff-dirty #175
<4>[3.164501] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC
(DT)
<4>[3.164508] pstate: 4085 (nZcv daIf -PAN -UAO)
<4>[3.164514] pc : msm_read.isra.2+0x20/0x50
<4>[3.164520] lr : msm_read.isra.2+0x1c/0x50
<4>[3.164526] sp : 08033a50
<4>[3.164531] x29: 08033a50 x28: 09486018
<4>[3.164548] x27: 0001 x26: 7dfffe7ff070
<4>[3.164565] x25: 0034 x24: 09486000
<4>[3.164582] x23:  x22: 0978e190
<4>[3.164599] x21: 095e8228 x20: 0034
<4>[3.164616] x19: 7dfffe7ff008 x18: 
<4>[3.164632] x17:  x16: 
<4>[3.164649] x15: 094a96c8 x14: 8978e6bf
<4>[3.164666] x13: 0978e6cd x12: 0038
<4>[3.164683] x11: 094c6000 x10: 0c24
<4>[3.164699] x9 : 80003c89b400 x8 : 08033970
<4>[3.164716] x7 : 8eb04100 x6 : 000af304
<4>[3.164732] x5 : 0c40 x4 : 80003c06f000
<4>[3.164750] x3 : 80003c89b498 x2 : 
<4>[3.164766] x1 : 80003ca68000 x0 : 0800
<0>[3.164785] Process swapper/0 (pid: 1, stack limit =
0x(ptrval))
<4>[3.164791] Call trace:
<4>[3.164797]  msm_read.isra.2+0x20/0x50
<4>[3.164804]  msm_reset_dm_count+0x44/0x80
<4>[3.164810]  __msm_console_write+0x1c8/0x1d0
<4>[3.164816]  msm_serial_early_write_dm+0x3c/0x50
<4>[3.164823]  console_unlock.part.6+0x468/0x528
<4>[3.164829]  vprintk_emit+0x210/0x218
<4>[3.164835]  vprintk_default+0x48/0x58
<4>[3.164841]  vprintk_func+0xf0/0x1c0
<4>[3.164847]  printk+0x74/0x94
<4>[3.164853]  sci_init+0x24/0x3c
<4>[3.164859]  do_one_initcall+0x54/0x248
<4>[3.164866]  kernel_init_freeable+0x210/0x378
<4>[3.164872]  kernel_init+0x18/0x118
<4>[3.164878]  ret_from_fork+0x10/0x1c
<0>[3.164884] Code: aa1e03e0 8b214273 97e616f7 d503201f (b9400260)

Seems to be issue with the msm serial driver and not ftrace.
Could someone look into it.

One more thing is for pstore dmesg-ramoops, I had to change late_initcall to
postcore_initcall which brings the question as to why we changed to
late_initcall?
Simple git blame shows to support crypto compress api, but is it really
helpful? A lot of boottime issues can be caught with pstore enabled at
postcore_initcall rather than late_initcall, this backtrace
is just one example. Is there any way we could change this?


Any chance you are able to dig deeper into the stack trace? I would
disassemble vmlinux and see which part of msm_read is generating the
synchronous external abort and look into that.



Yes I had checked the part of msm_read which was generating the abort 
and it always seems to be in "msm_wait_for_xmitr" at below pointed location.


static inline void msm_wait_for_xmitr(struct uart_port *port)
{
while (!(msm_read(port, UART_SR) & UART_SR_TX_EMPTY)) { <---
if (msm_read(port, UART_ISR) & UART_ISR_TX_READY)
break;
udelay(1);
}
msm_write(port, UART_CR_CMD_RESET_TX_READY, UART_CR);
}

Also I could confirm that this path is entered repeatedly(with tracing 
register reads/writes from my patch series in 
https://lore.kernel.org/patchwork/cover/983795/ and tp_printk) and crash 
is seen at some random time, so could not get much from this.



Also similar to what Steve suggested, I wonder if it boots for you if you
annotate all the functions in the serial driver with 'notrace'.



I have tried this too, but still the target crashes. So I am doubtful if 
this is ftrace issue? Maybe earlycon uart issue as Stephen is suspecting.


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-17 Thread Sai Prakash Ranjan

On 10/17/2018 5:08 PM, Sai Prakash Ranjan wrote:


What do you think about the (untested) patch below? It seems to me 
that it
should solve the issue of missing early crash dumps, but I have not 
tested it

yet. Sai, would you mind trying it out and let me know if you can see the
early crash dumps properly now?

8<---
From: "Joel Fernandes (Google)" 
Subject: [RFC] pstore: allocate compression during late_initcall

ramoop's pstore registration (using pstore_register) has to run during
late_initcall because crypto backend may not be ready during
postcore_initcall. This causes missing of dmesg crash dumps which could
have been caught by pstore.

Instead, lets allow ramoops pstore registration earlier, and once crypto
is ready we can initialize the compression.

Reported-by: Sai Prakash Ranjan 
Signed-off-by: Joel Fernandes (Google) 
---
  fs/pstore/platform.c | 13 +
  fs/pstore/ram.c  |  2 +-
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 15e99d5a681d..f09066db2d4d 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -780,6 +780,19 @@ void __init pstore_choose_compression(void)
  }
  }
+static int __init pstore_compression_late_init(void)
+{
+    /*
+ * Check if any pstore backends registered earlier but did not 
allocate
+ * for compression because crypto was not ready, if so then 
initialize

+ * compression.
+ */
+    if (psinfo && !tfm)
+    allocate_buf_for_compression();
+    return 0;
+}
+late_initcall(pstore_compression_late_init);
+
  module_param(compress, charp, 0444);
  MODULE_PARM_DESC(compress, "Pstore compression to use");
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index bbd1e357c23d..98e48d1a9776 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -940,7 +940,7 @@ static int __init ramoops_init(void)
  ramoops_register_dummy();
  return platform_driver_register(&ramoops_driver);
  }
-late_initcall(ramoops_init);
+postcore_initcall(ramoops_init);
  static void __exit ramoops_exit(void)
  {



Yes I could see the early crash dump. Also I tested with different 
compression (LZO) instead of deflate just to be sure and it works fine, 
thanks :)


Tested-by: Sai Prakash Ranjan 



I just noticed that allocate_buf_for_compression() is also called from 
pstore_register(). Shouldn't that call be removed now that ramoops_init 
is moved to postcore_initcall and allocate_buf_for_compression() will 
just return doing nothing when called from pstore_register()?


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] pstore: Refactor compression initialization

2018-10-17 Thread Sai Prakash Ranjan
int __init pstore_compression_late_init(void)

-{
/*
-* Check if any pstore backends registered earlier but did not allocate
-* for compression because crypto was not ready, if so then initialize
-* compression.
+* Check if any pstore backends registered earlier but did not
+* initialize compression because crypto was not ready. If so,
+* then initialize compression now.
 */
-   if (psinfo && !tfm)
-   allocate_buf_for_compression();
+   allocate_buf_for_compression();


We can also get rid of the 'zbackend' global variable since choosing the
compression backend and allocating the buffers are done at the same time?

Otherwise looks good to me,

Reviewed-by: Joel Fernandes (Google) 



Tested this on top of Joel's patch, works fine on getting early crash 
pstore dmesg logs, thanks.


Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 3/6] tracing: Add tp_pstore cmdline to have tracepoints go to pstore

2018-10-08 Thread Sai Prakash Ranjan

On 9/26/2018 3:16 PM, Sai Prakash Ranjan wrote:

On 9/26/2018 2:55 AM, Joel Fernandes wrote:

On Sat, Sep 8, 2018 at 1:28 PM Sai Prakash Ranjan
 wrote:


Add the kernel command line tp_pstore option that will have
tracepoints go to persistent ram buffer as well as to the
trace buffer for further debugging. This is similar to tp_printk
cmdline option of ftrace.

Pstore support for event tracing is already added and we enable
logging to pstore only if cmdline is specified.

Passing "tp_pstore" will activate logging to pstore. To turn it
off, the sysctl /proc/sys/kernel/tracepoint_pstore can have '0'
echoed into it. Note, this only works if the cmdline option is
used. Echoing 1 into the sysctl file without the cmdline option
will have no affect.

Signed-off-by: Sai Prakash Ranjan 
---
  .../admin-guide/kernel-parameters.txt | 21 
  include/linux/ftrace.h    |  6 ++-
  kernel/sysctl.c   |  7 +++
  kernel/trace/Kconfig  | 22 +++-
  kernel/trace/trace.c  | 51 +++
  kernel/trace/trace.h  |  7 +++
  6 files changed, 112 insertions(+), 2 deletions(-)


[...]

  config GCOV_PROFILE_FTRACE
 bool "Enable GCOV profiling on ftrace subsystem"
 depends on GCOV_KERNEL
@@ -789,4 +810,3 @@ config GCOV_PROFILE_FTRACE
  endif # FTRACE

  endif # TRACING_SUPPORT
-
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bf6f1d70484d..018cbbefb769 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -73,6 +73,11 @@ struct trace_iterator *tracepoint_print_iter;
  int tracepoint_printk;
  static DEFINE_STATIC_KEY_FALSE(tracepoint_printk_key);

+/* Pipe tracepoints to pstore */
+struct trace_iterator *tracepoint_pstore_iter;
+int tracepoint_pstore;
+static DEFINE_STATIC_KEY_FALSE(tracepoint_pstore_key);
+
  /* For tracers that don't implement custom flags */
  static struct tracer_opt dummy_tracer_opt[] = {
 { }
@@ -238,6 +243,14 @@ static int __init set_tracepoint_printk(char *str)
  }
  __setup("tp_printk", set_tracepoint_printk);

+static int __init set_tracepoint_pstore(char *str)
+{
+   if ((strcmp(str, "=0") != 0 && strcmp(str, "=off") != 0))
+   tracepoint_pstore = 1;
+   return 1;
+}
+__setup("tp_pstore", set_tracepoint_pstore);
+
  unsigned long long ns2usecs(u64 nsec)
  {
 nsec += 500;
@@ -2376,11 +2389,45 @@ int tracepoint_printk_sysctl(struct ctl_table 
*table, int write,

 return ret;
  }

+static DEFINE_MUTEX(tracepoint_pstore_mutex);
+
+int tracepoint_pstore_sysctl(struct ctl_table *table, int write,
+    void __user *buffer, size_t *lenp,
+    loff_t *ppos)
+{
+   int save_tracepoint_pstore;
+   int ret;
+
+   mutex_lock(&tracepoint_pstore_mutex);
+   save_tracepoint_pstore = tracepoint_pstore;
+
+   ret = proc_dointvec(table, write, buffer, lenp, ppos);
+
+   if (!tracepoint_pstore_iter)
+   tracepoint_pstore = 0;
+
+   if (save_tracepoint_pstore == tracepoint_pstore)
+   goto out;
+
+   if (tracepoint_pstore)
+   static_key_enable(&tracepoint_pstore_key.key);
+   else
+   static_key_disable(&tracepoint_pstore_key.key);
+
+ out:
+   mutex_unlock(&tracepoint_pstore_mutex);
+
+   return ret;
+}
+
  void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
  {
 if (static_key_false(&tracepoint_printk_key.key))
 output_printk(fbuffer);

+   if (static_key_false(&tracepoint_pstore_key.key))
+   pstore_event_call(fbuffer);


Can you not find a way to pass the size of the even record here, to
pstore? Then you can directly allocate and store the binary record in
pstore itself instead of rendering and storing the text in pstore
which will be more space (and I think time) efficient. I also think if
you do this, then you will not need to use the spinlock in the pstore
(which AIUI is preventing the warning you're seeing in the
event_call->event.funcs->trace() call).



Right, I can check this out.



Hi Joel,

Sorry for the long delay in updating this thread.
But I just observed one weird behaviour in ftrace-ramoops when I was 
trying to use binary record instead of rendering text for event-ramoops.


Even though we set the ftrace-size in device tree for ramoops, the 
actual ftrace-ramoops record seems to have more records than specified size.

Is this expected or some bug?

Below is the ftrace-ramoops size passed in dtsi for db410c and we can 
see that the ftrace record is more.


# cat /sys/module/ramoops/parameters/ftrace_size
131072
#
# cat /sys/fs/pstore/ftrace-ramoops-0 | wc -c
560888
#
# cat /sys/fs/pstore/ftrace-ramoops-0 | grep CPU:0 | wc -c
137758
#
# cat /sys/fs/pstore/ftrace-ram

Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-17 Thread Sai Prakash Ranjan

On 10/18/2018 8:03 AM, Steven Rostedt wrote:

On Wed, 17 Oct 2018 00:36:05 +0530
Sai Prakash Ranjan  wrote:


On 10/17/2018 12:33 AM, Steven Rostedt wrote:

On Wed, 17 Oct 2018 00:31:03 +0530
Sai Prakash Ranjan  wrote:
   

Haa seems like you are right! With "ftrace=function
ftrace_filter=msm_read" , I can trigger the crash, but
sadly "ftrace_notrace=msm_read" also crashes.


So there's more than one problem area.

What about ftrace_notrace=m*

?
   


That too crashes.



So something else is causing an issue besides just msm_read.

Can you do an objdump -dr of the entire vmlinux binary and gzip it and
post it somewhere. Not sure if it would be too big to email. You could
try sending it to me privately. I'd like to see the binary that you are
using.



I have sent the objdump and dot config to you privately.

Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-18 Thread Sai Prakash Ranjan

On 10/19/2018 9:47 AM, Joel Fernandes wrote:

On Thu, Oct 18, 2018 at 09:17:06AM -0400, Steven Rostedt wrote:

On Thu, 18 Oct 2018 10:51:18 +0530
Sai Prakash Ranjan  wrote:


So something else is causing an issue besides just msm_read.

Can you do an objdump -dr of the entire vmlinux binary and gzip it and
post it somewhere. Not sure if it would be too big to email. You could
try sending it to me privately. I'd like to see the binary that you are
using.
   


I have sent the objdump and dot config to you privately.


Thanks. I don't see anything that pops out, but then again, my arm asm
foo is very rusty (it has been literally decades since I did any arm
asm). I wonder if it could simply be a timing issue?

086eb538 :
086eb538:   a9be7bfdstp x29, x30, [sp,#-32]!
086eb53c:   910003fdmov x29, sp
086eb540:   a90153f3stp x19, x20, [sp,#16]
086eb544:   aa0003f4mov x20, x0
086eb548:   2a0103f3mov w19, w1
086eb54c:   aa1e03e0mov x0, x30
086eb550:   97e6bae4bl  0809a0e0 <_mcount>

The above is changed to nop on boot, but then to:

bl ftrace_caller

When ftrace is enabled.

086eb554:   8b334280add x0, x20, w19, uxtw
086eb558:   b940ldr w0, [x0]
086eb55c:   a94153f3ldp x19, x20, [sp,#16]
086eb560:   a8c27bfdldp x29, x30, [sp],#32
086eb564:   d65f03c0ret



0809a0e4 :
0809a0e4:   a9bf7bfdstp x29, x30, [sp,#-16]!
0809a0e8:   910003fdmov x29, sp
0809a0ec:   d10013c0sub x0, x30, #0x4
0809a0f0:   f94003a1ldr x1, [x29]
0809a0f4:   f9400421ldr x1, [x1,#8]
0809a0f8:   d1001021sub x1, x1, #0x4

0809a0fc :
0809a0fc:   d503201fnop

The above nop gets patched to:

bl ftrace_ops_no_ops

Which will iterate through all the registered functions.


0809a100 :
0809a100:   d503201fnop

The above only gets set when function graph tracer is enabled, which it
is not in this case.

0809a104:   a8c17bfdldp x29, x30, [sp],#16
0809a108:   d65f03c0ret


Anyone see any problems here?


This seems sane to me, he says in the other thread that he put 'notrace' to
the msm serial functions (which AIUI should prevent ftrace instrumentation)
and he still sees the issue.



Yes I did add notrace to all functions in msm serial and checked the 
objdump to make sure that those were not instrumented, and yet the 
target crashed. This doesnt seem like an issue with ftrace but rather 
with msm early con.


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: Crash in msm serial on dragonboard with ftrace bootargs

2018-10-19 Thread Sai Prakash Ranjan

On 10/19/2018 7:21 PM, Steven Rostedt wrote:

On Fri, 19 Oct 2018 12:24:05 +0530
Sai Prakash Ranjan  wrote:


Anyone see any problems here?


This seems sane to me, he says in the other thread that he put 'notrace' to
the msm serial functions (which AIUI should prevent ftrace instrumentation)
and he still sees the issue.
   


Yes I did add notrace to all functions in msm serial and checked the
objdump to make sure that those were not instrumented, and yet the
target crashed. This doesnt seem like an issue with ftrace but rather
with msm early con.


The thing that still bothers me is that it boots fine without ftrace
running, and that it requires tracing something to cause the problem.
This tells me that ftrace has something to do with it. Perhaps it's
just changing the timing.



Yes tracing does cause issue but only if earlycon is present. If I 
disable earlycon, then even tracing wont cause any issue and target 
boots fine.



You said that if you add 'ftrace_filter=msm_read' to the command line,
it still crashes?

So only tracing that function we have an issue, right?



Tracing msm_read does cause the crash, but that is not the only one 
causing the crash because even after "ftrace_notrace=msm_read" the board 
crashes which is why I suspect msm earlycon and not ftrace.


Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 0/6] Tracing register accesses with pstore and dynamic debug

2018-10-19 Thread Sai Prakash Ranjan

On 10/20/2018 10:55 AM, Joel Fernandes wrote:

On Sun, Sep 09, 2018 at 01:57:01AM +0530, Sai Prakash Ranjan wrote:

Hi,

This patch series adds Event tracing support to pstore and is continuation
to the RFC patch introduced to add a new tracing facility for register
accesses called Register Trace Buffer(RTB). Since we decided to not introduce
a separate framework to trace register accesses and use existing framework
like tracepoints, I have moved from RFC. Details of the RFC in link below:

Link: 
https://lore.kernel.org/lkml/cover.1535119710.git.saiprakash.ran...@codeaurora.org/

MSR tracing example given by Steven was helpful in using tracepoints for
register accesses instead of using separate trace. But just having these
IO traces would not help much unless we could have them in some persistent
ram buffer for debugging unclocked access or some kind of bus hang or an
unexpected reset caused by some buggy driver which happens a lot during
initial development stages. By analyzing the last few entries of this buffer,
we could identify the register access which is causing the issue.


Hi Sai,

I wanted to see if I could make some time to get your patches working. We are
hitting usecases that need something like this as well. Basically devices
hanging and then the ramdump does not tell us much, so in this case pstore
events can be really helpful. This usecase came up last year as well.

Anyway while I was going through your patches, I cleaned up some pstore code
as well and I have 3 more patches on top of yours for this clean up. I prefer
we submit the patches together and sync our work together so that there is
least conflict.

Here's my latest tree:
https://github.com/joelagnel/linux-kernel/commits/pstore-events
(note that I have only build tested the patches since I just wrote them and
its quite late in the night here ;-))



Hi Joel,

Thanks for looking into this. Sure, I will be happy to sync up with you 
on this. I can test your additional patches on top of my pstore patches. 
BTW, I'm still stuck at copying binary record into pstore and then 
extract it during read time. Seems like I'm missing something.


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 0/6] Tracing register accesses with pstore and dynamic debug

2018-10-20 Thread Sai Prakash Ranjan

On 10/20/2018 9:57 PM, Joel Fernandes wrote:

On Sat, Oct 20, 2018 at 12:02:37PM +0530, Sai Prakash Ranjan wrote:

On 10/20/2018 10:55 AM, Joel Fernandes wrote:

On Sun, Sep 09, 2018 at 01:57:01AM +0530, Sai Prakash Ranjan wrote:

Hi,

This patch series adds Event tracing support to pstore and is continuation
to the RFC patch introduced to add a new tracing facility for register
accesses called Register Trace Buffer(RTB). Since we decided to not introduce
a separate framework to trace register accesses and use existing framework
like tracepoints, I have moved from RFC. Details of the RFC in link below:

Link: 
https://lore.kernel.org/lkml/cover.1535119710.git.saiprakash.ran...@codeaurora.org/

MSR tracing example given by Steven was helpful in using tracepoints for
register accesses instead of using separate trace. But just having these
IO traces would not help much unless we could have them in some persistent
ram buffer for debugging unclocked access or some kind of bus hang or an
unexpected reset caused by some buggy driver which happens a lot during
initial development stages. By analyzing the last few entries of this buffer,
we could identify the register access which is causing the issue.


Hi Sai,

I wanted to see if I could make some time to get your patches working. We are
hitting usecases that need something like this as well. Basically devices
hanging and then the ramdump does not tell us much, so in this case pstore
events can be really helpful. This usecase came up last year as well.

Anyway while I was going through your patches, I cleaned up some pstore code
as well and I have 3 more patches on top of yours for this clean up. I prefer
we submit the patches together and sync our work together so that there is
least conflict.

Here's my latest tree:
https://github.com/joelagnel/linux-kernel/commits/pstore-events
(note that I have only build tested the patches since I just wrote them and
its quite late in the night here ;-))



Hi Joel,

Thanks for looking into this. Sure, I will be happy to sync up with you on


Thanks. And added a fourth patch in the tree too.


this. I can test your additional patches on top of my pstore patches. BTW,
I'm still stuck at copying binary record into pstore and then extract it
during read time. Seems like I'm missing something.


Sure, push your latest somewhere and let me know. I'll try to get you unstuck.



Thanks Joel, I will push my changes and let you know in some time.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 0/6] Tracing register accesses with pstore and dynamic debug

2018-10-20 Thread Sai Prakash Ranjan

On 10/21/2018 9:16 AM, Sai Prakash Ranjan wrote:

On 10/20/2018 9:57 PM, Joel Fernandes wrote:

On Sat, Oct 20, 2018 at 12:02:37PM +0530, Sai Prakash Ranjan wrote:

On 10/20/2018 10:55 AM, Joel Fernandes wrote:

On Sun, Sep 09, 2018 at 01:57:01AM +0530, Sai Prakash Ranjan wrote:

Hi,

This patch series adds Event tracing support to pstore and is 
continuation

to the RFC patch introduced to add a new tracing facility for register
accesses called Register Trace Buffer(RTB). Since we decided to not 
introduce
a separate framework to trace register accesses and use existing 
framework
like tracepoints, I have moved from RFC. Details of the RFC in link 
below:


Link: 
https://lore.kernel.org/lkml/cover.1535119710.git.saiprakash.ran...@codeaurora.org/ 



MSR tracing example given by Steven was helpful in using 
tracepoints for
register accesses instead of using separate trace. But just having 
these
IO traces would not help much unless we could have them in some 
persistent
ram buffer for debugging unclocked access or some kind of bus hang 
or an
unexpected reset caused by some buggy driver which happens a lot 
during
initial development stages. By analyzing the last few entries of 
this buffer,

we could identify the register access which is causing the issue.


Hi Sai,

I wanted to see if I could make some time to get your patches 
working. We are
hitting usecases that need something like this as well. Basically 
devices
hanging and then the ramdump does not tell us much, so in this case 
pstore

events can be really helpful. This usecase came up last year as well.

Anyway while I was going through your patches, I cleaned up some 
pstore code
as well and I have 3 more patches on top of yours for this clean up. 
I prefer
we submit the patches together and sync our work together so that 
there is

least conflict.

Here's my latest tree:
https://github.com/joelagnel/linux-kernel/commits/pstore-events
(note that I have only build tested the patches since I just wrote 
them and

its quite late in the night here ;-))



Hi Joel,

Thanks for looking into this. Sure, I will be happy to sync up with 
you on


Thanks. And added a fourth patch in the tree too.

this. I can test your additional patches on top of my pstore patches. 
BTW,

I'm still stuck at copying binary record into pstore and then extract it
during read time. Seems like I'm missing something.


Sure, push your latest somewhere and let me know. I'll try to get you 
unstuck.




Thanks Joel, I will push my changes and let you know in some time.



Hi Joel,

Here's my tree:
https://github.com/saiprakash-ranjan/linux/commits/pstore-events

I have one patch extra on top of your patches. Nothing much on binary 
record storage in this patch, only removed kmalloc in pstore event call 
to avoid loop.


Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-16 Thread Sai Prakash Ranjan

On 9/9/2018 1:57 AM, Sai Prakash Ranjan wrote:

Currently pstore has function trace support which can be
used to get the function call chain with limited data.
Event tracing has extra data which is useful to debug wide
variety of issues and is heavily used across the kernel.

Adding this support to pstore can be very helpful to debug
different subsystems since almost all of them have trace
events already available. And also it is useful to debug
unknown resets or crashes since we can get lot more info
from event tracing by viewing the last occurred events.

High frequency tracepoints such as sched and irq has also
been tested. This implementation is similar to "tp_printk"
command line feature of ftrace by Steven.

For example, sample pstore output of tracing sched events
after reboot:

   # mount -t pstore pstore /sys/fs/pstore/
   # tail /sys/fs/pstore/event-ramoops-0
   sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=S ==> 
next_comm=rcu_preempt next_pid=10 next_prio=120
   sched_switch: prev_comm=rcu_preempt prev_pid=10 prev_prio=120 prev_state=R+ 
==> next_comm=swapper/1 next_pid=0 next_prio=120
   sched_waking: comm=rcu_sched pid=11 prio=120 target_cpu=002
   sched_wakeup: comm=rcu_sched pid=11 prio=120 target_cpu=002
   sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=S ==> 
next_comm=rcu_sched next_pid=11 next_prio=120
   sched_switch: prev_comm=rcu_sched prev_pid=11 prev_prio=120 prev_state=R+ 
==> next_comm=swapper/2 next_pid=0 next_prio=120
   sched_waking: comm=reboot pid=1867 prio=120 target_cpu=000
   sched_wakeup: comm=reboot pid=1867 prio=120 target_cpu=000
   sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=S ==> 
next_comm=reboot next_pid=1867 next_prio=120

Signed-off-by: Sai Prakash Ranjan 
---
  fs/pstore/Kconfig  |  2 +-
  fs/pstore/ftrace.c | 55 ++
  fs/pstore/inode.c  |  4 +++
  fs/pstore/ram.c| 44 +++---
  include/linux/pstore.h |  2 ++
  include/linux/pstore_ram.h |  1 +
  6 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig
index 503086f7f7c1..6fe087b13a51 100644
--- a/fs/pstore/Kconfig
+++ b/fs/pstore/Kconfig
@@ -126,7 +126,7 @@ config PSTORE_PMSG
  
  config PSTORE_FTRACE

bool "Persistent function tracer"
-   depends on PSTORE
+   depends on PSTORE && PSTORE!=m
depends on FUNCTION_TRACER
depends on DEBUG_FS
help
diff --git a/fs/pstore/ftrace.c b/fs/pstore/ftrace.c
index 06aab07b6bb7..d47dc93ac098 100644
--- a/fs/pstore/ftrace.c
+++ b/fs/pstore/ftrace.c
@@ -24,6 +24,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  #include 
  #include "internal.h"
  
@@ -62,6 +64,59 @@ static struct ftrace_ops pstore_ftrace_ops __read_mostly = {

.func   = pstore_ftrace_call,
  };
  
+void notrace pstore_event_call(struct trace_event_buffer *fbuffer)

+{
+   struct trace_iterator *iter;
+   struct trace_seq *s;
+   struct trace_event_call *event_call;
+   struct pstore_record record;
+   struct trace_event *event;
+   struct seq_buf *seq;
+   unsigned long flags;
+
+   if (!psinfo)
+   return;
+
+   if (unlikely(oops_in_progress))
+   return;
+
+   pstore_record_init(&record, psinfo);
+   record.type = PSTORE_TYPE_EVENT;
+
+   iter = kmalloc(sizeof(*iter), GFP_KERNEL);
+   if (!iter)
+   return;
+
+   event_call = fbuffer->trace_file->event_call;
+   if (!event_call || !event_call->event.funcs ||
+   !event_call->event.funcs->trace)
+   goto fail_event;
+
+   event = &fbuffer->trace_file->event_call->event;
+
+   spin_lock_irqsave(&psinfo->buf_lock, flags);
+
+   trace_seq_init(&iter->seq);
+   iter->ent = fbuffer->entry;
+   event_call->event.funcs->trace(iter, 0, event);
+   trace_seq_putc(&iter->seq, 0);
+
+   if (seq->size > psinfo->bufsize)
+   seq->size = psinfo->bufsize;
+
+   s = &iter->seq;
+   seq = &s->seq;
+
+   record.buf = (char *)(seq->buffer);
+   record.size = seq->len;
+   psinfo->write(&record);
+
+   spin_unlock_irqrestore(&psinfo->buf_lock, flags);
+
+fail_event:
+   kfree(iter);
+}
+
  static DEFINE_MUTEX(pstore_ftrace_lock);
  static bool pstore_ftrace_enabled;
  
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c

index 5fcb845b9fec..f099152abbbd 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -345,6 +345,10 @@ int pstore_mkfile(struct dentry *root, struct 
pstore_record *record)
scnprintf(name, sizeof(name), "console-%s-%llu",
  record->psi->name, record->id);
   

Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-17 Thread Sai Prakash Ranjan

On 9/16/2018 7:25 PM, Joel Fernandes wrote:
Sorry for the top post. I've been wanting to do this as well for some 
time. It's quite useful. I am out of office this week and away from work 
machine. I will take a look at your patches next week once I'm back at 
work. Thanks.


Best,
J



Cool, thanks Joel.



Re: [PATCH 1/6] dt-bindings: ramoops: Add event-size property

2018-09-17 Thread Sai Prakash Ranjan

On 9/17/2018 11:15 AM, Rob Herring wrote:

On Sun,  9 Sep 2018 01:57:02 +0530, Sai Prakash Ranjan wrote:

Add an optional property called event-size to reserve
log buffer for trace events.

Signed-off-by: Sai Prakash Ranjan 
---
  .../devicetree/bindings/reserved-memory/ramoops.txt| 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)



Reviewed-by: Rob Herring 



Thanks a lot for the review Rob.

- Sai


Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-17 Thread Sai Prakash Ranjan

On 9/17/2018 8:24 PM, Kees Cook wrote:

On Sun, Sep 16, 2018 at 6:55 AM, Joel Fernandes  wrote:

On Sun, Sep 16, 2018, 12:08 AM Sai Prakash Ranjan
 wrote:


On 9/9/2018 1:57 AM, Sai Prakash Ranjan wrote:

Currently pstore has function trace support which can be
used to get the function call chain with limited data.
Event tracing has extra data which is useful to debug wide
variety of issues and is heavily used across the kernel.

Adding this support to pstore can be very helpful to debug
different subsystems since almost all of them have trace
events already available. And also it is useful to debug
unknown resets or crashes since we can get lot more info
from event tracing by viewing the last occurred events.


Anyone here?


Sorry for the top post. I've been wanting to do this as well for some time.
It's quite useful. I am out of office this week and away from work machine.
I will take a look at your patches next week once I'm back at work. Thanks.


If Steven agrees this shouldn't live in ftrace directly and Joel
reviews these patches, I think it should be fine. I'm travelling, but
I can review it hopefully later this week.



Thank you Kees.

- Sai



Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-17 Thread Sai Prakash Ranjan

On 9/17/2018 11:08 PM, Stephen Boyd wrote:

Quoting Sai Prakash Ranjan (2018-09-11 03:46:01)

On 9/9/2018 1:57 AM, Sai Prakash Ranjan wrote:

+void notrace pstore_event_call(struct trace_event_buffer *fbuffer)
+{
+ struct trace_iterator *iter;
+ struct trace_seq *s;
+ struct trace_event_call *event_call;
+ struct pstore_record record;
+ struct trace_event *event;
+ struct seq_buf *seq;
+ unsigned long flags;
+
+ if (!psinfo)
+ return;
+
+ if (unlikely(oops_in_progress))
+ return;
+
+ pstore_record_init(&record, psinfo);
+ record.type = PSTORE_TYPE_EVENT;
+
+ iter = kmalloc(sizeof(*iter), GFP_KERNEL);
+ if (!iter)
+ return;
+
+ event_call = fbuffer->trace_file->event_call;
+ if (!event_call || !event_call->event.funcs ||
+ !event_call->event.funcs->trace)
+ goto fail_event;
+
+ event = &fbuffer->trace_file->event_call->event;
+
+ spin_lock_irqsave(&psinfo->buf_lock, flags);
+
+ trace_seq_init(&iter->seq);
+ iter->ent = fbuffer->entry;
+ event_call->event.funcs->trace(iter, 0, event);
+ trace_seq_putc(&iter->seq, 0);
+
+ if (seq->size > psinfo->bufsize)
+ seq->size = psinfo->bufsize;
+
+ s = &iter->seq;
+ seq = &s->seq;
+
+ record.buf = (char *)(seq->buffer);
+ record.size = seq->len;
+ psinfo->write(&record);
+
+ spin_unlock_irqrestore(&psinfo->buf_lock, flags);
+
+fail_event:
+ kfree(iter);
+}
+



When tracing sched events on sdm845 mtp, I hit below bug repeatedly.
Seems like pstore_event_call can be called in atomic context.
I will respin the below fix in next version of the patch.
Reviews on other parts would be appreciated, thanks.

diff --git a/fs/pstore/ftrace.c b/fs/pstore/ftrace.c
index d47dc93ac098..a497cf782ee8 100644
--- a/fs/pstore/ftrace.c
+++ b/fs/pstore/ftrace.c
@@ -73,6 +73,7 @@ void notrace pstore_event_call(struct
trace_event_buffer *fbuffer)
  struct trace_event *event;
  struct seq_buf *seq;
  unsigned long flags;
+   gfp_t gfpflags;

  if (!psinfo)
  return;
@@ -83,7 +84,9 @@ void notrace pstore_event_call(struct
trace_event_buffer *fbuffer)
  pstore_record_init(&record, psinfo);
  record.type = PSTORE_TYPE_EVENT;

-   iter = kmalloc(sizeof(*iter), GFP_KERNEL);
+   gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC :
GFP_KERNEL;
+




Hi Stephen

Thanks for the comments.


Do you need to allocate at all? Can you throw the iter on the stack?



Yes since we need to copy the contents to pstore buffer.


Using in_atomic() and irqs_disabled() to figure out if an atomic or a
non-atomic allocation should be used is not a good solution.



I took reference from a similar use by graph_trace_open() which can be 
called in atomic context via ftrace_dump(). I am open to correct this if 
there is some other way.


Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] tty/sysrq: Make local variable 'killer' in sysrq_handle_crash() global

2018-09-17 Thread Sai Prakash Ranjan

On 9/18/2018 3:03 AM, Matthias Kaehlcke wrote:

sysrq_handle_crash() dereferences a NULL pointer on purpose to force
an exception, the local variable 'killer' is assigned to NULL and
dereferenced later. Clang detects the NULL pointer dereference at compile
time and emits a BRK instruction (on arm64) instead of the expected NULL
pointer exception. Change 'killer' to a global variable (and rename it
to 'sysrq_killer' to avoid possible clashes) to prevent Clang from
detecting the condition. By default global variables are initialized
with zero/NULL in C, therefore an explicit initialization is not needed.

Reported-by: Sai Prakash Ranjan 
Suggested-by: Evan Green 
Signed-off-by: Matthias Kaehlcke 
---
  drivers/tty/sysrq.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 06ed20dd01ba..49fa8e758690 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -132,10 +132,10 @@ static struct sysrq_key_op sysrq_unraw_op = {
  #define sysrq_unraw_op (*(struct sysrq_key_op *)NULL)
  #endif /* CONFIG_VT */
  
+char *sysrq_killer;

+
  static void sysrq_handle_crash(int key)
  {
-   char *killer = NULL;
-
/* we need to release the RCU read lock here,
 * otherwise we get an annoying
 * 'BUG: sleeping function called from invalid context'
@@ -144,7 +144,7 @@ static void sysrq_handle_crash(int key)
rcu_read_unlock();
panic_on_oops = 1;  /* force panic */
wmb();
-   *killer = 1;
+   *sysrq_killer = 1;
  }
  static struct sysrq_key_op sysrq_crash_op = {
.handler    = sysrq_handle_crash,



Tested-by: Sai Prakash Ranjan 

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-17 Thread Sai Prakash Ranjan

On 9/18/2018 4:34 AM, Steven Rostedt wrote:

On Sun, 16 Sep 2018 12:37:52 +0530
Sai Prakash Ranjan  wrote:


Hi,

Anyone here?


You also just caught me from coming back from a trip. I'm looking at
your patches now.

-- Steve



Thanks Steve, I just thought you guys might have missed the patch.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] tty/sysrq: Make local variable 'killer' in sysrq_handle_crash() global

2018-09-17 Thread Sai Prakash Ranjan

On 9/18/2018 11:41 AM, Jiri Slaby wrote:

On 09/17/2018, 11:33 PM, Matthias Kaehlcke wrote:

sysrq_handle_crash() dereferences a NULL pointer on purpose to force
an exception, the local variable 'killer' is assigned to NULL and
dereferenced later. Clang detects the NULL pointer dereference at compile
time and emits a BRK instruction (on arm64) instead of the expected NULL
pointer exception. Change 'killer' to a global variable (and rename it
to 'sysrq_killer' to avoid possible clashes) to prevent Clang from
detecting the condition. By default global variables are initialized
with zero/NULL in C, therefore an explicit initialization is not needed.

Reported-by: Sai Prakash Ranjan 
Suggested-by: Evan Green 
Signed-off-by: Matthias Kaehlcke 
---
  drivers/tty/sysrq.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 06ed20dd01ba..49fa8e758690 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -132,10 +132,10 @@ static struct sysrq_key_op sysrq_unraw_op = {
  #define sysrq_unraw_op (*(struct sysrq_key_op *)NULL)
  #endif /* CONFIG_VT */
  
+char *sysrq_killer;

+
  static void sysrq_handle_crash(int key)
  {
-   char *killer = NULL;
-
/* we need to release the RCU read lock here,
 * otherwise we get an annoying
 * 'BUG: sleeping function called from invalid context'
@@ -144,7 +144,7 @@ static void sysrq_handle_crash(int key)
rcu_read_unlock();
panic_on_oops = 1;  /* force panic */
wmb();
-   *killer = 1;
+   *sysrq_killer = 1;


Just because a static analyzer is wrong? Oh wait, even compiler is
wrong. At least make it a static global. Or what about OPTIMIZER_HIDE_VAR?



static global does not work, clang still inserts brk. As for 
OPTIMIZE_HIDE_VAR, it seems to work.
But, I dont think it is defined for clang in which case it defaults to 
using barrier(). There is already one wmb(), so will it be right?


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 5/6] arm64/io: Add header for instrumentation of io operations

2018-09-18 Thread Sai Prakash Ranjan

On 9/18/2018 5:09 AM, Steven Rostedt wrote:

On Sun,  9 Sep 2018 01:57:06 +0530
Sai Prakash Ranjan  wrote:


The new asm-generic/io-instrumented.h will keep arch code
clean and separate from instrumented version which traces
io register accesses. This instrumented header can later
be included in arm as well for tracing io register accesses.



Looks good to me.

Acked-by: Steven Rostedt (VMware) 

-- Steve


Thanks for the ack Steve.

@Will/Mark/Robin Let me know if you have any review comments, thanks.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] tty/sysrq: Make local variable 'killer' in sysrq_handle_crash() global

2018-09-18 Thread Sai Prakash Ranjan

On 9/18/2018 12:50 PM, Greg Kroah-Hartman wrote:

On Tue, Sep 18, 2018 at 12:28:39PM +0530, Sai Prakash Ranjan wrote:

On 9/18/2018 11:41 AM, Jiri Slaby wrote:

On 09/17/2018, 11:33 PM, Matthias Kaehlcke wrote:

sysrq_handle_crash() dereferences a NULL pointer on purpose to force
an exception, the local variable 'killer' is assigned to NULL and
dereferenced later. Clang detects the NULL pointer dereference at compile
time and emits a BRK instruction (on arm64) instead of the expected NULL
pointer exception. Change 'killer' to a global variable (and rename it
to 'sysrq_killer' to avoid possible clashes) to prevent Clang from
detecting the condition. By default global variables are initialized
with zero/NULL in C, therefore an explicit initialization is not needed.

Reported-by: Sai Prakash Ranjan 
Suggested-by: Evan Green 
Signed-off-by: Matthias Kaehlcke 
---
   drivers/tty/sysrq.c | 6 +++---
   1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 06ed20dd01ba..49fa8e758690 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -132,10 +132,10 @@ static struct sysrq_key_op sysrq_unraw_op = {
   #define sysrq_unraw_op (*(struct sysrq_key_op *)NULL)
   #endif /* CONFIG_VT */
+char *sysrq_killer;
+
   static void sysrq_handle_crash(int key)
   {
-   char *killer = NULL;
-
/* we need to release the RCU read lock here,
 * otherwise we get an annoying
 * 'BUG: sleeping function called from invalid context'
@@ -144,7 +144,7 @@ static void sysrq_handle_crash(int key)
rcu_read_unlock();
panic_on_oops = 1;  /* force panic */
wmb();
-   *killer = 1;
+   *sysrq_killer = 1;


Just because a static analyzer is wrong? Oh wait, even compiler is
wrong. At least make it a static global. Or what about OPTIMIZER_HIDE_VAR?



static global does not work, clang still inserts brk. As for
OPTIMIZE_HIDE_VAR, it seems to work.
But, I dont think it is defined for clang in which case it defaults to using
barrier(). There is already one wmb(), so will it be right?


Ick, why is this needed at all?  Why are we trying to "roll our own
panic" in this code?



Hi Greg, do you mean like why there is a killer var at all or why this 
change is required?


Below are some additional info about this issue:

CLANG generated:

  ff800842bc1c :
  ff800842bc1c:   a9bf7bfdstp x29, x30, [sp,#-16]!
  ff800842bc20:   910003fdmov x29, sp
  ff800842bc24:   97f1a34ebl  ff800809495c 
<_mcount>
  ff800842bc28:   97f38876bl  ff800810de00 
<__rcu_read_unlock>
  ff800842bc2c:   f0006888adrpx8, ff800913e000 


  ff800842bc30:   320003e9orr w9, wzr, #0x1
  ff800842bc34:   b9091109str w9, [x8,#2320]
  ff800842bc38:   d5033e9fdsb st
  ff800842bc3c:   d4200020brk #0x1  <


GCC generated:

ff8008452f60 :
ff8008452f60:   a9bf7bfdstp x29, x30, [sp,#-16]!
ff8008452f64:   910003fdmov x29, sp
ff8008452f68:   aa1e03e0mov x0, x30
ff8008452f6c:   97f1078cbl  ff8008094d9c <_mcount>
ff8008452f70:   97f2f2ccbl  ff800810faa0 
<__rcu_read_unlock>
ff8008452f74:   900067e1adrpx1, ff800914e000 

ff8008452f78:   52800020mov w0, #0x1 
   // #1

ff8008452f7c:   b9096820str w0, [x1,#2408]
ff8008452f80:   d5033e9fdsb st
ff8008452f84:   d281mov x1, #0x0 
   // #0

ff8008452f88:   3920strbw0, [x1]
ff8008452f8c:   a8c17bfdldp x29, x30, [sp],#16
ff8008452f90:   d65f03c0ret

Trigger sysrq crash:

localhost ~ # echo c > /proc/sysrq-trigger
 [   78.320401] sysrq: SysRq : Trigger a crash
 [   78.324752] Unexpected kernel BRK exception at EL1
 [   78.329691] Unhandled debug exception: ptrace BRK handler 
(0xf201) at 0x0e25c368

 [   78.338384] Internal error: : 0 [#1] PREEMPT SMP

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH] tty/sysrq: Make local variable 'killer' in sysrq_handle_crash() global

2018-09-18 Thread Sai Prakash Ranjan

On 9/18/2018 2:47 PM, Greg Kroah-Hartman wrote:

On Tue, Sep 18, 2018 at 02:35:02PM +0530, Sai Prakash Ranjan wrote:

On 9/18/2018 12:50 PM, Greg Kroah-Hartman wrote:

On Tue, Sep 18, 2018 at 12:28:39PM +0530, Sai Prakash Ranjan wrote:

On 9/18/2018 11:41 AM, Jiri Slaby wrote:

On 09/17/2018, 11:33 PM, Matthias Kaehlcke wrote:

sysrq_handle_crash() dereferences a NULL pointer on purpose to force
an exception, the local variable 'killer' is assigned to NULL and
dereferenced later. Clang detects the NULL pointer dereference at compile
time and emits a BRK instruction (on arm64) instead of the expected NULL
pointer exception. Change 'killer' to a global variable (and rename it
to 'sysrq_killer' to avoid possible clashes) to prevent Clang from
detecting the condition. By default global variables are initialized
with zero/NULL in C, therefore an explicit initialization is not needed.

Reported-by: Sai Prakash Ranjan 
Suggested-by: Evan Green 
Signed-off-by: Matthias Kaehlcke 
---
drivers/tty/sysrq.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 06ed20dd01ba..49fa8e758690 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -132,10 +132,10 @@ static struct sysrq_key_op sysrq_unraw_op = {
#define sysrq_unraw_op (*(struct sysrq_key_op *)NULL)
#endif /* CONFIG_VT */
+char *sysrq_killer;
+
static void sysrq_handle_crash(int key)
{
-   char *killer = NULL;
-
/* we need to release the RCU read lock here,
 * otherwise we get an annoying
 * 'BUG: sleeping function called from invalid context'
@@ -144,7 +144,7 @@ static void sysrq_handle_crash(int key)
rcu_read_unlock();
panic_on_oops = 1;  /* force panic */
wmb();
-   *killer = 1;
+   *sysrq_killer = 1;


Just because a static analyzer is wrong? Oh wait, even compiler is
wrong. At least make it a static global. Or what about OPTIMIZER_HIDE_VAR?



static global does not work, clang still inserts brk. As for
OPTIMIZE_HIDE_VAR, it seems to work.
But, I dont think it is defined for clang in which case it defaults to using
barrier(). There is already one wmb(), so will it be right?


Ick, why is this needed at all?  Why are we trying to "roll our own
panic" in this code?



Hi Greg, do you mean like why there is a killer var at all or why this
change is required?


I understand you are using a compiler that thinks it wants to protect
yourself from your code and tries to "fix" it for you.  That's fine, and
is up to the compiler writers (personally that seems not a good idea.)

My question is why we just don't call panic() here instead of trying to
duplicate the logic of that function here.  Why is that happening?



It seems fine to call panic() here. Dont no why they chose to have a 
null pointer dereference.


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 5/6] arm64/io: Add header for instrumentation of io operations

2018-09-18 Thread Sai Prakash Ranjan

On 9/18/2018 5:17 PM, Will Deacon wrote:

On Tue, Sep 18, 2018 at 12:40:57PM +0530, Sai Prakash Ranjan wrote:

On 9/18/2018 5:09 AM, Steven Rostedt wrote:

On Sun,  9 Sep 2018 01:57:06 +0530
Sai Prakash Ranjan  wrote:


The new asm-generic/io-instrumented.h will keep arch code
clean and separate from instrumented version which traces
io register accesses. This instrumented header can later
be included in arm as well for tracing io register accesses.



Looks good to me.

Acked-by: Steven Rostedt (VMware) 

-- Steve


Thanks for the ack Steve.

@Will/Mark/Robin Let me know if you have any review comments, thanks.


As I said before, the arm64 bits looks fine to me. I just don't really want
to Ack the thing because the patch also contains the asm-generic changes.

Will



Ok thank you. Is there anyone else whom I should add for asm-generic?

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-18 Thread Sai Prakash Ranjan

On 9/18/2018 5:04 AM, Steven Rostedt wrote:


It looks like pstore_event_call() gets called from a trace event. You
can't call kmalloc() from one. One thing is that kmalloc has
tracepoints itself. You trace those you just entered an infinite loop.




Ok will remove it in v2. But any alternative way to do this?


+
+   event_call = fbuffer->trace_file->event_call;
+   if (!event_call || !event_call->event.funcs ||
+   !event_call->event.funcs->trace)
+   goto fail_event;
+
+   event = &fbuffer->trace_file->event_call->event;
+
+   spin_lock_irqsave(&psinfo->buf_lock, flags);
+
+   trace_seq_init(&iter->seq);
+   iter->ent = fbuffer->entry;


I guess what you are doing is needing to translate the raw data into
ascii output, and need the trace_iterator to do so.

You are already under a psinfo->buf_lock. Add a dummy iterator to that
and use it instead.

trace_seq_init(&psinfo->iter->seq);


+   event_call->event.funcs->trace(iter, 0, event);


  (psinfo->iter, 0 , event);

etc.



Sure, will update in v2.


+   trace_seq_putc(&iter->seq, 0);
+
+   if (seq->size > psinfo->bufsize)
+   seq->size = psinfo->bufsize;
+
+   s = &iter->seq;
+   seq = &s->seq;
+
+   record.buf = (char *)(seq->buffer);
+   record.size = seq->len;
+   psinfo->write(&record);
+
+   spin_unlock_irqrestore(&psinfo->buf_lock, flags);


You may also need to convert these spin_locks into raw_spin_locks as
when PREEMPT_RT enters the kernel you don't want them to turn into
mutexes.

But that can be another patch.



I will change this in v2, but can't we have it in same patch?

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/6] pstore: Add event tracing support

2018-09-18 Thread Sai Prakash Ranjan

On 9/19/2018 2:14 AM, Steven Rostedt wrote:

On Tue, 18 Sep 2018 23:22:48 +0530
Sai Prakash Ranjan  wrote:


On 9/18/2018 5:04 AM, Steven Rostedt wrote:


It looks like pstore_event_call() gets called from a trace event. You
can't call kmalloc() from one. One thing is that kmalloc has
tracepoints itself. You trace those you just entered an infinite loop.

   


Ok will remove it in v2. But any alternative way to do this?


I think I describe it below.



Ok got it, will change and post the 2nd version soon.




+
+   event_call = fbuffer->trace_file->event_call;
+   if (!event_call || !event_call->event.funcs ||
+   !event_call->event.funcs->trace)
+   goto fail_event;
+
+   event = &fbuffer->trace_file->event_call->event;
+
+   spin_lock_irqsave(&psinfo->buf_lock, flags);
+
+   trace_seq_init(&iter->seq);
+   iter->ent = fbuffer->entry;


I guess what you are doing is needing to translate the raw data into
ascii output, and need the trace_iterator to do so.

You are already under a psinfo->buf_lock. Add a dummy iterator to that
and use it instead.

trace_seq_init(&psinfo->iter->seq);
   

+   event_call->event.funcs->trace(iter, 0, event);


  (psinfo->iter, 0 , event);

etc.
   


Sure, will update in v2.


+   trace_seq_putc(&iter->seq, 0);
+
+   if (seq->size > psinfo->bufsize)
+   seq->size = psinfo->bufsize;
+
+   s = &iter->seq;
+   seq = &s->seq;
+
+   record.buf = (char *)(seq->buffer);
+   record.size = seq->len;
+   psinfo->write(&record);
+
+   spin_unlock_irqrestore(&psinfo->buf_lock, flags);


You may also need to convert these spin_locks into raw_spin_locks as
when PREEMPT_RT enters the kernel you don't want them to turn into
mutexes.

But that can be another patch.
   


I will change this in v2, but can't we have it in same patch?


I suggested a separate patch because buf_lock is used elsewhere.
Changing it to "raw_spin_lock" will affect more than just what this
patch series does. Thus, I recommend making it a separate patch to keep
this patch series from being more intrusive than it needs to be.



Sure, thanks a lot.

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 1/3] arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity

2021-02-16 Thread Sai Prakash Ranjan
As per interrupt documentation for SM8250 SoC, the polarity
for level triggered PMU interrupt is low, fix this.

Fixes: 60378f1a171e ("arm64: dts: qcom: sm8250: Add sm8250 dts file")
Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sm8250.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi 
b/arch/arm64/boot/dts/qcom/sm8250.dtsi
index 947e1accae3a..1864c459a563 100644
--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
@@ -279,7 +279,7 @@ mmcx_reg: mmcx-reg {
 
pmu {
compatible = "arm,armv8-pmuv3";
-   interrupts = ;
+   interrupts = ;
};
 
psci {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 0/3] arm64: dts: qcom: Fix PMU and timer interrupt properties

2021-02-16 Thread Sai Prakash Ranjan
Fix PMU interrupt polarity for SM8250 and SM8350 SoCs and the timer
interrupt property for SM8250 SoC.

Sai Prakash Ranjan (3):
  arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity
  arm64: dts: qcom: sm8350: Fix level triggered PMU interrupt polarity
  arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical
timer

 arch/arm64/boot/dts/qcom/sm8250.dtsi | 4 ++--
 arch/arm64/boot/dts/qcom/sm8350.dtsi | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)


base-commit: d79b47c59576a51d8e288a6b98b75ccf4afb8acd
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 3/3] arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical timer

2021-02-16 Thread Sai Prakash Ranjan
ARM architected timer interrupts DT property specifies EL2/HYP
physical interrupt and not EL2/HYP virtual interrupt for the 4th
interrupt property. As per interrupt documentation for SM8250 SoC,
the EL2/HYP physical timer interrupt is 10 and EL2/HYP virtual timer
interrupt is 12, so fix the 4th timer interrupt to be EL2 physical
timer interrupt (10 in this case).

Fixes: 60378f1a171e ("arm64: dts: qcom: sm8250: Add sm8250 dts file")
Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sm8250.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi 
b/arch/arm64/boot/dts/qcom/sm8250.dtsi
index 1864c459a563..3232ac6253bb 100644
--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
@@ -3754,7 +3754,7 @@ timer {
(GIC_CPU_MASK_SIMPLE(8) | IRQ_TYPE_LEVEL_LOW)>,
 ,
-;
};
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/3] arm64: dts: qcom: sm8350: Fix level triggered PMU interrupt polarity

2021-02-16 Thread Sai Prakash Ranjan
As per interrupt documentation for SM8350 SoC, the polarity
for level triggered PMU interrupt is low, fix this.

Fixes: b7e8f433a673 ("arm64: dts: qcom: Add basic devicetree support for SM8350 
SoC")
Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sm8350.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi 
b/arch/arm64/boot/dts/qcom/sm8350.dtsi
index 5ef460458f5c..e8bf3f95c674 100644
--- a/arch/arm64/boot/dts/qcom/sm8350.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi
@@ -153,7 +153,7 @@ memory@8000 {
 
pmu {
compatible = "arm,armv8-pmuv3";
-   interrupts = ;
+   interrupts = ;
};
 
psci {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH] stm class: Fix out of bound access from bitmap allocation

2019-04-16 Thread Sai Prakash Ranjan

On 4/16/2019 8:30 PM, Alexander Shishkin wrote:

Sai Prakash Ranjan  writes:


From: Mulu He 

Bitmap allocation works on array of unsigned longs and
for stm master allocation when the number of software
channels is 32, 4 bytes are allocated and there is a out of
bound access at the first 8 bytes access of bitmap region.


Does the below fix the problem for you?

 From fb22b9ab109b332e58d72df13563e270befbd0e3 Mon Sep 17 00:00:00 2001
From: Alexander Shishkin 
Date: Tue, 16 Apr 2019 17:47:02 +0300
Subject: [PATCH] stm class: Fix channel bitmap on 32-bit systems

Commit 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace
Module devices") naively calculates the channel bitmap size in 64-bit
chunks regardless of the size of underlying unsigned long, making the
bitmap half as big on a 32-bit system. This leads to an out of bounds
access with the upper half of the bitmap.

Fix this by using BITS_TO_LONGS. While at it, convert to using
struct_size() for the total size calculation of the master struct.

Signed-off-by: Alexander Shishkin 
Fixes: 7bd1d4093c2f ("stm class: Introduce an abstraction for System Trace Module 
devices")
Reported-by: Mulu He 
Cc: sta...@vger.kernel.org # v4.4+
---
  drivers/hwtracing/stm/core.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/stm/core.c b/drivers/hwtracing/stm/core.c
index 5b5807cbcf7c..8c45e79e47db 100644
--- a/drivers/hwtracing/stm/core.c
+++ b/drivers/hwtracing/stm/core.c
@@ -166,11 +166,9 @@ stm_master(struct stm_device *stm, unsigned int idx)
  static int stp_master_alloc(struct stm_device *stm, unsigned int idx)
  {
struct stp_master *master;
-   size_t size;
  
-	size = ALIGN(stm->data->sw_nchannels, 8) / 8;

-   size += sizeof(struct stp_master);
-   master = kzalloc(size, GFP_ATOMIC);
+   master = kzalloc(struct_size(master, chan_map, 
BITS_TO_LONGS(stm->data->sw_nchannels)),
+GFP_ATOMIC);
if (!master)
return -ENOMEM;
  



++ David

Yes it does fix the issue. Actually initial fix internally was using
BITS_TO_LONGS, don't no why they deferred from it.

Anyways thanks for the patch.

- Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


[PATCH] coresight: etm4x: Add ETM PIDs for Cortex-A55 and Cortex-A78

2021-02-12 Thread Sai Prakash Ranjan
Add ETM PIDs for Cortex-A55 and Cortex-A78 to the list of
supported ETMs.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/hwtracing/coresight/coresight-etm4x-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c 
b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index b20b6ff17cf6..193233792ab5 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -1713,7 +1713,9 @@ static const struct amba_id etm4_ids[] = {
CS_AMBA_ID(0x000bb95a), /* Cortex-A72 */
CS_AMBA_ID(0x000bb959), /* Cortex-A73 */
CS_AMBA_UCI_ID(0x000bb9da, uci_id_etm4),/* Cortex-A35 */
+   CS_AMBA_UCI_ID(0x000bbd05, uci_id_etm4),/* Cortex-A55 */
CS_AMBA_UCI_ID(0x000bbd0c, uci_id_etm4),/* Neoverse N1 */
+   CS_AMBA_UCI_ID(0x000bbd41, uci_id_etm4),/* Cortex-A78 */
CS_AMBA_UCI_ID(0x000f0205, uci_id_etm4),/* Qualcomm Kryo */
CS_AMBA_UCI_ID(0x000f0211, uci_id_etm4),/* Qualcomm Kryo */
CS_AMBA_UCI_ID(0x000bb802, uci_id_etm4),/* Qualcomm Kryo 385 Cortex-A55 
*/

base-commit: 1efbcec2ef8c037f1e801c76e4b9434ee2400be7
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH] coresight: etm4x: Add ETM PIDs for Cortex-A55 and Cortex-A78

2021-02-13 Thread Sai Prakash Ranjan

Hi Mike,

On 2021-02-13 16:24, Mike Leach wrote:

Hi Sai,

This patch does not apply to coresight/next - possibly because the PID
for A55 has been added in an earlier patch ( [b8336ad947e19 ]
coresight: etm4x: add AMBA id for Cortex-A55 and Cortex-A75).



Apologies, my remote was still pointing to linaro coresight repo,
https://git.linaro.org/kernel/coresight.git, I have now updated
the remote to a proper kernel.org coresight repo and will post
the updated patchset.

Thanks,
Sai





On Fri, 12 Feb 2021 at 17:23, Sai Prakash Ranjan
 wrote:


Add ETM PIDs for Cortex-A55 and Cortex-A78 to the list of
supported ETMs.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/hwtracing/coresight/coresight-etm4x-core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c 
b/drivers/hwtracing/coresight/coresight-etm4x-core.c

index b20b6ff17cf6..193233792ab5 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -1713,7 +1713,9 @@ static const struct amba_id etm4_ids[] = {
CS_AMBA_ID(0x000bb95a), /* Cortex-A72 */
CS_AMBA_ID(0x000bb959), /* Cortex-A73 */
CS_AMBA_UCI_ID(0x000bb9da, uci_id_etm4),/* Cortex-A35 */
+   CS_AMBA_UCI_ID(0x000bbd05, uci_id_etm4),/* Cortex-A55 */
CS_AMBA_UCI_ID(0x000bbd0c, uci_id_etm4),/* Neoverse N1 */
+   CS_AMBA_UCI_ID(0x000bbd41, uci_id_etm4),/* Cortex-A78 */
CS_AMBA_UCI_ID(0x000f0205, uci_id_etm4),/* Qualcomm Kryo */
CS_AMBA_UCI_ID(0x000f0211, uci_id_etm4),/* Qualcomm Kryo */
CS_AMBA_UCI_ID(0x000bb802, uci_id_etm4),/* Qualcomm Kryo 385 
Cortex-A55 */


base-commit: 1efbcec2ef8c037f1e801c76e4b9434ee2400be7
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation




--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


[PATCHv2] coresight: etm4x: Add ETM PID for Cortex-A78

2021-02-13 Thread Sai Prakash Ranjan
Add ETM PID for Cortex-A78 to the list of supported ETMs.

Signed-off-by: Sai Prakash Ranjan 
---

Changes in v2:
 * Rebased on top of coresight/next from kernel.org coresight repo.

---
 drivers/hwtracing/coresight/coresight-etm4x-core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c 
b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 15016f757828..a5b13a7779c3 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -1951,6 +1951,7 @@ static const struct amba_id etm4_ids[] = {
CS_AMBA_UCI_ID(0x000bbd05, uci_id_etm4),/* Cortex-A55 */
CS_AMBA_UCI_ID(0x000bbd0a, uci_id_etm4),/* Cortex-A75 */
CS_AMBA_UCI_ID(0x000bbd0c, uci_id_etm4),/* Neoverse N1 */
+   CS_AMBA_UCI_ID(0x000bbd41, uci_id_etm4),/* Cortex-A78 */
CS_AMBA_UCI_ID(0x000f0205, uci_id_etm4),/* Qualcomm Kryo */
CS_AMBA_UCI_ID(0x000f0211, uci_id_etm4),/* Qualcomm Kryo */
CS_AMBA_UCI_ID(0x000bb802, uci_id_etm4),/* Qualcomm Kryo 385 Cortex-A55 
*/

base-commit: 06c18e28c402ecfb842df8e22a19a097c35ffca9
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 3/4] coresight: etm4x: Add support to exclude kernel mode tracing

2021-02-24 Thread Sai Prakash Ranjan

Hi,

Thanks for taking a look, comments inline.

On 2021-02-23 01:44, Doug Anderson wrote:

Hi,

On Fri, Jan 29, 2021 at 11:08 AM Sai Prakash Ranjan
 wrote:


@@ -1202,6 +1207,13 @@ void etm4_config_trace_mode(struct etmv4_config 
*config)

/* excluding kernel AND user space doesn't make sense */
WARN_ON_ONCE(mode == (ETM_MODE_EXCL_KERN | 
ETM_MODE_EXCL_USER));


+   if (!(mode & ETM_MODE_EXCL_KERN) && 
IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE)) {

+   dev_err(&drvdata->csdev->dev,
+   "Kernel mode tracing is not allowed, check 
your kernel config\n");

+   config->mode |= ETM_MODE_EXCL_KERN;
+   return;


So I'm not an expert on this code, but the above looks suspicious to
me.  Specifically you are still modifying "config->mode" even though
printing an "error" (dev_err, not dev_warn) and then skipping the rest
of this function.  Since you're skipping the rest of this function
you're not applying the access, right?  Naively I'd have expected one
of these:

1. Maybe the "dev_err" should be a "dev_warn" and then you shouldn't
"return".  In this case you're just implicitly adding
"ETM_MODE_EXCL_KERN" (and shouting) but then making things work.  Of
course, then what happens if the user already specified
"ETM_MODE_EXCL_USER" too?  As per the comment above that "doesn't make
sense".  ...so maybe the code wouldn't behave properly...

2. Maybe you should be modifying this function to return an error code.



mode_store() which is the caller of this function sets the config->mode
based on the value passed in the sysfs, so if the user passes the mode
which doesn't exclude the kernel even though the kernel config is 
enabled

and the code just sets it, then that is an error and the user should be
warned about, so I used dev_err, but I can change it to dev_warn if that
is preferred. And to make sysfs mode show the original mode after 
failure,

I set the mode in etm4_config_trace_mode().

But you are right, I am skipping to set the config for other mode bits
and returning which is wrong, will fix it as you suggest below.


3. Maybe you should just be updating the one caller of this function
to error check this right at the beginning of the function and then
fail the sysfs write if the user did the wrong thing.  Then in
etm4_config_trace_mode you could just have a WARN_ON_ONCE if the
kernel wasn't excluded...



Right, will do this.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/4] perf/core: Add support to exclude kernel mode instruction tracing

2021-02-09 Thread Sai Prakash Ranjan

Hi Peter,

On 2021-02-02 11:41, Sai Prakash Ranjan wrote:

Hi Peter,

On 2021-02-01 19:11, Peter Zijlstra wrote:

On Mon, Feb 01, 2021 at 01:11:04PM +0530, Sai Prakash Ranjan wrote:


Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see
how this new config has to depend on that? This can work 
independently

whether complete lockdown is enforced or not since it applies to only
hardware instruction tracing. Ideally this depends on several 
hardware
tracing configs such as ETMs and others but we don't need them 
because

we are already exposing PERF_PMU_CAP_ITRACE check in the events core.


If you don't have lockdown, root pretty much owns the kernel, or am I
missing something?



You are right in saying that without lockdown root would own kernel but
this config(EXCLUDE_KERNEL) will independently make sure that kernel
level pmu tracing is not allowed(we return -EACCES) even if LOCKDOWN
config is disabled. So I'm saying that we don't need to depend on
LOCKDOWN config, its good to have LOCKDOWN config enabled but perf
subsystem doesn't have to care about that.


be used for some speculative execution based attacks. Which other
kernel level PMUs can be used to get a full branch trace that is not
locked down? If there is one, then this should probably be applied to
it as well.


Just the regular counters. The information isn't as accurate, but 
given

enough goes you can infer plenty.

Just like all the SMT size-channel attacks.

Sure, PT and friends make it even easier, but I don't see a 
fundamental

distinction.


Right, we should then exclude all kernel level pmu tracing, is it fine?

if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE) && 
!attr.exclude_kernel))

return -EACCES;



Sorry for being pushy, but does the above make sense?

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 1/4] perf/core: Add support to exclude kernel mode instruction tracing

2021-01-29 Thread Sai Prakash Ranjan
Hardware assisted tracing families such as ARM Coresight, Intel PT
provides rich tracing capabilities including instruction level
tracing and accurate timestamps which are very useful for profiling
and also pose a significant security risk. One such example of
security risk is when kernel mode tracing is not excluded and these
hardware assisted tracing can be used to analyze cryptographic code
execution. In this case, even the root user must not be able to infer
anything.

To explain it more clearly in the words of a security team member
(credits: Mattias Nissler),

"Consider a system where disk contents are encrypted and the encryption
key is set up by the user when mounting the file system. From that point
on the encryption key resides in the kernel. It seems reasonable to
expect that the disk encryption key be protected from exfiltration even
if the system later suffers a root compromise (or even against insiders
that have root access), at least as long as the attacker doesn't
manage to compromise the kernel."

Here the idea is to protect such important information from all users
including root users since root privileges does not have to mean full
control over the kernel [1] and root compromise does not have to be
the end of the world.

Currently we can exclude kernel mode tracing via perf_event_paranoid
sysctl but it has following limitations,

 * It is applicable to all PMUs and not just the ones supporting
   instruction tracing.
 * No option to restrict kernel mode instruction tracing by the
   root user.
 * Not possible to restrict kernel mode instruction tracing when the
   hardware assisted tracing IPs like ARM Coresight ETMs use an
   additional interface via sysfs for tracing in addition to perf
   interface.

So introduce a new config CONFIG_EXCLUDE_KERNEL_HW_ITRACE to exclude
kernel mode instruction tracing which will be generic and applicable
to all hardware tracing families and which can also be used with other
interfaces like sysfs in case of ETMs.

[1] https://lwn.net/Articles/796866/

Suggested-by: Suzuki K Poulose 
Suggested-by: Al Grant 
Tested-by: Denis Nikitin 
Link: 
https://lore.kernel.org/lkml/20201015124522.1876-1-saiprakash.ran...@codeaurora.org/
Signed-off-by: Sai Prakash Ranjan 
---
 init/Kconfig | 12 
 kernel/events/core.c |  6 ++
 2 files changed, 18 insertions(+)

diff --git a/init/Kconfig b/init/Kconfig
index af454a51f3c5..31b4d1f26bce 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1832,6 +1832,18 @@ config DEBUG_PERF_USE_VMALLOC
 
 endmenu
 
+config EXCLUDE_KERNEL_HW_ITRACE
+   bool "Exclude kernel mode hardware assisted instruction tracing"
+   depends on PERF_EVENTS
+   help
+ Exclude kernel mode instruction tracing by hardware tracing
+ family such as ARM Coresight ETM, Intel PT and so on.
+
+ This option allows to disable kernel mode instruction tracing
+ offered by hardware assisted tracing for all users(including root)
+ especially for production systems where only userspace tracing might
+ be preferred for security reasons.
+
 config VM_EVENT_COUNTERS
default y
bool "Enable VM event counters for /proc/vmstat" if EXPERT
diff --git a/kernel/events/core.c b/kernel/events/core.c
index aece2fe19693..044a774cef6d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11866,6 +11866,12 @@ SYSCALL_DEFINE5(perf_event_open,
goto err_task;
}
 
+   if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE) &&
+   (event->pmu->capabilities & PERF_PMU_CAP_ITRACE) && 
!attr.exclude_kernel) {
+   err = -EACCES;
+   goto err_alloc;
+   }
+
if (is_sampling_event(event)) {
if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
err = -EOPNOTSUPP;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/4] perf evsel: Print warning for excluding kernel mode instruction tracing

2021-01-29 Thread Sai Prakash Ranjan
Add a warning message to check CONFIG_EXCLUDE_KERNEL_HW_ITRACE kernel
config which excludes kernel mode instruction tracing to help perf tool
users identify the perf event open failure when they attempt kernel mode
tracing with this config enabled.

Tested-by: Denis Nikitin 
Signed-off-by: Sai Prakash Ranjan 
---
 tools/perf/util/evsel.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c26ea82220bd..09cc0349f883 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2630,7 +2630,8 @@ int evsel__open_strerror(struct evsel *evsel, struct 
target *target,
 ">= 1: Disallow CPU event access\n"
 ">= 2: Disallow kernel profiling\n"
 "To make the adjusted perf_event_paranoid setting permanent 
preserve it\n"
-"in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = 
)",
+"in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = 
)\n\n"
+"Also check CONFIG_EXCLUDE_KERNEL_HW_ITRACE if kernel mode 
tracing is allowed.",
 perf_event_paranoid());
case ENOENT:
return scnprintf(msg, size, "The %s event is not supported.", 
evsel__name(evsel));
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 0/4] Add support to exclude kernel mode hardware assisted instruction tracing

2021-01-29 Thread Sai Prakash Ranjan
Hardware assisted tracing families such as ARM Coresight, Intel PT
provides rich tracing capabilities including instruction level
tracing and accurate timestamps which are very useful for profiling
and also pose a significant security risk. One such example of
security risk is when kernel mode tracing is not excluded and these
hardware assisted tracing can be used to analyze cryptographic code
execution. In this case, even the root user must not be able to infer
anything.

To explain it more clearly, here is the quote from a member of the
security team (credits: Mattias Nissler),

"Consider a system where disk contents are encrypted and the encryption
key is set up by the user when mounting the file system. From that point
on the encryption key resides in the kernel. It seems reasonable to
expect that the disk encryption key be protected from exfiltration even
if the system later suffers a root compromise (or even against insiders
that have root access), at least as long as the attacker doesn't
manage to compromise the kernel."

Here the idea is to protect such important information from all users
including root users since root privileges does not have to mean full
control over the kernel [1] and root compromise does not have to be
the end of the world.

Currently we can exclude kernel mode tracing via perf_event_paranoid
sysctl but it has following limitations,

 * It is applicable to all PMUs and not just the ones supporting
   instruction tracing.
 * No option to restrict kernel mode instruction tracing by the
   root user.
 * Not possible to restrict kernel mode instruction tracing when the
   hardware assisted tracing IPs like ARM Coresight ETMs use an
   additional interface via sysfs for tracing in addition to perf
   interface.

So introduce a new config CONFIG_EXCLUDE_KERNEL_HW_ITRACE to exclude
kernel mode instruction tracing which will be generic and applicable
to all hardware tracing families and which can also be used with other
interfaces like sysfs in case of ETMs.

Patch 1 adds this new config and the support in perf core to exclude
kernel mode tracing for PMUs which support instruction mode tracing.

Patch 2 adds the perf evsel warning message when the perf tool users
attempt to perform a kernel mode instruction trace with the config
enabled to exclude the kernel mode tracing.

Patch 3 and Patch 4 adds the support for excluding kernel mode for
ARM Coresight ETM{4,3}XX sysfs mode using the newly introduced generic
config.

[1] https://lwn.net/Articles/796866/

Sai Prakash Ranjan (4):
  perf/core: Add support to exclude kernel mode instruction tracing
  perf evsel: Print warning for excluding kernel mode instruction
tracing
  coresight: etm4x: Add support to exclude kernel mode tracing
  coresight: etm3x: Add support to exclude kernel mode tracing

 drivers/hwtracing/coresight/coresight-etm3x-core.c | 11 +++
 .../hwtracing/coresight/coresight-etm3x-sysfs.c|  3 ++-
 drivers/hwtracing/coresight/coresight-etm4x-core.c | 14 +-
 .../hwtracing/coresight/coresight-etm4x-sysfs.c|  3 ++-
 init/Kconfig   | 12 
 kernel/events/core.c   |  6 ++
 tools/perf/util/evsel.c|  3 ++-
 7 files changed, 48 insertions(+), 4 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 3/4] coresight: etm4x: Add support to exclude kernel mode tracing

2021-01-29 Thread Sai Prakash Ranjan
On production systems with ETMs enabled, it is preferred to exclude
kernel mode(NS EL1) tracing for security concerns and support only
userspace(NS EL0) tracing. Perf subsystem interface for ETMs use
the newly introduced kernel config CONFIG_EXCLUDE_KERNEL_HW_ITRACE
to exclude kernel mode tracing, but there is an additional interface
via sysfs for ETMs which also needs to be handled to exclude kernel
mode tracing. So we use this same generic kernel config to handle
the sysfs mode of tracing. This config is disabled by default and
would not affect the current configuration which has both kernel and
userspace tracing enabled by default.

Tested-by: Denis Nikitin 
Signed-off-by: Sai Prakash Ranjan 
---
 drivers/hwtracing/coresight/coresight-etm4x-core.c | 14 +-
 .../hwtracing/coresight/coresight-etm4x-sysfs.c|  3 ++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c 
b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index b20b6ff17cf6..f94143057bb8 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -1052,12 +1052,16 @@ static void etm4_set_default(struct etmv4_config 
*config)
return;
 
/*
-* Make default initialisation trace everything
+* Make default initialisation trace everything when
+* CONFIG_EXCLUDE_KERNEL_HW_ITRACE is disabled.
 *
 * This is done by a minimum default config sufficient to enable
 * full instruction trace - with a default filter for trace all
 * achieved by having no filtering.
 */
+   if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE))
+   config->mode |= ETM_MODE_EXCL_KERN;
+
etm4_set_default_config(config);
etm4_set_default_filter(config);
 }
@@ -1195,6 +1199,7 @@ static int etm4_set_event_filters(struct etmv4_drvdata 
*drvdata,
 void etm4_config_trace_mode(struct etmv4_config *config)
 {
u32 mode;
+   struct etmv4_drvdata *drvdata = container_of(config, struct 
etmv4_drvdata, config);
 
mode = config->mode;
mode &= (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER);
@@ -1202,6 +1207,13 @@ void etm4_config_trace_mode(struct etmv4_config *config)
/* excluding kernel AND user space doesn't make sense */
WARN_ON_ONCE(mode == (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER));
 
+   if (!(mode & ETM_MODE_EXCL_KERN) && 
IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE)) {
+   dev_err(&drvdata->csdev->dev,
+   "Kernel mode tracing is not allowed, check your kernel 
config\n");
+   config->mode |= ETM_MODE_EXCL_KERN;
+   return;
+   }
+
/* nothing to do if neither flags are set */
if (!(mode & ETM_MODE_EXCL_KERN) && !(mode & ETM_MODE_EXCL_USER))
return;
diff --git a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c 
b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
index 989ce7b8ade7..f1d19d69d151 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
@@ -426,7 +426,8 @@ static ssize_t mode_store(struct device *dev,
else
config->vinst_ctrl &= ~BIT(11);
 
-   if (config->mode & (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER))
+   if ((config->mode & (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER)) ||
+   IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE))
etm4_config_trace_mode(config);
 
spin_unlock(&drvdata->spinlock);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 4/4] coresight: etm3x: Add support to exclude kernel mode tracing

2021-01-29 Thread Sai Prakash Ranjan
On production systems with ETMs enabled, it is preferred to exclude
kernel mode(NS EL1) tracing for security concerns and support only
userspace(NS EL0) tracing. Perf subsystem interface for ETMs use
the newly introduced kernel config CONFIG_EXCLUDE_KERNEL_HW_ITRACE
to exclude kernel mode tracing, but there is an additional interface
via sysfs for ETMs which also needs to be handled to exclude kernel
mode tracing. So we use this same generic kernel config to handle
the sysfs mode of tracing. This config is disabled by default and
would not affect the current configuration which has both kernel and
userspace tracing enabled by default.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/hwtracing/coresight/coresight-etm3x-core.c  | 11 +++
 drivers/hwtracing/coresight/coresight-etm3x-sysfs.c |  3 ++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm3x-core.c 
b/drivers/hwtracing/coresight/coresight-etm3x-core.c
index 5bf5a5a4ce6d..4da3bfa66b70 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x-core.c
@@ -195,6 +195,9 @@ void etm_set_default(struct etm_config *config)
if (WARN_ON_ONCE(!config))
return;
 
+   if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE))
+   config->mode |= ETM_MODE_EXCL_KERN;
+
/*
 * Taken verbatim from the TRM:
 *
@@ -239,6 +242,7 @@ void etm_set_default(struct etm_config *config)
 void etm_config_trace_mode(struct etm_config *config)
 {
u32 flags, mode;
+   struct etm_drvdata *drvdata = container_of(config, struct etm_drvdata, 
config);
 
mode = config->mode;
 
@@ -248,6 +252,13 @@ void etm_config_trace_mode(struct etm_config *config)
if (mode == (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER))
return;
 
+   if (!(mode & ETM_MODE_EXCL_KERN) && 
IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE)) {
+   dev_err(&drvdata->csdev->dev,
+   "Kernel mode tracing is not allowed, check your kernel 
config\n");
+   config->mode |= ETM_MODE_EXCL_KERN;
+   return;
+   }
+
/* nothing to do if neither flags are set */
if (!(mode & ETM_MODE_EXCL_KERN) && !(mode & ETM_MODE_EXCL_USER))
return;
diff --git a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c 
b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c
index e8c7649f123e..26642dafddbb 100644
--- a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c
+++ b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c
@@ -164,7 +164,8 @@ static ssize_t mode_store(struct device *dev,
else
config->ctrl &= ~ETMCR_RETURN_STACK;
 
-   if (config->mode & (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER))
+   if ((config->mode & (ETM_MODE_EXCL_KERN | ETM_MODE_EXCL_USER)) ||
+   IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE))
etm_config_trace_mode(config);
 
spin_unlock(&drvdata->spinlock);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH] watchdog: qcom: Remove incorrect usage of QCOM_WDT_ENABLE_IRQ

2021-01-31 Thread Sai Prakash Ranjan

On 2021-01-31 22:33, Jorge Ramirez-Ortiz, Foundries wrote:

On 28/01/21, Sai Prakash Ranjan wrote:

On 2021-01-28 13:49, Jorge Ramirez-Ortiz, Foundries wrote:
> On 26/01/21, Sai Prakash Ranjan wrote:
> > As per register documentation, QCOM_WDT_ENABLE_IRQ which is BIT(1)
> > of watchdog control register is wakeup interrupt enable bit and
> > not related to bark interrupt at all, BIT(0) is used for that.
> > So remove incorrect usage of this bit when supporting bark irq for
> > pre-timeout notification. Currently with this bit set and bark
> > interrupt specified, pre-timeout notification and/or watchdog
> > reset/bite does not occur.
> >
> > Fixes: 36375491a439 ("watchdog: qcom: support pre-timeout when the
> > bark irq is available")
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Sai Prakash Ranjan 
> > ---
> >
> > Reading the conversations from when qcom pre-timeout support was
> > added [1], Bjorn already had mentioned it was not right to touch this
> > bit, not sure which SoC the pre-timeout was tested on at that time,
> > but I have tested this on SDM845, SM8150, SC7180 and watchdog bark
> > and bite does not occur with enabling this bit with the bark irq
> > specified in DT.
>
> this was tested on QCS404. have you validated there? unfortunately I
> no longer have access to that hardware or the documentation
>

I didn't validate on qcs404 yet since I didn't have access to it.
But now that you mention it, let me arrange for a setup and test it
there as well. Note: I did not see bark irq entry in upstream qcs404
dtsi, so you must have had some local change when you tested?


TBH I dont quite remember. I suppose that if those with access to the
documents and hardware are OK with this change then it shouldnt cause
regressions (I just cant check from my end)



No worries, I got the documentation access now and it is the same as
other SoCs which I have tested above, meaning the BIT(1) is not related
to bark irq. I am arranging a setup as well now, it took some time as
I don't work on QCS* chipsets but I can confirm by this week.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/4] perf/core: Add support to exclude kernel mode instruction tracing

2021-01-31 Thread Sai Prakash Ranjan

Hi Peter,

On 2021-01-30 01:00, Peter Zijlstra wrote:

On Sat, Jan 30, 2021 at 12:35:10AM +0530, Sai Prakash Ranjan wrote:


Here the idea is to protect such important information from all users
including root users since root privileges does not have to mean full
control over the kernel [1] and root compromise does not have to be
the end of the world.


And yet, your thing lacks:



I guess you mean this lacks an explanation as to why this only applies
to ITRACE and not others? See below.


+config EXCLUDE_KERNEL_HW_ITRACE
+   bool "Exclude kernel mode hardware assisted instruction tracing"
+   depends on PERF_EVENTS

depends on SECURITY_LOCKDOWN

or whatever the appropriate symbol is.



Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see
how this new config has to depend on that? This can work independently
whether complete lockdown is enforced or not since it applies to only
hardware instruction tracing. Ideally this depends on several hardware
tracing configs such as ETMs and others but we don't need them because
we are already exposing PERF_PMU_CAP_ITRACE check in the events core.


+   help
+ Exclude kernel mode instruction tracing by hardware tracing
+ family such as ARM Coresight ETM, Intel PT and so on.
+
+ This option allows to disable kernel mode instruction tracing
+ offered by hardware assisted tracing for all users(including root)
+	  especially for production systems where only userspace tracing 
might

+ be preferred for security reasons.


Also, colour me unconvinced, pretty much all kernel level PMU usage
can be employed to side-channel / infer crypto keys, why focus on
ITRACE over others?


Here ITRACE is not just instruction trace, it is meant for hardware 
assisted
instruction trace such as Intel PT, Intel BTS, ARM coresight etc. These 
provide
much more capabilities than normal instruction tracing whether its 
kernel level
or userspace. More specifically, these provide more accurate branch 
trace like
Intel PT LBR (Last Branch Record), Intel BTS(Branch Trace Store) which 
can be
used to decode the program flow more accurately with timestamps in real 
time
than other PMUs. Also there is cycle accurate tracing which can 
theoretically
be used for some speculative execution based attacks. Which other kernel 
level
PMUs can be used to get a full branch trace that is not locked down? If 
there

is one, then this should probably be applied to it as well.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 1/4] perf/core: Add support to exclude kernel mode instruction tracing

2021-02-01 Thread Sai Prakash Ranjan

Hi Peter,

On 2021-02-01 19:11, Peter Zijlstra wrote:

On Mon, Feb 01, 2021 at 01:11:04PM +0530, Sai Prakash Ranjan wrote:


Ok I suppose you mean CONFIG_SECURITY_LOCKDOWN_LSM? But I don't see
how this new config has to depend on that? This can work independently
whether complete lockdown is enforced or not since it applies to only
hardware instruction tracing. Ideally this depends on several hardware
tracing configs such as ETMs and others but we don't need them because
we are already exposing PERF_PMU_CAP_ITRACE check in the events core.


If you don't have lockdown, root pretty much owns the kernel, or am I
missing something?



You are right in saying that without lockdown root would own kernel but
this config(EXCLUDE_KERNEL) will independently make sure that kernel
level pmu tracing is not allowed(we return -EACCES) even if LOCKDOWN
config is disabled. So I'm saying that we don't need to depend on
LOCKDOWN config, its good to have LOCKDOWN config enabled but perf
subsystem doesn't have to care about that.


be used for some speculative execution based attacks. Which other
kernel level PMUs can be used to get a full branch trace that is not
locked down? If there is one, then this should probably be applied to
it as well.


Just the regular counters. The information isn't as accurate, but given
enough goes you can infer plenty.

Just like all the SMT size-channel attacks.

Sure, PT and friends make it even easier, but I don't see a fundamental
distinction.


Right, we should then exclude all kernel level pmu tracing, is it fine?

if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_HW_ITRACE) && 
!attr.exclude_kernel))

return -EACCES;

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-02-01 Thread Sai Prakash Ranjan

On 2021-02-01 23:50, Jordan Crouse wrote:

On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:

On Mon, Feb 1, 2021 at 3:16 AM Will Deacon  wrote:
>
> On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> > On 2021-01-29 14:35, Will Deacon wrote:
> > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> > > > Add a new page protection flag IOMMU_LLC which can be used
> > > > by non-coherent masters to set cacheable memory attributes
> > > > for an outer level of cache called as last-level cache or
> > > > system cache. Initial user of this page protection flag is
> > > > the adreno gpu and then can later be used by other clients
> > > > such as video where this can be used for per-buffer based
> > > > mapping.
> > > >
> > > > Signed-off-by: Sai Prakash Ranjan 
> > > > ---
> > > >  drivers/iommu/io-pgtable-arm.c | 3 +++
> > > >  include/linux/iommu.h  | 6 ++
> > > >  2 files changed, 9 insertions(+)
> > > >
> > > > diff --git a/drivers/iommu/io-pgtable-arm.c
> > > > b/drivers/iommu/io-pgtable-arm.c
> > > > index 7439ee7fdcdb..ebe653ef601b 100644
> > > > --- a/drivers/iommu/io-pgtable-arm.c
> > > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > > @@ -415,6 +415,9 @@ static arm_lpae_iopte
> > > > arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> > > >   else if (prot & IOMMU_CACHE)
> > > >   pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
> > > >   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> > > > + else if (prot & IOMMU_LLC)
> > > > + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
> > > > + << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> > > >   }
> > > >
> > > >   if (prot & IOMMU_CACHE)
> > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > > index ffaa389ea128..1f82057df531 100644
> > > > --- a/include/linux/iommu.h
> > > > +++ b/include/linux/iommu.h
> > > > @@ -31,6 +31,12 @@
> > > >   * if the IOMMU page table format is equivalent.
> > > >   */
> > > >  #define IOMMU_PRIV   (1 << 5)
> > > > +/*
> > > > + * Non-coherent masters can use this page protection flag to set
> > > > cacheable
> > > > + * memory attributes for only a transparent outer level of cache,
> > > > also known as
> > > > + * the last-level or system cache.
> > > > + */
> > > > +#define IOMMU_LLC(1 << 6)
> > >
> > > On reflection, I'm a bit worried about exposing this because I think it
> > > will
> > > introduce a mismatched virtual alias with the CPU (we don't even have a
> > > MAIR
> > > set up for this memory type). Now, we also have that issue for the PTW,
> > > but
> > > since we always use cache maintenance (i.e. the streaming API) for
> > > publishing the page-tables to a non-coheren walker, it works out.
> > > However,
> > > if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
> > > allocation, then they're potentially in for a nasty surprise due to the
> > > mismatched outer-cacheability attributes.
> > >
> >
> > Can't we add the syscached memory type similar to what is done on android?
>
> Maybe. How does the GPU driver map these things on the CPU side?

Currently we use writecombine mappings for everything, although there
are some cases that we'd like to use cached (but have not merged
patches that would give userspace a way to flush/invalidate)

BR,
-R


LLC/system cache doesn't have a relationship with the CPU cache.  Its 
just a
little accelerator that sits on the connection from the GPU to DDR and 
caches
accesses. The hint that Sai is suggesting is used to mark the buffers 
as
'no-write-allocate' to prevent GPU write operations from being cached 
in the LLC
which a) isn't interesting and b) takes up cache space for read 
operations.


Its easiest to think of the LLC as a bonus accelerator that has no cost 
for

us to use outside of the unfortunate per buffer hint.

We do have to worry about the CPU cache w.r.t I/O coherency (which is a
different hint) and in that case we have all of concerns that Will 
identified.




For mismatched outer cacheability attributes which Will mentioned, I was
referring to [1] in android kernel.

[1] https://android-review.googlesource.com/c/kernel/common/+/1549097/3

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-02-01 Thread Sai Prakash Ranjan

On 2021-02-01 23:50, Jordan Crouse wrote:

On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:

On Mon, Feb 1, 2021 at 3:16 AM Will Deacon  wrote:
>
> On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> > On 2021-01-29 14:35, Will Deacon wrote:
> > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> > > > Add a new page protection flag IOMMU_LLC which can be used
> > > > by non-coherent masters to set cacheable memory attributes
> > > > for an outer level of cache called as last-level cache or
> > > > system cache. Initial user of this page protection flag is
> > > > the adreno gpu and then can later be used by other clients
> > > > such as video where this can be used for per-buffer based
> > > > mapping.
> > > >
> > > > Signed-off-by: Sai Prakash Ranjan 
> > > > ---
> > > >  drivers/iommu/io-pgtable-arm.c | 3 +++
> > > >  include/linux/iommu.h  | 6 ++
> > > >  2 files changed, 9 insertions(+)
> > > >
> > > > diff --git a/drivers/iommu/io-pgtable-arm.c
> > > > b/drivers/iommu/io-pgtable-arm.c
> > > > index 7439ee7fdcdb..ebe653ef601b 100644
> > > > --- a/drivers/iommu/io-pgtable-arm.c
> > > > +++ b/drivers/iommu/io-pgtable-arm.c
> > > > @@ -415,6 +415,9 @@ static arm_lpae_iopte
> > > > arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
> > > >   else if (prot & IOMMU_CACHE)
> > > >   pte |= (ARM_LPAE_MAIR_ATTR_IDX_CACHE
> > > >   << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> > > > + else if (prot & IOMMU_LLC)
> > > > + pte |= (ARM_LPAE_MAIR_ATTR_IDX_INC_OCACHE
> > > > + << ARM_LPAE_PTE_ATTRINDX_SHIFT);
> > > >   }
> > > >
> > > >   if (prot & IOMMU_CACHE)
> > > > diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> > > > index ffaa389ea128..1f82057df531 100644
> > > > --- a/include/linux/iommu.h
> > > > +++ b/include/linux/iommu.h
> > > > @@ -31,6 +31,12 @@
> > > >   * if the IOMMU page table format is equivalent.
> > > >   */
> > > >  #define IOMMU_PRIV   (1 << 5)
> > > > +/*
> > > > + * Non-coherent masters can use this page protection flag to set
> > > > cacheable
> > > > + * memory attributes for only a transparent outer level of cache,
> > > > also known as
> > > > + * the last-level or system cache.
> > > > + */
> > > > +#define IOMMU_LLC(1 << 6)
> > >
> > > On reflection, I'm a bit worried about exposing this because I think it
> > > will
> > > introduce a mismatched virtual alias with the CPU (we don't even have a
> > > MAIR
> > > set up for this memory type). Now, we also have that issue for the PTW,
> > > but
> > > since we always use cache maintenance (i.e. the streaming API) for
> > > publishing the page-tables to a non-coheren walker, it works out.
> > > However,
> > > if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
> > > allocation, then they're potentially in for a nasty surprise due to the
> > > mismatched outer-cacheability attributes.
> > >
> >
> > Can't we add the syscached memory type similar to what is done on android?
>
> Maybe. How does the GPU driver map these things on the CPU side?

Currently we use writecombine mappings for everything, although there
are some cases that we'd like to use cached (but have not merged
patches that would give userspace a way to flush/invalidate)

BR,
-R


LLC/system cache doesn't have a relationship with the CPU cache.  Its 
just a
little accelerator that sits on the connection from the GPU to DDR and 
caches
accesses. The hint that Sai is suggesting is used to mark the buffers 
as
'no-write-allocate' to prevent GPU write operations from being cached 
in the LLC
which a) isn't interesting and b) takes up cache space for read 
operations.


Its easiest to think of the LLC as a bonus accelerator that has no cost 
for

us to use outside of the unfortunate per buffer hint.

We do have to worry about the CPU cache w.r.t I/O coherency (which is a
different hint) and in that case we have all of concerns that Will 
identified.




For mismatched outer cacheability attributes which Will mentioned, I was
referring to [1] in android kernel.

[1] https://android-review.googlesource.com/c/kernel/common/+/1549097/3

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/3] iommu/io-pgtable-arm: Add IOMMU_LLC page protection flag

2021-02-05 Thread Sai Prakash Ranjan

On 2021-02-04 03:16, Will Deacon wrote:

On Tue, Feb 02, 2021 at 11:56:27AM +0530, Sai Prakash Ranjan wrote:

On 2021-02-01 23:50, Jordan Crouse wrote:
> On Mon, Feb 01, 2021 at 08:20:44AM -0800, Rob Clark wrote:
> > On Mon, Feb 1, 2021 at 3:16 AM Will Deacon  wrote:
> > > On Fri, Jan 29, 2021 at 03:12:59PM +0530, Sai Prakash Ranjan wrote:
> > > > On 2021-01-29 14:35, Will Deacon wrote:
> > > > > On Mon, Jan 11, 2021 at 07:45:04PM +0530, Sai Prakash Ranjan wrote:
> > > > > > +#define IOMMU_LLC(1 << 6)
> > > > >
> > > > > On reflection, I'm a bit worried about exposing this because I think 
it
> > > > > will
> > > > > introduce a mismatched virtual alias with the CPU (we don't even have 
a
> > > > > MAIR
> > > > > set up for this memory type). Now, we also have that issue for the 
PTW,
> > > > > but
> > > > > since we always use cache maintenance (i.e. the streaming API) for
> > > > > publishing the page-tables to a non-coheren walker, it works out.
> > > > > However,
> > > > > if somebody expects IOMMU_LLC to be coherent with a DMA API coherent
> > > > > allocation, then they're potentially in for a nasty surprise due to 
the
> > > > > mismatched outer-cacheability attributes.
> > > > >
> > > >
> > > > Can't we add the syscached memory type similar to what is done on 
android?
> > >
> > > Maybe. How does the GPU driver map these things on the CPU side?
> >
> > Currently we use writecombine mappings for everything, although there
> > are some cases that we'd like to use cached (but have not merged
> > patches that would give userspace a way to flush/invalidate)
> >
>
> LLC/system cache doesn't have a relationship with the CPU cache.  Its
> just a
> little accelerator that sits on the connection from the GPU to DDR and
> caches
> accesses. The hint that Sai is suggesting is used to mark the buffers as
> 'no-write-allocate' to prevent GPU write operations from being cached in
> the LLC
> which a) isn't interesting and b) takes up cache space for read
> operations.
>
> Its easiest to think of the LLC as a bonus accelerator that has no cost
> for
> us to use outside of the unfortunate per buffer hint.
>
> We do have to worry about the CPU cache w.r.t I/O coherency (which is a
> different hint) and in that case we have all of concerns that Will
> identified.
>

For mismatched outer cacheability attributes which Will mentioned, I 
was

referring to [1] in android kernel.


I've lost track of the conversation here :/

When the GPU has a buffer mapped with IOMMU_LLC, is the buffer also 
mapped

into the CPU and with what attributes? Rob said "writecombine for
everything" -- does that mean ioremap_wc() / MEMREMAP_WC?



Rob answered this.

Finally, we need to be careful when we use the word "hint" as 
"allocation
hint" has a specific meaning in the architecture, and if we only 
mismatch on

those then we're actually ok. But I think IOMMU_LLC is more than just a
hint, since it actually drives eviction policy (i.e. it enables 
writeback).


Sorry for the pedantry, but I just want to make sure we're all talking
about the same things!



Sorry for the confusion which probably was caused by my mentioning of
android, NWA(no write allocate) is an allocation hint which we can 
ignore

for now as it is not introduced yet in upstream.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


[PATCH] iommu: Add device name to iommu map/unmap trace events

2021-02-09 Thread Sai Prakash Ranjan
IOMMU map/unmap traces become hard to decode i.e., it becomes hard
to associate the map/unmap events with the particular device from
the iova/paddr/size parameters alone when there are multiple
devices attached. So it is useful to add the device name to iommu
trace events which can be used to filter out map/unmap traces for
a particular device when we are debugging iommu faults such as
context faults where we are interested with the map/unmap traces
for a specific device.

Before:
  map: IOMMU: iova=0x00036000 paddr=0x0001164d8000 size=4096
  unmap: IOMMU: iova=0x00036000 size=4096 unmapped_size=4096

After:
  map: IOMMU: dev=1d84000.ufshc iova=0x000fffa88000 
paddr=0x0001063db000 size=4096
  unmap: IOMMU: dev=1d84000.ufshc iova=0x000fffa88000 size=4096 
unmapped_size=4096

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/iommu.c|  8 +---
 include/linux/iommu.h|  1 +
 include/trace/events/iommu.h | 20 
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d0b0a15dba84..37081b745f38 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1947,8 +1947,10 @@ static int __iommu_attach_device(struct iommu_domain 
*domain,
return -ENODEV;
 
ret = domain->ops->attach_dev(domain, dev);
-   if (!ret)
+   if (!ret) {
trace_attach_device_to_domain(dev);
+   strscpy(domain->dev_name, dev_name(dev), 
sizeof(domain->dev_name));
+   }
return ret;
 }
 
@@ -2440,7 +2442,7 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
else
-   trace_map(orig_iova, orig_paddr, orig_size);
+   trace_map(orig_iova, orig_paddr, orig_size, domain->dev_name);
 
return ret;
 }
@@ -2523,7 +2525,7 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
unmapped += unmapped_page;
}
 
-   trace_unmap(orig_iova, size, unmapped);
+   trace_unmap(orig_iova, size, unmapped, domain->dev_name);
return unmapped;
 }
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5e7fe519430a..6064187d9bb6 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -87,6 +87,7 @@ struct iommu_domain {
void *handler_token;
struct iommu_domain_geometry geometry;
void *iova_cookie;
+   char dev_name[32];
 };
 
 enum iommu_cap {
diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
index 72b4582322ff..44e48fb8b677 100644
--- a/include/trace/events/iommu.h
+++ b/include/trace/events/iommu.h
@@ -85,47 +85,51 @@ DEFINE_EVENT(iommu_device_event, detach_device_from_domain,
 
 TRACE_EVENT(map,
 
-   TP_PROTO(unsigned long iova, phys_addr_t paddr, size_t size),
+   TP_PROTO(unsigned long iova, phys_addr_t paddr, size_t size, const char 
*dev_name),
 
-   TP_ARGS(iova, paddr, size),
+   TP_ARGS(iova, paddr, size, dev_name),
 
TP_STRUCT__entry(
__field(u64, iova)
__field(u64, paddr)
__field(size_t, size)
+   __string(dev_name, dev_name)
),
 
TP_fast_assign(
__entry->iova = iova;
__entry->paddr = paddr;
__entry->size = size;
+   __assign_str(dev_name, dev_name);
),
 
-   TP_printk("IOMMU: iova=0x%016llx paddr=0x%016llx size=%zu",
-   __entry->iova, __entry->paddr, __entry->size
+   TP_printk("IOMMU: dev=%s iova=0x%016llx paddr=0x%016llx size=%zu",
+ __get_str(dev_name), __entry->iova, __entry->paddr, 
__entry->size
)
 );
 
 TRACE_EVENT(unmap,
 
-   TP_PROTO(unsigned long iova, size_t size, size_t unmapped_size),
+   TP_PROTO(unsigned long iova, size_t size, size_t unmapped_size, const 
char *dev_name),
 
-   TP_ARGS(iova, size, unmapped_size),
+   TP_ARGS(iova, size, unmapped_size, dev_name),
 
TP_STRUCT__entry(
__field(u64, iova)
__field(size_t, size)
__field(size_t, unmapped_size)
+   __string(dev_name, dev_name)
),
 
TP_fast_assign(
__entry->iova = iova;
__entry->size = size;
__entry->unmapped_size = unmapped_size;
+   __assign_str(dev_name, dev_name);
),
 
-   TP_printk("IOMMU: iova=0x%016llx size=%zu unmapped_size=%zu",
-   __entry->iova, __entry->size, __entry->unmapped_size
+   TP_printk("IOMMU: dev=%s iova=0x%016llx size=%zu unmapped_size=%zu",
+ __get_str(dev_name), __entry->iova, __entry->size, 
__entry->unmapped_size
)
 );
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 3/9] arm64: dts: qcom: sc7280: Add device tree node for LLCC

2021-02-25 Thread Sai Prakash Ranjan
Add a DT node for Last level cache (aka. system cache)
controller which provides control over the last level
cache present on SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sc7280.dtsi | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
b/arch/arm64/boot/dts/qcom/sc7280.dtsi
index 3b86052b78bc..aeeb47c70c3a 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -338,6 +338,13 @@ uart5: serial@994000 {
};
};
 
+   system-cache-controller@920 {
+   compatible = "qcom,sc7280-llcc";
+   reg = <0 0x0920 0 0xd>, <0 0x0960 0 
0x5>;
+   reg-names = "llcc_base", "llcc_broadcast_base";
+   interrupts = ;
+   };
+
pdc: interrupt-controller@b22 {
compatible = "qcom,sc7280-pdc", "qcom,pdc";
reg = <0 0xb22 0 0x3>;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 0/9] qcom/sc7280: Enable various hardware blocks on SC7280 SoC

2021-02-25 Thread Sai Prakash Ranjan
This series enables various hardware blocks such as LLCC, IPCC, AOSS QMP
and Coresight on SC7280 SoC.

This series is dependent on the base support added for SC7280 in [1].

[1] https://lore.kernel.org/patchwork/cover/1379842/

Sai Prakash Ranjan (9):
  dt-bindings: arm: msm: Add LLCC for SC7280
  soc: qcom: llcc: Add configuration data for SC7280
  arm64: dts: qcom: sc7280: Add device tree node for LLCC
  dt-bindings: mailbox: qcom-ipcc: Add compatible for SC7280
  arm64: dts: qcom: sc7280: Add IPCC for SC7280 SoC
  dt-bindings: soc: qcom: aoss: Add SC7280 compatible
  soc: qcom: aoss: Add AOSS QMP support for SC7280
  arm64: dts: qcom: sc7280: Add AOSS QMP node
  arm64: dts: qcom: sc7280: Add Coresight support

 .../bindings/arm/msm/qcom,llcc.yaml   |   1 +
 .../bindings/mailbox/qcom-ipcc.yaml   |   1 +
 .../bindings/soc/qcom/qcom,aoss-qmp.txt   |   1 +
 arch/arm64/boot/dts/qcom/sc7280.dtsi  | 520 ++
 drivers/soc/qcom/llcc-qcom.c  |  19 +
 drivers/soc/qcom/qcom_aoss.c  |   1 +
 6 files changed, 543 insertions(+)


base-commit: d79b47c59576a51d8e288a6b98b75ccf4afb8acd
prerequisite-patch-id: d8babdd3c8a9923360af342f3d8d9876820272e5
prerequisite-patch-id: 5757e07e4336d773d402769d09106924962ce31b
prerequisite-patch-id: 9b21eb51aa86619f5695a511c65c9236e3bc0f2b
prerequisite-patch-id: 2f834cc892f7f9109cbf32a87d504ba27b64a5df
prerequisite-patch-id: 14b1185357703d750c3411a16e97675489ca7dde
prerequisite-patch-id: 55c143f21b646c18da921a62bbd2801a5df38c8f
prerequisite-patch-id: 66f4c58aff2f1a7283b0103590ff82384907bae3
prerequisite-patch-id: 75e73e6b13ab91ed5e3a96b59957aa5e867d65ea
prerequisite-patch-id: eb46845b4f9eb3706a26911042c2865a58577198
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/9] soc: qcom: llcc: Add configuration data for SC7280

2021-02-25 Thread Sai Prakash Ranjan
Add LLCC configuration data for SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/soc/qcom/llcc-qcom.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/soc/qcom/llcc-qcom.c b/drivers/soc/qcom/llcc-qcom.c
index 8403a77b59fe..15a36dcab990 100644
--- a/drivers/soc/qcom/llcc-qcom.c
+++ b/drivers/soc/qcom/llcc-qcom.c
@@ -109,6 +109,18 @@ static const struct llcc_slice_config sc7180_data[] =  {
{ LLCC_GPU,  12, 128, 1, 0, 0xf, 0x0, 0, 0, 0, 1, 0 },
 };
 
+static const struct llcc_slice_config sc7280_data[] =  {
+   { LLCC_CPUSS,1,  768, 1, 0, 0x3f, 0x0, 0, 0, 0, 1, 1, 0},
+   { LLCC_MDMHPGRW, 7,  512, 2, 1, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+   { LLCC_CMPT, 10, 768, 1, 1, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+   { LLCC_GPUHTW,   11, 256, 1, 1, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+   { LLCC_GPU,  12, 512, 1, 0, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+   { LLCC_MMUHWT,   13, 256, 1, 1, 0x3f, 0x0, 0, 0, 0, 1, 1, 0},
+   { LLCC_MDMPNG,   21, 768, 0, 1, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+   { LLCC_WLHW, 24, 256, 1, 1, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+   { LLCC_MODPE,29, 64,  1, 1, 0x3f, 0x0, 0, 0, 0, 1, 0, 0},
+};
+
 static const struct llcc_slice_config sdm845_data[] =  {
{ LLCC_CPUSS,1,  2816, 1, 0, 0xffc, 0x2,   0, 0, 1, 1, 1 },
{ LLCC_VIDSC0,   2,  512,  2, 1, 0x0,   0x0f0, 0, 0, 1, 1, 0 },
@@ -179,6 +191,12 @@ static const struct qcom_llcc_config sc7180_cfg = {
.need_llcc_cfg  = true,
 };
 
+static const struct qcom_llcc_config sc7280_cfg = {
+   .sct_data   = sc7280_data,
+   .size   = ARRAY_SIZE(sc7280_data),
+   .need_llcc_cfg  = true,
+};
+
 static const struct qcom_llcc_config sdm845_cfg = {
.sct_data   = sdm845_data,
.size   = ARRAY_SIZE(sdm845_data),
@@ -606,6 +624,7 @@ static int qcom_llcc_probe(struct platform_device *pdev)
 
 static const struct of_device_id qcom_llcc_of_match[] = {
{ .compatible = "qcom,sc7180-llcc", .data = &sc7180_cfg },
+   { .compatible = "qcom,sc7280-llcc", .data = &sc7280_cfg },
{ .compatible = "qcom,sdm845-llcc", .data = &sdm845_cfg },
{ .compatible = "qcom,sm8150-llcc", .data = &sm8150_cfg },
{ .compatible = "qcom,sm8250-llcc", .data = &sm8250_cfg },
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/9] dt-bindings: arm: msm: Add LLCC for SC7280

2021-02-25 Thread Sai Prakash Ranjan
Add LLCC compatible for SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---
 Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml 
b/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml
index c299dc907f6c..62fcbd883392 100644
--- a/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml
+++ b/Documentation/devicetree/bindings/arm/msm/qcom,llcc.yaml
@@ -22,6 +22,7 @@ properties:
   compatible:
 enum:
   - qcom,sc7180-llcc
+  - qcom,sc7280-llcc
   - qcom,sdm845-llcc
   - qcom,sm8150-llcc
   - qcom,sm8250-llcc
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 5/9] arm64: dts: qcom: sc7280: Add IPCC for SC7280 SoC

2021-02-25 Thread Sai Prakash Ranjan
Add the IPCC DT node which is used to send and receive IPC
signals with remoteprocs for SC7280 SoC.

Cc: Manivannan Sadhasivam 
Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sc7280.dtsi | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
b/arch/arm64/boot/dts/qcom/sc7280.dtsi
index aeeb47c70c3a..65c1e0f2fb56 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 / {
@@ -315,6 +316,15 @@ gcc: clock-controller@10 {
#power-domain-cells = <1>;
};
 
+   ipcc: mailbox@408000 {
+   compatible = "qcom,sc7280-ipcc", "qcom,ipcc";
+   reg = <0 0x00408000 0 0x1000>;
+   interrupts = ;
+   interrupt-controller;
+   #interrupt-cells = <3>;
+   #mbox-cells = <2>;
+   };
+
qupv3_id_0: geniqup@9c {
compatible = "qcom,geni-se-qup";
reg = <0 0x009c 0 0x2000>;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 8/9] arm64: dts: qcom: sc7280: Add AOSS QMP node

2021-02-25 Thread Sai Prakash Ranjan
Add a DT node for the AOSS QMP on SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sc7280.dtsi | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
b/arch/arm64/boot/dts/qcom/sc7280.dtsi
index 65c1e0f2fb56..cbd567ccc04e 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 / {
@@ -368,6 +369,19 @@ pdc: interrupt-controller@b22 {
interrupt-controller;
};
 
+   aoss_qmp: qmp@c30 {
+   compatible = "qcom,sc7280-aoss-qmp";
+   reg = <0 0x0c30 0 0x10>;
+   interrupts-extended = <&ipcc IPCC_CLIENT_AOP
+IPCC_MPROC_SIGNAL_GLINK_QMP
+IRQ_TYPE_EDGE_RISING>;
+   mboxes = <&ipcc IPCC_CLIENT_AOP
+   IPCC_MPROC_SIGNAL_GLINK_QMP>;
+
+   #clock-cells = <0>;
+   #power-domain-cells = <1>;
+   };
+
spmi_bus: qcom,spmi@c44 {
compatible = "qcom,spmi-pmic-arb";
reg = <0 0x0c44 0 0x1100>,
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 6/9] dt-bindings: soc: qcom: aoss: Add SC7280 compatible

2021-02-25 Thread Sai Prakash Ranjan
Add SC7280 AOSS QMP compatible to the list of possible bindings.

Signed-off-by: Sai Prakash Ranjan 
---
 Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt 
b/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
index 19c059e44681..783dc81b0f26 100644
--- a/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,aoss-qmp.txt
@@ -17,6 +17,7 @@ power-domains.
Value type: 
Definition: must be one of:
"qcom,sc7180-aoss-qmp"
+   "qcom,sc7280-aoss-qmp"
"qcom,sdm845-aoss-qmp"
"qcom,sm8150-aoss-qmp"
"qcom,sm8250-aoss-qmp"
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 4/9] dt-bindings: mailbox: qcom-ipcc: Add compatible for SC7280

2021-02-25 Thread Sai Prakash Ranjan
Add IPCC compatible for SC7280 SoC.

Cc: Manivannan Sadhasivam 
Signed-off-by: Sai Prakash Ranjan 
---
 Documentation/devicetree/bindings/mailbox/qcom-ipcc.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/mailbox/qcom-ipcc.yaml 
b/Documentation/devicetree/bindings/mailbox/qcom-ipcc.yaml
index 168beeb7e9f7..06419543d235 100644
--- a/Documentation/devicetree/bindings/mailbox/qcom-ipcc.yaml
+++ b/Documentation/devicetree/bindings/mailbox/qcom-ipcc.yaml
@@ -24,6 +24,7 @@ properties:
   compatible:
 items:
   - enum:
+  - qcom,sc7280-ipcc
   - qcom,sm8250-ipcc
   - const: qcom,ipcc
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 7/9] soc: qcom: aoss: Add AOSS QMP support for SC7280

2021-02-25 Thread Sai Prakash Ranjan
Add AOSS QMP support for SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/soc/qcom/qcom_aoss.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/soc/qcom/qcom_aoss.c b/drivers/soc/qcom/qcom_aoss.c
index 53acb9423bd6..934fcc4d2b05 100644
--- a/drivers/soc/qcom/qcom_aoss.c
+++ b/drivers/soc/qcom/qcom_aoss.c
@@ -597,6 +597,7 @@ static int qmp_remove(struct platform_device *pdev)
 
 static const struct of_device_id qmp_dt_match[] = {
{ .compatible = "qcom,sc7180-aoss-qmp", },
+   { .compatible = "qcom,sc7280-aoss-qmp", },
{ .compatible = "qcom,sdm845-aoss-qmp", },
{ .compatible = "qcom,sm8150-aoss-qmp", },
{ .compatible = "qcom,sm8250-aoss-qmp", },
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 9/9] arm64: dts: qcom: sc7280: Add Coresight support

2021-02-25 Thread Sai Prakash Ranjan
Add coresight components found on SC7280 SoC.

Cc: Mathieu Poirier 
Cc: Suzuki K Poulose 
Cc: Mike Leach 
Cc: Leo Yan 
Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sc7280.dtsi | 489 +++
 1 file changed, 489 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
b/arch/arm64/boot/dts/qcom/sc7280.dtsi
index cbd567ccc04e..3245a18fa2a1 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -349,6 +349,495 @@ uart5: serial@994000 {
};
};
 
+   stm@6002000 {
+   compatible = "arm,coresight-stm", "arm,primecell";
+   reg = <0 0x06002000 0 0x1000>,
+ <0 0x1628 0 0x18>;
+   reg-names = "stm-base", "stm-stimulus-base";
+
+   clocks = <&aoss_qmp>;
+   clock-names = "apb_pclk";
+
+   out-ports {
+   port {
+   stm_out: endpoint {
+   remote-endpoint = 
<&funnel0_in7>;
+   };
+   };
+   };
+   };
+
+   funnel@6041000 {
+   compatible = "arm,coresight-dynamic-funnel", 
"arm,primecell";
+   reg = <0 0x06041000 0 0x1000>;
+
+   clocks = <&aoss_qmp>;
+   clock-names = "apb_pclk";
+
+   out-ports {
+   port {
+   funnel0_out: endpoint {
+   remote-endpoint = 
<&merge_funnel_in0>;
+   };
+   };
+   };
+
+   in-ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@7 {
+   reg = <7>;
+   funnel0_in7: endpoint {
+   remote-endpoint = <&stm_out>;
+   };
+   };
+   };
+   };
+
+   funnel@6042000 {
+   compatible = "arm,coresight-dynamic-funnel", 
"arm,primecell";
+   reg = <0 0x06042000 0 0x1000>;
+
+   clocks = <&aoss_qmp>;
+   clock-names = "apb_pclk";
+
+   out-ports {
+   port {
+   funnel1_out: endpoint {
+   remote-endpoint = 
<&merge_funnel_in1>;
+   };
+   };
+   };
+
+   in-ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@4 {
+   reg = <4>;
+   funnel1_in4: endpoint {
+   remote-endpoint = 
<&apss_merge_funnel_out>;
+   };
+   };
+   };
+   };
+
+   funnel@6045000 {
+   compatible = "arm,coresight-dynamic-funnel", 
"arm,primecell";
+   reg = <0 0x06045000 0 0x1000>;
+
+   clocks = <&aoss_qmp>;
+   clock-names = "apb_pclk";
+
+   out-ports {
+   port {
+   merge_funnel_out: endpoint {
+   remote-endpoint = 
<&swao_funnel_in>;
+   };
+   };
+   };
+
+   in-ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   merge_funnel_in0: endpoint {
+   remote-endpoint = 
<&funn

[PATCH 0/2] iommu/arm-smmu-qcom: Add SC7280 support

2021-02-25 Thread Sai Prakash Ranjan
Patch 1 adds the sc7280 smmu compatible.
Patch 2 moves the adreno smmu check before apss smmu to enable
adreno smmu specific implementation.

Sai Prakash Ranjan (2):
  iommu/arm-smmu-qcom: Add SC7280 SMMU compatible
  iommu/arm-smmu-qcom: Move the adreno smmu specific impl earlier

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)


base-commit: 7060377ce06f9cd3ed6274c0f2310463feb5baec
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 2/2] iommu/arm-smmu-qcom: Move the adreno smmu specific impl earlier

2021-02-25 Thread Sai Prakash Ranjan
Adreno(GPU) SMMU and APSS(Application Processor SubSystem) SMMU
both implement "arm,mmu-500" in some QTI SoCs and to run through
adreno smmu specific implementation such as enabling split pagetables
support, we need to match the "qcom,adreno-smmu" compatible first
before apss smmu or else we will be running apps smmu implementation
for adreno smmu and the additional features for adreno smmu is never
set. For ex: we have "qcom,sc7280-smmu-500" compatible for both apps
and adreno smmu implementing "arm,mmu-500", so the adreno smmu
implementation is never reached because the current sequence checks
for apps smmu compatible(qcom,sc7280-smmu-500) first and runs that
specific impl and we never reach adreno smmu specific implementation.

Suggested-by: Akhil P Oommen 
Signed-off-by: Sai Prakash Ranjan 
---

Its either this or we add a new compatible for adreno smmu implementing
arm,mmu-500 like "qcom,sc7280-adreno-smmu-500".

---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index bea3ee0dabc2..7d0fc2c8e72f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -345,11 +345,11 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct 
arm_smmu_device *smmu)
 {
const struct device_node *np = smmu->dev->of_node;
 
-   if (of_match_node(qcom_smmu_impl_of_match, np))
-   return qcom_smmu_create(smmu, &qcom_smmu_impl);
-
if (of_device_is_compatible(np, "qcom,adreno-smmu"))
return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
 
+   if (of_match_node(qcom_smmu_impl_of_match, np))
+   return qcom_smmu_create(smmu, &qcom_smmu_impl);
+
return smmu;
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/2] iommu/arm-smmu-qcom: Add SC7280 SMMU compatible

2021-02-25 Thread Sai Prakash Ranjan
Add compatible for SC7280 SMMU to use the Qualcomm Technologies, Inc.
specific implementation.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 98b3a1c2a181..bea3ee0dabc2 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -166,6 +166,7 @@ static const struct of_device_id 
qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,mdss" },
{ .compatible = "qcom,sc7180-mdss" },
{ .compatible = "qcom,sc7180-mss-pil" },
+   { .compatible = "qcom,sc7280-mdss" },
{ .compatible = "qcom,sc8180x-mdss" },
{ .compatible = "qcom,sdm845-mdss" },
{ .compatible = "qcom,sdm845-mss-pil" },
@@ -330,6 +331,7 @@ static struct arm_smmu_device *qcom_smmu_create(struct 
arm_smmu_device *smmu,
 static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
{ .compatible = "qcom,msm8998-smmu-v2" },
{ .compatible = "qcom,sc7180-smmu-500" },
+   { .compatible = "qcom,sc7280-smmu-500" },
{ .compatible = "qcom,sc8180x-smmu-500" },
{ .compatible = "qcom,sdm630-smmu-v2" },
{ .compatible = "qcom,sdm845-smmu-500" },
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH 8/9] arm64: dts: qcom: sc7280: Add AOSS QMP node

2021-02-25 Thread Sai Prakash Ranjan

On 2021-02-26 01:11, Stephen Boyd wrote:

Quoting Sai Prakash Ranjan (2021-02-25 01:30:24)

Add a DT node for the AOSS QMP on SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---
 arch/arm64/boot/dts/qcom/sc7280.dtsi | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
b/arch/arm64/boot/dts/qcom/sc7280.dtsi

index 65c1e0f2fb56..cbd567ccc04e 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 / {
@@ -368,6 +369,19 @@ pdc: interrupt-controller@b22 {
interrupt-controller;
};

+   aoss_qmp: qmp@c30 {


power-domain-controller@c30? power-controller@c30?



Its an AOSS message RAM and all other SM* SoCs have as qmp@
and the dt binding as well, I see only SM8150 with power-controller,
that should probably be fixed?


+   compatible = "qcom,sc7280-aoss-qmp";
+   reg = <0 0x0c30 0 0x10>;
+   interrupts-extended = <&ipcc IPCC_CLIENT_AOP
+
IPCC_MPROC_SIGNAL_GLINK_QMP
+
IRQ_TYPE_EDGE_RISING>;

+   mboxes = <&ipcc IPCC_CLIENT_AOP
+   IPCC_MPROC_SIGNAL_GLINK_QMP>;
+
+   #clock-cells = <0>;
+   #power-domain-cells = <1>;
+   };
+
spmi_bus: qcom,spmi@c44 {


Ick, should be spmi@



Not introduced by this patch but I'll pass on the comment.


compatible = "qcom,spmi-pmic-arb";
reg = <0 0x0c44 0 0x1100>,



Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 3/9] arm64: dts: qcom: sc7280: Add device tree node for LLCC

2021-02-26 Thread Sai Prakash Ranjan

On 2021-02-26 01:07, Stephen Boyd wrote:

Quoting Sai Prakash Ranjan (2021-02-25 01:30:19)

Add a DT node for Last level cache (aka. system cache)
controller which provides control over the last level
cache present on SC7280 SoC.

Signed-off-by: Sai Prakash Ranjan 
---


Reviewed-by: Stephen Boyd 

Should add system-cache-controller to the devicetree spec. Or just use
cache-controller for the node name.


This was as per discussion in [1][2] where dt-schema throws an error
since it expects cache-level to be associated with cache-controller.

[1] 
https://lore.kernel.org/lkml/5dcd8588.1c69fb81.2528a.3...@mx.google.com/
[2] 
https://lore.kernel.org/lkml/cover.1573814758.git.saiprakash.ran...@codeaurora.org/


Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 2/2] iommu/arm-smmu-qcom: Move the adreno smmu specific impl earlier

2021-02-26 Thread Sai Prakash Ranjan

On 2021-02-25 23:36, Jordan Crouse wrote:

On Thu, Feb 25, 2021 at 03:54:10PM +0530, Sai Prakash Ranjan wrote:

Adreno(GPU) SMMU and APSS(Application Processor SubSystem) SMMU
both implement "arm,mmu-500" in some QTI SoCs and to run through
adreno smmu specific implementation such as enabling split pagetables
support, we need to match the "qcom,adreno-smmu" compatible first
before apss smmu or else we will be running apps smmu implementation
for adreno smmu and the additional features for adreno smmu is never
set. For ex: we have "qcom,sc7280-smmu-500" compatible for both apps
and adreno smmu implementing "arm,mmu-500", so the adreno smmu
implementation is never reached because the current sequence checks
for apps smmu compatible(qcom,sc7280-smmu-500) first and runs that
specific impl and we never reach adreno smmu specific implementation.

Suggested-by: Akhil P Oommen 
Signed-off-by: Sai Prakash Ranjan 
---

Its either this or we add a new compatible for adreno smmu 
implementing

arm,mmu-500 like "qcom,sc7280-adreno-smmu-500".

---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c

index bea3ee0dabc2..7d0fc2c8e72f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -345,11 +345,11 @@ struct arm_smmu_device 
*qcom_smmu_impl_init(struct arm_smmu_device *smmu)

 {
const struct device_node *np = smmu->dev->of_node;

-   if (of_match_node(qcom_smmu_impl_of_match, np))
-   return qcom_smmu_create(smmu, &qcom_smmu_impl);
-
if (of_device_is_compatible(np, "qcom,adreno-smmu"))
return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);

+   if (of_match_node(qcom_smmu_impl_of_match, np))
+   return qcom_smmu_create(smmu, &qcom_smmu_impl);
+


It would be good to add a comment here explaining the order here so we
don't accidentally reorganize ourselves back into a problem later.



Sure its better, will add it.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


[PATCHv2 0/2] iommu/arm-smmu-qcom: Add SC7280 support

2021-02-26 Thread Sai Prakash Ranjan
Patch 1 adds the sc7280 smmu compatible.
Patch 2 moves the adreno smmu check before apss smmu to enable
adreno smmu specific implementation.

Changes in v2:
 * Add a comment to make sure this order is not changed in future (Jordan)

Sai Prakash Ranjan (2):
  iommu/arm-smmu-qcom: Add SC7280 SMMU compatible
  iommu/arm-smmu-qcom: Move the adreno smmu specific impl earlier

 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)


base-commit: 7060377ce06f9cd3ed6274c0f2310463feb5baec
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCHv2 1/2] iommu/arm-smmu-qcom: Add SC7280 SMMU compatible

2021-02-26 Thread Sai Prakash Ranjan
Add compatible for SC7280 SMMU to use the Qualcomm Technologies, Inc.
specific implementation.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index 98b3a1c2a181..bea3ee0dabc2 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -166,6 +166,7 @@ static const struct of_device_id 
qcom_smmu_client_of_match[] __maybe_unused = {
{ .compatible = "qcom,mdss" },
{ .compatible = "qcom,sc7180-mdss" },
{ .compatible = "qcom,sc7180-mss-pil" },
+   { .compatible = "qcom,sc7280-mdss" },
{ .compatible = "qcom,sc8180x-mdss" },
{ .compatible = "qcom,sdm845-mdss" },
{ .compatible = "qcom,sdm845-mss-pil" },
@@ -330,6 +331,7 @@ static struct arm_smmu_device *qcom_smmu_create(struct 
arm_smmu_device *smmu,
 static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
{ .compatible = "qcom,msm8998-smmu-v2" },
{ .compatible = "qcom,sc7180-smmu-500" },
+   { .compatible = "qcom,sc7280-smmu-500" },
{ .compatible = "qcom,sc8180x-smmu-500" },
{ .compatible = "qcom,sdm630-smmu-v2" },
{ .compatible = "qcom,sdm845-smmu-500" },
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCHv2 2/2] iommu/arm-smmu-qcom: Move the adreno smmu specific impl earlier

2021-02-26 Thread Sai Prakash Ranjan
Adreno(GPU) SMMU and APSS(Application Processor SubSystem) SMMU
both implement "arm,mmu-500" in some QTI SoCs and to run through
adreno smmu specific implementation such as enabling split pagetables
support, we need to match the "qcom,adreno-smmu" compatible first
before apss smmu or else we will be running apps smmu implementation
for adreno smmu and the additional features for adreno smmu is never
set. For ex: we have "qcom,sc7280-smmu-500" compatible for both apps
and adreno smmu implementing "arm,mmu-500", so the adreno smmu
implementation is never reached because the current sequence checks
for apps smmu compatible(qcom,sc7280-smmu-500) first and runs that
specific impl and we never reach adreno smmu specific implementation.

Suggested-by: Akhil P Oommen 
Signed-off-by: Sai Prakash Ranjan 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
index bea3ee0dabc2..03f048aebb80 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
@@ -345,11 +345,17 @@ struct arm_smmu_device *qcom_smmu_impl_init(struct 
arm_smmu_device *smmu)
 {
const struct device_node *np = smmu->dev->of_node;
 
-   if (of_match_node(qcom_smmu_impl_of_match, np))
-   return qcom_smmu_create(smmu, &qcom_smmu_impl);
-
+   /*
+* Do not change this order of implementation, i.e., first adreno
+* smmu impl and then apss smmu since we can have both implementing
+* arm,mmu-500 in which case we will miss setting adreno smmu specific
+* features if the order is changed.
+*/
if (of_device_is_compatible(np, "qcom,adreno-smmu"))
return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);
 
+   if (of_match_node(qcom_smmu_impl_of_match, np))
+   return qcom_smmu_create(smmu, &qcom_smmu_impl);
+
return smmu;
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCHv2 2/2] iommu/arm-smmu-qcom: Move the adreno smmu specific impl earlier

2021-02-27 Thread Sai Prakash Ranjan
Hi Bjorn,

On 2021-02-27 00:44, Bjorn Andersson wrote:
> On Fri 26 Feb 12:23 CST 2021, Rob Clark wrote:
> 
> 
> The current logic picks one of:
> 1) is the compatible mentioned in qcom_smmu_impl_of_match[]
> 2) is the compatible an adreno
> 3) no quirks needed
> 
> The change flips the order of these, so the only way I can see this
> change affecting things is if we expected a match on #2, but we got one
> on #1.
> 
> Which implies that the instance that we want to act according to the
> adreno impl was listed in qcom_smmu_impl_of_match[] - which either is
> wrong, or there's a single instance that needs both behaviors.
> 
> (And I believe Jordan's answer confirms the latter - there's a single
> SMMU instance that needs all them quirks at once)
> 

Let me go through the problem statement in case my commit message wasn't
clear. There are two SMMUs (APSS and GPU) on SC7280 and both are SMMU500
(ARM SMMU IP).

APSS SMMU compatible - ("qcom,sc7280-smmu-500", "arm,mmu-500")
GPU SMMU compatible - ("qcom,sc7280-smmu-500", "qcom,adreno-smmu", 
"arm,mmu-500")

Now if we take SC7180 as an example, GPU SMMU was QSMMU(QCOM SMMU IP)
and APSS SMMU was SMMU500(ARM SMMU IP).

APSS SMMU compatible - ("qcom,sc7180-smmu-500", "arm,mmu-500")
GPU SMMU compatible - ("qcom,sc7180-smmu-v2", "qcom,adreno-smmu", 
"qcom,smmu-v2")

Current code sequence without this patch,

if (of_match_node(qcom_smmu_impl_of_match, np))
 return qcom_smmu_create(smmu, &qcom_smmu_impl);

if (of_device_is_compatible(np, "qcom,adreno-smmu"))
return qcom_smmu_create(smmu, &qcom_adreno_smmu_impl);

Now if we look at the compatible for SC7180, there is no problem because
the APSS SMMU will match the one in qcom_smmu_impl_of_match[] and GPU SMMU
will match "qcom,adreno-smmu" because the compatible strings are different.
But for SC7280, both the APSS SMMU and GPU SMMU 
compatible("qcom,sc7280-smmu-500")
are same. So GPU SMMU will match with the one in qcom_smmu_impl_of_match[]
i.e.., "qcom,sc7280-smmu-500" which functionally doesn't cause any problem
but we will miss settings for split pagetables which are part of GPU SMMU
specific implementation.

We can avoid this with yet another new compatible for GPU SMMU something like
"qcom,sc7280-adreno-smmu-500" but since we can handle this easily in the
driver and since the IPs are same, meaning if there was a hardware quirk
required, then we would need to apply to both of them and would this additional
compatible be of any help?

Coming to the part of quirks now, you are right saying both SMMUs will need
to have the same quirks in SC7280 and similar others where both are based on
same IPs but those should probably be *hardware quirks* and if they are
software based like the S2CR quirk depending on the firmware, then it might
not be applicable to both. In case if it is applicable, then as Jordan 
mentioned,
we can add the same quirks in GPU SMMU implementation.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 8/9] arm64: dts: qcom: sc7280: Add AOSS QMP node

2021-02-27 Thread Sai Prakash Ranjan

On 2021-02-27 00:16, Stephen Boyd wrote:

Quoting Sai Prakash Ranjan (2021-02-25 23:51:00)

On 2021-02-26 01:11, Stephen Boyd wrote:
> Quoting Sai Prakash Ranjan (2021-02-25 01:30:24)
>> Add a DT node for the AOSS QMP on SC7280 SoC.
>>
>> Signed-off-by: Sai Prakash Ranjan 
>> ---
>>  arch/arm64/boot/dts/qcom/sc7280.dtsi | 14 ++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi
>> b/arch/arm64/boot/dts/qcom/sc7280.dtsi
>> index 65c1e0f2fb56..cbd567ccc04e 100644
>> --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
>> +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
>> @@ -9,6 +9,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>
>>  / {
>> @@ -368,6 +369,19 @@ pdc: interrupt-controller@b22 {
>> interrupt-controller;
>> };
>>
>> +   aoss_qmp: qmp@c30 {
>
> power-domain-controller@c30? power-controller@c30?
>

Its an AOSS message RAM and all other SM* SoCs have as qmp@
and the dt binding as well, I see only SM8150 with power-controller,
that should probably be fixed?


Node name should be generic while still being meaningful. Nobody knows
what qmp is, but power-controller makes sense. Can you fix this and the
others to be power-controller?



Ok makes sense, I will post changing others as well and see if we get
any comments there.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


Re: [PATCH 3/9] arm64: dts: qcom: sc7280: Add device tree node for LLCC

2021-02-27 Thread Sai Prakash Ranjan

On 2021-02-27 00:15, Stephen Boyd wrote:

Quoting Sai Prakash Ranjan (2021-02-26 00:04:27)

On 2021-02-26 01:07, Stephen Boyd wrote:
> Quoting Sai Prakash Ranjan (2021-02-25 01:30:19)
>> Add a DT node for Last level cache (aka. system cache)
>> controller which provides control over the last level
>> cache present on SC7280 SoC.
>>
>> Signed-off-by: Sai Prakash Ranjan 
>> ---
>
> Reviewed-by: Stephen Boyd 
>
> Should add system-cache-controller to the devicetree spec. Or just use
> cache-controller for the node name.

This was as per discussion in [1][2] where dt-schema throws an error
since it expects cache-level to be associated with cache-controller.



Ah right. Can you add system-cache-controller to the dt spec?


Sure, I'll add it. Hopefully that won't have to block this change?
Because I might need some time to get permissions to add it there.

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation


[PATCH 1/2] coresight: tmc: Add enable flag to indicate the status of ETR/ETF

2020-06-01 Thread Sai Prakash Ranjan
Add a flag to check whether TMC ETR/ETF is enabled or not.
This is later used in shutdown callback to determine if
we require to disable ETR/ETF.

Signed-off-by: Sai Prakash Ranjan 
---
 drivers/hwtracing/coresight/coresight-tmc.c | 2 ++
 drivers/hwtracing/coresight/coresight-tmc.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight-tmc.c 
b/drivers/hwtracing/coresight/coresight-tmc.c
index 39fba1d16e6e..5a271ebc4585 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.c
+++ b/drivers/hwtracing/coresight/coresight-tmc.c
@@ -62,11 +62,13 @@ void tmc_flush_and_stop(struct tmc_drvdata *drvdata)
 
 void tmc_enable_hw(struct tmc_drvdata *drvdata)
 {
+   drvdata->enable = true;
writel_relaxed(TMC_CTL_CAPT_EN, drvdata->base + TMC_CTL);
 }
 
 void tmc_disable_hw(struct tmc_drvdata *drvdata)
 {
+   drvdata->enable = false;
writel_relaxed(0x0, drvdata->base + TMC_CTL);
 }
 
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h 
b/drivers/hwtracing/coresight/coresight-tmc.h
index 71de978575f3..d156860495c7 100644
--- a/drivers/hwtracing/coresight/coresight-tmc.h
+++ b/drivers/hwtracing/coresight/coresight-tmc.h
@@ -184,6 +184,7 @@ struct etr_buf {
  * @idr_mutex: Access serialisation for idr.
  * @sysfs_buf: SYSFS buffer for ETR.
  * @perf_buf:  PERF buffer for ETR.
+ * @enable:Indicates whether ETR/ETF is enabled.
  */
 struct tmc_drvdata {
void __iomem*base;
@@ -207,6 +208,7 @@ struct tmc_drvdata {
struct mutexidr_mutex;
struct etr_buf  *sysfs_buf;
struct etr_buf  *perf_buf;
+   boolenable;
 };
 
 struct etr_buf_operations {
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



  1   2   3   4   5   6   7   8   >