Re: [PATCH v2 00/11] LSM documentation update
The rest of the warnings are about undocumented hooks. This patchset fixes the existing documentation. I will try to document the hooks from warnings in a separate patch. Some of the hooks are trivial enough, but others require me digging into the code and mailing lists. Can't promise to do it quickly. 27.02.2019 1:09, Kees Cook пишет: > If you want more work, I do notice the following warnings are still present:
Re: [PATCH v2 00/11] LSM documentation update
On Wed, Feb 27, 2019 at 7:10 AM Denis Efremov wrote: > The rest of the warnings are about undocumented hooks. This patchset > fixes the existing documentation. I will try to document the hooks from > warnings in a separate patch. Some of the hooks are trivial enough, but > others require me digging into the code and mailing lists. Can't promise > to do it quickly. No worries! What you've added already helps a lot. :) -- Kees Cook
[PATCH v3 2/2] Add selftests for module build using in-kernel headers
This test tries to build a module successfully using the in-kernel headers found in /proc/kheaders.tar.xz. Verified pass and fail scenarios by running: make -C tools/testing/selftests TARGETS=kheaders run_tests Signed-off-by: Joel Fernandes (Google) --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/kheaders/Makefile | 5 + tools/testing/selftests/kheaders/config | 1 + .../kheaders/run_kheaders_modbuild.sh | 18 + .../selftests/kheaders/testmod/Makefile | 3 +++ .../testing/selftests/kheaders/testmod/test.c | 20 +++ 6 files changed, 48 insertions(+) create mode 100644 tools/testing/selftests/kheaders/Makefile create mode 100644 tools/testing/selftests/kheaders/config create mode 100755 tools/testing/selftests/kheaders/run_kheaders_modbuild.sh create mode 100644 tools/testing/selftests/kheaders/testmod/Makefile create mode 100644 tools/testing/selftests/kheaders/testmod/test.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 400ee81a3043..5a9287fddd0d 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -20,6 +20,7 @@ TARGETS += intel_pstate TARGETS += ipc TARGETS += ir TARGETS += kcmp +TARGETS += kheaders TARGETS += kvm TARGETS += lib TARGETS += membarrier diff --git a/tools/testing/selftests/kheaders/Makefile b/tools/testing/selftests/kheaders/Makefile new file mode 100644 index ..51035ab0732b --- /dev/null +++ b/tools/testing/selftests/kheaders/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +TEST_PROGS := run_kheaders_modbuild.sh + +include ../lib.mk diff --git a/tools/testing/selftests/kheaders/config b/tools/testing/selftests/kheaders/config new file mode 100644 index ..5221f9fb5e79 --- /dev/null +++ b/tools/testing/selftests/kheaders/config @@ -0,0 +1 @@ +CONFIG_IKHEADERS_PROC=y diff --git a/tools/testing/selftests/kheaders/run_kheaders_modbuild.sh b/tools/testing/selftests/kheaders/run_kheaders_modbuild.sh new file mode 100755 index ..f001568e08b0 --- /dev/null +++ b/tools/testing/selftests/kheaders/run_kheaders_modbuild.sh @@ -0,0 +1,18 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +HEADERS_XZ=/proc/kheaders.tar.xz +TMP_DIR_HEADERS=$(mktemp -d) +TMP_DIR_MODULE=$(mktemp -d) +SPATH="$(dirname "$(readlink -f "$0")")" + +tar -xvf $HEADERS_XZ -C $TMP_DIR_HEADERS > /dev/null + +cp -r $SPATH/testmod $TMP_DIR_MODULE/ + +pushd $TMP_DIR_MODULE/testmod > /dev/null +make -C $TMP_DIR_HEADERS M=$(pwd) modules +popd > /dev/null + +rm -rf $TMP_DIR_HEADERS +rm -rf $TMP_DIR_MODULE diff --git a/tools/testing/selftests/kheaders/testmod/Makefile b/tools/testing/selftests/kheaders/testmod/Makefile new file mode 100644 index ..7083e28706e8 --- /dev/null +++ b/tools/testing/selftests/kheaders/testmod/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-m += test.o diff --git a/tools/testing/selftests/kheaders/testmod/test.c b/tools/testing/selftests/kheaders/testmod/test.c new file mode 100644 index ..6eb0b8492ffa --- /dev/null +++ b/tools/testing/selftests/kheaders/testmod/test.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +static int __init hello_init(void) +{ + printk(KERN_INFO "Hello, world\n"); + return 0; +} + +static void __exit hello_exit(void) +{ + printk(KERN_INFO "Goodbye, world\n"); +} + +module_init(hello_init); +module_exit(hello_exit); +MODULE_LICENSE("GPL v2"); -- 2.21.0.rc2.261.ga7da99ff1b-goog
[PATCH v3 1/2] Provide in-kernel headers for making it easy to extend the kernel
Introduce in-kernel headers and other artifacts which are made available as an archive through proc (/proc/kheaders.tar.xz file). This archive makes it possible to build kernel modules, run eBPF programs, and other tracing programs that need to extend the kernel for tracing purposes without any dependency on the file system having headers and build artifacts. On Android and embedded systems, it is common to switch kernels but not have kernel headers available on the file system. Raw kernel headers also cannot be copied into the filesystem like they can be on other distros, due to licensing and other issues. There's no linux-headers package on Android. Further once a different kernel is booted, any headers stored on the file system will no longer be useful. By storing the headers as a compressed archive within the kernel, we can avoid these issues that have been a hindrance for a long time. The feature is also buildable as a module just in case the user desires it not being part of the kernel image. This makes it possible to load and unload the headers on demand. A tracing program, or a kernel module builder can load the module, do its operations, and then unload the module to save kernel memory. The total memory needed is 3.8MB. The code to read the headers is based on /proc/config.gz code and uses the same technique to embed the headers. To build a module, the below steps have been tested on an x86 machine: modprobe kheaders rm -rf $HOME/headers mkdir -p $HOME/headers tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null cd my-kernel-module make -C $HOME/headers M=$(pwd) modules rmmod kheaders Additional notes: (1) A limitation of module building with this is, since Module.symvers is not available in the archive due to a cyclic dependency with building of the archive into the kernel or module binaries, the modules built using the archive will not contain symbol versioning (modversion). This is usually not an issue since the idea of this patch is to build a kernel module on the fly and load it into the same kernel. An appropriate warning is already printed by the kernel to alert the user of modules not having modversions when built using the archive. For building with modversions, the user can use traditional header packages. For our tracing usecases, we build modules on the fly with this so it is not a concern. (2) I have left IKHD_ST and IKHD_ED markers as is to facilitate future patches that would extract the headers from a kernel or module image. Signed-off-by: Joel Fernandes (Google) --- Changes since v2: (Thanks to Masahiro Yamada for several excellent suggestions) - Added support for out of tree builds. - Added incremental build support bringing down build time of incremental builds from 50 seconds to 5 seconds. - Fixed various small nits / cleanups. - clean ups to kheaders.c pointed by Alexey Dobriyan. - Fixed MODULE_LICENSE in test module and kheaders.c - Dropped Module.symvers from archive due to circular dependency. Changes since v1: - removed IKH_EXTRA variable, not needed (Masahiro Yamada) - small fix ups to selftest - added target to main Makefile etc - added MODULE_LICENSE to test module - made selftest more quiet Changes since RFC: Both changes bring size down to 3.8MB: - use xz for compression - strip comments except SPDX lines - Call out the module name in Kconfig - Also added selftests in second patch to ensure headers are always working. Other notes: By the way I still see this error (without the patch) when doing a clean build: Makefile:594: include/config/auto.conf: No such file or directory It appears to be because of commit 0a16d2e8cb7e ("kbuild: use 'include' directive to load auto.conf from top Makefile") Documentation/dontdiff| 1 + init/Kconfig | 11 ++ kernel/.gitignore | 3 ++ kernel/Makefile | 36 +++ kernel/kheaders.c | 72 + scripts/gen_ikh_data.sh | 76 +++ scripts/strip-comments.pl | 8 + 7 files changed, 207 insertions(+) create mode 100644 kernel/kheaders.c create mode 100755 scripts/gen_ikh_data.sh create mode 100755 scripts/strip-comments.pl diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 2228fcc8e29f..05a2319ee2a2 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -151,6 +151,7 @@ int8.c kallsyms kconfig keywords.c +kheaders_data.h* ksym.c* ksym.h* kxgettext diff --git a/init/Kconfig b/init/Kconfig index c9386a365eea..63ff0990ae55 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -563,6 +563,17 @@ config IKCONFIG_PROC This option enables access to the kernel configuration file through /proc/config.gz. +config IKHEADERS_PROC + tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz" + select BUILD_BIN2C + depends
[PATCH 0/2] doc: net: ieee802154: move from plain text to rst
Hello. The ieee802154 subsystem doc was still in plain text. With the networking book taking shape I thought it was time to do the first step and move it over to rst. This really is only the minimal conversion. I need to take some time to update and extend the docs. The patches are based on net-next, but they only touch the networking book so I would not expect and trouble. From what I have seen they would go through Jonathan's tree after being acked by Dave? If you want this patches against a different tree let me know. regards Stefan Schmidt Stefan Schmidt (2): doc: net: ieee802154: introduce IEEE 802.15.4 subsystem doc in rst style doc: net: ieee802154: remove old plain text docs after switching to rst .../{ieee802154.txt => ieee802154.rst}| 193 +- Documentation/networking/index.rst| 1 + 2 files changed, 99 insertions(+), 95 deletions(-) rename Documentation/networking/{ieee802154.txt => ieee802154.rst} (58%) -- 2.17.2
[PATCH 1/2] doc: net: ieee802154: introduce IEEE 802.15.4 subsystem doc in rst style
Moving the ieee802154 docs from a plain text file into the new rst style. This commit only does the minimal needed change to bring the documentation over. Follow up patches will improve and extend on this. Signed-off-by: Stefan Schmidt --- Documentation/networking/ieee802154.rst | 180 Documentation/networking/index.rst | 1 + 2 files changed, 181 insertions(+) create mode 100644 Documentation/networking/ieee802154.rst diff --git a/Documentation/networking/ieee802154.rst b/Documentation/networking/ieee802154.rst new file mode 100644 index ..36ca823a1122 --- /dev/null +++ b/Documentation/networking/ieee802154.rst @@ -0,0 +1,180 @@ +=== +IEEE 802.15.4 Developer's Guide +=== + +Introduction + +The IEEE 802.15.4 working group focuses on standardization of the bottom +two layers: Medium Access Control (MAC) and Physical access (PHY). And there +are mainly two options available for upper layers: + +- ZigBee - proprietary protocol from the ZigBee Alliance +- 6LoWPAN - IPv6 networking over low rate personal area networks + +The goal of the Linux-wpan is to provide a complete implementation +of the IEEE 802.15.4 and 6LoWPAN protocols. IEEE 802.15.4 is a stack +of protocols for organizing Low-Rate Wireless Personal Area Networks. + +The stack is composed of three main parts: + +- IEEE 802.15.4 layer; We have chosen to use plain Berkeley socket API, + the generic Linux networking stack to transfer IEEE 802.15.4 data + messages and a special protocol over netlink for configuration/management +- MAC - provides access to shared channel and reliable data delivery +- PHY - represents device drivers + +Socket API +== + +.. c:function:: int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0); + +The address family, socket addresses etc. are defined in the +include/net/af_ieee802154.h header or in the special header +in the userspace package (see either http://wpan.cakelab.org/ or the +git tree at https://github.com/linux-wpan/wpan-tools). + +6LoWPAN Linux implementation + + +The IEEE 802.15.4 standard specifies an MTU of 127 bytes, yielding about 80 +octets of actual MAC payload once security is turned on, on a wireless link +with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format +[RFC4944] was specified to carry IPv6 datagrams over such constrained links, +taking into account limited bandwidth, memory, or energy resources that are +expected in applications such as wireless Sensor Networks. [RFC4944] defines +a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header +to support the IPv6 minimum MTU requirement [RFC2460], and stateless header +compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the +relatively large IPv6 and UDP headers down to (in the best case) several bytes. + +In September 2011 the standard update was published - [RFC6282]. +It deprecates HC1 and HC2 compression and defines IPHC encoding format which is +used in this Linux implementation. + +All the code related to 6lowpan you may find in files: net/6lowpan/* +and net/ieee802154/6lowpan/* + +To setup a 6LoWPAN interface you need: +1. Add IEEE802.15.4 interface and set channel and PAN ID; +2. Add 6lowpan interface by command like: +# ip link add link wpan0 name lowpan0 type lowpan +3. Bring up 'lowpan0' interface + +Drivers +=== + +Like with WiFi, there are several types of devices implementing IEEE 802.15.4. +1) 'HardMAC'. The MAC layer is implemented in the device itself, the device +exports a management (e.g. MLME) and data API. +2) 'SoftMAC' or just radio. These types of devices are just radio transceivers +possibly with some kinds of acceleration like automatic CRC computation and +comparation, automagic ACK handling, address matching, etc. + +Those types of devices require different approach to be hooked into Linux kernel. + +HardMAC +--- + +See the header include/net/ieee802154_netdev.h. You have to implement Linux +net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family +code via plain sk_buffs. On skb reception skb->cb must contain additional +info as described in the struct ieee802154_mac_cb. During packet transmission +the skb->cb is used to provide additional data to device's header_ops->create +function. Be aware that this data can be overridden later (when socket code +submits skb to qdisc), so if you need something from that cb later, you should +store info in the skb->data on your own. + +To hook the MLME interface you have to populate the ml_priv field of your +net_device with a pointer to struct ieee802154_mlme_ops instance. The fields +assoc_req, assoc_resp, disassoc_req, start_req, and scan_req are optional. +All other fields are required. + +SoftMAC +--- + +The MAC is the middle layer in the IEEE 802.15.4 Linux stack. This moment it +provides interface for drivers registration and management of
[PATCH 2/2] doc: net: ieee802154: remove old plain text docs after switching to rst
The plain text docs are converted to rst now, which allows us to remove the old text file from the tree. Signed-off-by: Stefan Schmidt --- Documentation/networking/ieee802154.txt | 177 1 file changed, 177 deletions(-) delete mode 100644 Documentation/networking/ieee802154.txt diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt deleted file mode 100644 index e74d8e1da0e2.. --- a/Documentation/networking/ieee802154.txt +++ /dev/null @@ -1,177 +0,0 @@ - - Linux IEEE 802.15.4 implementation - - -Introduction - -The IEEE 802.15.4 working group focuses on standardization of the bottom -two layers: Medium Access Control (MAC) and Physical access (PHY). And there -are mainly two options available for upper layers: - - ZigBee - proprietary protocol from the ZigBee Alliance - - 6LoWPAN - IPv6 networking over low rate personal area networks - -The goal of the Linux-wpan is to provide a complete implementation -of the IEEE 802.15.4 and 6LoWPAN protocols. IEEE 802.15.4 is a stack -of protocols for organizing Low-Rate Wireless Personal Area Networks. - -The stack is composed of three main parts: - - IEEE 802.15.4 layer; We have chosen to use plain Berkeley socket API, - the generic Linux networking stack to transfer IEEE 802.15.4 data - messages and a special protocol over netlink for configuration/management - - MAC - provides access to shared channel and reliable data delivery - - PHY - represents device drivers - - -Socket API -== - -int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0); -. - -The address family, socket addresses etc. are defined in the -include/net/af_ieee802154.h header or in the special header -in the userspace package (see either http://wpan.cakelab.org/ or the -git tree at https://github.com/linux-wpan/wpan-tools). - - -Kernel side -= - -Like with WiFi, there are several types of devices implementing IEEE 802.15.4. -1) 'HardMAC'. The MAC layer is implemented in the device itself, the device - exports a management (e.g. MLME) and data API. -2) 'SoftMAC' or just radio. These types of devices are just radio transceivers - possibly with some kinds of acceleration like automatic CRC computation and - comparation, automagic ACK handling, address matching, etc. - -Those types of devices require different approach to be hooked into Linux kernel. - - -HardMAC -=== - -See the header include/net/ieee802154_netdev.h. You have to implement Linux -net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family -code via plain sk_buffs. On skb reception skb->cb must contain additional -info as described in the struct ieee802154_mac_cb. During packet transmission -the skb->cb is used to provide additional data to device's header_ops->create -function. Be aware that this data can be overridden later (when socket code -submits skb to qdisc), so if you need something from that cb later, you should -store info in the skb->data on your own. - -To hook the MLME interface you have to populate the ml_priv field of your -net_device with a pointer to struct ieee802154_mlme_ops instance. The fields -assoc_req, assoc_resp, disassoc_req, start_req, and scan_req are optional. -All other fields are required. - - -SoftMAC -=== - -The MAC is the middle layer in the IEEE 802.15.4 Linux stack. This moment it -provides interface for drivers registration and management of slave interfaces. - -NOTE: Currently the only monitor device type is supported - it's IEEE 802.15.4 -stack interface for network sniffers (e.g. WireShark). - -This layer is going to be extended soon. - -See header include/net/mac802154.h and several drivers in -drivers/net/ieee802154/. - - -Device drivers API -== - -The include/net/mac802154.h defines following functions: - - struct ieee802154_hw * - ieee802154_alloc_hw(size_t priv_data_len, const struct ieee802154_ops *ops): - allocation of IEEE 802.15.4 compatible hardware device - - - void ieee802154_free_hw(struct ieee802154_hw *hw): - freeing allocated hardware device - - - int ieee802154_register_hw(struct ieee802154_hw *hw): - register PHY which is the allocated hardware device, in the system - - - void ieee802154_unregister_hw(struct ieee802154_hw *hw): - freeing registered PHY - - - void ieee802154_rx_irqsafe(struct ieee802154_hw *hw, struct sk_buff *skb, - u8 lqi): - telling 802.15.4 module there is a new received frame in the skb with - the RF Link Quality Indicator (LQI) from the hardware device - - - void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb, - bool ifs_handling): - telling 802.15.4 module the frame in the skb is or going to be - transmitted through the hardware device - -The device driver must implement the following callbacks in the IEEE 802.15.4 -operations structure at least: -struct ieee802154_ops { -
Re: [PATCH 0/2] doc: net: ieee802154: move from plain text to rst
On Wed, 27 Feb 2019 20:59:12 +0100 Stefan Schmidt wrote: > The patches are based on net-next, but they only touch the networking book so > I > would not expect and trouble. From what I have seen they would go through > Jonathan's tree after being acked by Dave? If you want this patches against a > different tree let me know. Usually Dave takes networking documentation patches directly, so that is what I would expect here. I took a quick look anyway; seems generally good. The main comment I would make is that much of what's there would be better placed as kerneldoc comments in the code itself that can then be pulled into the formatted docs. But that can be a job for another day... Thanks, jon
Re: [PATCH 0/2] doc: net: ieee802154: move from plain text to rst
Hello Jon. On 27.02.19 21:18, Jonathan Corbet wrote: > On Wed, 27 Feb 2019 20:59:12 +0100 > Stefan Schmidt wrote: > >> The patches are based on net-next, but they only touch the networking book >> so I >> would not expect and trouble. From what I have seen they would go through >> Jonathan's tree after being acked by Dave? If you want this patches against a >> different tree let me know. > > Usually Dave takes networking documentation patches directly, so that is > what I would expect here. OK, so I got that wrong. Works for me. Dave, you want to take them directly, if nothing else comes up during review, or should I apply them to my tree and send them with the next pull request? > I took a quick look anyway; seems generally good. The main comment I > would make is that much of what's there would be better placed as > kerneldoc comments in the code itself that can then be pulled into the > formatted docs. But that can be a job for another day... Interesting point. That might be indeed a good idea on some parts of this sparse doc. Need to check how this is handled in other docs. It also needs extending scope and context, but again something for a follow up patchset. regards Stefan Schmidt
Re: [PATCH 1/2] doc: net: ieee802154: introduce IEEE 802.15.4 subsystem doc in rst style
On 2/27/19 11:59 AM, Stefan Schmidt wrote: > Moving the ieee802154 docs from a plain text file into the new rst > style. This commit only does the minimal needed change to bring the > documentation over. Follow up patches will improve and extend on this. > > Signed-off-by: Stefan Schmidt > --- > Documentation/networking/ieee802154.rst | 180 > Documentation/networking/index.rst | 1 + > 2 files changed, 181 insertions(+) > create mode 100644 Documentation/networking/ieee802154.rst Tested-by: Randy Dunlap Thanks. -- ~Randy
Re: [PATCH] x86/fpu: Parse comma separated list passed in clearcpuid
On 2/21/19 8:48 AM, Peter Zijlstra wrote: > On Thu, Feb 21, 2019 at 08:12:25AM -0500, Prarit Bhargava wrote: >> Users cannot disable multiple CPU features with the kernel parameter >> clearcpuid=. For example, "clearcpuid=154 clearcpuid=227" only disables >> CPUID bit 154. >> >> Previous to commit 0c2a3913d6f5 ("x86/fpu: Parse clearcpuid= as early XSAVE >> argument") it was possible to pass multiple clearcpuid options as kernel >> parameters using individual entries. With the new code it isn't easy to >> replicate exactly that behaviour but a comma separated list can be easily >> implemented, eg) "clearcpuid=154,227" >> >> Make the clearcpuid parse a comma-separated list of values instead of only >> a single value. > > Can we also please kill the value thing entirely and only accept > strings. Having to reverse engineer the numbers is madness. > > Also, wth would you want to disable XSAVE and EPB ? > It looks like Fenghua has implemented this here: https://marc.info/?l=linux-kernel&m=154908490105208&w=2 so please drop this patch. Thanks, P.
Re: [PATCH v8 05/26] clocksource: Add driver for the Ingenic JZ47xx OST
Hi, Le lun. 25 févr. 2019 à 15:05, Stephen Boyd a écrit : Quoting Paul Cercueil (2019-02-22 19:17:25) Hi, Anything new on this? It still happens on 5.0-rc7. It probes with late_initcall, and not with device_initcall. I have no clue what's going on. I'm not sure what's going on either. You'll probably have to debug when the device is created and when it is probed by enabling the debug printing in the driver core or by adding in extra debug prints to narrow down the problem. For example, add a '#define DEBUG 1' at the top of drivers/base/dd.c and see if that helps give some info on what's going on with the drivers and devices. The doc of __platform_driver_probe says: "Use this instead of platform_driver_register() when you know the device is not hotpluggable and has already been registered". When the parent device and child device are both probed with builtin_platform_driver_probe(), and the parent calls devm_of_platform_populate(), it is not certain that the parent's probe will happen before the child's, and if it does not, the child device has not been registered and its probe is not allowed to defer. So it turned out not to be a core bug, rather a misuse of the API. So I will keep the builtin_platform_driver_probe() in the child, and use a subsys_initcall() in the parent. That works fine. Regards, -Paul
Re: [PATCH v3 1/2] Provide in-kernel headers for making it easy to extend the kernel
Hi Joel, On Thu, Feb 28, 2019 at 4:40 AM Joel Fernandes (Google) wrote: > > Introduce in-kernel headers and other artifacts which are made available > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes > it possible to build kernel modules, run eBPF programs, and other > tracing programs that need to extend the kernel for tracing purposes > without any dependency on the file system having headers and build > artifacts. > > On Android and embedded systems, it is common to switch kernels but not > have kernel headers available on the file system. Raw kernel headers > also cannot be copied into the filesystem like they can be on other > distros, due to licensing and other issues. There's no linux-headers > package on Android. Further once a different kernel is booted, any > headers stored on the file system will no longer be useful. By storing > the headers as a compressed archive within the kernel, we can avoid these > issues that have been a hindrance for a long time. > > The feature is also buildable as a module just in case the user desires > it not being part of the kernel image. This makes it possible to load > and unload the headers on demand. A tracing program, or a kernel module > builder can load the module, do its operations, and then unload the > module to save kernel memory. The total memory needed is 3.8MB. > > The code to read the headers is based on /proc/config.gz code and uses > the same technique to embed the headers. Please let me ask a question about the actual use-case. To build embedded systems including Android, I use an x86 build machine. In other words, I cross-compile vmlinux and in-tree modules. So, target-arch: arm64 host-arch: x86 > To build a module, the below steps have been tested on an x86 machine: > modprobe kheaders > rm -rf $HOME/headers > mkdir -p $HOME/headers > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null > cd my-kernel-module > make -C $HOME/headers M=$(pwd) modules > rmmod kheaders I am guessing the user will run these commands on the target system. In other words, external modules are native-compiled. So, target-arch: arm64 host-arch: arm64 Is this correct? If I understood the assumed use-case correctly, kheaders.tar.xw will contain host-programs compiled for x86, which will not work on the target system. Masahiro > Additional notes: > (1) > A limitation of module building with this is, since Module.symvers is > not available in the archive due to a cyclic dependency with building of > the archive into the kernel or module binaries, the modules built using > the archive will not contain symbol versioning (modversion). This is > usually not an issue since the idea of this patch is to build a kernel > module on the fly and load it into the same kernel. An appropriate > warning is already printed by the kernel to alert the user of modules > not having modversions when built using the archive. For building with > modversions, the user can use traditional header packages. For our > tracing usecases, we build modules on the fly with this so it is not a > concern. > > (2) I have left IKHD_ST and IKHD_ED markers as is to facilitate > future patches that would extract the headers from a kernel or module > image. > > Signed-off-by: Joel Fernandes (Google) > --- -- Best Regards Masahiro Yamada
Re: [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk
On 2/19/19 3:52 AM, liaoweixiong wrote: > The document, at Documentation/admin-guide/pstore-block.rst, > tells user how to use pstore_blk and the attentions about panic > read/write > > Signed-off-by: liaoweixiong > --- > Documentation/admin-guide/pstore-block.rst | 233 > + > MAINTAINERS| 1 + > fs/pstore/Kconfig | 4 + > 3 files changed, 238 insertions(+) > create mode 100644 Documentation/admin-guide/pstore-block.rst > > diff --git a/Documentation/admin-guide/pstore-block.rst > b/Documentation/admin-guide/pstore-block.rst > new file mode 100644 > index 000..a828274 > --- /dev/null > +++ b/Documentation/admin-guide/pstore-block.rst > @@ -0,0 +1,233 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +Pstore block oops/panic logger > +== > + > +Introduction > + > + > +Pstore block (pstore_blk) is an oops/panic logger that write its logs to > block to a block > +device before the system crashes. Pstore_blk needs block device driver needs the block > +registering a partition path of the block device, like /dev/mmcblk0p7 for mmc to register for MMC > +driver, and read/write APIs for this partition when on panic. > + > +Pstore block concepts > +- > + > +Pstore block begins at function ``blkz_register``, by which block driver by which a block driver > +registers to pstore_blk. Note that, block driver should register to > pstore_blk Note that the block driver should > +after block device has registered. Block driver transfers a structure The block driver > +``blkz_info`` which is defined in *linux/pstore_blk.h*. > + > +The following key members of ``struct blkz_info`` may be of interest to you. > + > +blkdev > +~~ > + > +The block device to use. Most of the time, it is a partition of block device. > +It's ok to keep it as NULL if you passing ``read`` and ``write`` in > blkz_info as if you are passing > +``blkdev`` is used by blkz_default_general_read/write. If both of ``blkdev``, > +``read`` and ``write`` are NULL, no block device is effective and the data > will > +be saved in ddr buffer. what is ddr buffer? > + > +It accept the following variants: > + > +1. device number in hexadecimal represents itself no itself; no > + leading 0x, for example b302. > +#. /dev/ represents the device number of disk > +#. /dev/ represents the device number of partition - > device > + number of disk plus the partition number > +#. /dev/p - same as the above, that form is used when > disk above; this form > + name of partitioned disk ends on a digit. ends with a digit. > +#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the unique id > of > + a partition if the partition table provides it. The UUID may be either an > + EFI/GPT UUID, or refer to an MSDOS partition using the format -PP, > + where is a zero-filled hex representation of the 32-bit > + "NT disk signature", and PP is a zero-filled hex representation of the > + 1-based partition number. > +#. PARTUUID=/PARTNROFF= to select a partition in relation to a > + partition with a known unique id. > +#. : major and minor number of the device separated by a colon. > + > +See more on section **read/write**. in section > + > +total_size > +~~ > + > +The total size in bytes of block device used for pstore_blk. It **MUST** be > less > +than or equal to size of block device if ``blkdev`` valid. It **MUST** be a > +multiple of 4096. If ``total_size`` is zero with ``blkdev``, ``total_size`` > will be > +set to equal to size of ``blkdev``. > + > +The block device area is divided into many chunks, and each event writes a > chunk > +of information. > + > +dmesg_size > +~~ > + > +The chunk size in bytes for dmesg(oops/panic). It **MUST** be a multiple of > +SECTOR_SIZE (Most of the time, the SECTOR_SIZE is 512). If you don't need > dmesg, > +you are safely to set it to 0. you can safely > + > +NOTE that, the remaining space, except ``pmsg_size`` and others, belongs to > +dmesg. It means that there are multiple chunks for dmesg. > + > +Psotre_blk will log to dmesg chunks one by one, and always overwrite the > oldest Pstore_blk > +chunk if no free chunk. > + > +pmsg_size > +~ > + > +The chunk size in bytes for pmsg. It **MUST** be a multiple of SECTOR_SIZE > (Most > +of the time, the SECTOR_SIZE is 512). If you don't need pmsg, you ar
Re: [RFC v9 5/5] Documentation: pstore/blk: create document for pstore_blk
Thank you for your correction. I will update the patch in the 12th version. On 2019/02/28 13:15, Randy Dunlap wrote: > On 2/19/19 3:52 AM, liaoweixiong wrote: >> The document, at Documentation/admin-guide/pstore-block.rst, >> tells user how to use pstore_blk and the attentions about panic >> read/write >> >> Signed-off-by: liaoweixiong >> --- >> Documentation/admin-guide/pstore-block.rst | 233 >> + >> MAINTAINERS| 1 + >> fs/pstore/Kconfig | 4 + >> 3 files changed, 238 insertions(+) >> create mode 100644 Documentation/admin-guide/pstore-block.rst >> >> diff --git a/Documentation/admin-guide/pstore-block.rst >> b/Documentation/admin-guide/pstore-block.rst >> new file mode 100644 >> index 000..a828274 >> --- /dev/null >> +++ b/Documentation/admin-guide/pstore-block.rst >> @@ -0,0 +1,233 @@ >> +.. SPDX-License-Identifier: GPL-2.0 >> + >> +Pstore block oops/panic logger >> +== >> + >> +Introduction >> + >> + >> +Pstore block (pstore_blk) is an oops/panic logger that write its logs to >> block > > to a > block > >> +device before the system crashes. Pstore_blk needs block device driver > > needs the block > >> +registering a partition path of the block device, like /dev/mmcblk0p7 for >> mmc > >to register for > MMC > >> +driver, and read/write APIs for this partition when on panic. >> + >> +Pstore block concepts >> +- >> + >> +Pstore block begins at function ``blkz_register``, by which block driver > > by which a block driver > >> +registers to pstore_blk. Note that, block driver should register to >> pstore_blk > > Note that the block driver should > >> +after block device has registered. Block driver transfers a structure > > The block driver > >> +``blkz_info`` which is defined in *linux/pstore_blk.h*. >> + >> +The following key members of ``struct blkz_info`` may be of interest to you. >> + >> +blkdev >> +~~ >> + >> +The block device to use. Most of the time, it is a partition of block >> device. >> +It's ok to keep it as NULL if you passing ``read`` and ``write`` in >> blkz_info as > > if you are passing > >> +``blkdev`` is used by blkz_default_general_read/write. If both of >> ``blkdev``, >> +``read`` and ``write`` are NULL, no block device is effective and the data >> will >> +be saved in ddr buffer. > > what is ddr buffer? > It is a buffer allocated from RAM. I modify it as follow: If both of ``blkdev``, ``read`` and ``write`` are NULL, no block device is effective and the data will only be saved in RAM. >> + >> +It accept the following variants: >> + >> +1. device number in hexadecimal represents itself no > > itself; > no > >> + leading 0x, for example b302. >> +#. /dev/ represents the device number of disk >> +#. /dev/ represents the device number of partition - >> device >> + number of disk plus the partition number >> +#. /dev/p - same as the above, that form is used when >> disk > >above; this form > >> + name of partitioned disk ends on a digit. > >ends with a digit. > >> +#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the unique id >> of >> + a partition if the partition table provides it. The UUID may be either an >> + EFI/GPT UUID, or refer to an MSDOS partition using the format >> -PP, >> + where is a zero-filled hex representation of the 32-bit >> + "NT disk signature", and PP is a zero-filled hex representation of the >> + 1-based partition number. >> +#. PARTUUID=/PARTNROFF= to select a partition in relation to a >> + partition with a known unique id. >> +#. : major and minor number of the device separated by a >> colon. >> + >> +See more on section **read/write**. > > in section > >> + >> +total_size >> +~~ >> + >> +The total size in bytes of block device used for pstore_blk. It **MUST** be >> less >> +than or equal to size of block device if ``blkdev`` valid. It **MUST** be a >> +multiple of 4096. If ``total_size`` is zero with ``blkdev``, ``total_size`` >> will be >> +set to equal to size of ``blkdev``. >> + >> +The block device area is divided into many chunks, and each event writes a >> chunk >> +of information. >> + >> +dmesg_size >> +~~ >> + >> +The chunk size in bytes for dmesg(oops/panic). It **MUST** be a multiple of >> +SECTOR_SIZE (Most of the time, the SECTOR_SIZE is 512). If you don't need >> dmesg, >> +you are safely t
[PATCH v12 1/4] pstore/blk: new support logger for block devices
pstore_blk is similar to pstore_ram, but dump log to block devices rather than persistent ram. Why should we need pstore_blk? 1. Most embedded intelligent equipment have no persistent ram, which increases costs. We perfer to cheaper solutions, like block devices. In fact, there is already a sample for block device logger in driver MTD (drivers/mtd/mtdoops.c). 2. Do not any equipment have battery, which means that it lost all data on general ram if power failure. Pstore has little to do for these equipments. pstore_blk can only dump Oops/Panic log to block devices. It only supports dmesg now. To make pstore_blk work, the block driver should provide the block device and the read/write apis when on panic. pstore_blk begins at 'blkz_register', by witch block device can register a block device to pstore_blk. Then pstore_blk divide and manage the block device as zones, which is similar to pstore_ram. Recommend that, block driver register pstore_blk after block device is ready. pstore_blk works well on allwinner(sunxi) platform. Signed-off-by: liaoweixiong --- fs/pstore/Kconfig |8 + fs/pstore/Makefile |3 + fs/pstore/blkzone.c| 1031 include/linux/pstore_blk.h | 80 4 files changed, 1122 insertions(+) create mode 100644 fs/pstore/blkzone.c create mode 100644 include/linux/pstore_blk.h diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig index 8b3ba27..defcb75 100644 --- a/fs/pstore/Kconfig +++ b/fs/pstore/Kconfig @@ -152,3 +152,11 @@ config PSTORE_RAM "ramoops.ko". For more information, see Documentation/admin-guide/ramoops.rst. + +config PSTORE_BLK + tristate "Log panic/oops to a block device" + depends on PSTORE + depends on BLOCK + help + This enables panic and oops message to be logged to a block dev + where it can be read back at some later point. diff --git a/fs/pstore/Makefile b/fs/pstore/Makefile index 967b589..0ee2fc8 100644 --- a/fs/pstore/Makefile +++ b/fs/pstore/Makefile @@ -12,3 +12,6 @@ pstore-$(CONFIG_PSTORE_PMSG) += pmsg.o ramoops-objs += ram.o ram_core.o obj-$(CONFIG_PSTORE_RAM) += ramoops.o + +obj-$(CONFIG_PSTORE_BLK) += pstore_blk.o +pstore_blk-y += blkzone.o diff --git a/fs/pstore/blkzone.c b/fs/pstore/blkzone.c new file mode 100644 index 000..cba55b3 --- /dev/null +++ b/fs/pstore/blkzone.c @@ -0,0 +1,1031 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * + * blkzone.c: Block device Oops/Panic logger + * + * Copyright (C) 2019 liaoweixiong + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#define MODNAME "pstore-blk" +#define pr_fmt(fmt) MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PSTORE_BLKDEV "/dev/pstore-blk" + +/** + * struct blkz_head - head of zone to flush to storage + * + * @sig: signature to indicate header (BLK_SIG xor BLKZONE-type value) + * @datalen: length of data in @data + * @data: zone data. + */ +struct blkz_buffer { +#define BLK_SIG (0x43474244) /* DBGC */ + uint32_t sig; + atomic_t datalen; + uint8_t data[]; +}; + +/** + * struct blkz_dmesg_header: dmesg information + * + * @magic: magic num for dmesg header + * @time: trigger time + * @compressed: whether conpressed + * @count: oops/panic counter + * @reason: identify oops or panic + */ +struct blkz_dmesg_header { +#define DMESG_HEADER_MAGIC 0x4dfc3ae5 + uint32_t magic; + struct timespec64 time; + bool compressed; + uint32_t counter; + enum kmsg_dump_reason reason; + uint8_t data[0]; +}; + +/** + * struct blkz_zone - zone information + * @off: + * zone offset of block device + * @type: + * frontent type for this zone + * @name: + * frontent name for this zone + * @buffer: + * pointer to data buffer managed by this zone + * @buffer_size: + * bytes in @buffer->data + * @should_recover: + * should recover from storage + * @dirty: + * mark whether the data in @buffer are dirty (not flush to storage yet) + */ +struct blkz_zone { + unsigned long off; + const char *name; + enum pstore_type_id type; + + struct blkz_buffer *buffer; + size_t buffer_size; + bool should_recover; + atomic_t dirty; +}; + +struct blkz_context { + struct blkz_zone **dbzs;/* dmesg block zones */ + unsigned int dmesg_max_cnt; + unsigned int dmesg_read_cnt; + unsigned int dmesg_write_cnt; + /* +*
[PATCH v12 0/4] pstore/block: new support logger for block devices
Why should we need pstore_block? 1. Most embedded intelligent equipment have no persistent ram, which increases costs. We perfer to cheaper solutions, like block devices. In fast, there is already a sample for block device logger in driver MTD (drivers/mtd/mtdoops.c). 2. Do not any equipment have battery, which means that it lost all data on general ram if power failure. Pstore has little to do for these equipments. [PATCH v12] On patch 4: 1. Modify the document according to Randy Dunlap's suggestion. [PATCH v11] Change patchset label from RFC to PATCH [PATCH v10] Cancel DT support for blkoops temporarily. On patch 1: 1. pstore/blk should unlink PSTORE_BLKDEV when unregister. On patch 2: 1. cancel DT support temporarily. I will submit other patches to support DT when DT maintainers acked. 2. add spin lock to protect blkz_info when modify panic operations. 3. change default value of total size on Kconfig from 1024 to 0. [PATCH v9] On patch 1: 1. rename part_path/part_size, members of blkz_info, to blkdev/total_size 2. if total_size is zero, get size from @blkdev 3. support multiple variants for @blkdev, such as partuuid, major with minor, and /dev/. See details on Documentation. 4. get size from block device 5. add depends on CONFIG_BLOCK On patch 2: 1. update document On patch 3: 1. update codes for new blkzone. Blkoops support insmod without total_size. for example: "insmod ./blkoops.ko blkdev=93:6" (major:minor). 2. use late_initcalls rather than module_init, to avoid block device not ready. 3. support for block driver to add panic apis to blkoops. By this, block driver can do the least work that just provides panic operations. On patch 5: 1. update document [PATCH v8] On patch 2: 1. move DT to /bindings/pstore 2. Delete details for kernel. [PATCH v7] On patch 1: 1. Fix line over 80 characters. On patch 2: 1. Insert a separate patch for DT bindings. [PATCH v6] On patch 1: 1. Fix according to email from Kees Cook, including spelling mistakes, explicit overflow test, none of the zeroing etc. 2. Do not recover data but metadata of dmesg when panic. 3. No need to take recovery when do erase. 4. Do not use "blkoops" for blkzone any more because "blkoops" is used for other module now. (rename blkbuf to blkoops) On patch 2: 1. Rename blkbuf to blkoops. 2. Add Kconfig/device tree/module parameters settings for blkoops. 3. Add document for device tree. On patch 3: 1. Blkoops support pmsg. 2. Fix description for new version patch. On patch 4: 1. Fix description for new version patch. [PATCH v5] On patch 1: 1. rename pstore/rom to pstore/blk 2. Do not allocate any memory in the write path of panic. So, use local array instead in function romz_recover_dmesg_meta. 3. Add C header file "linux/fs.h" to fix implicit declaration of function 'filp_open','kernel_read'... On patch 3: 1. If panic, do not recover pmsg but flush if it is dirty. 2. Fix erase pmsg failed. On patch 4: 1. Create a document for pstore/blk [PATCH v4] On patch 1: 1. Fix always true condition '(--i >= 0) => (0-u32max >= 0)' in function romz_init_zones by defining variable i to 'int' rahter than 'unsigned int'. 2. To make codes more easily to read, we use macro READ_NEXT_ZONE for return value of romz_dmesg_read if it need to read next zone. Moveover, we assign READ_NEXT_ZONE -1024 rather than 0. 3. Add 'FLUSH_META' to 'enum romz_flush_mode' and rename 'NOT_FLUSH' to 'FLUSH_NONE' 4. Function romz_zone_write work badly with FLUSH_PART mode as badly address and offset to write. On patch 3: NEW SUPPORT psmg for pstore_rom. [PATCH v3] On patch 1: Fix build as module error for undefined 'vfs_read' and 'vfs_write' Both of 'vfs_read' and 'vfs_write' haven't be exproted yet, so we use 'kernel_read' and 'kernel_write' instead. [PATCH v2] On patch 1: Fix build as module error for redefinition of 'romz_unregister' and 'romz_register' [PATCH v1] On patch 1: Core codes of pstore_rom, which works well on allwinner(sunxi) platform. On patch 2: A sample for pstore_rom, using general ram rather than block device. liaoweixiong (4): pstore/blk: new support logger for block devices pstore/blk: add blkoops for pstore_blk pstore/blk: support pmsg for pstore block Documentation: pstore/blk: create document for pstore_blk Documentation/admin-guide/pstore-block.rst | 233 ++ MAINTAINERS|3 +- fs/pstore/Kconfig | 147 fs/pstore/Makefile |5 + fs/pstore/blkoops.c| 208 + fs/pstore/blkzone.c| 1242 include/linux/pstore_blk.h | 87 ++ 7 files changed, 1924 insertions(+), 1 deletion(-) create mode 100644 Documentation/admin-guide/pstore-block.rst create mode 100644 fs/pstore/blkoops.c create mode 100644 fs/pstore/blkzone.c create mode 100644 include/linux/pstore_blk.h -- 1.9.1
[PATCH v12 3/4] pstore/blk: support pmsg for pstore block
To enable pmsg, just set pmsg_size when block device register blkzone. Signed-off-by: liaoweixiong --- fs/pstore/Kconfig | 21 fs/pstore/blkoops.c| 10 ++ fs/pstore/blkzone.c| 253 + include/linux/pstore_blk.h | 1 + 4 files changed, 264 insertions(+), 21 deletions(-) diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig index 7dfe00b..b417bf5 100644 --- a/fs/pstore/Kconfig +++ b/fs/pstore/Kconfig @@ -210,6 +210,27 @@ config PSTORE_BLKOOPS_DMESG_SIZE It is the first priority. Take care of that blkoops will take lower priority settings if higher priority one do not set. +config PSTORE_BLKOOPS_PMSG_SIZE + int "pmsg size in kbytes for blkoops" + depends on PSTORE_BLKOOPS + default 64 + help + This just sets size of pmsg (pmsg_size) for pstore/blk. The value must + be a multiple of 4096. Pmsg work only if "blkdev" is set. + + NOTE that, there are three ways to set parameters of blkoops and + prioritize according to configuration flexibility. That is + Kconfig < device tree < module parameters. It means that the value can + be overwritten by higher priority settings. + 1. Kconfig +It just sets a default value. + 2. device tree +It is set on device tree, which will overwrites value from Kconfig, +but can also be overwritten by module parameters. + 3. module parameters +It is the first priority. Take care of that blkoops will take lower +priority settings if higher priority one do not set. + config PSTORE_BLKOOPS_TOTAL_SIZE int "total size in kbytes for blkoops" depends on PSTORE_BLKOOPS diff --git a/fs/pstore/blkoops.c b/fs/pstore/blkoops.c index 22c0c84..05140fd 100644 --- a/fs/pstore/blkoops.c +++ b/fs/pstore/blkoops.c @@ -30,6 +30,10 @@ module_param(dmesg_size, long, 0400); MODULE_PARM_DESC(dmesg_size, "demsg size in kbytes"); +static long pmsg_size = -1; +module_param(pmsg_size, long, 0400); +MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes"); + static long total_size = -1; module_param(total_size, long, 0400); MODULE_PARM_DESC(total_size, "total size in kbytes"); @@ -47,11 +51,13 @@ struct blkz_info blkz_info = { struct blkoops_info { unsigned long dmesg_size; + unsigned long pmsg_size; unsigned long total_size; const char *blkdev; }; struct blkoops_info blkoops_info = { .dmesg_size = CONFIG_PSTORE_BLKOOPS_DMESG_SIZE * 1024, + .pmsg_size = CONFIG_PSTORE_BLKOOPS_PMSG_SIZE * 1024, .total_size = CONFIG_PSTORE_BLKOOPS_TOTAL_SIZE * 1024, .blkdev = CONFIG_PSTORE_BLKOOPS_BLKDEV, }; @@ -104,6 +110,7 @@ static int blkoops_probe(struct platform_device *pdev) check_size(total_size, 4096); check_size(dmesg_size, 4096); + check_size(pmsg_size, 4096); #undef check_size @@ -112,6 +119,7 @@ static int blkoops_probe(struct platform_device *pdev) * through /sys/module/blkoops/parameters/ */ dmesg_size = blkz_info.dmesg_size; + pmsg_size = blkz_info.pmsg_size; total_size = blkz_info.total_size; if (blkz_info.blkdev) strncpy(blkdev, blkz_info.blkdev, 80 - 1); @@ -156,6 +164,8 @@ void blkoops_register_dummy(void) info->blkdev = (const char *)blkdev; if (dmesg_size >= 0) info->dmesg_size = (unsigned long)dmesg_size * 1024; + if (pmsg_size >= 0) + info->pmsg_size = (unsigned long)pmsg_size * 1024; } else if (info->total_size > 0 || strlen(info->blkdev)) { pr_info("using kconfig value\n"); } else { diff --git a/fs/pstore/blkzone.c b/fs/pstore/blkzone.c index cba55b3..cd3d4ed 100644 --- a/fs/pstore/blkzone.c +++ b/fs/pstore/blkzone.c @@ -40,12 +40,14 @@ * * @sig: signature to indicate header (BLK_SIG xor BLKZONE-type value) * @datalen: length of data in @data + * @start: offset into @data where the beginning of the stored bytes begin * @data: zone data. */ struct blkz_buffer { #define BLK_SIG (0x43474244) /* DBGC */ uint32_t sig; atomic_t datalen; + atomic_t start; uint8_t data[]; }; @@ -78,6 +80,9 @@ struct blkz_dmesg_header { * frontent name for this zone * @buffer: * pointer to data buffer managed by this zone + * @oldbuf: + * pointer to old data buffer. It is used for single zone such as pmsg, + * saving the old buffer. * @buffer_size: * bytes in @buffer->data * @should_recover: @@ -91,6 +96,7 @@ struct blkz_zone { enum pstore_type_id type; struct blkz_buffer *buffer; + struct blkz_buffer *oldbuf; size_t buffer_size; bool should_recover; atomic_t dirty; @@ -98,8 +104,10 @@ struct blkz_zone { struct blkz_context { struct blkz_zo
[PATCH v12 4/4] Documentation: pstore/blk: create document for pstore_blk
The document, at Documentation/admin-guide/pstore-block.rst, tells user how to use pstore_blk and the attentions about panic read/write Signed-off-by: liaoweixiong --- Documentation/admin-guide/pstore-block.rst | 233 + MAINTAINERS| 1 + fs/pstore/Kconfig | 4 + 3 files changed, 238 insertions(+) create mode 100644 Documentation/admin-guide/pstore-block.rst diff --git a/Documentation/admin-guide/pstore-block.rst b/Documentation/admin-guide/pstore-block.rst new file mode 100644 index 000..c22245d --- /dev/null +++ b/Documentation/admin-guide/pstore-block.rst @@ -0,0 +1,233 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Pstore block oops/panic logger +== + +Introduction + + +Pstore block (pstore_blk) is an oops/panic logger that write its logs to a block +device before the system crashes. Pstore_blk needs the block device driver +to register a partition of the block device, like /dev/mmcblk0p7 for MMC +driver, and read/write APIs for this partition when on panic. + +Pstore block concepts +- + +Pstore block begins at function ``blkz_register``, by which a block driver +registers to pstore_blk. Note that the block driver should register to +pstore_blk after block device has registered. The Block driver transfers a +structure ``blkz_info`` which is defined in *linux/pstore_blk.h*. + +The following key members of ``struct blkz_info`` may be of interest to you. + +blkdev +~~ + +The block device to use. Most of the time, it is a partition of block device. +It's ok to keep it as NULL if you are passing ``read`` and ``write`` in +blkz_info as ``blkdev`` is used by blkz_default_general_read/write. If both of +``blkdev``, ``read`` and ``write`` are NULL, no block device is effective and +the data will only be saved in RAM. + +It accept the following variants: + +1. device number in hexadecimal represents itself; no + leading 0x, for example b302. +#. /dev/ represents the device number of disk +#. /dev/ represents the device number of partition - device + number of disk plus the partition number +#. /dev/p - same as the above; this form is used when disk + name of partitioned disk ends with a digit. +#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the unique id of + a partition if the partition table provides it. The UUID may be either an + EFI/GPT UUID, or refer to an MSDOS partition using the format -PP, + where is a zero-filled hex representation of the 32-bit + "NT disk signature", and PP is a zero-filled hex representation of the + 1-based partition number. +#. PARTUUID=/PARTNROFF= to select a partition in relation to a + partition with a known unique id. +#. : major and minor number of the device separated by a colon. + +See more in section **read/write**. + +total_size +~~ + +The total size in bytes of block device used for pstore_blk. It **MUST** be less +than or equal to size of block device if ``blkdev`` valid. It **MUST** be a +multiple of 4096. If ``total_size`` is zero with ``blkdev``, ``total_size`` will be +set to equal to size of ``blkdev``. + +The block device area is divided into many chunks, and each event writes a chunk +of information. + +dmesg_size +~~ + +The chunk size in bytes for dmesg(oops/panic). It **MUST** be a multiple of +SECTOR_SIZE (Most of the time, the SECTOR_SIZE is 512). If you don't need dmesg, +you can safely to set it to 0. + +NOTE that, the remaining space, except ``pmsg_size`` and others, belongs to +dmesg. It means that there are multiple chunks for dmesg. + +Pstore_blk will log to dmesg chunks one by one, and always overwrite the oldest +chunk if no free chunk. + +pmsg_size +~ + +The chunk size in bytes for pmsg. It **MUST** be a multiple of SECTOR_SIZE (Most +of the time, the SECTOR_SIZE is 512). If you don't need pmsg, you can safely set +it to 0. + +There is only one chunk for pmsg. + +Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are +appended to the chunk. On reboot the contents are available in +/sys/fs/pstore/pmsg-pstore-blk-0. + +dump_oops +~ + +Dumping both oopses and panics can be done by setting 1 in the ``dump_oops`` +member while setting 0 in that variable dumps only the panics. + +read/write +~~ + +They are general ``read/write`` APIs. It is safe and recommended to ignore it, +but set ``blkdev``. + +These general APIs are used all the time expect panic. The ``read`` API is +usually used to recover data from block device, and the ``write`` API is usually +to flush new data and erase to block device. + +Pstore_blk will temporarily hold all new data before block device is ready. If +you ignore both of ``read/write`` and ``blkdev``, the old data will be lost. + +NOTE that the general APIs must check whether the block device is ready if +self-defined. + +panic_read/panic_write +~
[PATCH v12 2/4] pstore/blk: add blkoops for pstore_blk
blkoops is a sample for pstore/blk. It can only record oops, excluding panics as no read/write apis for panic registered. It support settings on Kconfg/module parameters. It can record oops log even power failure if "PSTORE_BLKOOPS_BLKDEV" on Kconfig or "blkdev" on module parameter is valid. Otherwise, it can only record data to ram buffer, which will be dropped when reboot. Signed-off-by: liaoweixiong --- MAINTAINERS| 2 +- fs/pstore/Kconfig | 114 ++ fs/pstore/Makefile | 2 + fs/pstore/blkoops.c| 198 + include/linux/pstore_blk.h | 14 +++- 5 files changed, 325 insertions(+), 5 deletions(-) create mode 100644 fs/pstore/blkoops.c diff --git a/MAINTAINERS b/MAINTAINERS index 51029a4..4e9242a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12318,7 +12318,7 @@ F: drivers/firmware/efi/efi-pstore.c F: drivers/acpi/apei/erst.c F: Documentation/admin-guide/ramoops.rst F: Documentation/devicetree/bindings/reserved-memory/ramoops.txt -K: \b(pstore|ramoops) +K: \b(pstore|ramoops|blkoops) PTP HARDWARE CLOCK SUPPORT M: Richard Cochran diff --git a/fs/pstore/Kconfig b/fs/pstore/Kconfig index defcb75..7dfe00b 100644 --- a/fs/pstore/Kconfig +++ b/fs/pstore/Kconfig @@ -160,3 +160,117 @@ config PSTORE_BLK help This enables panic and oops message to be logged to a block dev where it can be read back at some later point. + +config PSTORE_BLKOOPS + tristate "pstore block with oops logger" + depends on PSTORE_BLK + help + This is a sample for pstore block with oops logger. + + It CANNOT record panic log as no read/write apis for panic registered. + + It CAN record oops log even power failure if + "PSTORE_BLKOOPS_BLKDEV" on Kconfig or "block-device" on dts or + "blkdev" on module parameter is valid. + + Otherwise, it can only record data to ram buffer, which will be + dropped when reboot. + + NOTE that, there are three ways to set parameters of blkoops and + prioritize according to configuration flexibility. That is + Kconfig < device tree < module parameters. It means that the value can + be overwritten by higher priority settings. + 1. Kconfig +It just sets a default value. + 2. device tree +It is set on device tree, which will overwrites value from Kconfig, +but can also be overwritten by module parameters. + 3. module parameters +It is the first priority. Take care of that blkoops will take lower +priority settings if higher priority one do not set. + +config PSTORE_BLKOOPS_DMESG_SIZE + int "dmesg size in kbytes for blkoops" + depends on PSTORE_BLKOOPS + default 64 + help + This just sets size of dmesg (dmesg_size) for pstore/blk. The value + must be a multiple of 4096. + + NOTE that, there are three ways to set parameters of blkoops and + prioritize according to configuration flexibility. That is + Kconfig < device tree < module parameters. It means that the value can + be overwritten by higher priority settings. + 1. Kconfig +It just sets a default value. + 2. device tree +It is set on device tree, which will overwrites value from Kconfig, +but can also be overwritten by module parameters. + 3. module parameters +It is the first priority. Take care of that blkoops will take lower +priority settings if higher priority one do not set. + +config PSTORE_BLKOOPS_TOTAL_SIZE + int "total size in kbytes for blkoops" + depends on PSTORE_BLKOOPS + default 0 + help + The total size in kbytes pstore/blk can use. It must be less than or + equal to size of block device if @blkdev valid. If @total_size is zero + with @blkdev, @total_size will be set to equal to size of @blkdev. + The value must be a multiple of 4096. + + NOTE that, there are three ways to set parameters of blkoops and + prioritize according to configuration flexibility. That is + Kconfig < device tree < module parameters. It means that the value can + be overwritten by higher priority settings. + 1. Kconfig +It just sets a default value. + 2. device tree +It is set on device tree, which will overwrites value from Kconfig, +but can also be overwritten by module parameters. + 3. module parameters +It is the first priority. Take care of that blkoops will take lower +priority settings if higher priority one do not set. + +config PSTORE_BLKOOPS_BLKDEV + string "block device for blkoops" + depends on PSTORE_BLKOOPS + default "" + help + This just sets bloc