Re: [RFC PATCH bpf-next v2 4/4] error-injection: Support fault injection framework
On Tue, 26 Dec 2017 18:12:56 -0800 Alexei Starovoitov wrote: > On Tue, Dec 26, 2017 at 04:48:25PM +0900, Masami Hiramatsu wrote: > > Support in-kernel fault-injection framework via debugfs. > > This allows you to inject a conditional error to specified > > function using debugfs interfaces. > > > > Signed-off-by: Masami Hiramatsu > > --- > > Documentation/fault-injection/fault-injection.txt |5 + > > kernel/Makefile |1 > > kernel/fail_function.c| 169 > > + > > lib/Kconfig.debug | 10 + > > 4 files changed, 185 insertions(+) > > create mode 100644 kernel/fail_function.c > > > > diff --git a/Documentation/fault-injection/fault-injection.txt > > b/Documentation/fault-injection/fault-injection.txt > > index 918972babcd8..6243a588dd71 100644 > > --- a/Documentation/fault-injection/fault-injection.txt > > +++ b/Documentation/fault-injection/fault-injection.txt > > @@ -30,6 +30,11 @@ o fail_mmc_request > >injects MMC data errors on devices permitted by setting > >debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request > > > > +o fail_function > > + > > + injects error return on specific functions by setting debugfs entries > > + under /sys/kernel/debug/fail_function. No boot option supported. > > I like it. > Could you document it a bit better? Yes, I will do in next series. > In particular retval is configurable, but without an example no one > will be able to figure out how to use it. Ah, right. BTW, as I pointed in the covermail, should we store the expected error value range into the injectable list? e.g. ALLOW_ERROR_INJECTION(open_ctree, -1, -MAX_ERRNO) And provide APIs to check/get it. const struct error_range *ei_get_error_range(unsigned long addr); > > I think you can drop RFC tag from the next version of these patches. > Thanks! Thank you, I'll fix some errors came from configurations, and resend it. Thanks! -- Masami Hiramatsu
Re: PROBLEM: 4.15.0-rc3 APIC causes lockups on Core 2 Duo laptop
Hi Alexandru, At 12/24/2017 04:01 AM, Alexandru Chirvasitu wrote: On Sat, Dec 23, 2017 at 02:32:52PM +0100, Thomas Gleixner wrote: On Sat, 23 Dec 2017, Dexuan Cui wrote: From: Alexandru Chirvasitu [mailto:achirva...@gmail.com] Sent: Friday, December 22, 2017 14:29 The output of that precise command run just now on a freshly-compiled copy of that commit is attached. On Fri, Dec 22, 2017 at 09:31:28PM +, Dexuan Cui wrote: From: Alexandru Chirvasitu [mailto:achirva...@gmail.com] Sent: Friday, December 22, 2017 06:21 In the absence of logs, the best I can do at the moment is attach a picture of the screen I am presented with on the boot attempt. Alex The panic happens in irq_matrix_assign_system+0x4e/0xd0 in your picture. IMO we should find which line of code causes the panic. I suppose "objdump -D kernel/irq/matrix.o" can help to do that. Thanks, -- Dexuan The BUG_ON panic happens at line 147: BUG_ON(!test_and_clear_bit(bit, cm->alloc_map)); There are 2 bugs in your laptop: 1. Hard lockups on both CPUs after login 2. panic with "apic=debug" For the 2th bug, please try the following patch(need Thomas confirmation :) ) in Linux 4.15-rc5. I think it can fix the panic. If the 2th bug fixed, let's back to the 1th bug: Is Linus current head 4.15-rc5 bad as well? If yes, Please using "apic=debug" and give the dmesg log. Thanks, dou. 8<--- irq/matrix: Remove the overused BUGON() in irq_matrix_assign_system() Currently, x86 marks the preallocated legacy interrupts when initializing IRQ(native_init_IRQ), but will clear them if they are not activated in vector_configure_legacy(). So, in irq_matrix_assign_system(), replacing an legacy vector which may not allocated in a cpumap->alloc_map[] with a system vector will trigger the BUGON(); Remove the BUGON(). Signed-off-by: Dou Liyang --- kernel/irq/matrix.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c index 0ba0dd8863a7..876cbeab9ca2 100644 --- a/kernel/irq/matrix.c +++ b/kernel/irq/matrix.c @@ -143,11 +143,12 @@ void irq_matrix_assign_system(struct irq_matrix *m, unsigned int bit, BUG_ON(m->online_maps > 1 || (m->online_maps && !replace)); set_bit(bit, m->system_map); - if (replace) { - BUG_ON(!test_and_clear_bit(bit, cm->alloc_map)); + + if (replace && test_and_clear_bit(bit, cm->alloc_map)){ cm->allocated--; m->total_allocated--; } + if (bit >= m->alloc_start && bit < m->alloc_end) m->systembits_inalloc++; --
Re: [PATCH 10/11 v3] ARM: s3c24xx/s3c64xx: constify gpio_led
On Tue, Dec 26, 2017 at 7:50 PM, Arvind Yadav wrote: > gpio_led are not supposed to change at runtime. > struct gpio_led_platform_data working with const gpio_led > provided by . So mark the non-const structs > as const. > > Signed-off-by: Arvind Yadav > --- > changes in v2: > The GPIO LED driver can be built as a module, it can > be loaded after the init sections have gone away. > So removed '__initconst'. > changes in v3: > Description was missing. > > arch/arm/mach-s3c24xx/mach-h1940.c| 2 +- > arch/arm/mach-s3c24xx/mach-rx1950.c | 2 +- > arch/arm/mach-s3c64xx/mach-hmt.c | 2 +- > arch/arm/mach-s3c64xx/mach-smartq5.c | 2 +- > arch/arm/mach-s3c64xx/mach-smartq7.c | 2 +- > arch/arm/mach-s3c64xx/mach-smdk6410.c | 2 +- > 6 files changed, 6 insertions(+), 6 deletions(-) There were few build errors reported by kbuild for your patches. Are you sure that you compiled every file you touch? Best regards, Krzysztof
[PATCH] ext4: Remove repeated test in ext4_file_read_iter.
generic_file_read_iter has done the count test. So ext4_file_read_iter don't need to test the count repeatedly. Signed-off-by: Sean Fu --- fs/ext4/file.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index a0ae27b..87ca13e 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -67,9 +67,6 @@ static ssize_t ext4_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (unlikely(ext4_forced_shutdown(EXT4_SB(file_inode(iocb->ki_filp)->i_sb return -EIO; - if (!iov_iter_count(to)) - return 0; /* skip atime */ - #ifdef CONFIG_FS_DAX if (IS_DAX(file_inode(iocb->ki_filp))) return ext4_dax_read_iter(iocb, to); -- 2.6.2
Re: [PATCH 1/2] MIPS: math-emu: Do not export function `srl128`
On Tue, Dec 26, 2017 at 7:10 PM, Aleksandar Markovic wrote: >> > Fix non-fatal warning: >> > >> > arch/mips/math-emu/dp_maddf.c:19:6: warning: no previous prototype for >> > ‘srl128’ [-Wmissing-prototypes] >> > void srl128(u64 *hptr, u64 *lptr, int count) >> > >> > Signed-off-by: Mathieu Malaterre >> > --- >> > arch/mips/math-emu/dp_maddf.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c >> > index 7ad79ed411f5..0e2278a47f43 100644 >> > --- a/arch/mips/math-emu/dp_maddf.c >> > +++ b/arch/mips/math-emu/dp_maddf.c >> > @@ -16,7 +16,7 @@ >> > >> > >> > /* 128 bits shift right logical with rounding. */ >> > -void srl128(u64 *hptr, u64 *lptr, int count) >> > +static void srl128(u64 *hptr, u64 *lptr, int count) >> > { >> > u64 low; >> > >> > -- >> > 2.11.0 >> >> Acked-by: Aleksandar Markovic > > However, there is an already submitted patch: (the code change is identical) > > https://www.linux-mips.org/archives/linux-mips/2017-11/msg00044.html > > Status of that patch on patchwork is "Accepted". Sorry did not realized you sent it already. Thanks
Re: [PATCH v2] arm64: dts: Hi3660: Fix up psci state id
Hi Leo On 25 December 2017 at 03:22, Leo Yan wrote: > Hi Vincent, > > [ + John, Kevin Wang ] > > On Fri, Dec 22, 2017 at 03:22:51PM +0100, Vincent Guittot wrote: >> Hi Leo, >> >> Sorry for jumping late in the discussion but should we also remove >> the NAP state from the property cpu-idle-states of the CPUs because >> this state not supported by the platform at least for now and may be >> not in a near future ? > > Thanks for bringing up this. > > I don't want to hide anything for patch discussion :) this patch is to > resolve the PSCI parameter mismatching issue between kernel and ARM-TF > and it's not used to resolve the bug for CPU_NAP, so I didn't mention > the CPU_NAP malfunction issue to avoid complex discussion context. > > I want to keep CPU_NAP state and track bug for CPU_NAP fixing; if we > remove this state, I suspect we might have no chance to enable it > anymore. Finally this is up to Hisilicon colleague decision and if they > have time to fix this. > > I will offline to check with Daniel and Kevin for this; and if we > finally decide to remove it we can commit extra patch for this later, > how about you think? I would prefer to remove it right now. Removing NAP from c-state table makes the hikey960 working correctly; I mean even with current ATF and current state id. So it's the best solution to the NAP problem IMO and I don't see the benefit of keeping NAP in the table until this state has been fixed. This will just add uncertainties in the behavior of the board. I don't see why you can't re-add it once it has been fixed. > >> Then, I have another question regarding the update of the >> psci-suspend-parameter. These changes implies an update of the psci >> firmawre which means that we will now have 2 different firmware >> version compatible with 2 different dt. >> >> Is there any way to check that the ATF on the board is the one that >> compatible with the parameter with something like a version ? I >> currently use the previous firmware which works fine with current >> kernel and dt binding once the NAP state is removed from the table. >> When moving on recent kernel, I will have to take care of updating the >> firmware and if i need to go back on a previous kernel, i will have to >> make sure that i have the right ATF version. This make a lot of chance >> of having the wrong configuration > > AFAIK, we cannot distinguish the PSCI parameter by PSCI version or And that's my main concern because this adds a new possible regression factor when switching between different kernel version > ARM-TF version number; alternatively one simple way for checking ARM-TF > is we can get commit ID (e.g. 83df7ce) from the ARM-TF log; so any > ARM-TF commit ID is newer than the patch fdae60b6ba27: "Hikey960: > Change to use recommended power state id format" should apply this > kernel patch. > > NOTICE: BL1: Booting BL31 > NOTICE: BL31: v1.4(debug):v1.4-441-g83df7ce-dirty > NOTICE: BL31: Built : 17:31:35, Dec 22 2017 > > BTW, I hope we can upgrade Linux kernel and ARM-TF to latest code base > to avoid compatible issue; for Android offical releasing it uses the > old PSCI parameters with Hisilicon legacy booting images, so they can > work well, but if someone uses ARM-TF mainline code + Android kernel > 4.4/4.9, there must have compatible issue. > > I am monitoring the integration ARM-TF/UEFI into Android on Hikey960, > we need backport this patch onto Android kernel 4.4/4.9 ASAP after > integration ARM-TF/UEFI. > > Thanks, > Leo Yan > >> Regards, >> Vincent >> >> On 12 December 2017 at 10:12, Leo Yan wrote: >> > Thanks a lot for Vincent Guittot careful work to find bug for 'CPU_NAP' >> > idle state. From ftrace log we can observe CA73 CPUs can be easily >> > waken up from 'CPU_NAP' state but the 'waken up' CPUs doesn't handle >> > anything and sleep again; so there have tons of trace events for CA73 >> > CPUs entering and exiting idle state. >> > >> > On Hi3660 CA73 has retention state 'CPU_NAP' for CPU idle, this state we >> > set its psci parameter as '0x001' and from this parameter it can >> > calculate state id is 1. Unfortunately ARM trusted firmware (ARM-TF) >> > takes 1 as a invalid value for state id, so the CPU cannot enter idle >> > state and directly bail out to kernel. >> > >> > We want to create good practice for psci parameters platform definition, >> > so review the psci specification. The spec "ARM Power State Coordination >> > Interface - Platform Design Document (ARM DEN 0022D)" recommends state >> > ID in chapter "6.5 Recommended StateID Encoding". The recommended power >> > state IDs can be presented by below listed values; and it divides into >> > three fields, every field can use 4 bits to present power states >> > corresponding to core level, cluster level and system level: >> > 0: Run >> > 1: Standby >> > 2: Retention >> > 3: Powerdown >> > >> > This commit changes psci parameter to compliance with the suggested >> > state ID in the doc. Except we change 'CPU
Re: [GIT PULL] tee dynamic shm for v4.16
On Mon, Dec 25, 2017 at 01:22:18PM -0800, thomas zeng wrote: > > > On 2017年12月21日 08:30, Arnd Bergmann wrote: > > On Fri, Dec 15, 2017 at 2:21 PM, Jens Wiklander > > wrote: > > > Hello arm-soc maintainers, > > > > > > Please pull these tee driver changes. This implements support for dynamic > > > shared memory support in OP-TEE. More specifically is enables mapping of > > > user space memory in secure world to be used as shared memory. > > > > > > This has been reviewed and refined by the OP-TEE community at various > > > places on Github during the last year. An earlier version of this pull > > > request is used in the latest OP-TEE release (2.6.0). This has also been > > > reviewed recently at the kernel mailing lists, with all comments from > > > Mark Rutland and Yury Norov > > > addressed as far as I can tell. > > > > > > This isn't a bugfix so I'm aiming for the next merge window. > > Given that Mark and Yury reviewed this, I'm assuming this is all > > good and have now merged it. However I missed the entire discussion > > about it, so I have one question about the implementation: > > > > What happens when user space passes a buffer that is not > > backed by regular memory but instead is something it has itself > > mapped from a device with special page attributes or physical > > properties? Could this be inconsistent when optee and user > > space disagree on the caching attributes? Can you get into > > trouble if you pass an area from a device that is read-only > > in user space but writable from secure world? Read-only memory is dealt with by calling get_user_pages_fast() with the 'write' parameter set to 1. Mismatch in cache attributes isn't addressed though. This is something that should be checked in the OP-TEE driver, typically drivers/tee/optee/core.c. I would like to add another patch on top of this patch series to guard against cache attributes which aren't normal cached memory. So far I haven't been able to find a nice way of doing that, I'd appreciate any advice of idea of how to deal with this. > > Just recently, we have started to kick the tires of these "shm" related Gen > Tee Driver patches. And we have in the past encountered real world > scenarios requiring some of the shared memory regions to be marked as > "normal IC=0 and OC=0" in EL2 or SEL1, or else HW would misbehave. We worked > around by hacking the boot code but that works if the regions are > pre-allocated. Since now these regions can also be managed dynamically, we > definitely agree with Arnd Bergmann that the dynamic registration SMC > commands, and potention the SHM IOCTL commands, must convey cache > intentions. Is it possible to take this requirement into consideration, in > this iteration or the follow on? I'd be happy to discuss using different cache attributes outside this patch series. We have so far avoided specifying cache attributes by calling it normal cached memory. Now that we have one use case we're able take the next step here. Thanks, Jens
Re: [PATCH 2/2] MIPS: math-emu: Declare ys variable as possibly unused
Aleksandar, On Tue, Dec 26, 2017 at 4:12 PM, Aleksandar Markovic wrote: >> Fix non-fatal warning: >> >> arch/mips/math-emu/sp_fdp.c: In function ‘ieee754sp_fdp’: >> arch/mips/math-emu/ieee754int.h:60:31: warning: variable ‘ys’ set but not >> used [-Wunused-but-set-variable] >> unsigned int ym; int ye; int ys; int yc >>^ >> arch/mips/math-emu/sp_fdp.c:37:2: note: in expansion of macro ‘COMPYSP’ >> COMPYSP; >> ^~~ >> >> Signed-off-by: Mathieu Malaterre >> --- >> arch/mips/math-emu/ieee754int.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/mips/math-emu/ieee754int.h >> b/arch/mips/math-emu/ieee754int.h >> index 06ac0e2ac7ac..cb8f04cd24bf 100644 >> --- a/arch/mips/math-emu/ieee754int.h >> +++ b/arch/mips/math-emu/ieee754int.h >> @@ -57,7 +57,7 @@ static inline int ieee754_class_nan(int xc) >> unsigned int xm; int xe; int xs __maybe_unused; int xc >> >> #define COMPYSP \ >> - unsigned int ym; int ye; int ys; int yc >> + unsigned int ym; int ye; int ys __maybe_unused; int yc >> >> #define COMPZSP \ >> unsigned int zm; int ze; int zs; int zc > > This will silence the warning, but will do it for all future cases of unused > ys too - in other words, it may well silence even useful, valid warnings. > Also, this introduces an inconsistency among COMPXSP, COMPYSP, and COMPZSP > macros. > > A better solution would be to reduce the scope of ys, so that it is always > used, if declared. Instead of this code segment (in > arch/mips/math-emu/sp_fdp.c): > > union ieee754sp ieee754sp_fdp(union ieee754dp x) > { > union ieee754sp y; > u32 rm; > > COMPXDP; > COMPYSP; > > EXPLODEXDP; > > ieee754_clearcx(); > > FLUSHXDP; > > switch (xc) { > case IEEE754_CLASS_SNAN: > x = ieee754dp_nanxcpt(x); > EXPLODEXDP; > /* Fall through. */ > case IEEE754_CLASS_QNAN: > y = ieee754sp_nan_fdp(xs, xm); > if (!ieee754_csr.nan2008) { > EXPLODEYSP; > if (!ieee754_class_nan(yc)) > y = ieee754sp_indef(); > } > return y; > > > ... should be the following: (COMPYSP is moved to a smaller code block) > > union ieee754sp ieee754sp_fdp(union ieee754dp x) > { > union ieee754sp y; > u32 rm; > > COMPXDP; > > EXPLODEXDP; > > ieee754_clearcx(); > > FLUSHXDP; > > switch (xc) { > case IEEE754_CLASS_SNAN: > x = ieee754dp_nanxcpt(x); > EXPLODEXDP; > /* Fall through. */ > case IEEE754_CLASS_QNAN: > { > COMPYSP; > > y = ieee754sp_nan_fdp(xs, xm); > if (!ieee754_csr.nan2008) { > EXPLODEYSP; > if (!ieee754_class_nan(yc)) > y = ieee754sp_indef(); > } > return y; > } > Thanks for the suggestion. However the sign bit is still not used, so the warning is still there. Just for clarity did you see that: #define COMPXSP \ unsigned int xm; int xe; int xs __maybe_unused; int xc I'll try to give it some more thoughts, and come up with something hopefully working. -M
Re: [PATCH] x86/cpu, x86/pti: Do not enable PTI on AMD processors
On 12/26/2017 09:43 PM, Tom Lendacky wrote: > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -923,8 +923,8 @@ static void __init early_identify_cpu(struct cpuinfo_x86 > *c) > > setup_force_cpu_cap(X86_FEATURE_ALWAYS); > > - /* Assume for now that ALL x86 CPUs are insecure */ > - setup_force_cpu_bug(X86_BUG_CPU_INSECURE); > + if (c->x86_vendor != X86_VENDOR_AMD) > + setup_force_cpu_bug(X86_BUG_CPU_INSECURE); Does this disable it in a way that it can be turned back on via the kernel command-line? This is a rather wide class of issues and I would rather not just hard-code it in a way that we say one vendor has never and will never be affected.
[PATCH v6 0/8] add support for relative references in special sections
This adds support for emitting special sections such as initcall arrays, PCI fixups and tracepoints as relative references rather than absolute references. This reduces the size by 50% on 64-bit architectures, but more importantly, it removes the need for carrying relocation metadata for these sections in relocatables kernels (e.g., for KASLR) that need to fix up these absolute references at boot time. On arm64, this reduces the vmlinux footprint of such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs 4 byte relative reference) Patch #2 was sent out before as a single patch. This series supersedes the previous submission. This version makes relative ksymtab entries dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather than trying to infer from kbuild test robot replies for which architectures it should be blacklisted. Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS, and sets it for the main architectures that are expected to benefit the most from this feature, i.e., 64-bit architectures or ones that use runtime relocations. Patches #3 - #5 implement relative references for initcalls, PCI fixups and tracepoints, respectively, all of which produce sections with order ~1000 entries on an arm64 defconfig kernel with tracing enabled. This means we save about 28 KB of vmlinux space for each of these patches. Patches #6 - #8 have been added in v5, and implement relative references in jump tables for arm64 and x86. On arm64, this results in significant space savings (650+ KB on a typical distro kernel). On x86, the savings are not as impressive, but still worthwhile. (Note that these patches do not rely on CONFIG_HAVE_ARCH_PREL32_RELOCATIONS, given that the inline asm that is emitted is already per-arch) For the arm64 kernel, all patches combined reduce the memory footprint of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has KASLR enabled), of which ~1 MB is the size reduction of the RELA section in .init, and the remaining 300 KB is reduction of .text/.data. Branch: git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git relative-special-sections-v6 Changes since v5: - add missing jump_label prototypes to s390 jump_label.h (#6) - fix inverted condition in call to jump_entry_is_module_init() (#6) Changes since v4: - add patches to convert x86 and arm64 to use relative references for jump tables (#6 - #8) - rename PCI patch and add Bjorn's ack (#4) - rebase onto v4.15-rc5 Changes since v3: - fix module unload issue in patch #5 reported by Jessica, by reusing the updated routine for_each_tracepoint_range() for the quiescent check at module unload time; this requires this routine to be moved before tracepoint_module_going() in kernel/tracepoint.c - add Jessica's ack to #2 - rebase onto v4.14-rc1 Changes since v2: - Revert my slightly misguided attempt to appease checkpatch, which resulted in needless churn and worse code. This v3 is based on v1 with a few tweaks that were actually reasonable checkpatch warnings: unnecessary braces (as pointed out by Ingo) and other minor whitespace misdemeanors. Changes since v1: - Remove checkpatch errors to the extent feasible: in some cases, this involves moving extern declarations into C files, and switching to struct definitions rather than typedefs. Some errors are impossible to fix: please find the remaining ones after the diffstat. - Used 'int' instead if 'signed int' for the various offset fields: there is no ambiguity between architectures regarding its signedness (unlike 'char') - Refactor the different patches to be more uniform in the way they define the section entry type and accessors in the .h file, and avoid the need to add #ifdefs to the C code. Cc: "H. Peter Anvin" Cc: Ralf Baechle Cc: Arnd Bergmann Cc: Heiko Carstens Cc: Kees Cook Cc: Will Deacon Cc: Michael Ellerman Cc: Thomas Garnier Cc: Thomas Gleixner Cc: "Serge E. Hallyn" Cc: Bjorn Helgaas Cc: Benjamin Herrenschmidt Cc: Russell King Cc: Paul Mackerras Cc: Catalin Marinas Cc: "David S. Miller" Cc: Petr Mladek Cc: Ingo Molnar Cc: James Morris Cc: Andrew Morton Cc: Nicolas Pitre Cc: Josh Poimboeuf Cc: Steven Rostedt Cc: Martin Schwidefsky Cc: Sergey Senozhatsky Cc: Linus Torvalds Cc: Jessica Yu Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linux-m...@linux-mips.org Cc: linuxppc-...@lists.ozlabs.org Cc: linux-s...@vger.kernel.org Cc: sparcli...@vger.kernel.org Cc: x...@kernel.org Ard Biesheuvel (8): arch: enable relative relocations for arm64, power, x86, s390 and x86 module: use relative references for __ksymtab entries init: allow initcall tables to be emitted using relative references PCI: Add support for relative addressing in quirk tables kernel: tracepoints: add support for relative references kernel/jump_label: abstract jump_entry member accessors arm64/kernel: jump_label: use relative references x86/kern
[PATCH] perf test shell: Add -D to check dynamic symbols for ubuntu/debian
On Ubuntu and Debian, we can't find any symbol including "inet_pton" from 'nm -g' root@vm-lkp-nex04-8G-5 ~# nm -g /lib/x86_64-linux-gnu/libc-2.25.so | grep inet_pton nm: /lib/x86_64-linux-gnu/libc-2.25.so: no symbols it looks libc.so has different symbol compositions at different distros Usage: nm [option(s)] [file(s)] List symbols in [file(s)] (a.out by default). The options are: ...snip... -D, --dynamic Display dynamic symbols instead of normal symbols --defined-only Display only defined symbols -e (ignored) -f, --format=FORMATUse the output format FORMAT. FORMAT can be `bsd', `sysv' or `posix'. The default is `bsd' -g, --extern-only Display only external symbols I tested both debian/ubuntu and RHEL, they work as expected CC: Thomas Richter CC: Arnaldo Carvalho de Melo Signed-off-by: Li Zhijian --- tools/perf/tests/shell/trace+probe_libc_inet_pton.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh index 8b3da21..f939bd6 100755 --- a/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh +++ b/tools/perf/tests/shell/trace+probe_libc_inet_pton.sh @@ -11,7 +11,7 @@ . $(dirname $0)/lib/probe.sh libc=$(grep -w libc /proc/self/maps | head -1 | sed -r 's/.*[[:space:]](\/.*)/\1/g') -nm -g $libc 2>/dev/null | fgrep -q inet_pton || exit 254 +nm -gD $libc 2>/dev/null | fgrep -q inet_pton || exit 254 trace_libc_inet_pton_backtrace() { idx=0 -- 2.7.4
[PATCH v6 6/8] kernel/jump_label: abstract jump_entry member accessors
In preparation of allowing architectures to use relative references in jump_label entries [which can dramatically reduce the memory footprint], introduce abstractions for references to the 'code' and 'key' members of struct jump_entry. Signed-off-by: Ard Biesheuvel --- arch/arm/include/asm/jump_label.h | 27 ++ arch/arm64/include/asm/jump_label.h | 27 ++ arch/mips/include/asm/jump_label.h| 27 ++ arch/powerpc/include/asm/jump_label.h | 27 ++ arch/s390/include/asm/jump_label.h| 20 +++ arch/sparc/include/asm/jump_label.h | 27 ++ arch/tile/include/asm/jump_label.h| 27 ++ arch/x86/include/asm/jump_label.h | 27 ++ kernel/jump_label.c | 38 +--- 9 files changed, 225 insertions(+), 22 deletions(-) diff --git a/arch/arm/include/asm/jump_label.h b/arch/arm/include/asm/jump_label.h index e12d7d096fc0..7b05b404063a 100644 --- a/arch/arm/include/asm/jump_label.h +++ b/arch/arm/include/asm/jump_label.h @@ -45,5 +45,32 @@ struct jump_entry { jump_label_t key; }; +static inline jump_label_t jump_entry_code(const struct jump_entry *entry) +{ + return entry->code; +} + +static inline struct static_key *jump_entry_key(const struct jump_entry *entry) +{ + return (struct static_key *)((unsigned long)entry->key & ~1UL); +} + +static inline bool jump_entry_is_branch(const struct jump_entry *entry) +{ + return (unsigned long)entry->key & 1UL; +} + +static inline bool jump_entry_is_module_init(const struct jump_entry *entry) +{ + return entry->code == 0; +} + +static inline void jump_entry_set_module_init(struct jump_entry *entry) +{ + entry->code = 0; +} + +#define jump_label_swapNULL + #endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h index 1b5e0e843c3a..9d6e46355c89 100644 --- a/arch/arm64/include/asm/jump_label.h +++ b/arch/arm64/include/asm/jump_label.h @@ -62,5 +62,32 @@ struct jump_entry { jump_label_t key; }; +static inline jump_label_t jump_entry_code(const struct jump_entry *entry) +{ + return entry->code; +} + +static inline struct static_key *jump_entry_key(const struct jump_entry *entry) +{ + return (struct static_key *)((unsigned long)entry->key & ~1UL); +} + +static inline bool jump_entry_is_branch(const struct jump_entry *entry) +{ + return (unsigned long)entry->key & 1UL; +} + +static inline bool jump_entry_is_module_init(const struct jump_entry *entry) +{ + return entry->code == 0; +} + +static inline void jump_entry_set_module_init(struct jump_entry *entry) +{ + entry->code = 0; +} + +#define jump_label_swapNULL + #endif /* __ASSEMBLY__ */ #endif /* __ASM_JUMP_LABEL_H */ diff --git a/arch/mips/include/asm/jump_label.h b/arch/mips/include/asm/jump_label.h index e77672539e8e..70df9293dc49 100644 --- a/arch/mips/include/asm/jump_label.h +++ b/arch/mips/include/asm/jump_label.h @@ -66,5 +66,32 @@ struct jump_entry { jump_label_t key; }; +static inline jump_label_t jump_entry_code(const struct jump_entry *entry) +{ + return entry->code; +} + +static inline struct static_key *jump_entry_key(const struct jump_entry *entry) +{ + return (struct static_key *)((unsigned long)entry->key & ~1UL); +} + +static inline bool jump_entry_is_branch(const struct jump_entry *entry) +{ + return (unsigned long)entry->key & 1UL; +} + +static inline bool jump_entry_is_module_init(const struct jump_entry *entry) +{ + return entry->code == 0; +} + +static inline void jump_entry_set_module_init(struct jump_entry *entry) +{ + entry->code = 0; +} + +#define jump_label_swapNULL + #endif /* __ASSEMBLY__ */ #endif /* _ASM_MIPS_JUMP_LABEL_H */ diff --git a/arch/powerpc/include/asm/jump_label.h b/arch/powerpc/include/asm/jump_label.h index 9a287e0ac8b1..412b2699c9f6 100644 --- a/arch/powerpc/include/asm/jump_label.h +++ b/arch/powerpc/include/asm/jump_label.h @@ -59,6 +59,33 @@ struct jump_entry { jump_label_t key; }; +static inline jump_label_t jump_entry_code(const struct jump_entry *entry) +{ + return entry->code; +} + +static inline struct static_key *jump_entry_key(const struct jump_entry *entry) +{ + return (struct static_key *)((unsigned long)entry->key & ~1UL); +} + +static inline bool jump_entry_is_branch(const struct jump_entry *entry) +{ + return (unsigned long)entry->key & 1UL; +} + +static inline bool jump_entry_is_module_init(const struct jump_entry *entry) +{ + return entry->code == 0; +} + +static inline void jump_entry_set_module_init(struct jump_entry *entry) +{ + entry->code = 0; +} + +#define jump_label_swapNULL + #else #define ARCH_STATIC_BRANCH(LABEL, KEY) \ 1098: nop;\ diff --git a/arch/s390/include/asm/jump_label.
[PATCH v6 5/8] kernel: tracepoints: add support for relative references
To avoid the need for relocating absolute references to tracepoint structures at boot time when running relocatable kernels (which may take a disproportionate amount of space), add the option to emit these tables as relative references instead. Cc: Steven Rostedt Cc: Ingo Molnar Signed-off-by: Ard Biesheuvel --- include/linux/tracepoint.h | 19 ++-- kernel/tracepoint.c| 50 +++- 2 files changed, 42 insertions(+), 27 deletions(-) diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index a26ffbe09e71..d02bf1a695e8 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -228,6 +228,19 @@ extern void syscall_unregfunc(void); return static_key_false(&__tracepoint_##name.key); \ } +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +#define __TRACEPOINT_ENTRY(name)\ + asm(" .section \"__tracepoints_ptrs\", \"a\" \n" \ + " .balign 4\n" \ + " .long " VMLINUX_SYMBOL_STR(__tracepoint_##name) " - .\n" \ + " .previous\n") +#else +#define __TRACEPOINT_ENTRY(name)\ + static struct tracepoint * const __tracepoint_ptr_##name __used \ + __attribute__((section("__tracepoints_ptrs"))) = \ + &__tracepoint_##name +#endif + /* * We have no guarantee that gcc and the linker won't up-align the tracepoint * structures, so we create an array of pointers that will be used for iteration @@ -237,11 +250,9 @@ extern void syscall_unregfunc(void); static const char __tpstrtab_##name[]\ __attribute__((section("__tracepoints_strings"))) = #name; \ struct tracepoint __tracepoint_##name\ - __attribute__((section("__tracepoints"))) = \ + __attribute__((section("__tracepoints"), used)) =\ { __tpstrtab_##name, STATIC_KEY_INIT_FALSE, reg, unreg, NULL };\ - static struct tracepoint * const __tracepoint_ptr_##name __used \ - __attribute__((section("__tracepoints_ptrs"))) = \ - &__tracepoint_##name; + __TRACEPOINT_ENTRY(name); #define DEFINE_TRACE(name) \ DEFINE_TRACE_FN(name, NULL, NULL); diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 685c50ae6300..05649fef106c 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -327,6 +327,28 @@ int tracepoint_probe_unregister(struct tracepoint *tp, void *probe, void *data) } EXPORT_SYMBOL_GPL(tracepoint_probe_unregister); +static void for_each_tracepoint_range(struct tracepoint * const *begin, + struct tracepoint * const *end, + void (*fct)(struct tracepoint *tp, void *priv), + void *priv) +{ + if (!begin) + return; + + if (IS_ENABLED(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)) { + const int *iter; + + for (iter = (const int *)begin; iter < (const int *)end; iter++) + fct((struct tracepoint *)((unsigned long)iter + *iter), + priv); + } else { + struct tracepoint * const *iter; + + for (iter = begin; iter < end; iter++) + fct(*iter, priv); + } +} + #ifdef CONFIG_MODULES bool trace_module_has_bad_taint(struct module *mod) { @@ -391,15 +413,9 @@ EXPORT_SYMBOL_GPL(unregister_tracepoint_module_notifier); * Ensure the tracer unregistered the module's probes before the module * teardown is performed. Prevents leaks of probe and data pointers. */ -static void tp_module_going_check_quiescent(struct tracepoint * const *begin, - struct tracepoint * const *end) +static void tp_module_going_check_quiescent(struct tracepoint *tp, void *priv) { - struct tracepoint * const *iter; - - if (!begin) - return; - for (iter = begin; iter < end; iter++) - WARN_ON_ONCE((*iter)->funcs); + WARN_ON_ONCE(tp->funcs); } static int tracepoint_module_coming(struct module *mod) @@ -450,8 +466,9 @@ static void tracepoint_module_going(struct module *mod) * Called the going notifier before checking for * quiescence. */ - tp_module_going_check_quiescent(mod->tracepoints_ptrs, - mod->tracepoints_ptrs + mod->num_tracepoints); + for_each_tracepoint_range(mod->tracepoints_ptrs, + mod->tracepoints_ptrs + mod->num_tracepoints, + tp_module_going_check_quiescent, NULL); break; } } @@ -503,1
[PATCH v6 3/8] init: allow initcall tables to be emitted using relative references
Allow the initcall tables to be emitted using relative references that are only half the size on 64-bit architectures and don't require fixups at runtime on relocatable kernels. Cc: Petr Mladek Cc: Sergey Senozhatsky Cc: Steven Rostedt Cc: James Morris Cc: "Serge E. Hallyn" Signed-off-by: Ard Biesheuvel --- include/linux/init.h | 44 +++- init/main.c| 32 +++--- kernel/printk/printk.c | 4 +- security/security.c| 4 +- 4 files changed, 53 insertions(+), 31 deletions(-) diff --git a/include/linux/init.h b/include/linux/init.h index ea1b31101d9e..125bbea99c6b 100644 --- a/include/linux/init.h +++ b/include/linux/init.h @@ -109,8 +109,24 @@ typedef int (*initcall_t)(void); typedef void (*exitcall_t)(void); -extern initcall_t __con_initcall_start[], __con_initcall_end[]; -extern initcall_t __security_initcall_start[], __security_initcall_end[]; +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +typedef signed int initcall_entry_t; + +static inline initcall_t initcall_from_entry(initcall_entry_t *entry) +{ + return (initcall_t)((unsigned long)entry + *entry); +} +#else +typedef initcall_t initcall_entry_t; + +static inline initcall_t initcall_from_entry(initcall_entry_t *entry) +{ + return *entry; +} +#endif + +extern initcall_entry_t __con_initcall_start[], __con_initcall_end[]; +extern initcall_entry_t __security_initcall_start[], __security_initcall_end[]; /* Used for contructor calls. */ typedef void (*ctor_fn_t)(void); @@ -160,9 +176,20 @@ extern bool initcall_debug; * as KEEP() in the linker script. */ -#define __define_initcall(fn, id) \ +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +#define ___define_initcall(fn, id, __sec) \ + __ADDRESSABLE(fn) \ + asm(".section \"" #__sec ".init\", \"a\" \n" \ + "__initcall_" #fn #id ":\n" \ + ".long "VMLINUX_SYMBOL_STR(fn) " - .\n" \ + ".previous \n"); +#else +#define ___define_initcall(fn, id, __sec) \ static initcall_t __initcall_##fn##id __used \ - __attribute__((__section__(".initcall" #id ".init"))) = fn; + __attribute__((__section__(#__sec ".init"))) = fn; +#endif + +#define __define_initcall(fn, id) ___define_initcall(fn, id, .initcall##id) /* * Early initcalls run before initializing SMP. @@ -201,13 +228,8 @@ extern bool initcall_debug; #define __exitcall(fn) \ static exitcall_t __exitcall_##fn __exit_call = fn -#define console_initcall(fn) \ - static initcall_t __initcall_##fn \ - __used __section(.con_initcall.init) = fn - -#define security_initcall(fn) \ - static initcall_t __initcall_##fn \ - __used __section(.security_initcall.init) = fn +#define console_initcall(fn) ___define_initcall(fn,, .con_initcall) +#define security_initcall(fn) ___define_initcall(fn,, .security_initcall) struct obs_kernel_param { const char *str; diff --git a/init/main.c b/init/main.c index 7b606fc48482..2cbe3c2804ab 100644 --- a/init/main.c +++ b/init/main.c @@ -845,18 +845,18 @@ int __init_or_module do_one_initcall(initcall_t fn) } -extern initcall_t __initcall_start[]; -extern initcall_t __initcall0_start[]; -extern initcall_t __initcall1_start[]; -extern initcall_t __initcall2_start[]; -extern initcall_t __initcall3_start[]; -extern initcall_t __initcall4_start[]; -extern initcall_t __initcall5_start[]; -extern initcall_t __initcall6_start[]; -extern initcall_t __initcall7_start[]; -extern initcall_t __initcall_end[]; - -static initcall_t *initcall_levels[] __initdata = { +extern initcall_entry_t __initcall_start[]; +extern initcall_entry_t __initcall0_start[]; +extern initcall_entry_t __initcall1_start[]; +extern initcall_entry_t __initcall2_start[]; +extern initcall_entry_t __initcall3_start[]; +extern initcall_entry_t __initcall4_start[]; +extern initcall_entry_t __initcall5_start[]; +extern initcall_entry_t __initcall6_start[]; +extern initcall_entry_t __initcall7_start[]; +extern initcall_entry_t __initcall_end[]; + +static initcall_entry_t *initcall_levels[] __initdata = { __initcall0_start, __initcall1_start, __initcall2_start, @@ -882,7 +882,7 @@ static char *initcall_level_names[] __initdata = { static void __init do_initcall_level(int level) { - initcall_t *fn; + initcall_entry_t *fn; strcpy(initcall_command_line, saved_command_line); parse_args(initcall_level_names[level], @@ -892,7 +892,7 @@ static void __init do_initcall_level(int level) NULL, &repair_env_string); for (fn = initcall_levels[level]; fn < initcall_levels[level+1]; fn++) - do_one_initcall(*fn); + do_one_ini
[PATCH v6 4/8] PCI: Add support for relative addressing in quirk tables
Allow the PCI quirk tables to be emitted in a way that avoids absolute references to the hook functions. This reduces the size of the entries, and, more importantly, makes them invariant under runtime relocation (e.g., for KASLR) Acked-by: Bjorn Helgaas Signed-off-by: Ard Biesheuvel --- drivers/pci/quirks.c | 13 ++--- include/linux/pci.h | 20 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 10684b17d0bd..b6d51b4d5ce1 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3556,9 +3556,16 @@ static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f, f->vendor == (u16) PCI_ANY_ID) && (f->device == dev->device || f->device == (u16) PCI_ANY_ID)) { - calltime = fixup_debug_start(dev, f->hook); - f->hook(dev); - fixup_debug_report(dev, calltime, f->hook); + void (*hook)(struct pci_dev *dev); +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS + hook = (void *)((unsigned long)&f->hook_offset + + f->hook_offset); +#else + hook = f->hook; +#endif + calltime = fixup_debug_start(dev, hook); + hook(dev); + fixup_debug_report(dev, calltime, hook); } } diff --git a/include/linux/pci.h b/include/linux/pci.h index c170c9250c8b..e8c34afb5d4a 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1792,7 +1792,11 @@ struct pci_fixup { u16 device; /* You can use PCI_ANY_ID here of course */ u32 class; /* You can use PCI_ANY_ID here too */ unsigned int class_shift; /* should be 0, 8, 16 */ +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS + signed int hook_offset; +#else void (*hook)(struct pci_dev *dev); +#endif }; enum pci_fixup_pass { @@ -1806,12 +1810,28 @@ enum pci_fixup_pass { pci_fixup_suspend_late, /* pci_device_suspend_late() */ }; +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook) \ + __ADDRESSABLE(hook) \ + asm(".section " #sec ", \"a\" \n" \ + ".balign16 \n" \ + ".short " #vendor ", " #device " \n" \ + ".long "#class ", " #class_shift " \n" \ + ".long "VMLINUX_SYMBOL_STR(hook) " - . \n" \ + ".previous \n"); +#define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,\ + class_shift, hook)\ + __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class, \ + class_shift, hook) +#else /* Anonymous variables would be nice... */ #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class, \ class_shift, hook)\ static const struct pci_fixup __PASTE(__pci_fixup_##name,__LINE__) __used \ __attribute__((__section__(#section), aligned((sizeof(void *)\ = { vendor, device, class, class_shift, hook }; +#endif #define DECLARE_PCI_FIXUP_CLASS_EARLY(vendor, device, class, \ class_shift, hook) \ -- 2.11.0
[PATCH v6 8/8] x86/kernel: jump_table: use relative references
Similar to the arm64 case, 64-bit x86 can benefit from using 32-bit relative references rather than 64-bit absolute ones when emitting struct jump_entry instances. Not only does this reduce the memory footprint of the entries themselves by 50%, it also removes the need for carrying relocation metadata on relocatable builds (i.e., for KASLR) which saves a fair chunk of .init space as well (although the savings are not as dramatic as on arm64) Signed-off-by: Ard Biesheuvel --- arch/x86/include/asm/jump_label.h | 35 +++- arch/x86/kernel/jump_label.c | 59 ++-- tools/objtool/special.c | 4 +- 3 files changed, 65 insertions(+), 33 deletions(-) diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h index 009ff2699d07..91c01af96907 100644 --- a/arch/x86/include/asm/jump_label.h +++ b/arch/x86/include/asm/jump_label.h @@ -36,8 +36,8 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran asm_volatile_goto("1:" ".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t" ".pushsection __jump_table, \"aw\" \n\t" - _ASM_ALIGN "\n\t" - _ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t" + ".balign 4\n\t" + ".long 1b - ., %l[l_yes] - ., %c0 + %c1 - .\n\t" ".popsection \n\t" : : "i" (key), "i" (branch) : : l_yes); @@ -52,8 +52,8 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool ".byte 0xe9\n\t .long %l[l_yes] - 2f\n\t" "2:\n\t" ".pushsection __jump_table, \"aw\" \n\t" - _ASM_ALIGN "\n\t" - _ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t" + ".balign 4\n\t" + ".long 1b - ., %l[l_yes] - ., %c0 + %c1 - .\n\t" ".popsection \n\t" : : "i" (key), "i" (branch) : : l_yes); @@ -69,19 +69,26 @@ typedef u32 jump_label_t; #endif struct jump_entry { - jump_label_t code; - jump_label_t target; - jump_label_t key; + s32 code; + s32 target; + s32 key; }; static inline jump_label_t jump_entry_code(const struct jump_entry *entry) { - return entry->code; + return (jump_label_t)&entry->code + entry->code; +} + +static inline jump_label_t jump_entry_target(const struct jump_entry *entry) +{ + return (jump_label_t)&entry->target + entry->target; } static inline struct static_key *jump_entry_key(const struct jump_entry *entry) { - return (struct static_key *)((unsigned long)entry->key & ~1UL); + unsigned long key = (unsigned long)&entry->key + entry->key; + + return (struct static_key *)(key & ~1UL); } static inline bool jump_entry_is_branch(const struct jump_entry *entry) @@ -99,7 +106,7 @@ static inline void jump_entry_set_module_init(struct jump_entry *entry) entry->code = 0; } -#define jump_label_swapNULL +void jump_label_swap(void *a, void *b, int size); #else /* __ASSEMBLY__ */ @@ -114,8 +121,8 @@ static inline void jump_entry_set_module_init(struct jump_entry *entry) .byte STATIC_KEY_INIT_NOP .endif .pushsection __jump_table, "aw" - _ASM_ALIGN - _ASM_PTR.Lstatic_jump_\@, \target, \key + .balign 4 + .long .Lstatic_jump_\@ - ., \target - ., \key - . .popsection .endm @@ -130,8 +137,8 @@ static inline void jump_entry_set_module_init(struct jump_entry *entry) .Lstatic_jump_after_\@: .endif .pushsection __jump_table, "aw" - _ASM_ALIGN - _ASM_PTR.Lstatic_jump_\@, \target, \key + 1 + .balign 4 + .long .Lstatic_jump_\@ - ., \target - ., \key - . + 1 .popsection .endm diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c index e56c95be2808..cc5034b42335 100644 --- a/arch/x86/kernel/jump_label.c +++ b/arch/x86/kernel/jump_label.c @@ -52,22 +52,24 @@ static void __jump_label_transform(struct jump_entry *entry, * Jump label is enabled for the first time. * So we expect a default_nop... */ - if (unlikely(memcmp((void *)entry->code, default_nop, 5) -!= 0)) - bug_at((void *)entry->code, __LINE__); + if (unlikely(memcmp((void *)jump_entry_code(entry), + default_nop, 5) != 0)) + bug_at((void *)jump_entry_code(entry), + __LINE__); } else { /* * ...otherwise expect an ideal_nop. Otherwise * something went horribly wrong. */ - if (unlikely(memcmp((voi
[PATCH v6 7/8] arm64/kernel: jump_label: use relative references
On a randomly chosen distro kernel build for arm64, vmlinux.o shows the following sections, containing jump label entries, and the associated RELA relocation records, respectively: ... [38088] __jump_table PROGBITS 00e19f30 0002ea10 WA 0 0 8 [38089] .rela__jump_table RELA 01fd8bb0 0008be30 0018 I 38178 38088 8 ... In other words, we have 190 KB worth of 'struct jump_entry' instances, and 573 KB worth of RELA entries to relocate each entry's code, target and key members. This means the RELA section occupies 10% of the .init segment, and the two sections combined represent 5% of vmlinux's entire memory footprint. So let's switch from 64-bit absolute references to 32-bit relative references: this reduces the size of the __jump_table by 50%, and gets rid of the RELA section entirely. Note that this requires some extra care in the sorting routine, given that the offsets change when entries are moved around in the jump_entry table. Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/jump_label.h | 27 arch/arm64/kernel/jump_label.c | 22 +--- 2 files changed, 36 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h index 9d6e46355c89..5cec68616125 100644 --- a/arch/arm64/include/asm/jump_label.h +++ b/arch/arm64/include/asm/jump_label.h @@ -30,8 +30,8 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran { asm goto("1: nop\n\t" ".pushsection __jump_table, \"aw\"\n\t" -".align 3\n\t" -".quad 1b, %l[l_yes], %c0\n\t" +".align 2\n\t" +".long 1b - ., %l[l_yes] - ., %c0 - .\n\t" ".popsection\n\t" : : "i"(&((char *)key)[branch]) : : l_yes); @@ -44,8 +44,8 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool { asm goto("1: b %l[l_yes]\n\t" ".pushsection __jump_table, \"aw\"\n\t" -".align 3\n\t" -".quad 1b, %l[l_yes], %c0\n\t" +".align 2\n\t" +".long 1b - ., %l[l_yes] - ., %c0 - .\n\t" ".popsection\n\t" : : "i"(&((char *)key)[branch]) : : l_yes); @@ -57,19 +57,26 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool typedef u64 jump_label_t; struct jump_entry { - jump_label_t code; - jump_label_t target; - jump_label_t key; + s32 code; + s32 target; + s32 key; }; static inline jump_label_t jump_entry_code(const struct jump_entry *entry) { - return entry->code; + return (jump_label_t)&entry->code + entry->code; +} + +static inline jump_label_t jump_entry_target(const struct jump_entry *entry) +{ + return (jump_label_t)&entry->target + entry->target; } static inline struct static_key *jump_entry_key(const struct jump_entry *entry) { - return (struct static_key *)((unsigned long)entry->key & ~1UL); + unsigned long key = (unsigned long)&entry->key + entry->key; + + return (struct static_key *)(key & ~1UL); } static inline bool jump_entry_is_branch(const struct jump_entry *entry) @@ -87,7 +94,7 @@ static inline void jump_entry_set_module_init(struct jump_entry *entry) entry->code = 0; } -#define jump_label_swapNULL +void jump_label_swap(void *a, void *b, int size); #endif /* __ASSEMBLY__ */ #endif /* __ASM_JUMP_LABEL_H */ diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c index c2dd1ad3e648..2b8e459e91f7 100644 --- a/arch/arm64/kernel/jump_label.c +++ b/arch/arm64/kernel/jump_label.c @@ -25,12 +25,12 @@ void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type) { - void *addr = (void *)entry->code; + void *addr = (void *)jump_entry_code(entry); u32 insn; if (type == JUMP_LABEL_JMP) { - insn = aarch64_insn_gen_branch_imm(entry->code, - entry->target, + insn = aarch64_insn_gen_branch_imm(jump_entry_code(entry), + jump_entry_target(entry), AARCH64_INSN_BRANCH_NOLINK); } else { insn = aarch64_insn_gen_nop(); @@ -50,4 +50,20 @@ void arch_jump_label_transform_static(struct jump_entry *entry, */ } +void jump_label_swap(void *a, void *b, int size) +{ + long delta = (unsigned long)a - (unsigned long)b; + struct jump_entry *jea = a; + struct jump_entry *jeb = b; + struct jump_entry tmp = *jea; + + jea->code = jeb->code - delta; + jea->t
[GIT PULL] sound fixes for 4.15-rc6
Linus, please pull sound fixes for v4.15-rc6 from: git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git tags/sound-4.15-rc6 The topmost commit is 44be77c590f381bc629815ac789b8b15ecc4ddcf sound fixes for 4.15-rc6 It seems that Santa overslept with a bunch of gifts; the majority of changes here are various device-specific ASoC fixes, most notably the revert of rcar IOMMU support and fsl_ssi AC97 fixes, but also lots of small fixes for codecs. Besides that, the usual HD-audio quirks and fixes are included, too. Abhijeet Kumar (1): ASoC: nau8825: fix issue that pop noise when start capture Adam Thomson (2): ASoC: da7219: Correct IRQ level in DT binding example ASoC: da7218: Correct IRQ level in DT binding example Alexandre Belloni (1): ASoC: atmel-classd: select correct Kconfig symbol Andrew F. Davis (1): ASoC: tlv320aic31xx: Fix GPIO1 register definition Bard Liao (1): ASoC: rt5645: reset RT5645_AD_DA_MIXER at probe Ben Hutchings (1): ASoC: wm_adsp: Fix validation of firmware and coeff lengths Brian Norris (1): ASoC: rt5514-spi: only enable wakeup when fully initialized Guenter Roeck (1): ASoC: amd: Add error checking to probe function Guneshwor Singh (1): ASoC: Intel: Skylake: Do not check dev_type for dmic link type Hui Wang (3): ALSA: hda - Add MIC_NO_PRESENCE fixup for 2 HP machines ALSA: hda - fix headset mic detection issue on a Dell machine ALSA: hda - change the location for one mic on a Lenovo machine Jiada Wang (2): ASoC: rsnd: ssiu: clear SSI_MODE for non TDM Extended modes ASoC: rsnd: ssi: fix race condition in rsnd_ssi_pointer_update Johan Hovold (2): ASoC: da7218: fix fix child-node lookup ASoC: twl4030: fix child-node lookup Kuninori Morimoto (2): ASoC: rcar: revert IOMMU support so far ASoC: rsnd: fixup ADG register mask Maciej S. Szmigiero (2): ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure ASoC: fsl_ssi: serialize AC'97 register access operations Naveen Manohar (2): ASoC: Intel: kbl: Modify map for Headset Playback to fix pop-noise ASoC: Intel: Change kern log level to avoid unwanted messages Nicolin Chen (1): ASoC: fsl_asrc: Fix typo in a field define Srinivas Kandagatla (1): ASoC: codecs: msm8916-wcd: Fix supported formats Stefan Potyra (1): ASoC: rockchip: disable clock on error Takashi Iwai (2): ALSA: hda: Drop useless WARN_ON() ALSA: hda - Fix missing COEF init for ALC225/295/299 oder_ch...@realtek.com (3): ASoC: rt5514: Make sure the DMIC delay will be happened after normal SUPPLY widgets power on ASoC: rt5514: Add the sanity check for the driver_data in the resume function ASoC: rt5663: Fix the wrong result of the first jack detection --- Documentation/devicetree/bindings/sound/da7218.txt | 2 +- Documentation/devicetree/bindings/sound/da7219.txt | 2 +- sound/hda/hdac_i915.c | 2 +- sound/pci/hda/patch_conexant.c | 29 sound/pci/hda/patch_realtek.c | 14 +++- sound/soc/amd/acp-pcm-dma.c| 7 ++ sound/soc/atmel/Kconfig| 2 +- sound/soc/codecs/da7218.c | 2 +- sound/soc/codecs/msm8916-wcd-analog.c | 2 +- sound/soc/codecs/msm8916-wcd-digital.c | 4 +- sound/soc/codecs/nau8825.c | 1 + sound/soc/codecs/rt5514-spi.c | 15 ++-- sound/soc/codecs/rt5514.c | 2 +- sound/soc/codecs/rt5645.c | 2 + sound/soc/codecs/rt5663.c | 4 + sound/soc/codecs/rt5663.h | 4 + sound/soc/codecs/tlv320aic31xx.h | 2 +- sound/soc/codecs/twl4030.c | 4 +- sound/soc/codecs/wm_adsp.c | 12 +-- sound/soc/fsl/fsl_asrc.h | 4 +- sound/soc/fsl/fsl_ssi.c| 44 --- sound/soc/intel/boards/kbl_rt5663_max98927.c | 2 +- .../soc/intel/boards/kbl_rt5663_rt5514_max98927.c | 2 +- sound/soc/intel/skylake/skl-nhlt.c | 15 ++-- sound/soc/intel/skylake/skl-topology.c | 2 +- sound/soc/rockchip/rockchip_spdif.c| 18 +++-- sound/soc/sh/rcar/adg.c| 6 +- sound/soc/sh/rcar/core.c | 4 +- sound/soc/sh/rcar/dma.c| 86 ++ sound/soc/sh/rcar/ssi.c| 16 ++-- sound/soc/sh/rcar/ssiu.c | 5 +- 31 files changed, 173 insertions(+), 143 deletions(-) diff --git a/Documentation/devicetree/bindings/sou
[PATCH v6 1/8] arch: enable relative relocations for arm64, power, x86, s390 and x86
Before updating certain subsystems to use place relative 32-bit relocations in special sections, to save space and reduce the number of absolute relocations that need to be processed at runtime by relocatable kernels, introduce the Kconfig symbol and define it for some architectures that should be able to support and benefit from it. Cc: Catalin Marinas Cc: Will Deacon Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Martin Schwidefsky Cc: Heiko Carstens Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x...@kernel.org Signed-off-by: Ard Biesheuvel --- arch/Kconfig| 10 ++ arch/arm64/Kconfig | 1 + arch/arm64/kernel/vmlinux.lds.S | 2 +- arch/powerpc/Kconfig| 1 + arch/s390/Kconfig | 1 + arch/x86/Kconfig| 1 + 6 files changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/Kconfig b/arch/Kconfig index 400b9e1b2f27..dbc036a7bd1b 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -959,4 +959,14 @@ config REFCOUNT_FULL against various use-after-free conditions that can be used in security flaw exploits. +config HAVE_ARCH_PREL32_RELOCATIONS + bool + help + May be selected by an architecture if it supports place-relative + 32-bit relocations, both in the toolchain and in the module loader, + in which case relative references can be used in special sections + for PCI fixup, initcalls etc which are only half the size on 64 bit + architectures, and don't require runtime relocation on relocatable + kernels. + source "kernel/gcov/Kconfig" diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index c9a7e9e1414f..66c7b9ab2a3d 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -89,6 +89,7 @@ config ARM64 select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT + select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S index 7da3e5c366a0..49ae5b43fe2b 100644 --- a/arch/arm64/kernel/vmlinux.lds.S +++ b/arch/arm64/kernel/vmlinux.lds.S @@ -156,7 +156,7 @@ SECTIONS CON_INITCALL SECURITY_INITCALL INIT_RAM_FS - *(.init.rodata.* .init.bss) /* from the EFI stub */ + *(.init.rodata.* .init.bss .init.discard.*) /* EFI stub */ } .exit.data : { ARM_EXIT_KEEP(EXIT_DATA) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index c51e6ce42e7a..e172478e2ae7 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -177,6 +177,7 @@ config PPC select HAVE_ARCH_KGDB select HAVE_ARCH_MMAP_RND_BITS select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT + select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !RELOCATABLE && !HIBERNATION) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 829c67986db7..ed29d1ebecd9 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -129,6 +129,7 @@ config S390 select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_JUMP_LABEL select CPU_NO_EFFICIENT_FFS if !HAVE_MARCH_Z9_109_FEATURES + select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_SOFT_DIRTY select HAVE_ARCH_TRACEHOOK diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index d4fc98c50378..9f2bb853aedb 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -115,6 +115,7 @@ config X86 select HAVE_ARCH_MMAP_RND_BITS if MMU select HAVE_ARCH_MMAP_RND_COMPAT_BITS if MMU && COMPAT select HAVE_ARCH_COMPAT_MMAP_BASES if MMU && COMPAT + select HAVE_ARCH_PREL32_RELOCATIONS select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRANSPARENT_HUGEPAGE -- 2.11.0
[PATCH v6 2/8] module: use relative references for __ksymtab entries
An ordinary arm64 defconfig build has ~64 KB worth of __ksymtab entries, each consisting of two 64-bit fields containing absolute references, to the symbol itself and to a char array containing its name, respectively. When we build the same configuration with KASLR enabled, we end up with an additional ~192 KB of relocations in the .init section, i.e., one 24 byte entry for each absolute reference, which all need to be processed at boot time. Given how the struct kernel_symbol that describes each entry is completely local to module.c (except for the references emitted by EXPORT_SYMBOL() itself), we can easily modify it to contain two 32-bit relative references instead. This reduces the size of the __ksymtab section by 50% for all 64-bit architectures, and gets rid of the runtime relocations entirely for architectures implementing KASLR, either via standard PIE linking (arm64) or using custom host tools (x86). Note that the binary search involving __ksymtab contents relies on each section being sorted by symbol name. This is implemented based on the input section names, not the names in the ksymtab entries, so this patch does not interfere with that. Given that the use of place-relative relocations requires support both in the toolchain and in the module loader, we cannot enable this feature for all architectures. So make it dependent on whether CONFIG_HAVE_ARCH_PREL32_RELOCATIONS is defined. Cc: Arnd Bergmann Cc: Andrew Morton Cc: Ingo Molnar Cc: Kees Cook Cc: Thomas Garnier Cc: Nicolas Pitre Acked-by: Jessica Yu Signed-off-by: Ard Biesheuvel --- arch/x86/include/asm/Kbuild | 1 + arch/x86/include/asm/export.h | 5 --- include/asm-generic/export.h | 12 - include/linux/compiler.h | 11 + include/linux/export.h| 46 +++- kernel/module.c | 33 +++--- 6 files changed, 84 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/Kbuild b/arch/x86/include/asm/Kbuild index 5d6a53fd7521..3e8a88dcaa1d 100644 --- a/arch/x86/include/asm/Kbuild +++ b/arch/x86/include/asm/Kbuild @@ -9,5 +9,6 @@ generated-y += xen-hypercalls.h generic-y += clkdev.h generic-y += dma-contiguous.h generic-y += early_ioremap.h +generic-y += export.h generic-y += mcs_spinlock.h generic-y += mm-arch-hooks.h diff --git a/arch/x86/include/asm/export.h b/arch/x86/include/asm/export.h deleted file mode 100644 index 2a51d66689c5.. --- a/arch/x86/include/asm/export.h +++ /dev/null @@ -1,5 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifdef CONFIG_64BIT -#define KSYM_ALIGN 16 -#endif -#include diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h index 719db1968d81..97ce606459ae 100644 --- a/include/asm-generic/export.h +++ b/include/asm-generic/export.h @@ -5,12 +5,10 @@ #define KSYM_FUNC(x) x #endif #ifdef CONFIG_64BIT -#define __put .quad #ifndef KSYM_ALIGN #define KSYM_ALIGN 8 #endif #else -#define __put .long #ifndef KSYM_ALIGN #define KSYM_ALIGN 4 #endif @@ -25,6 +23,16 @@ #define KSYM(name) name #endif +.macro __put, val, name +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS + .long \val - ., \name - . +#elif defined(CONFIG_64BIT) + .quad \val, \name +#else + .long \val, \name +#endif +.endm + /* * note on .section use: @progbits vs %progbits nastiness doesn't matter, * since we immediately emit into those sections anyway. diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 52e611ab9a6c..fe752d365334 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -327,4 +327,15 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s compiletime_assert(__native_word(t),\ "Need native word sized stores/loads for atomicity.") +/* + * Force the compiler to emit 'sym' as a symbol, so that we can reference + * it from inline assembler. Necessary in case 'sym' could be inlined + * otherwise, or eliminated entirely due to lack of references that are + * visibile to the compiler. + */ +#define __ADDRESSABLE(sym) \ + static void *__attribute__((section(".discard.text"), used))\ + __PASTE(__discard_##sym, __LINE__)(void)\ + { return (void *)&sym; }\ + #endif /* __LINUX_COMPILER_H */ diff --git a/include/linux/export.h b/include/linux/export.h index 1a1dfdb2a5c6..5112d0c41512 100644 --- a/include/linux/export.h +++ b/include/linux/export.h @@ -24,12 +24,6 @@ #define VMLINUX_SYMBOL_STR(x) __VMLINUX_SYMBOL_STR(x) #ifndef __ASSEMBLY__ -struct kernel_symbol -{ - unsigned long value; - const char *name; -}; - #ifdef MODULE extern struct module __this_module; #define THIS_MODULE (&__this_module) @@ -60,17 +54,47 @@ extern struct module __this_module; #define __CRC_SYMBOL(sym, sec) #endif +#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS +#include +/* + * Emit the ksymtab entry as a pair
Re: [PATCH V8 3/3] OPP: Allow "opp-hz" and "opp-microvolt" to contain magic values
On 26-12-17, 14:29, Rob Herring wrote: > On Mon, Dec 18, 2017 at 03:51:30PM +0530, Viresh Kumar wrote: > > +On some platforms the exact frequency or voltage may be hidden from the OS > > by > > +the firmware and the "opp-hz" or the "opp-microvolt" properties may contain > > +magic values that represent the frequency or voltage in a firmware > > dependent > > +way, for example an index of an array in the firmware. > > I'm still not convinced this is a good idea. You were kind-of a few days back :) lkml.kernel.org/r/CAL_JsqK-qtAaM_Ou5NtxcWR3F_q=8rmpjum-vqgtkhbtwe5...@mail.gmail.com So here is the deal: - I proposed "domain-performance-state" property for this stuff initially. - But Kevin didn't like that and proposed reusing "opp-hz" and "opp-microvolt", which we all agreed to multiple times.. - And we are back to the same discussion now and its painful and time killing for all of us. TBH, I don't have too strong preferences about any of the suggestions you guys have and I need you guys to tell me what binding changes to do here and I will do that. > If you have firmware > partially managing things, then I think we should have platform specific > bindings or drivers. What about the initial idea then, like "performance-state" for the power domains ? All platforms will anyway replicate that binding only. > This is complex enough I'm not taking silence from Stephen as an okay. Sure, but I am not sure how to make him speak :) -- viresh
RE: [PATCH 2/2 v4] scsi: ufs: introduce sysfs entries exposing UFS health info
> -Original Message- > From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi- > ow...@vger.kernel.org] On Behalf Of Greg Kroah-Hartman > Sent: Thursday, December 21, 2017 10:00 AM > To: Jaegeuk Kim > Cc: linux-kernel@vger.kernel.org; linux-s...@vger.kernel.org; Jaegeuk Kim > > Subject: Re: [PATCH 2/2 v4] scsi: ufs: introduce sysfs entries exposing UFS > health info > > On Wed, Dec 20, 2017 at 02:13:25PM -0800, Jaegeuk Kim wrote: > > This patch adds a new sysfs group, namely health, via: > > > >/sys/devices/soc/X.ufshc/health/ As device health is just one piece of information out of the device management, I think that you should address this in a more comprehensive way, And set hooks for much more device info: Allow access to device descriptors, attributes and flags. The attributes and flags should be placed in separate subfolders The LUN specific descriptors and attributes should be placed in a luns subfolder, and then per descriptor / attribute type You might also would like to consider differentiating read and write - to control those type of accesses as well. Cheers, Avri
Re: [PATCH v3 0/3] create sysfs representation of ACPI HMAT
Le 22/12/2017 à 23:53, Dan Williams a écrit : > On Thu, Dec 21, 2017 at 12:31 PM, Brice Goglin wrote: >> Le 20/12/2017 à 23:41, Ross Zwisler a écrit : > [..] >> Hello >> >> I can confirm that HPC runtimes are going to use these patches (at least >> all runtimes that use hwloc for topology discovery, but that's the vast >> majority of HPC anyway). >> >> We really didn't like KNL exposing a hacky SLIT table [1]. We had to >> explicitly detect that specific crazy table to find out which NUMA nodes >> were local to which cores, and to find out which NUMA nodes were >> HBM/MCDRAM or DDR. And then we had to hide the SLIT values to the >> application because the reported latencies didn't match reality. Quite >> annoying. >> >> With Ross' patches, we can easily get what we need: >> * which NUMA nodes are local to which CPUs? /sys/devices/system/node/ >> can only report a single local node per CPU (doesn't work for KNL and >> upcoming architectures with HBM+DDR+...) >> * which NUMA nodes are slow/fast (for both bandwidth and latency) >> And we can still look at SLIT under /sys/devices/system/node if really >> needed. >> >> And of course having this in sysfs is much better than parsing ACPI >> tables that are only accessible to root :) > On this point, it's not clear to me that we should allow these sysfs > entries to be world readable. Given /proc/iomem now hides physical > address information from non-root we at least need to be careful not > to undo that with new sysfs HMAT attributes. Once you need to be root > for this info, is parsing binary HMAT vs sysfs a blocker for the HPC > use case? I don't think it would be a blocker. > Perhaps we can enlist /proc/iomem or a similar enumeration interface > to tell userspace the NUMA node and whether the kernel thinks it has > better or worse performance characteristics relative to base > system-RAM, i.e. new IORES_DESC_* values. I'm worried that if we start > publishing absolute numbers in sysfs userspace will default to looking > for specific magic numbers in sysfs vs asking the kernel for memory > that has performance characteristics relative to base "System RAM". In > other words the absolute performance information that the HMAT > publishes is useful to the kernel, but it's not clear that userspace > needs that vs a relative indicator for making NUMA node preference > decisions. Some HPC users will benchmark the machine to discovery actual performance numbers anyway. However, most users won't do this. They will want to know relative performance of different nodes. If you normalize HMAT values by dividing them with system-RAM values, that's likely OK. If you just say "that node is faster than system RAM", it's not precise enough. Brice
[PATCH] Device tree binding for Avago APDS990X light sensor
From: Filip Matijević This prepares binding for light sensor used in Nokia N9. Signed-off-by: Filip Matijević Signed-off-by: Pavel machek --- Patches to convert APDS990X driver to device tree and to switch to iio are available. diff --git a/Documentation/devicetree/bindings/misc/avago-apds990x.txt b/Documentation/devicetree/bindings/misc/avago-apds990x.txt new file mode 100644 index 000..e038146 --- /dev/null +++ b/Documentation/devicetree/bindings/misc/avago-apds990x.txt @@ -0,0 +1,39 @@ +Avago APDS990X driver + +Required properties: +- compatible: "avago,apds990x" +- reg: address on the I2C bus +- interrupts: external interrupt line number +- Vdd-supply: power supply for VDD +- Vled-supply: power supply for LEDA +- ga: Glass attenuation +- cf1: Clear channel factor 1 +- irf1: IR channel factor 1 +- cf2: Clear channel factor 2 +- irf2: IR channel factor 2 +- df: Device factor +- pdrive: IR current, one of APDS_IRLED_CURR_XXXmA values +- ppcount: Proximity pulse count + +Example (Nokia N9): + + als_ps@39 { + compatible = "avago,apds990x"; + reg = <0x39>; + + interrupt-parent = <&gpio3>; + interrupts = <19 10>; /* gpio_83, IRQF_TRIGGER_FALLING | IRQF_TRIGGER_LOW */ + + Vdd-supply = <&vaux1>; + Vled-supply = <&vbat>; + + ga = <168834>; + cf1 = <4096>; + irf1= <7824>; + cf2 = <877>; + irf2= <1575>; + df = <52>; + + pdrive = <0x2>; /* APDS_IRLED_CURR_25mA */ + ppcount = <5>; + }; -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[PATCH v5 0/2] fs: fat: add ioctl to modify fat filesystem partion volume label
The FAT filesystem partition volume label can be read with FAT_IOCTL_GET_VOLUME_LABEL and written with FAT_IOCTL_SET_VOLUME_LABEL. FAT volume label (volume name) is exactly same stored in boot sector and root directory. Thus, the boot sector just needs to be upgrade when the label writing. v5: 1. find the volume label entry through the scan function. 2. the volume label only retains the d-characters (reference from Ecma-107). v4: 1. read/write volume label from/to the location of the respective version. 2. correct volume label check reference from mkfs.fat. 3. fixed some code issue. v3: 1. write volume label both boot sector and root directory. v2: 1. add filesystem version check. 2. add diretory permissions check. 3. add volume label string check. 4. fixed part of return value. 5. fixed some indent issue. 6. remove sync_dirty_buffer(). ChenGuanqiao (2): fs: fat: Add fat filesystem partition volume label in local structure fs: fat: add ioctl method in fat filesystem driver fs/fat/dir.c | 29 +++ fs/fat/fat.h | 2 + fs/fat/file.c | 116 ++ fs/fat/inode.c| 15 -- include/uapi/linux/msdos_fs.h | 2 + 5 files changed, 161 insertions(+), 3 deletions(-) -- 2.11.0
[PATCH v5 2/2] fs: fat: add ioctl method in fat filesystem driver
Signed-off-by: ChenGuanqiao --- fs/fat/file.c | 116 ++ 1 file changed, 116 insertions(+) diff --git a/fs/fat/file.c b/fs/fat/file.c index 4724cc9ad650..517941c7bce4 100644 --- a/fs/fat/file.c +++ b/fs/fat/file.c @@ -15,11 +15,35 @@ #include #include #include +#include #include "fat.h" static long fat_fallocate(struct file *file, int mode, loff_t offset, loff_t len); +/* the characters in this field shall be d-characters, and unused byte shall be set to 0x20. */ +static int fat_format_d_characters(char *label, unsigned long len) +{ + int i; + + for (i=0; ivol_id, user_attr); } +static int fat_ioctl_get_volume_label(struct inode *inode, + u8 __user *vol_label) +{ + int err = 0; + struct fat_slot_info sinfo; + + err = fat_scan_volume_label(inode, &sinfo); + if (err) + goto out; + + if (copy_to_user(vol_label, sinfo.de->name, MSDOS_NAME)) + err = -EFAULT; + + brelse(sinfo.bh); +out: + return err; +} + +static int fat_ioctl_set_volume_label(struct file *file, + u8 __user *vol_label) +{ + int err = 0; + u8 label[MSDOS_NAME]; + struct buffer_head *bh; + struct fat_boot_sector *b; + struct fat_slot_info sinfo; + struct inode *inode = file_inode(file); + struct super_block *sb = inode->i_sb; + struct msdos_sb_info *sbi = MSDOS_SB(sb); + + if (copy_from_user(label, vol_label, sizeof(label))) { + err = -EFAULT; + goto out; + } + + fat_format_d_characters(label, sizeof(label)); + err = mnt_want_write_file(file); + if (err) + goto out; + + /* Update sector's vol_label */ + bh = sb_bread(sb, 0); + if (bh == NULL) { + fat_msg(sb, KERN_ERR, + "unable to read boot sector to write volume label"); + err = -EIO; + goto out_drop_file; + } + + b = (struct fat_boot_sector *)bh->b_data; + + lock_buffer(bh); + if (sbi->fat_bits == 32) + memcpy(b->fat32.vol_label, label, sizeof(label)); + else + memcpy(b->fat16.vol_label, label, sizeof(label)); + + mark_buffer_dirty(bh); + unlock_buffer(bh); + err = sync_dirty_buffer(bh); + brelse(bh); + if (err) + goto out_drop_file; + + /* updates root directory's vol_label */ + err = fat_scan_volume_label(inode, &sinfo); + if (err) + goto out_drop_file; + + bh = sinfo.bh; + lock_buffer(bh); + memcpy(sinfo.de->name, label, sizeof(sinfo.de->name)); + mark_buffer_dirty(bh); + unlock_buffer(bh); + err = sync_dirty_buffer(bh); + brelse(bh); + if (err) + goto out_drop_file; + + memcpy(sbi->vol_label, label, sizeof(sbi->vol_label)); + +out_drop_file: + mnt_drop_write_file(file); + out: + return err; +} + long fat_generic_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct inode *inode = file_inode(filp); u32 __user *user_attr = (u32 __user *)arg; + u8 __user *user_vol_label = (u8 __user *)arg; switch (cmd) { case FAT_IOCTL_GET_ATTRIBUTES: @@ -133,6 +245,10 @@ long fat_generic_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) return fat_ioctl_set_attributes(filp, user_attr); case FAT_IOCTL_GET_VOLUME_ID: return fat_ioctl_get_volume_id(inode, user_attr); + case FAT_IOCTL_GET_VOLUME_LABEL: + return fat_ioctl_get_volume_label(inode, user_vol_label); + case FAT_IOCTL_SET_VOLUME_LABEL: + return fat_ioctl_set_volume_label(filp, user_vol_label); default: return -ENOTTY; /* Inappropriate ioctl for device */ } -- 2.11.0
[PATCH v5 1/2] fs: fat: Add fat filesystem partition volume label in local structure
1. Read volume label whe the fat system driver load. 2. Add interface to scan volume label entry. Signed-off-by: ChenGuanqiao --- fs/fat/dir.c | 29 + fs/fat/fat.h | 2 ++ fs/fat/inode.c| 15 --- include/uapi/linux/msdos_fs.h | 2 ++ 4 files changed, 45 insertions(+), 3 deletions(-) diff --git a/fs/fat/dir.c b/fs/fat/dir.c index 81cecbe6d7cf..b369953979e3 100644 --- a/fs/fat/dir.c +++ b/fs/fat/dir.c @@ -881,6 +881,35 @@ static int fat_get_short_entry(struct inode *dir, loff_t *pos, return -ENOENT; } +static int fat_get_volume_label_entry(struct inode *dir, loff_t *pos, + struct buffer_head **bh, + struct msdos_dir_entry **de) +{ + while (fat_get_entry(dir, pos, bh, de) >= 0) + if (((*de)->attr & ATTR_VOLUME) && (*de)->attr != ATTR_EXT) + return 0; + return -ENOENT; +} + +int fat_scan_volume_label(struct inode *dir, struct fat_slot_info *sinfo) +{ + struct super_block *sb = dir->i_sb; + + sinfo->slot_off = 0; + sinfo->bh = NULL; + while (fat_get_volume_label_entry(dir, &sinfo->slot_off, + &sinfo->bh, &sinfo->de) >= 0) { + sinfo->slot_off -= sizeof(*sinfo->de); + sinfo->nr_slots = 1; + sinfo->i_pos = fat_make_i_pos(sb, sinfo->bh, sinfo->de); + + return 0; + } + + return -ENOENT; +} +EXPORT_SYMBOL_GPL(fat_scan_volume_label); + /* * The ".." entry can not provide the "struct fat_slot_info" information * for inode, nor a usable i_pos. So, this function provides some information diff --git a/fs/fat/fat.h b/fs/fat/fat.h index 051dac1ce3be..9e8d525d52c9 100644 --- a/fs/fat/fat.h +++ b/fs/fat/fat.h @@ -85,6 +85,7 @@ struct msdos_sb_info { int dir_per_block;/* dir entries per block */ int dir_per_block_bits; /* log2(dir_per_block) */ unsigned int vol_id;/*volume ID*/ + char vol_label[11]; /*volume label*/ int fatent_shift; const struct fatent_operations *fatent_ops; @@ -299,6 +300,7 @@ extern int fat_dir_empty(struct inode *dir); extern int fat_subdirs(struct inode *dir); extern int fat_scan(struct inode *dir, const unsigned char *name, struct fat_slot_info *sinfo); +extern int fat_scan_volume_label(struct inode *dir, struct fat_slot_info *sinfo); extern int fat_scan_logstart(struct inode *dir, int i_logstart, struct fat_slot_info *sinfo); extern int fat_get_dotdot_entry(struct inode *dir, struct buffer_head **bh, diff --git a/fs/fat/inode.c b/fs/fat/inode.c index 30c52394a7ad..e73379a41d49 100644 --- a/fs/fat/inode.c +++ b/fs/fat/inode.c @@ -45,12 +45,14 @@ struct fat_bios_param_block { u8 fat16_state; u32 fat16_vol_id; + u8 fat16_vol_label[11]; u32 fat32_length; u32 fat32_root_cluster; u16 fat32_info_sector; u8 fat32_state; u32 fat32_vol_id; + u8 fat32_vol_label[11]; }; static int fat_default_codepage = CONFIG_FAT_DEFAULT_CODEPAGE; @@ -1460,12 +1462,16 @@ static int fat_read_bpb(struct super_block *sb, struct fat_boot_sector *b, bpb->fat16_state = b->fat16.state; bpb->fat16_vol_id = get_unaligned_le32(b->fat16.vol_id); + memcpy(bpb->fat16_vol_label, b->fat16.vol_label, + sizeof(bpb->fat16_vol_label)); bpb->fat32_length = le32_to_cpu(b->fat32.length); bpb->fat32_root_cluster = le32_to_cpu(b->fat32.root_cluster); bpb->fat32_info_sector = le16_to_cpu(b->fat32.info_sector); bpb->fat32_state = b->fat32.state; bpb->fat32_vol_id = get_unaligned_le32(b->fat32.vol_id); + memcpy(bpb->fat32_vol_label, b->fat32.vol_label, + sizeof(bpb->fat32_vol_label)); /* Validate this looks like a FAT filesystem BPB */ if (!bpb->fat_reserved) { @@ -1723,11 +1729,14 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat, brelse(fsinfo_bh); } - /* interpret volume ID as a little endian 32 bit integer */ - if (sbi->fat_bits == 32) + /* interpret volume ID and label as a little endian 32 bit integer */ + if (sbi->fat_bits == 32) { sbi->vol_id = bpb.fat32_vol_id; - else /* fat 16 or 12 */ + memcpy(sbi->vol_label, bpb.fat32_vol_label, sizeof(sbi->vol_label)); + } else { /* fat 16 or 12 */ sbi->vol_id = bpb.fat16_vol_id; + memcpy(sbi->vol_label, bpb.fat16_vol_label, sizeof(sbi->vol_label)); + } sbi->dir_per_block = sb->s_blocksize / sizeof(struct msdos_dir_entry); sbi->dir_per_block_bits = ffs(sbi->dir_per_block) - 1; diff --git a/include/uapi/linux/msd
[PATCH] bq24190: Simplify code in property_is_writeable
Simplify function that should be trivial. Signed-off-by: Pavel machek diff --git a/drivers/power/supply/bq24190_charger.c b/drivers/power/supply/bq24190_charger.c index 35ff406..4ea8f0a 100644 --- a/drivers/power/supply/bq24190_charger.c +++ b/drivers/power/supply/bq24190_charger.c @@ -1193,8 +1193,6 @@ static int bq24190_charger_set_property(struct power_supply *psy, static int bq24190_charger_property_is_writeable(struct power_supply *psy, enum power_supply_property psp) { - int ret; - switch (psp) { case POWER_SUPPLY_PROP_ONLINE: case POWER_SUPPLY_PROP_TEMP_ALERT_MAX: @@ -1202,13 +1200,10 @@ static int bq24190_charger_property_is_writeable(struct power_supply *psy, case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT: case POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE: case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT: - ret = 1; - break; + return 1; default: - ret = 0; + return 0; } - - return ret; } static void bq24190_input_current_limit_work(struct work_struct *work) -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
[RFC PATCH v2 0/2] Reduce IOTLB flush when pass-through dGPU devices
From: Suravee Suthikulpanit Currently, when pass-through dGPU to a guest VM, there are thousands of IOTLB flush commands sent from IOMMU to end-point-device. This cause performance issue when launching new VMs, and could cause IOTLB invalidate time-out issue on certain dGPUs. This can be avoided by adopting the new fast IOTLB flush APIs. Cc: Alex Williamson Cc: Joerg Roedel Changes from V1: (https://lkml.org/lkml/2017/11/17/764) * Rebased on top of v4.15-rc5 * Patch 1/2: Fix iommu_tlb_range_add() size parameter to use unmapped instead of len. (per Alex) * Patch 1/2: Use a list to keep track unmapped IOVAs for VFIO remote unpinning. Although, I am still not sure if using a list is the best way to keep track the IOVAs. (per Alex) * Patch 2/2: Fix logic due to missing spin unlock. (per Tom) Suravee Suthikulpanit (2): vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs iommu/amd: Add support for fast IOTLB flushing drivers/iommu/amd_iommu.c | 73 - drivers/iommu/amd_iommu_init.c | 7 drivers/iommu/amd_iommu_types.h | 7 drivers/vfio/vfio_iommu_type1.c | 89 +++-- 4 files changed, 163 insertions(+), 13 deletions(-) -- 1.8.3.1
[RFC PATCH v2 2/2] iommu/amd: Add support for fast IOTLB flushing
Implement the newly added IOTLB flushing interface for AMD IOMMU. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu.c | 73 - drivers/iommu/amd_iommu_init.c | 7 drivers/iommu/amd_iommu_types.h | 7 3 files changed, 86 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 7d5eb00..42fe365 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -129,6 +129,12 @@ struct dma_ops_domain { static struct iova_domain reserved_iova_ranges; static struct lock_class_key reserved_rbtree_key; +struct amd_iommu_flush_entries { + struct list_head list; + unsigned long iova; + size_t size; +}; + / * * Helper functions @@ -3043,7 +3049,6 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova, unmap_size = iommu_unmap_page(domain, iova, page_size); mutex_unlock(&domain->api_lock); - domain_flush_tlb_pde(domain); domain_flush_complete(domain); return unmap_size; @@ -3163,6 +3168,69 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain, return dev_data->defer_attach; } +static void amd_iommu_flush_iotlb_all(struct iommu_domain *domain) +{ + struct protection_domain *dom = to_pdomain(domain); + + domain_flush_tlb_pde(dom); +} + +static void amd_iommu_iotlb_range_add(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + struct amd_iommu_flush_entries *entry, *p; + unsigned long flags; + bool found = false; + + spin_lock_irqsave(&amd_iommu_flush_list_lock, flags); + list_for_each_entry(p, &amd_iommu_flush_list, list) { + if (iova != p->iova) + continue; + + if (size > p->size) { + p->size = size; + pr_debug("%s: update range: iova=%#lx, size = %#lx\n", +__func__, p->iova, p->size); + } + found = true; + break; + } + + if (!found) { + entry = kzalloc(sizeof(struct amd_iommu_flush_entries), + GFP_KERNEL); + if (entry) { + pr_debug("%s: new range: iova=%lx, size=%#lx\n", +__func__, iova, size); + + entry->iova = iova; + entry->size = size; + list_add(&entry->list, &amd_iommu_flush_list); + } + } + spin_unlock_irqrestore(&amd_iommu_flush_list_lock, flags); +} + +static void amd_iommu_iotlb_sync(struct iommu_domain *domain) +{ + struct protection_domain *pdom = to_pdomain(domain); + struct amd_iommu_flush_entries *entry, *next; + unsigned long flags; + + /* Note: +* Currently, IOMMU driver just flushes the whole IO/TLB for +* a given domain. So, just remove entries from the list here. +*/ + spin_lock_irqsave(&amd_iommu_flush_list_lock, flags); + list_for_each_entry_safe(entry, next, &amd_iommu_flush_list, list) { + list_del(&entry->list); + kfree(entry); + } + spin_unlock_irqrestore(&amd_iommu_flush_list_lock, flags); + + domain_flush_tlb_pde(pdom); +} + const struct iommu_ops amd_iommu_ops = { .capable = amd_iommu_capable, .domain_alloc = amd_iommu_domain_alloc, @@ -3181,6 +3249,9 @@ static bool amd_iommu_is_attach_deferred(struct iommu_domain *domain, .apply_resv_region = amd_iommu_apply_resv_region, .is_attach_deferred = amd_iommu_is_attach_deferred, .pgsize_bitmap = AMD_IOMMU_PGSIZES, + .flush_iotlb_all = amd_iommu_flush_iotlb_all, + .iotlb_range_add = amd_iommu_iotlb_range_add, + .iotlb_sync = amd_iommu_iotlb_sync, }; /* diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c index 6fe2d03..e8f8cee 100644 --- a/drivers/iommu/amd_iommu_init.c +++ b/drivers/iommu/amd_iommu_init.c @@ -185,6 +185,12 @@ struct ivmd_header { bool amd_iommu_force_isolation __read_mostly; /* + * IOTLB flush list + */ +LIST_HEAD(amd_iommu_flush_list); +spinlock_t amd_iommu_flush_list_lock; + +/* * List of protection domains - used during resume */ LIST_HEAD(amd_iommu_pd_list); @@ -2490,6 +2496,7 @@ static int __init early_amd_iommu_init(void) __set_bit(0, amd_iommu_pd_alloc_bitmap); spin_lock_init(&amd_iommu_pd_lock); + spin_lock_init(&amd_iommu_flush_list_lock); /* * now the data structures are allocated and basically initialized diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h index f6b24c7..c3f4a7e 1006
[RFC PATCH v2 1/2] vfio/type1: Adopt fast IOTLB flush interface when unmap IOVAs
VFIO IOMMU type1 currently upmaps IOVA pages synchronously, which requires IOTLB flushing for every unmapping. This results in large IOTLB flushing overhead when handling pass-through devices has a large number of mapped IOVAs. This can be avoided by using the new IOTLB flushing interface. Cc: Alex Williamson Cc: Joerg Roedel Signed-off-by: Suravee Suthikulpanit --- drivers/vfio/vfio_iommu_type1.c | 89 +++-- 1 file changed, 77 insertions(+), 12 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index e30e29a..f000844 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -102,6 +102,13 @@ struct vfio_pfn { atomic_tref_count; }; +struct vfio_regions{ + struct list_head list; + dma_addr_t iova; + phys_addr_t phys; + size_t len; +}; + #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\ (!list_empty(&iommu->domain_list)) @@ -479,6 +486,40 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova, return unlocked; } +/* + * Generally, VFIO needs to unpin remote pages after each IOTLB flush. + * Therefore, when using IOTLB flush sync interface, VFIO need to keep track + * of these regions (currently using a list). + * + * This value specifies maximum number of regions for each IOTLB flush sync. + */ +#define VFIO_IOMMU_TLB_SYNC_MAX512 + +static long vfio_sync_and_unpin(struct vfio_dma *dma, struct vfio_domain *domain, + struct list_head *regions, bool do_accounting) +{ + long unlocked = 0; + struct vfio_regions *entry, *next; + + iommu_tlb_sync(domain->domain); + + list_for_each_entry_safe(entry, next, regions, list) { + unlocked += vfio_unpin_pages_remote(dma, + entry->iova, + entry->phys >> PAGE_SHIFT, + entry->len >> PAGE_SHIFT, + false); + list_del(&entry->list); + kfree(entry); + } + + if (do_accounting) { + vfio_lock_acct(dma->task, -unlocked, NULL); + return 0; + } + return unlocked; +} + static int vfio_pin_page_external(struct vfio_dma *dma, unsigned long vaddr, unsigned long *pfn_base, bool do_accounting) { @@ -653,7 +694,10 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, { dma_addr_t iova = dma->iova, end = dma->iova + dma->size; struct vfio_domain *domain, *d; + struct list_head unmapped_regions; + struct vfio_regions *entry; long unlocked = 0; + int cnt = 0; if (!dma->size) return 0; @@ -661,6 +705,8 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) return 0; + INIT_LIST_HEAD(&unmapped_regions); + /* * We use the IOMMU to track the physical addresses, otherwise we'd * need a much more complicated tracking system. Unfortunately that @@ -698,24 +744,36 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma, break; } - unmapped = iommu_unmap(domain->domain, iova, len); - if (WARN_ON(!unmapped)) + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) break; - unlocked += vfio_unpin_pages_remote(dma, iova, - phys >> PAGE_SHIFT, - unmapped >> PAGE_SHIFT, - false); + unmapped = iommu_unmap_fast(domain->domain, iova, len); + if (WARN_ON(!unmapped)) { + kfree(entry); + break; + } + + iommu_tlb_range_add(domain->domain, iova, unmapped); + entry->iova = iova; + entry->phys = phys; + entry->len = unmapped; + list_add_tail(&entry->list, &unmapped_regions); + cnt++; iova += unmapped; + if (cnt >= VFIO_IOMMU_TLB_SYNC_MAX) { + unlocked += vfio_sync_and_unpin(dma, domain, &unmapped_regions, + do_accounting); + cnt = 0; + } cond_resched(); } + if (cnt) + unlocked += vfio_sync_and_unpin(dma, domain, &unmapped_regions, + do_account
[PATCH] ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE
If we can't get inode lock immediately in the function ocfs2_inode_lock_with_page() when reading a page, we should not return directly here, since this will lead to a softlockup problem. The method is to get a blocking lock and immediately unlock before returning, this can avoid CPU resource waste due to lots of retries, and benefits fairness in getting lock among multiple nodes, increase efficiency in case modifying the same file frequently from multiple nodes. The softlockup problem looks like, Kernel panic - not syncing: softlockup: hung tasks CPU: 0 PID: 885 Comm: multi_mmap Tainted: G L 4.12.14-6.1-default #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x5c/0x82 panic+0xd5/0x21e watchdog_timer_fn+0x208/0x210 ? watchdog_park_threads+0x70/0x70 __hrtimer_run_queues+0xcc/0x200 hrtimer_interrupt+0xa6/0x1f0 smp_apic_timer_interrupt+0x34/0x50 apic_timer_interrupt+0x96/0xa0 RIP: 0010:unlock_page+0x17/0x30 RSP: :af154080bc88 EFLAGS: 0246 ORIG_RAX: ff10 RAX: dead0100 RBX: f21e009f5300 RCX: 0004 RDX: dead00ff RSI: 0202 RDI: f21e009f5300 RBP: R08: R09: af154080bb00 R10: af154080bc30 R11: 0040 R12: 993749a39518 R13: R14: f21e009f5300 R15: f21e009f5300 ocfs2_inode_lock_with_page+0x25/0x30 [ocfs2] ocfs2_readpage+0x41/0x2d0 [ocfs2] ? pagecache_get_page+0x30/0x200 filemap_fault+0x12b/0x5c0 ? recalc_sigpending+0x17/0x50 ? __set_task_blocked+0x28/0x70 ? __set_current_blocked+0x3d/0x60 ocfs2_fault+0x29/0xb0 [ocfs2] __do_fault+0x1a/0xa0 __handle_mm_fault+0xbe8/0x1090 handle_mm_fault+0xaa/0x1f0 __do_page_fault+0x235/0x4b0 trace_do_page_fault+0x3c/0x110 async_page_fault+0x28/0x30 RIP: 0033:0x7fa75ded638e RSP: 002b:7ffd6657db18 EFLAGS: 00010287 RAX: 55c7662fb700 RBX: 0001 RCX: 55c7662fb700 RDX: 1770 RSI: 7fa75e909000 RDI: 55c7662fb700 RBP: 0003 R08: 000e R09: R10: 0483 R11: 7fa75ded61b0 R12: 7fa75e90a770 R13: 000e R14: 1770 R15: Fixes: 1cce4df04f37 ("ocfs2: do not lock/unlock() inode DLM lock") Signed-off-by: Gang He --- fs/ocfs2/dlmglue.c | 9 + 1 file changed, 9 insertions(+) diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 4689940..5193218 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -2486,6 +2486,15 @@ int ocfs2_inode_lock_with_page(struct inode *inode, ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK); if (ret == -EAGAIN) { unlock_page(page); + /* +* If we can't get inode lock immediately, we should not return +* directly here, since this will lead to a softlockup problem. +* The method is to get a blocking lock and immediately unlock +* before returning, this can avoid CPU resource waste due to +* lots of retries, and benefits fairness in getting lock. +*/ + if (ocfs2_inode_lock(inode, ret_bh, ex) == 0) + ocfs2_inode_unlock(inode, ex); ret = AOP_TRUNCATED_PAGE; } -- 1.8.5.6
Re: [v3,21/27] watchdog: replace devm_ioremap_nocache with devm_ioremap
Hi Guenter, On 2017/12/27 1:28, Guenter Roeck wrote: > On Sat, Dec 23, 2017 at 07:01:37PM +0800, Yisheng Xie wrote: >> > Default ioremap is ioremap_nocache, so devm_ioremap has the same >> > function with devm_ioremap_nocache, which can just be killed to >> > save the size of devres.o >> > >> > This patch is to use use devm_ioremap instead of devm_ioremap_nocache, >> > which should not have any function change but prepare for killing >> > devm_ioremap_nocache. >> > > I don't have issues with the patch itself - on mips, the definitions > _are_ the same - but with the description. It is not universally correct > that the definitions are the same. Please update and resubmit. > Right, there are 4 archs have different meaning of ioremap. And presently, I still do not know why. For this patch itself, maybe I can update and resubmit. Thanks and Merry Christmas. Yisheng > Guenter >
Re: [PATCH] clk: mediatek: adjust dependency of reset.c to avoid unexpectedly being built
On Tue, 2017-12-26 at 17:19 -0800, Stephen Boyd wrote: > On 12/26, sean.w...@mediatek.com wrote: > > From: Sean Wang > > > > commit 74cb0d6dde8 ("clk: mediatek: fixup test-building of MediaTek clock > > drivers") can let the build system looking into the directory where the > > clock drivers resides and then allow test-building the drivers. > > > > But the change also gives rise to certain incorrect behavior which is > > reset.c being built even not depending on either COMPILE_TEST or > > ARCH_MEDIATEK alternative dependency. To get rid of reset.c being built > > unexpectedly on the other platforms, it would be a good change that the > > file should be built depending on its own specific configuration rather > > than just on generic RESET_CONTROLLER one. > > > > Signed-off-by: Sean Wang > > Cc: Jean Delvare > > I've typically seen vendor Kconfigs select the RESET_CONTROLLER > framework if the vendor Kconfig is enabled. Any reason that same > method isn't followed here? > I just thought explicit dependency added in Kconfig seems a little good no matter how the vendor Kconfig forces to select. But, I believe reset controller is always present on every mediatek SoC, at least it can be found on infracfg and pericfg subsystem, which is really fundamental hardware block. So, it's still quite reasonable to add "RESET_CONTROLLER" to vendor Kconfig. Once we did it in vendor Kconfig, the Kconfig maybe could become something like that. config RESET_MEDIATEK bool "MediaTek Reset Driver" depends on ARCH_MEDIATEK || (RESET_CONTROLLER && COMPILE_TEST) help This enables the reset controller driver used on MediaTek SoCs. where COMPILE_TEST still has to depend on RESET_CONTROLLER to avoid any compiling error. I'll make the next version based on above and relevant vendor Kconfig changes Sean
[PATCH] PCI: exynos: remove the deprecated phy codes
pci-exynos had updated to use the PHY framework. (drivers/phy/samsung/phy-exynos-pcie.c) Removed the depreccated codes relevant to phy in pci-exynos.c. Instead, use the phy-exynos-pcie.c file. Modified the binding documentation. Signed-off-by: Jaehoon Chung --- .../bindings/pci/samsung,exynos5440-pcie.txt | 58 ++ drivers/pci/dwc/pci-exynos.c | 219 ++--- 2 files changed, 22 insertions(+), 255 deletions(-) diff --git a/Documentation/devicetree/bindings/pci/samsung,exynos5440-pcie.txt b/Documentation/devicetree/bindings/pci/samsung,exynos5440-pcie.txt index 34a11bfbfb60..651d957d1051 100644 --- a/Documentation/devicetree/bindings/pci/samsung,exynos5440-pcie.txt +++ b/Documentation/devicetree/bindings/pci/samsung,exynos5440-pcie.txt @@ -6,9 +6,6 @@ and thus inherits all the common properties defined in designware-pcie.txt. Required properties: - compatible: "samsung,exynos5440-pcie" - reg: base addresses and lengths of the PCIe controller, - the PHY controller, additional register for the PHY controller. - (Registers for the PHY controller are DEPRECATED. -Use the PHY framework.) - reg-names : First name should be set to "elbi". And use the "config" instead of getting the configuration address space from "ranges". @@ -23,49 +20,8 @@ For other common properties, refer to Example: -SoC-specific DT Entry: +SoC-specific DT Entry (with using PHY framework): - pcie@29 { - compatible = "samsung,exynos5440-pcie", "snps,dw-pcie"; - reg = <0x29 0x1000 - 0x27 0x1000 - 0x271000 0x40>; - interrupts = <0 20 0>, <0 21 0>, <0 22 0>; - clocks = <&clock 28>, <&clock 27>; - clock-names = "pcie", "pcie_bus"; - #address-cells = <3>; - #size-cells = <2>; - device_type = "pci"; - ranges = <0x0800 0 0x4000 0x4000 0 0x1000 /* configuration space */ - 0x8100 0 0 0x40001000 0 0x0001 /* downstream I/O */ - 0x8200 0 0x40011000 0x40011000 0 0x1ffef000>; /* non-prefetchable memory */ - #interrupt-cells = <1>; - interrupt-map-mask = <0 0 0 0>; - interrupt-map = <0 0 0 0 &gic GIC_SPI 21 IRQ_TYPE_LEVEL_HIGH>; - num-lanes = <4>; - }; - - pcie@2a { - compatible = "samsung,exynos5440-pcie", "snps,dw-pcie"; - reg = <0x2a 0x1000 - 0x272000 0x1000 - 0x271040 0x40>; - interrupts = <0 23 0>, <0 24 0>, <0 25 0>; - clocks = <&clock 29>, <&clock 27>; - clock-names = "pcie", "pcie_bus"; - #address-cells = <3>; - #size-cells = <2>; - device_type = "pci"; - ranges = <0x0800 0 0x6000 0x6000 0 0x1000 /* configuration space */ - 0x8100 0 0 0x60001000 0 0x0001 /* downstream I/O */ - 0x8200 0 0x60011000 0x60011000 0 0x1ffef000>; /* non-prefetchable memory */ - #interrupt-cells = <1>; - interrupt-map-mask = <0 0 0 0>; - interrupt-map = <0 0 0 0 &gic GIC_SPI 24 IRQ_TYPE_LEVEL_HIGH>; - num-lanes = <4>; - }; - -With using PHY framework: pcie_phy0: pcie-phy@27 { ... reg = <0x27 0x1000>, <0x271000 0x40>; @@ -74,13 +30,21 @@ With using PHY framework: }; pcie@29 { - ... + compatible = "samsung,exynos5440-pcie", "snps,dw-pcie"; reg = <0x29 0x1000>, <0x4000 0x1000>; reg-names = "elbi", "config"; + clocks = <&clock 28>, <&clock 27>; + clock-names = "pcie", "pcie_bus"; + #address-cells = <3>; + #size-cells = <2>; + device_type = "pci"; phys = <&pcie_phy0>; ranges = <0x8100 0 0 0x60001000 0 0x0001 0x8200 0 0x60011000 0x60011000 0 0x1ffef000>; - ... + #interrupt-cells = <1>; + interrupt-map-mask = <0 0 0 0>; + interrupt-map = <0 0 0 0 &gic GIC_SPI 21 IRQ_TYPE_LEVEL_HIGH>; + num-lanes = <4>; }; Board-specific DT Entry: diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c index 5596fdedbb94..56f32aeebd0a 100644 --- a/drivers/pci/dwc/pci-exynos.c +++ b/drivers/pci/dwc/pci-exynos.c @@ -55,49 +55,8 @@ #define PCIE_ELBI_SLV_ARMISC 0x120 #define PCIE_ELBI_SLV_DBI_ENABLE BIT(21) -/* PCIe Purple registers */ -#define PCIE_PHY_GLOBAL_RESET 0x000 -#define PCIE_PHY_COMMON_RESET 0x004 -#define PCIE_PH
Re: [PATCH 4/4] KVM: nVMX: initialize more non-shadowed fields in prepare_vmcs02_full
On 25/12/2017 04:09, Wanpeng Li wrote: > 2017-12-21 20:43 GMT+08:00 Paolo Bonzini : >> These fields are also simple copies of the data in the vmcs12 struct. >> For some of them, prepare_vmcs02 was skipping the copy when the field >> was unused. In prepare_vmcs02_full, we copy them always as long as the >> field exists on the host, because the corresponding execution control >> might be one of the shadowed fields. > > Why we don't need to copy them always before the patchset? Before these patches, we only copy them if the corresponding processor control is enabled. For example, we only copy the EOI exit bitmap if APICv is enabled by L1. Here we could have write to EOI exit bitmap vmlaunch (calls prepare_vmcs02_full) enable APICv (but EOI exit bitmap fields are clean) vmresume (doesn't call prepare_vmcs02_full) The vmresume doesn't call prepare_vmcs02_full, so the EOI exit bitmap must be copied every time prepare_vmcs02_full runs. Paolo
Re: [next] ath10k: wmi: remove redundant integer fc
Colin Ian King wrote: > Variable fc is being assigned but never used, so remove it. Cleans > up the clang warning: > > warning: Value stored to 'fc' is never read > > Signed-off-by: Colin Ian King > Signed-off-by: Kalle Valo Patch applied to ath-next branch of ath.git, thanks. a0709dfd7ff8 ath10k: wmi: remove redundant integer fc -- https://patchwork.kernel.org/patch/10119831/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Re: wil6210: fix build warnings without CONFIG_PM
Arnd Bergmann wrote: > The #ifdef checks are hard to get right, in this case some functions > should have been left inside a CONFIG_PM_SLEEP check as seen by this > message: > > drivers/net/wireless/ath/wil6210/pcie_bus.c:489:12: error: > 'wil6210_pm_resume' defined but not used [-Werror=unused-function] > drivers/net/wireless/ath/wil6210/pcie_bus.c:484:12: error: > 'wil6210_pm_suspend' defined but not used [-Werror=unused-function] > > Using an __maybe_unused is easier here, so I'm replacing all the > other #ifdef in this file as well for consistency. > > Fixes: 94162666cd51 ("wil6210: run-time PM when interface down") > Signed-off-by: Arnd Bergmann > Signed-off-by: Kalle Valo Patch applied to ath-next branch of ath.git, thanks. 203dab8395d9 wil6210: fix build warnings without CONFIG_PM -- https://patchwork.kernel.org/patch/10119565/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Re: [PATCH v4] hwrng: exynos - add Samsung Exynos True RNG driver
It was <2017-12-22 pią 19:30>, when Philippe Ombredanne wrote: > On Fri, Dec 22, 2017 at 5:38 PM, Łukasz Stelmach > wrote: >> It was <2017-12-22 pią 14:34>, when Philippe Ombredanne wrote: >>> Łukasz, >>> >>> On Fri, Dec 22, 2017 at 2:23 PM, Łukasz Stelmach >>> wrote: Add support for True Random Number Generator found in Samsung Exynos 5250+ SoCs. Signed-off-by: Łukasz Stelmach Reviewed-by: Krzysztof Kozlowski >>> >>> >>> --- /dev/null +++ b/drivers/char/hw_random/exynos-trng.c @@ -0,0 +1,245 @@ +/* + * RNG driver for Exynos TRNGs + * + * Author: Łukasz Stelmach + * + * Copyright 2017 (c) Samsung Electronics Software, Inc. + * + * Based on the Exynos PRNG driver drivers/crypto/exynos-rng by + * Krzysztof Kozłowski + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ >>> >>> >>> Would you mind using the new SPDX tags documented in Thomas patch set >>> [1] rather than this fine but longer legalese? >>> >>> And if you could spread the word to others in your team this would be very >>> nice. >>> See also this fine article posted by Mauro on the Samsung Open Source >>> Group Blog [2] >>> Thank you! >> >> Cool! We've been using SPDX to tag RPM packages in Tizen for three years or >> more. ;-) > > Very nice! any pubic pointers? ^ I assume you request an URL of a publicly available web-page ;-) https://wiki.tizen.org/Packaging/Guidelines#License_Tag -- Łukasz Stelmach Samsung R&D Institute Poland Samsung Electronics signature.asc Description: PGP signature
Re: [Ocfs2-devel] [PATCH] ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE
Hi Gang, Do you mean that too many retrys in loop cast losts of CPU-time and block page-fault interrupt? We should not add any delay in ocfs2_fault(), right? And I still feel a little confused why your method can solve this problem. thanks, Jun On 2017/12/27 17:29, Gang He wrote: > If we can't get inode lock immediately in the function > ocfs2_inode_lock_with_page() when reading a page, we should not > return directly here, since this will lead to a softlockup problem. > The method is to get a blocking lock and immediately unlock before > returning, this can avoid CPU resource waste due to lots of retries, > and benefits fairness in getting lock among multiple nodes, increase > efficiency in case modifying the same file frequently from multiple > nodes. > The softlockup problem looks like, > Kernel panic - not syncing: softlockup: hung tasks > CPU: 0 PID: 885 Comm: multi_mmap Tainted: G L 4.12.14-6.1-default #1 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > Call Trace: > > dump_stack+0x5c/0x82 > panic+0xd5/0x21e > watchdog_timer_fn+0x208/0x210 > ? watchdog_park_threads+0x70/0x70 > __hrtimer_run_queues+0xcc/0x200 > hrtimer_interrupt+0xa6/0x1f0 > smp_apic_timer_interrupt+0x34/0x50 > apic_timer_interrupt+0x96/0xa0 > > RIP: 0010:unlock_page+0x17/0x30 > RSP: :af154080bc88 EFLAGS: 0246 ORIG_RAX: ff10 > RAX: dead0100 RBX: f21e009f5300 RCX: 0004 > RDX: dead00ff RSI: 0202 RDI: f21e009f5300 > RBP: R08: R09: af154080bb00 > R10: af154080bc30 R11: 0040 R12: 993749a39518 > R13: R14: f21e009f5300 R15: f21e009f5300 > ocfs2_inode_lock_with_page+0x25/0x30 [ocfs2] > ocfs2_readpage+0x41/0x2d0 [ocfs2] > ? pagecache_get_page+0x30/0x200 > filemap_fault+0x12b/0x5c0 > ? recalc_sigpending+0x17/0x50 > ? __set_task_blocked+0x28/0x70 > ? __set_current_blocked+0x3d/0x60 > ocfs2_fault+0x29/0xb0 [ocfs2] > __do_fault+0x1a/0xa0 > __handle_mm_fault+0xbe8/0x1090 > handle_mm_fault+0xaa/0x1f0 > __do_page_fault+0x235/0x4b0 > trace_do_page_fault+0x3c/0x110 > async_page_fault+0x28/0x30 > RIP: 0033:0x7fa75ded638e > RSP: 002b:7ffd6657db18 EFLAGS: 00010287 > RAX: 55c7662fb700 RBX: 0001 RCX: 55c7662fb700 > RDX: 1770 RSI: 7fa75e909000 RDI: 55c7662fb700 > RBP: 0003 R08: 000e R09: > R10: 0483 R11: 7fa75ded61b0 R12: 7fa75e90a770 > R13: 000e R14: 1770 R15: > > Fixes: 1cce4df04f37 ("ocfs2: do not lock/unlock() inode DLM lock") > Signed-off-by: Gang He > --- > fs/ocfs2/dlmglue.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c > index 4689940..5193218 100644 > --- a/fs/ocfs2/dlmglue.c > +++ b/fs/ocfs2/dlmglue.c > @@ -2486,6 +2486,15 @@ int ocfs2_inode_lock_with_page(struct inode *inode, > ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK); > if (ret == -EAGAIN) { > unlock_page(page); > + /* > + * If we can't get inode lock immediately, we should not return > + * directly here, since this will lead to a softlockup problem. > + * The method is to get a blocking lock and immediately unlock > + * before returning, this can avoid CPU resource waste due to > + * lots of retries, and benefits fairness in getting lock. > + */ > + if (ocfs2_inode_lock(inode, ret_bh, ex) == 0) > + ocfs2_inode_unlock(inode, ex); > ret = AOP_TRUNCATED_PAGE; > } > >
Re: [PATCH] backlight: otm3225a: add support for ORISE OTM3225A LCD SoC
Hello Jingoo, Many thanks for taking the time to review my patch! Your suggestions will be implemented in v2 which I will post soon. kind regards, Felix On 22.12.2017 18:23, Jingoo Han wrote: > On Wednesday, December 20, 2017 12:58 PM, Felix Brack wrote: >> >> This patch adds a LCD driver supporting the OTM3225A LCD SoC >> from ORISE Technology. This device can drive TFT LC panels having a >> resolution of 240x320 pixels. After initializing the OTM3225A using >> it's SPI interface it switches to use 16-bib RGB as external >> display interface. >> >> Signed-off-by: Felix Brack >> --- >> drivers/video/backlight/Kconfig| 7 ++ >> drivers/video/backlight/Makefile | 1 + >> drivers/video/backlight/otm3225a.c | 210 >> + >> 3 files changed, 218 insertions(+) >> create mode 100644 drivers/video/backlight/otm3225a.c >> >> diff --git a/drivers/video/backlight/Kconfig >> b/drivers/video/backlight/Kconfig >> index 4e1d2ad..06e187b 100644 >> --- a/drivers/video/backlight/Kconfig >> +++ b/drivers/video/backlight/Kconfig >> @@ -150,6 +150,13 @@ config LCD_HX8357 >>If you have a HX-8357 LCD panel, say Y to enable its LCD control >>driver. >> >> + config LCD_OTM3225A >> +tristate "ORISE Technology OTM3225A support" >> +depends on SPI >> +help >> + If you have a panel based on the OTM3225A controller >> + chip then say y to include a driver for it. >> + >> endif # LCD_CLASS_DEVICE >> >> # >> diff --git a/drivers/video/backlight/Makefile >> b/drivers/video/backlight/Makefile >> index 8905129..b177b91 100644 >> --- a/drivers/video/backlight/Makefile >> +++ b/drivers/video/backlight/Makefile >> @@ -17,6 +17,7 @@ obj-$(CONFIG_LCD_S6E63M0) += s6e63m0.o >> obj-$(CONFIG_LCD_TDO24M)+= tdo24m.o >> obj-$(CONFIG_LCD_TOSA) += tosa_lcd.o >> obj-$(CONFIG_LCD_VGG2432A4) += vgg2432a4.o >> +obj-$(CONFIG_LCD_OTM3225A) += otm3225a.o > > All entries of Kconfig was alphasorted 4 years ago for reducing > patch collisions. So please add it in alphabetical order as below. > > @@ -13,6 +13,7 @@ obj-$(CONFIG_LCD_LD9040) += ld9040.o > obj-$(CONFIG_LCD_LMS283GF05) += lms283gf05.o > obj-$(CONFIG_LCD_LMS501KF03) += lms501kf03.o > obj-$(CONFIG_LCD_LTV350QV) += ltv350qv.o > +obj-$(CONFIG_LCD_OTM3225A) += otm3225a.o > obj-$(CONFIG_LCD_PLATFORM) += platform_lcd.o > > >> >> obj-$(CONFIG_BACKLIGHT_88PM860X)+= 88pm860x_bl.o >> obj-$(CONFIG_BACKLIGHT_AAT2870) += aat2870_bl.o >> diff --git a/drivers/video/backlight/otm3225a.c >> b/drivers/video/backlight/otm3225a.c >> new file mode 100644 >> index 000..0de75f8 >> --- /dev/null >> +++ b/drivers/video/backlight/otm3225a.c >> @@ -0,0 +1,210 @@ >> +/* >> + * Driver for ORISE Technology OTM3225A SOC for TFT LCD >> + * >> + * Copyright (C) 2014-2017, EETS GmbH, Felix Brack > > Please change the year of copyright as below. > > + * Copyright (C) 2017, EETS GmbH, Felix Brack > >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License as published by >> + * the Free Software Foundation; either version 2 of the License, or >> + * (at your option) any later version. >> + >> + * This program is distributed in the hope that it will be useful, >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> + * GNU General Public License for more details. >> + >> + * This driver implements a lcd device for the ORISE OTM3225A display >> + * controller. The control interface to the display is SPI and the >> display's >> + * memory is updated over the 16-bit RGB interface. >> + * The main source of information for writing this driver was provided by >> the >> + * OTM3225A datasheet from ORISE Technology. Some information arise from >> the >> + * ILI9328 datasheet from ILITEK as well as from the datasheets and >> sample code >> + * provided by Crystalfontz America Inc. who sells the CFAF240320A-032T, >> a 3.2" >> + * TFT LC display using the OTM3225A controller. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include > > So please add these headers in alphabetical order > for readability. > >> + >> +#define OTM3225A_INDEX_REG 0x70 >> +#define OTM3225A_DATA_REG 0x72 >> + >> +struct otm3225a_data { >> +struct spi_device *spi; >> +struct lcd_device *ld; >> +int power; >> +}; >> + >> +struct otm3225a_spi_instruction { >> +unsigned char reg; /* register to write */ >> +unsigned short value; /* data to write to 'reg' */ >> +unsigned short delay; /* delay in ms after write */ >> +}; >> + >> +static struct otm3225a_spi_instruction display_init[] = { >> +{ 0x01, 0x, 0 }, { 0x02, 0x0700, 0 }, { 0x03, 0x50A0, 0 }, >> +{ 0x04, 0x, 0 },
Re: [PATCH v5 03/78] xarray: Add the xa_lock to the radix_tree_root
On Tue, Dec 26, 2017 at 07:43:40PM -0800, Matthew Wilcox wrote: > On Tue, Dec 26, 2017 at 07:54:40PM +0300, Kirill A. Shutemov wrote: > > On Fri, Dec 15, 2017 at 02:03:35PM -0800, Matthew Wilcox wrote: > > > From: Matthew Wilcox > > > > > > This results in no change in structure size on 64-bit x86 as it fits in > > > the padding between the gfp_t and the void *. > > > > The patch does more than described in the subject and commit message. At > > first > > I was confused why do you need to touch idr here. It took few minutes to > > figure > > it out. > > > > Could you please add more into commit message about lockname and xa_ locking > > interface since you introduce it here? > > Sure! How's this? > > xarray: Add the xa_lock to the radix_tree_root > > This results in no change in structure size on 64-bit x86 as it fits in > the padding between the gfp_t and the void *. > > Initialising the spinlock requires a name for the benefit of lockdep, > so RADIX_TREE_INIT() now needs to know the name of the radix tree it's > initialising, and so do IDR_INIT() and IDA_INIT(). > > Also add the xa_lock() and xa_unlock() family of wrappers to make it > easier to use the lock. If we could rely on -fplan9-extensions in > the compiler, we could avoid all of this syntactic sugar, but that > wasn't added until gcc 4.6. > Looks great, thanks. -- Kirill A. Shutemov
Re: You will definetely be interested...
Hi Dear, Reading your profile has given me courage in search of a reasponsable and trust worthy Fellow. The past has treated me so awfully but now I am ready to move on despite of my health condition. I will like to have a sincere and important discussion with you that will be in your favor likewise to you and your environment especially to your close family. Endeavor to reply me and I have attached my picture in case you long to know who emailed you. I will be waiting to hear from you as soon as possble. Thanks for paying attention to my mail and will appreciate so much if I receive a reply from you for understable details. Thanks, Mrs. Rania Hassan
Re: [PATCH v5 03/78] xarray: Add the xa_lock to the radix_tree_root
On Tue, Dec 26, 2017 at 07:58:15PM -0800, Matthew Wilcox wrote: > On Tue, Dec 26, 2017 at 07:43:40PM -0800, Matthew Wilcox wrote: > > Also add the xa_lock() and xa_unlock() family of wrappers to make it > > easier to use the lock. If we could rely on -fplan9-extensions in > > the compiler, we could avoid all of this syntactic sugar, but that > > wasn't added until gcc 4.6. > > Oh, in case anyone's wondering, here's how I'd do it with plan9 extensions: > > struct xarray { > spinlock_t; > int xa_flags; > void *xa_head; > }; > > ... > spin_lock_irqsave(&mapping->pages, flags); > __delete_from_page_cache(page, NULL); > spin_unlock_irqrestore(&mapping->pages, flags); > ... > > The plan9 extensions permit passing a pointer to a struct which has an > unnamed element to a function which is expecting a pointer to the type > of that element. The compiler does any necessary arithmetic to produce > a pointer. It's exactly as if I had written: > > spin_lock_irqsave(&mapping->pages.xa_lock, flags); > __delete_from_page_cache(page, NULL); > spin_unlock_irqrestore(&mapping->pages.xa_lock, flags); > > More details here: https://9p.io/sys/doc/compiler.html Yeah, that's neat. Dealing with old compilers is frustrating... -- Kirill A. Shutemov
[PATCH 1/4] PCI/AER: factor out error reporting from AER
This patch factors out error reporting callbacks, which are currently tightly coupled with AER. DPC should be able to call these callbacks when DPC trigger event occurs. Signed-off-by: Oza Pawandeep diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 6402f7f..fd053e5 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -462,7 +462,7 @@ static void ghes_do_proc(struct ghes *ghes, * use, so treat it as a fatal AER error. */ if (gdata->flags & CPER_SEC_RESET) - aer_severity = AER_FATAL; + aer_severity = PCI_ERR_AER_FATAL; aer_recover_queue(pcie_err->device_id.segment, pcie_err->device_id.bus, diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile index 223e4c3..d669497 100644 --- a/drivers/pci/pcie/Makefile +++ b/drivers/pci/pcie/Makefile @@ -6,7 +6,7 @@ # Build PCI Express ASPM if needed obj-$(CONFIG_PCIEASPM) += aspm.o -pcieportdrv-y := portdrv_core.o portdrv_pci.o portdrv_bus.o +pcieportdrv-y := portdrv_core.o portdrv_pci.o portdrv_bus.o pcie-err.o pcieportdrv-$(CONFIG_ACPI) += portdrv_acpi.o obj-$(CONFIG_PCIEPORTBUS) += pcieportdrv.o diff --git a/drivers/pci/pcie/aer/aerdrv.h b/drivers/pci/pcie/aer/aerdrv.h index 5449e5c..bc9db53 100644 --- a/drivers/pci/pcie/aer/aerdrv.h +++ b/drivers/pci/pcie/aer/aerdrv.h @@ -76,36 +76,6 @@ struct aer_rpc { */ }; -struct aer_broadcast_data { - enum pci_channel_state state; - enum pci_ers_result result; -}; - -static inline pci_ers_result_t merge_result(enum pci_ers_result orig, - enum pci_ers_result new) -{ - if (new == PCI_ERS_RESULT_NO_AER_DRIVER) - return PCI_ERS_RESULT_NO_AER_DRIVER; - - if (new == PCI_ERS_RESULT_NONE) - return orig; - - switch (orig) { - case PCI_ERS_RESULT_CAN_RECOVER: - case PCI_ERS_RESULT_RECOVERED: - orig = new; - break; - case PCI_ERS_RESULT_DISCONNECT: - if (new == PCI_ERS_RESULT_NEED_RESET) - orig = PCI_ERS_RESULT_NEED_RESET; - break; - default: - break; - } - - return orig; -} - extern struct bus_type pcie_port_bus_type; void aer_isr(struct work_struct *work); void aer_print_error(struct pci_dev *dev, struct aer_err_info *info); diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c index 7448052..758e744 100644 --- a/drivers/pci/pcie/aer/aerdrv_core.c +++ b/drivers/pci/pcie/aer/aerdrv_core.c @@ -165,7 +165,7 @@ static bool is_error_source(struct pci_dev *dev, struct aer_err_info *e_info) return false; /* Check if error is recorded */ - if (e_info->severity == AER_CORRECTABLE) { + if (e_info->severity == PCI_ERR_AER_CORRECTABLE) { pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS, &status); pci_read_config_dword(dev, pos + PCI_ERR_COR_MASK, &mask); } else { @@ -234,189 +234,6 @@ static bool find_source_device(struct pci_dev *parent, return true; } -static int report_error_detected(struct pci_dev *dev, void *data) -{ - pci_ers_result_t vote; - const struct pci_error_handlers *err_handler; - struct aer_broadcast_data *result_data; - result_data = (struct aer_broadcast_data *) data; - - device_lock(&dev->dev); - dev->error_state = result_data->state; - - if (!dev->driver || - !dev->driver->err_handler || - !dev->driver->err_handler->error_detected) { - if (result_data->state == pci_channel_io_frozen && - dev->hdr_type != PCI_HEADER_TYPE_BRIDGE) { - /* -* In case of fatal recovery, if one of down- -* stream device has no driver. We might be -* unable to recover because a later insmod -* of a driver for this device is unaware of -* its hw state. -*/ - dev_printk(KERN_DEBUG, &dev->dev, "device has %s\n", - dev->driver ? - "no AER-aware driver" : "no driver"); - } - - /* -* If there's any device in the subtree that does not -* have an error_detected callback, returning -* PCI_ERS_RESULT_NO_AER_DRIVER prevents calling of -* the subsequent mmio_enabled/slot_reset/resume -* callbacks of "any" device in the subtree. All the -* devices in the subtree
[PATCH 0/4] Address error and recovery for AER and DPC
This patch set brings in support for DPC and AER to co-exist and not to race for recovery. The current implementation of AER and error message broadcasting to the EP driver is tightly coupled and limited to AER service driver. It is important to factor out broadcasting and other link handling callbacks. So that not only when AER gets triggered, but also when DPC get triggered, or both get triggered simultaneously (for e.g. ERR_FATAL), callbacks are handled appropriately. having modularized the code, the race between AER and DPC is handled gracefully. for e.g. when DPC is active and kicked in, AER should not attempt to do recovery, because DPC takes care of it. DPC should enumerate the devices after recovering the link, which is achieved by implementing error_resume callback. Oza Pawandeep (4): PCI/AER: factor out error reporting from AER PCI/DPC/AER: Address Concurrency between AER and DPC PCI/ERR: Do not do recovery if DPC service is active PCI/DPC: Enumerate the devices after DPC trigger event drivers/acpi/apei/ghes.c | 2 +- drivers/pci/pcie/Makefile | 2 +- drivers/pci/pcie/aer/aerdrv.h | 30 --- drivers/pci/pcie/aer/aerdrv_core.c | 306 + drivers/pci/pcie/aer/aerdrv_errprint.c | 27 ++- drivers/pci/pcie/pcie-dpc.c| 127 ++- drivers/pci/pcie/pcie-err.c| 392 + drivers/pci/pcie/portdrv.h | 2 + include/linux/aer.h| 4 - include/linux/pci.h| 23 ++ 10 files changed, 569 insertions(+), 346 deletions(-) create mode 100644 drivers/pci/pcie/pcie-err.c -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc., a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
[PATCH 3/4] PCI/ERR: Do not do recovery if DPC service is active
If AER attempts to do recovery for any device, and DPC is active on any upstream port, AER should not do recovery, since it will be handled by DPC Change-Id: Ida507ce9145f420e35302db34e967f1b421e15c9 Signed-off-by: Oza Pawandeep diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c index 8bac584..1f01e76 100644 --- a/drivers/pci/pcie/pcie-err.c +++ b/drivers/pci/pcie/pcie-err.c @@ -267,6 +267,22 @@ pci_ers_result_t pci_broadcast_error_message(struct pci_dev *dev, return result_data.result; } +/* + * pcie_port_upstream_bridge - returns immediate upstream bridge. + * dev: pcie device + */ +static struct pci_dev *pcie_port_upstream_bridge(struct pci_dev *dev) +{ + struct pci_dev *parent; + + parent = pci_upstream_bridge(dev); + + if (parent && pci_is_pcie(parent)) + return parent; + + return NULL; +} + /** * pci_do_recovery - handle nonfatal/fatal error recovery process * @dev: pointer to a pci_dev data structure of agent detecting an error @@ -280,9 +296,29 @@ void pci_do_recovery(struct pci_dev *dev, int severity) { pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED; enum pci_channel_state state; + struct pcie_port_service_driver *driver; + struct pci_dev *pdev = dev; mutex_lock(&pci_err_recovery_lock); + if (severity != PCI_ERR_DPC_FATAL) { + /* +* DPC service could be running in RP +* or any upstream switch. +*/ + do { + driver = pci_find_dpc_service(pdev); + if (driver) { + dev_printk(KERN_NOTICE, &dev->dev, + "AER: Recovery to be done by DPC %s\n", + pci_name(dev)); + mutex_unlock(&pci_err_recovery_lock); + return; + } + pdev = pcie_port_upstream_bridge(dev); + } while (pdev); + } + if ((severity == PCI_ERR_AER_FATAL) || (severity == PCI_ERR_DPC_FATAL)) state = pci_channel_io_frozen; -- Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc., a Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
Re: [PATCH v5 15/15] devicetree: bindings: Document qcom,pvs
Hi Rob, On 12/26/2017 11:06 PM, Rob Herring wrote: > On Thu, Dec 21, 2017 at 5:53 AM, Sricharan R wrote: >> Hi Rob, >> >> On 12/21/2017 2:48 AM, Rob Herring wrote: >>> On Wed, Dec 20, 2017 at 11:55:33AM +0530, Sricharan R wrote: Hi Viresh, On 12/20/2017 8:56 AM, Viresh Kumar wrote: > On 19-12-17, 21:25, Sricharan R wrote: >> + cpu@0 { >> + compatible = "qcom,krait"; >> + enable-method = "qcom,kpss-acc-v1"; >> + device_type = "cpu"; >> + reg = <0>; >> + qcom,acc = <&acc0>; >> + qcom,saw = <&saw0>; >> + clocks = <&kraitcc 0>; >> + clock-names = "cpu"; >> + cpu-supply = <&smb208_s2a>; >> + operating-points-v2 = <&cpu_opp_table>; >> + }; >> + >> + qcom,pvs { >> + qcom,pvs-format-a; >> + }; > > Not sure what Rob is going to say on that :) > Yes. Would be good to know the best way. >>> >>> Seems like this should be a property of an efuse node either implied by >>> the compatible or a separate property. What determines format A vs. B? >>> >> >> Yes, this efuse registers are part of the eeprom (qfprom) tied to the soc. >> So this property (details like bitfields and register offsets that it >> represents) >> can be put soc specific and nvmem apis can be used to read >> the registers. Does something like below look ok ? >> >> qcom,pvs { >> compatible = "qcom,pvs-ipq8064"; >> nvmem-cells = <&pvs_efuse>; >> } > > Why do you need this node? It doesn't look like it corresponds to a > h/w block. It looks like you are just creating it to instantiate a > driver. > >> qfprom: qfprom@70 { >> compatible = "qcom,qfprom"; > > Either this or... > >> reg = <0x0070 0x1000>; >> #address-cells = <1>; >> #size-cells = <1>; >> ranges; >> pvs_efuse: pvs { > > a compatible here should be specific enough so the OS can know what > the bits are. Infact the above "qcom,pvs" node is required mainly to act as a consumer for the nvmem data provider ("qcom,qfprom") (using nvmem-cells = <&pvs_efuse>) Then "qfprom" can be made to contain a "format_a" or "format_b" specific cell. So all that is needed is, nvmem-cells = <&pvs_efuse_phandle> needs to be available somewhere. The requirement is similar what is now done by "operating-points-v2-ti-cpu" and the ti-cpufreq.c. There "operating-points-v2-ti-cpu" node, contains the syscon register to read the efuse values. Similarly does defining a new "operating-points-v2-krait-cpu" which would contain the nvmem-cells property look ok ? This would avoid defining a new qcom,pvs node. cpu@0 { compatible = "qcom,krait"; enable-method = "qcom,kpss-acc-v1"; device_type = "cpu"; reg = <0>; qcom,acc = <&acc0>; qcom,saw = <&saw0>; clocks = <&kraitcc 0>; clock-names = "cpu"; cpu-supply = <&smb208_s2a>; operating-points-v2 = <&cpu_opp_table>; }; cpu_opp_table: opp_table { compatible = "operating-points-v2-krait-cpu"; nvmem-cells = <&pvs_efuse_format_a>; /* * Missing opp-shared property means CPUs switch DVFS states * independently. */ opp-14 { opp-hz = /bits/ 64 <14>; opp-microvolt-speed0-pvs0-v0 = <125>; opp-microvolt-speed0-pvs1-v0 = <1175000>; opp-microvolt-speed0-pvs2-v0 = <1125000>; opp-microvolt-speed0-pvs3-v0 = <105>; }; ... } qfprom: qfprom@70 { compatible = "qcom,qfprom"; reg = <0x0070 0x1000>; #address-cells = <1>; #size-cells = <1>; ranges; pvs_efuse_format_a: pvs { reg = <0xc0 0x8>; }; } Regards, Sricharan -- "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[PATCH 4/4] PCI/DPC: Enumerate the devices after DPC trigger event
Implement error_resume callback in DPC, which, after DPC trigger event enumerates the devices beneath. Signed-off-by: Oza Pawandeep diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c index e7ced58..78e557f 100644 --- a/drivers/pci/pcie/pcie-dpc.c +++ b/drivers/pci/pcie/pcie-dpc.c @@ -161,6 +161,43 @@ static void dpc_wait_link_inactive(struct dpc_dev *dpc) dev_warn(dev, "Link state not disabled for DPC event\n"); } +static bool dpc_wait_link_active(struct pci_dev *pdev) +{ + unsigned long timeout = jiffies + HZ; + u16 lnk_status; + bool ret = true; + + pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status); + + while (!(lnk_status & PCI_EXP_LNKSTA_DLLLA) && + !time_after(jiffies, timeout)) { + msleep(10); + pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status); + } + + if (!(lnk_status & PCI_EXP_LNKSTA_DLLLA)) { + dev_warn(&pdev->dev, "Link state not enabled after DPC event\n"); + ret = false; + } + + return ret; +} + +/** + * dpc_error_resume - enumerate the devices beneath + * @dev: pointer to Root Port's pci_dev data structure + * + * Invoked by Port Bus driver during nonfatal recovery. + */ +static void dpc_error_resume(struct pci_dev *pdev) +{ + if (dpc_wait_link_active(pdev)) { + pci_lock_rescan_remove(); + pci_rescan_bus(pdev->bus); + pci_unlock_rescan_remove(); + } +} + /** * dpc_reset_link - reset link DPC routine * @dev: pointer to Root Port's pci_dev data structure @@ -419,6 +456,7 @@ static void dpc_remove(struct pcie_device *dev) .service= PCIE_PORT_SERVICE_DPC, .probe = dpc_probe, .remove = dpc_remove, + .error_resume = dpc_error_resume, .reset_link = dpc_reset_link, }; diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c index 1f01e76..9c4377c 100644 --- a/drivers/pci/pcie/pcie-err.c +++ b/drivers/pci/pcie/pcie-err.c @@ -231,7 +231,8 @@ pci_ers_result_t pci_reset_link(struct pci_dev *dev, int severity) pci_ers_result_t pci_broadcast_error_message(struct pci_dev *dev, enum pci_channel_state state, char *error_mesg, - int (*cb)(struct pci_dev *, void *)) + int (*cb)(struct pci_dev *, void *), + int severity) { struct pci_err_broadcast_data result_data; @@ -243,6 +244,15 @@ pci_ers_result_t pci_broadcast_error_message(struct pci_dev *dev, result_data.result = PCI_ERS_RESULT_RECOVERED; if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) { + /* If DPC is triggered, call resume error hanlder +* because, at this point we can safely assume that +* link recovery has happened. +*/ + if ((severity == PCI_ERR_DPC_FATAL) && + (cb == pci_report_resume)) { + cb(dev, NULL); + return PCI_ERS_RESULT_RECOVERED; + } /* * If the error is reported by a bridge, we think this error * is related to the downstream link of the bridge, so we @@ -328,7 +338,8 @@ void pci_do_recovery(struct pci_dev *dev, int severity) status = pci_broadcast_error_message(dev, state, "error_detected", - pci_report_error_detected); + pci_report_error_detected, + severity); if ((severity == PCI_ERR_AER_FATAL) || (severity == PCI_ERR_DPC_FATAL)) { @@ -337,11 +348,15 @@ void pci_do_recovery(struct pci_dev *dev, int severity) goto failed; } + if (severity == PCI_ERR_DPC_FATAL) + goto resume; + if (status == PCI_ERS_RESULT_CAN_RECOVER) status = pci_broadcast_error_message(dev, state, "mmio_enabled", - pci_report_mmio_enabled); + pci_report_mmio_enabled, + severity); if (status == PCI_ERS_RESULT_NEED_RESET) { /* @@ -352,16 +367,19 @@ void pci_do_recovery(struct pci_dev *dev, int severity) status = pci_broadcast_error_message(dev, state, "slot_reset", - pci_report_slot_reset); + pci_report_slot_reset, + severity); } if (status != PCI_ERS_RESULT_RECOVERED) goto failed; +resume: pci_broadcast_error_message(dev, state, "resume", - pci_repo
[PATCH 2/4] PCI/DPC/AER: Address Concurrency between AER and DPC
This patch addresses the race condition between AER and DPC for recovery. Current DPC driver does not do recovery, e.g. calling end-point's driver's callbacks, which sanitize the device. DPC driver implements link_reset callback, and calls pci_do_recovery. Signed-off-by: Oza Pawandeep diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c index 2d976a6..e7ced58 100644 --- a/drivers/pci/pcie/pcie-dpc.c +++ b/drivers/pci/pcie/pcie-dpc.c @@ -15,6 +15,9 @@ #include #include #include "../pci.h" +#include "portdrv.h" + +static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev); struct rp_pio_header_log_regs { u32 dw0; @@ -67,6 +70,60 @@ struct dpc_dev { "Memory Request Completion Timeout", /* Bit Position 18 */ }; +static int find_dpc_dev_iter(struct device *device, void *data) +{ + struct pcie_port_service_driver *service_driver; + struct device **dev; + + dev = (struct device **) data; + + if (device->bus == &pcie_port_bus_type && device->driver) { + service_driver = to_service_driver(device->driver); + if (service_driver->service == PCIE_PORT_SERVICE_DPC) { + *dev = device; + return 1; + } + } + + return 0; +} + +struct device *pci_find_dpc_dev(struct pci_dev *pdev) +{ + struct device *dev = NULL; + + device_for_each_child(&pdev->dev, &dev, find_dpc_dev_iter); + + return dev; +} + +static int find_dpc_service_iter(struct device *device, void *data) +{ + struct pcie_port_service_driver *service_driver, **drv; + + drv = (struct pcie_port_service_driver **) data; + + if (device->bus == &pcie_port_bus_type && device->driver) { + service_driver = to_service_driver(device->driver); + if (service_driver->service == PCIE_PORT_SERVICE_DPC) { + *drv = service_driver; + return 1; + } + } + + return 0; +} + +struct pcie_port_service_driver *pci_find_dpc_service(struct pci_dev *dev) +{ + struct pcie_port_service_driver *drv = NULL; + + device_for_each_child(&dev->dev, &drv, find_dpc_service_iter); + + return drv; +} +EXPORT_SYMBOL(pci_find_dpc_service); + static int dpc_wait_rp_inactive(struct dpc_dev *dpc) { unsigned long timeout = jiffies + HZ; @@ -104,11 +161,23 @@ static void dpc_wait_link_inactive(struct dpc_dev *dpc) dev_warn(dev, "Link state not disabled for DPC event\n"); } -static void interrupt_event_handler(struct work_struct *work) +/** + * dpc_reset_link - reset link DPC routine + * @dev: pointer to Root Port's pci_dev data structure + * + * Invoked by Port Bus driver when performing link reset at Root Port. + */ +static pci_ers_result_t dpc_reset_link(struct pci_dev *pdev) { - struct dpc_dev *dpc = container_of(work, struct dpc_dev, work); - struct pci_dev *dev, *temp, *pdev = dpc->dev->port; struct pci_bus *parent = pdev->subordinate; + struct pci_dev *dev, *temp; + struct dpc_dev *dpc; + struct pcie_device *pciedev; + struct device *devdpc; + + devdpc = pci_find_dpc_dev(pdev); + pciedev = to_pcie_device(devdpc); + dpc = get_service_data(pciedev); pci_lock_rescan_remove(); list_for_each_entry_safe_reverse(dev, temp, &parent->devices, @@ -125,7 +194,7 @@ static void interrupt_event_handler(struct work_struct *work) dpc_wait_link_inactive(dpc); if (dpc->rp && dpc_wait_rp_inactive(dpc)) - return; + return PCI_ERS_RESULT_DISCONNECT; if (dpc->rp && dpc->rp_pio_status) { pci_write_config_dword(pdev, dpc->cap_pos + PCI_EXP_DPC_RP_PIO_STATUS, @@ -135,6 +204,17 @@ static void interrupt_event_handler(struct work_struct *work) pci_write_config_word(pdev, dpc->cap_pos + PCI_EXP_DPC_STATUS, PCI_EXP_DPC_STATUS_TRIGGER | PCI_EXP_DPC_STATUS_INTERRUPT); + + return PCI_ERS_RESULT_RECOVERED; +} + +static void interrupt_event_handler(struct work_struct *work) +{ + struct dpc_dev *dpc = container_of(work, struct dpc_dev, work); + struct pci_dev *pdev = dpc->dev->port; + + /* From DPC point of view error is always FATAL. */ + pci_do_recovery(pdev, PCI_ERR_DPC_FATAL); } static void dpc_rp_pio_print_tlp_header(struct device *dev, @@ -339,6 +419,7 @@ static void dpc_remove(struct pcie_device *dev) .service= PCIE_PORT_SERVICE_DPC, .probe = dpc_probe, .remove = dpc_remove, + .reset_link = dpc_reset_link, }; static int __init dpc_service_init(void) diff --git a/drivers/pci/pcie/pcie-err.c b/drivers/pci/pcie/pcie-err.c index d59866c..8bac584 100644 --- a/drivers/pci/pcie/pcie-err.c +++ b/drivers/pci/pcie/pcie-err.c @@ -176,7 +176,7 @@ static pci_ers_result_t pci_defaul
Re: [PATCH] backlight: otm3225a: add support for ORISE OTM3225A LCD SoC
Hello Daniel, On 22.12.2017 18:33, Daniel Thompson wrote: > On 22/12/17 17:23, Jingoo Han wrote:>> diff --git > a/drivers/video/backlight/otm3225a.c >>> b/drivers/video/backlight/otm3225a.c >>> new file mode 100644 >>> index 000..0de75f8 >>> --- /dev/null >>> +++ b/drivers/video/backlight/otm3225a.c >>> @@ -0,0 +1,210 @@ >>> +/* >>> + * Driver for ORISE Technology OTM3225A SOC for TFT LCD >>> + * >>> + * Copyright (C) 2014-2017, EETS GmbH, Felix Brack >> >> Please change the year of copyright as below. > >> + * Copyright (C) 2017, EETS GmbH, Felix Brack > > ... and include (or just rely entirely on) a SPDX header to describe the > licensing of the file. > Thanks for the hint. I have opted for this first line in the source file: "// SPDX-License-Identifier: GPL-2.0" > > Daniel. kind regards, Felix
Re: [PATCH v5 05/78] xarray: Replace exceptional entries
On Tue, Dec 26, 2017 at 07:05:34PM -0800, Matthew Wilcox wrote: > On Tue, Dec 26, 2017 at 08:15:42PM +0300, Kirill A. Shutemov wrote: > > > 28 files changed, 249 insertions(+), 240 deletions(-) > > > > Everything looks fine to me after quick scan, but hat's a lot of changes for > > one patch... > > Yeah. It's pretty mechanical though. > > > > - if (radix_tree_exceptional_entry(page)) { > > > + if (xa_is_value(page)) { > > > if (!invalidate_exceptional_entry2(mapping, > > > index, page)) > > > ret = -EBUSY; > > > > invalidate_exceptional_entry? Are we going to leave the terminology here as > > is? > > That is a great question. If the page cache wants to call its value > entries exceptional entries, it can continue to do that. I think there's > a better name for them, but I'm not sure what it is. Right now, the > page cache uses value entries to store: > > 1. Shadow entries (for workingset) > 2. Swap entries (for shmem) > 3. DAX entries > > I can't come up with a good name for these three things. 'nonpage' is > the only thing which hasn't immediately fallen off my ideas list. Yeah, naming problem... > But I think renaming exceptional entries in the page cache is a great idea, > and I don't want to do it as part of this patch set ;-) Fair enough. -- Kirill A. Shutemov
Re: [PATCH v5 06/78] xarray: Change definition of sibling entries
On Tue, Dec 26, 2017 at 07:13:26PM -0800, Matthew Wilcox wrote: > On Tue, Dec 26, 2017 at 08:21:53PM +0300, Kirill A. Shutemov wrote: > > > +/** > > > + * xa_is_internal() - Is the entry an internal entry? > > > + * @entry: Entry retrieved from the XArray > > > + * > > > + * Return: %true if the entry is an internal entry. > > > + */ > > > > What does it mean "internal entry"? Is it just a term for non-value and > > non-data pointer entry? Do we allow anybody besides xarray implementation to > > use internal entires? > > > > Do we have it documented? > > We do! include/linux/radix-tree.h has it documented right now: Looks good. Thanks. -- Kirill A. Shutemov
Re: [PATCH V1 3/4] usb: serial: f81534: add output pin control
On Thu, Dec 21, 2017 at 05:49:45PM +0800, Ji-Ze Hong (Peter Hong) wrote: > Hi Johan, > > Johan Hovold 於 2017/12/19 上午 12:06 寫道: > > On Thu, Nov 16, 2017 at 03:46:08PM +0800, Ji-Ze Hong (Peter Hong) wrote: > >> +static int f81534_set_port_output_pin(struct usb_serial_port *port) > >> +{ > >> + struct f81534_serial_private *serial_priv; > >> + struct f81534_port_private *port_priv; > >> + struct usb_serial *serial; > >> + const struct f81534_port_out_pin *pins; > >> + int status; > >> + int i; > >> + u8 value; > >> + u8 idx; > >> + > >> + serial = port->serial; > >> + serial_priv = usb_get_serial_data(serial); > >> + port_priv = usb_get_serial_port_data(port); > >> + > >> + idx = F81534_CONF_GPIO_OFFSET + port_priv->phy_num; > >> + value = serial_priv->conf_data[idx]; > >> + pins = &f81534_port_out_pins[port_priv->phy_num]; > >> + > >> + for (i = 0; i < ARRAY_SIZE(pins->pin); ++i) { > >> + status = f81534_set_mask_register(serial, > >> + pins->pin[i].reg_addr, pins->pin[i].reg_mask, > >> + value & BIT(i) ? pins->pin[i].reg_mask : 0); > >> + if (status) > >> + return status; > >> + } > > > > You're using 24 (get or set) accesses to update these three registers > > here. Why not read them out (if necessary), determine their new values > > and then write them back when done instead? > > > > In this code, I'm only read/write 3 registers of 0x2ae8, 0x2a90, 0x2a80, > but some register will read/write more than once. Should I change the > code from port_probe() to attach() and re-write it as: > 1: read the 3 register > 2: change them will 12 pin desire value > 3: write it back > Is it ok? Do you expect these pins to ever be changed after probe? If not, then perhaps it can be moved to attach(), but otherwise I guess they should be set at port_probe(). By using shadow registers, you should be able to reduce the number of device accesses, but perhaps it's not worth the complexity. Do you have a rough idea about how long these register updates take? I was just worried that these changes will add up to really long probe times. Thanks, Johan
Re: [PATCH 4.14 108/159] kvm, mm: account kvm related kmem slabs to kmemcg
On 23/12/2017 10:24, Greg Kroah-Hartman wrote: > For many subsystems, the maintainers _never_ mark patches for stable. > Others, they catch maybe half of the things they should be applying. > > KVM is one such example of the "half" group, they mark patches as > resolving CVE issues at times, yet don't mark them for stable. So when > I see a patch like this, it triggers the "oh, look, KVM doing the same > thing again", so I take the patch and of course cc: the > developers/maintainers so they can object if they want to. In general there are some cases where I tend to be conservative on applying the "stable" tag, for example: 1) sometimes I'm not very familiar with API changes in the other subsystems (this was the case for this patch). If I am not sure of the amount of backporting effort required, and the bug is not super important, I don't mark it as stable because I don't want to later drop a complex backport on the floor. I prefer to have fewer patches applied, but know that the fixes are backported to all branches. 2) not all bugs are equal; a WARN_ON_ONCE from a syzkaller testcase for example doesn't really matter to a cloud provider that uses KVM, because invalid API usage is not controlled by the customer. But an oops or BUG_ON probably *will* get CCed to stable. So some patches for syzkaller bugs may be CCed, some may not. IIRC the CVE that you mention was a guest user->kernel escalation, but it didn't affect Linux guests at all, and it couldn't be fixed completely on Windows guests because Windows has another bug in the same area. Plus, I knew there would be different conflicts on all LTS branches, so I decided to not mark it for stable. I did dutifully provide a backport when someone (either you or Ben Hutchings) asked for one, though. It does happen that Radim or I forget to Cc stable, so I'm okay with you picking up more patches than what I mark and I will happily do the backports for you. Still, there is some thought put into whether to CC stable or not. :) Thanks, Paolo > Over time you get to know what subsystems are like this and what are > not. MM is one that is really good, I almost never take a mm patch > without being told explicitly to do so. Others are horrible and never > mark anything, so stuff has to be picked up manually through Sasha's > process or through other ways. > > So it's not a perfect system, but it seems to work "good enough", and if > you ever have any questions about any patch, always feel free to ask, > there's usually a story behind almost every one...
Re: [PATCH] spi: Add a sysfs interface to instantiate devices
On Sat, Dec 23, 2017 at 09:58:51AM +0100, Geert Uytterhoeven wrote: > >> > > + struct spi_board_info bi = { > >> > > + .modalias = "spidev", > I would make it a little bit more generic and extract the modalias from the > string written. Right, that'd be much better. signature.asc Description: PGP signature
Re: [PATCH 0/3] mtd: spi-nor: fix DMA-unsafe buffer issue between MTD and SPI
On Tue, Dec 26, 2017 at 06:45:28PM +, Trent Piepho wrote: > Or, since this only fixes instances of DMA-unsafe buffers used in > access to SPI NOR flash chips, and since there are other SPI master > interface users, those chip specific fixes in some/all spi master > drivers are still needed to fix transfers not originated via spi-nor? SPI client drivers are *supposed* to use DMA safe memory already. How often that happens in cases where it matters is a separate question, we definitely have users with smaller transfers that don't do the right thing but they're normally done using PIO anyway. signature.asc Description: PGP signature
Re: [Ocfs2-devel] [PATCH] ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE
Hi Jun, >>> > Hi Gang, > > Do you mean that too many retrys in loop cast losts of CPU-time and > block page-fault interrupt? We should not add any delay in > ocfs2_fault(), right? And I still feel a little confused why your > method can solve this problem. You can see the related code in function filemap_fault(), if ocfs2 fails to read a page since it can not get a inode lock with non-block mode, the VFS layer code will invoke ocfs2 read page call back function circularly, this will lead to a softlockup problem (like the below back trace). So, we should get a blocking lock to let the dlm lock to this node and also can avoid CPU loop, second, base on my testing, the patch also can improve the efficiency in case modifying the same file frequently from multiple nodes, since the lock acquisition chance is more fair. In fact, the code was modified by a patch 1cce4df04f37 ("ocfs2: do not lock/unlock() inode DLM lock"), before that patch, the code is the same, this patch can be considered to revert that patch, except adding more clear comments. Thanks Gang > > thanks, > Jun > > On 2017/12/27 17:29, Gang He wrote: >> If we can't get inode lock immediately in the function >> ocfs2_inode_lock_with_page() when reading a page, we should not >> return directly here, since this will lead to a softlockup problem. >> The method is to get a blocking lock and immediately unlock before >> returning, this can avoid CPU resource waste due to lots of retries, >> and benefits fairness in getting lock among multiple nodes, increase >> efficiency in case modifying the same file frequently from multiple >> nodes. >> The softlockup problem looks like, >> Kernel panic - not syncing: softlockup: hung tasks >> CPU: 0 PID: 885 Comm: multi_mmap Tainted: G L 4.12.14-6.1-default #1 >> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 >> Call Trace: >> >> dump_stack+0x5c/0x82 >> panic+0xd5/0x21e >> watchdog_timer_fn+0x208/0x210 >> ? watchdog_park_threads+0x70/0x70 >> __hrtimer_run_queues+0xcc/0x200 >> hrtimer_interrupt+0xa6/0x1f0 >> smp_apic_timer_interrupt+0x34/0x50 >> apic_timer_interrupt+0x96/0xa0 >> >> RIP: 0010:unlock_page+0x17/0x30 >> RSP: :af154080bc88 EFLAGS: 0246 ORIG_RAX: ff10 >> RAX: dead0100 RBX: f21e009f5300 RCX: 0004 >> RDX: dead00ff RSI: 0202 RDI: f21e009f5300 >> RBP: R08: R09: af154080bb00 >> R10: af154080bc30 R11: 0040 R12: 993749a39518 >> R13: R14: f21e009f5300 R15: f21e009f5300 >> ocfs2_inode_lock_with_page+0x25/0x30 [ocfs2] >> ocfs2_readpage+0x41/0x2d0 [ocfs2] >> ? pagecache_get_page+0x30/0x200 >> filemap_fault+0x12b/0x5c0 >> ? recalc_sigpending+0x17/0x50 >> ? __set_task_blocked+0x28/0x70 >> ? __set_current_blocked+0x3d/0x60 >> ocfs2_fault+0x29/0xb0 [ocfs2] >> __do_fault+0x1a/0xa0 >> __handle_mm_fault+0xbe8/0x1090 >> handle_mm_fault+0xaa/0x1f0 >> __do_page_fault+0x235/0x4b0 >> trace_do_page_fault+0x3c/0x110 >> async_page_fault+0x28/0x30 >> RIP: 0033:0x7fa75ded638e >> RSP: 002b:7ffd6657db18 EFLAGS: 00010287 >> RAX: 55c7662fb700 RBX: 0001 RCX: 55c7662fb700 >> RDX: 1770 RSI: 7fa75e909000 RDI: 55c7662fb700 >> RBP: 0003 R08: 000e R09: >> R10: 0483 R11: 7fa75ded61b0 R12: 7fa75e90a770 >> R13: 000e R14: 1770 R15: >> >> Fixes: 1cce4df04f37 ("ocfs2: do not lock/unlock() inode DLM lock") >> Signed-off-by: Gang He >> --- >> fs/ocfs2/dlmglue.c | 9 + >> 1 file changed, 9 insertions(+) >> >> diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c >> index 4689940..5193218 100644 >> --- a/fs/ocfs2/dlmglue.c >> +++ b/fs/ocfs2/dlmglue.c >> @@ -2486,6 +2486,15 @@ int ocfs2_inode_lock_with_page(struct inode *inode, >> ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK); >> if (ret == -EAGAIN) { >> unlock_page(page); >> +/* >> + * If we can't get inode lock immediately, we should not return >> + * directly here, since this will lead to a softlockup problem. >> + * The method is to get a blocking lock and immediately unlock >> + * before returning, this can avoid CPU resource waste due to >> + * lots of retries, and benefits fairness in getting lock. >> + */ >> +if (ocfs2_inode_lock(inode, ret_bh, ex) == 0) >> +ocfs2_inode_unlock(inode, ex); >> ret = AOP_TRUNCATED_PAGE; >> } >> >>
Applied "regmap: debugfs: document why we don't create the debugfs entries" to the regmap tree
The patch regmap: debugfs: document why we don't create the debugfs entries has been applied to the regmap tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From 078711d7f88d33b0adebb402a1bcb2aa89afe68b Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Fri, 22 Dec 2017 18:42:08 +0100 Subject: [PATCH] regmap: debugfs: document why we don't create the debugfs entries This is a follow-up to commit a5ba91c380b8 ("regmap: debugfs: emit a debug message when locking is disabled"). I figured that a user may see this message, grep the code, come to this place and he still won't know why we actually disabled debugfs. Add a comment explaining the reason. Signed-off-by: Bartosz Golaszewski Signed-off-by: Mark Brown --- drivers/base/regmap/regmap-debugfs.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/base/regmap/regmap-debugfs.c b/drivers/base/regmap/regmap-debugfs.c index ae962b756863..f3266334063e 100644 --- a/drivers/base/regmap/regmap-debugfs.c +++ b/drivers/base/regmap/regmap-debugfs.c @@ -529,6 +529,13 @@ void regmap_debugfs_init(struct regmap *map, const char *name) struct regmap_range_node *range_node; const char *devname = "dummy"; + /* +* Userspace can initiate reads from the hardware over debugfs. +* Normally internal regmap structures and buffers are protected with +* a mutex or a spinlock, but if the regmap owner decided to disable +* all locking mechanisms, this is no longer the case. For safety: +* don't create the debugfs entries if locking is disabled. +*/ if (map->debugfs_disable) { dev_dbg(map->dev, "regmap locking disabled - not creating debugfs entries\n"); return; -- 2.15.0
Applied "regmap: Add one flag to indicate if a hwlock should be used" to the regmap tree
The patch regmap: Add one flag to indicate if a hwlock should be used has been applied to the regmap tree at https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark >From a4887813c3a9481ab87c8a71ab1de50b975cc823 Mon Sep 17 00:00:00 2001 From: Baolin Wang Date: Mon, 25 Dec 2017 14:37:09 +0800 Subject: [PATCH] regmap: Add one flag to indicate if a hwlock should be used Since the hwlock id 0 is valid for hardware spinlock core, but now id 0 is treated as one invalid value for regmap. Thus we should add one extra flag for regmap config to indicate if a hardware spinlock should be used, then id 0 can be valid for regmap to request. Signed-off-by: Baolin Wang Signed-off-by: Mark Brown --- drivers/base/regmap/regmap.c | 2 +- include/linux/regmap.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c index f25ab18ca057..d23a5c99b639 100644 --- a/drivers/base/regmap/regmap.c +++ b/drivers/base/regmap/regmap.c @@ -671,7 +671,7 @@ struct regmap *__regmap_init(struct device *dev, map->lock = config->lock; map->unlock = config->unlock; map->lock_arg = config->lock_arg; - } else if (config->hwlock_id) { + } else if (config->use_hwlock) { map->hwlock = hwspin_lock_request_specific(config->hwlock_id); if (!map->hwlock) { ret = -ENXIO; diff --git a/include/linux/regmap.h b/include/linux/regmap.h index 15eddc1353ba..c78e0057df66 100644 --- a/include/linux/regmap.h +++ b/include/linux/regmap.h @@ -317,6 +317,7 @@ typedef void (*regmap_unlock)(void *); * * @ranges: Array of configuration entries for virtual address ranges. * @num_ranges: Number of range configuration entries. + * @use_hwlock: Indicate if a hardware spinlock should be used. * @hwlock_id: Specify the hardware spinlock id. * @hwlock_mode: The hardware spinlock mode, should be HWLOCK_IRQSTATE, * HWLOCK_IRQ or 0. @@ -365,6 +366,7 @@ struct regmap_config { const struct regmap_range_cfg *ranges; unsigned int num_ranges; + bool use_hwlock; unsigned int hwlock_id; unsigned int hwlock_mode; }; -- 2.15.0
Re: [PATCH v6 00/11] Intel SGX Driver
On Dec 12, 3:07pm, Pavel Machek wrote: } Subject: Re: [PATCH v6 00/11] Intel SGX Driver Good morning, I hope this note finds the holiday season going well for everyone. This note is a bit delayed due to the holidays, my apologies. Pretty wide swath on this e-mail but will include the copy list due to the possible general interest and impact of these issues. We have done an independent implementation of Intel's platform software (PSW), directed at the use of SGX on intelligent network endpoint devices, so we have some experience with the issues under discussion. > On Sat 2017-11-25 21:29:17, Jarkko Sakkinen wrote: > > Intel(R) SGX is a set of CPU instructions that can be used by > > applications to set aside private regions of code and data. The > > code outside the enclave is disallowed to access the memory inside > > the enclave by the CPU access control. In a way you can think > > that SGX provides inverted sandbox. It protects the application > > from a malicious host. > Would you list guarantees provided by SGX? Obviously, confidentiality and integrity. SGX was designed to address an Iago threat model, a very difficult challenge to address in reality. On SGX capable platforms, the Memory Encryption Engine (MEE) is an integrated component of the hardware MMU, as SGX is a virtual memory play. As a result, the executable code and data are encrypted in main memory and only decrypted when the data is fed from memory onto the hardware fetch queues. Irregardless of anything else, this has implications with respect to cold boot attacks, if an architect chooses to worry about that threat modality. In reality, we believe the guarantee that is most important is integrity, given the issues below. > For example, host can still observe timing of cachelines being > accessed by "protected" app, right? Can it also introduce bit flips? Timing attacks are the bane of SGX, just as they are throughout the rest of the commodity architectures. Jarkko cited Beecham's work, which is a good reference. Oakland's work on controlled side-channel attacks is also a very good, and fundamental, read on the issues involved. Microsoft Research and Georgia Tech have a paper out discussing the use of transactional memory to mitigate these. I don't have the citation immediately available, but a bit-flip attack has also been described on enclaves. Due to the nature of the architecture, they tend to crash the enclave so they are more in the category of a denial-of-service attack, rather then a functional confidentiality or integrity compromise. At the end of the day, giving up complete observational and functional control to an adversary is a difficult challenge to address. There is also a large difference between attacks that can be conducted in a carefully controlled lab environment and what an adversary or malware can implement in practice. Platforms which require security assurances ultimately need a root of trust. That either comes from a TPM or a Trusted Execution Environment like SGX. Realistically, we think the future involves an integration of both technologies. The only other alternative is perfect software and I think the jury has already weighed in on that. The advantage of SGX over a TPM is that it is blindingly fast with respect to performance. The IMA community has been involved in a debate over the list digest patches in order to overcome performance issues with TPM based extension measurements. We lifted most of the IMA infrastructure into an SGX enclave and demonstrated significant performance impacts as a result. The bigger question, for community integration, is the availability of hardware. I see Jarkko's patches are based on the notion of having flexible launch control available, ie. the ability to program the relevant MSR's with the checksum of the identity modulus which is to serve as the root of trust. I'm not sure there is any hardware in the wild that currently supports this, Jarkko comments? Even with that, the question arises as to what is going to be trusted to program those registers. The obvious candidate for this is TXT/tboot which underscores a future involving the integration of these technologies. Unfortunately, in the security field it is way more fun, and seemingly advantageous from a reputational perspective, to break things then to build solutions :-)( > Pavel I hope the above clarifications are helpful. Best wishes for a pleasant holiday weekend to everyone. Dr. Greg }-- End of excerpt from Pavel Machek As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: g...@enjellic.com -- "I suppose that could could happen but he wouldn't know a Galois Field if it kicked him in the nuts."
Re: [PATCH] USB: serial: ftdi_sio: add id for Airbus DS P8GR
On Wed, Dec 20, 2017 at 08:47:44PM +0100, Max Schulze wrote: > Add AIRBUS_DS_P8GR device IDs to ftdi_sio driver. > > Signed-off-by: Max Schulze Thanks for the patch. Note that I moved the new defines to try to keep (some of) the ids sorted on VID, and dropped the comment header in the id-table before applying. Johan
Re: [PATCH 10/11 v3] ARM: s3c24xx/s3c64xx: constify gpio_led
Hi, On Wednesday 27 December 2017 01:49 PM, Krzysztof Kozlowski wrote: On Tue, Dec 26, 2017 at 7:50 PM, Arvind Yadav wrote: gpio_led are not supposed to change at runtime. struct gpio_led_platform_data working with const gpio_led provided by . So mark the non-const structs as const. Signed-off-by: Arvind Yadav --- changes in v2: The GPIO LED driver can be built as a module, it can be loaded after the init sections have gone away. So removed '__initconst'. changes in v3: Description was missing. arch/arm/mach-s3c24xx/mach-h1940.c| 2 +- arch/arm/mach-s3c24xx/mach-rx1950.c | 2 +- arch/arm/mach-s3c64xx/mach-hmt.c | 2 +- arch/arm/mach-s3c64xx/mach-smartq5.c | 2 +- arch/arm/mach-s3c64xx/mach-smartq7.c | 2 +- arch/arm/mach-s3c64xx/mach-smdk6410.c | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) There were few build errors reported by kbuild for your patches. Are you sure that you compiled every file you touch? Best regards, Krzysztof Yes, I got few build error which I have fixed it. and send updated patch. Now I have done cross checking. It's not having any build failure. Regards arvind
[PATCH] PCI: imx6: Add PHY reference clock source support
i.MX7D variant of the IP can use either Crystal Oscillator input or internal clock input as a Reference Clock input for PCIe PHY. Add support for an optional property 'pcie-phy-refclk-internal'. If present then an internal clock input is used as PCIe PHY reference clock source. By default an external oscillator input is still used. Verified on Compulab SBC-iMX7 Single Board Computer. Signed-off-by: Ilya Ledvich --- Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt | 5 + drivers/pci/dwc/pci-imx6.c | 8 +++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt index 7b1e48b..f9cf11e 100644 --- a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt +++ b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt @@ -50,6 +50,11 @@ Additional required properties for imx7d-pcie: - "pciephy" - "apps" +Additional optional properties for imx7d-pcie: +- pcie-phy-refclk-internal: If present then an internal PLL input is used as + PCIe PHY reference clock source. By default an external ocsillator input + is used. + Example: pcie@0x0100 { diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c index b734835..a616192 100644 --- a/drivers/pci/dwc/pci-imx6.c +++ b/drivers/pci/dwc/pci-imx6.c @@ -61,6 +61,7 @@ struct imx6_pcie { u32 tx_swing_low; int link_gen; struct regulator*vpcie; + boolpciephy_refclk_sel; }; /* Parameters for the waiting for PCIe PHY PLL to lock on i.MX7 */ @@ -474,7 +475,9 @@ static void imx6_pcie_init_phy(struct imx6_pcie *imx6_pcie) switch (imx6_pcie->variant) { case IMX7D: regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12, - IMX7D_GPR12_PCIE_PHY_REFCLK_SEL, 0); + IMX7D_GPR12_PCIE_PHY_REFCLK_SEL, + imx6_pcie->pciephy_refclk_sel ? + IMX7D_GPR12_PCIE_PHY_REFCLK_SEL : 0); break; case IMX6SX: regmap_update_bits(imx6_pcie->iomuxc_gpr, IOMUXC_GPR12, @@ -840,6 +843,9 @@ static int imx6_pcie_probe(struct platform_device *pdev) imx6_pcie->vpcie = NULL; } + imx6_pcie->pciephy_refclk_sel = + of_property_read_bool(node, "pcie-phy-refclk-internal"); + platform_set_drvdata(pdev, imx6_pcie); ret = imx6_add_pcie_port(imx6_pcie, pdev); -- 1.9.1
[PATCH v2] MIPS: Use proper Return keyword
For reference: * https://www.kernel.org/doc/html/latest/doc-guide/kernel-doc.html#function-documentation Fix non-fatal warning: arch/mips/kernel/branch.c:418: warning: Excess function parameter 'returns' description in '__compute_return_epc_for_insn' Signed-off-by: Mathieu Malaterre --- v2: Actually use the correct keyword arch/mips/kernel/branch.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/mips/kernel/branch.c b/arch/mips/kernel/branch.c index b79ed9af9886..e48f6c0a9e4a 100644 --- a/arch/mips/kernel/branch.c +++ b/arch/mips/kernel/branch.c @@ -399,7 +399,7 @@ int __MIPS16e_compute_return_epc(struct pt_regs *regs) * * @regs: Pointer to pt_regs * @insn: branch instruction to decode - * @returns: -EFAULT on error and forces SIGILL, and on success + * Return: -EFAULT on error and forces SIGILL, and on success * returns 0 or BRANCH_LIKELY_TAKEN as appropriate after * evaluating the branch. * -- 2.11.0
Re: [PATCH 1/2] ARM: dts: imx6: RDU2: disable internal watchdog
Hi Andrey, On Wed, Dec 27, 2017 at 1:56 AM, Andrey Smirnov wrote: > The system has an external watchdog in the environment processor > so the internal watchdog is of no use. > > Cc: Sascha Hauer > Cc: Fabio Estevam > Cc: Rob Herring > Cc: Mark Rutland > Cc: linux-arm-ker...@lists.infradead.org > Cc: devicet...@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: cphe...@gmail.com > Signed-off-by: Lucas Stach > Signed-off-by: Andrey Smirnov Patch looks good. Just not clear if the authorship comes from you or Lucas. If Lucas is the original author then his name should appear in the From field.
Re: [PATCH 2/2] ARM: dts: imx6: RDU2: correct RTC compatible
Hi Andrey, On Wed, Dec 27, 2017 at 1:56 AM, Andrey Smirnov wrote: > The RTC is manufactured by Maxim. This is a cosmetic fix, as Linux > doesn't match the vendor string for i2c devices. > > Cc: Sascha Hauer > Cc: Fabio Estevam > Cc: Rob Herring > Cc: Mark Rutland > Cc: linux-arm-ker...@lists.infradead.org > Cc: devicet...@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: cphe...@gmail.com > Signed-off-by: Lucas Stach > Signed-off-by: Andrey Smirnov This patch seems to be from Lucas: https://patchwork.kernel.org/patch/10099397/ ,so his name should appear in the From field. Anyway, this patch has been sent earlier and we suggested to keep the existing binding, which is the documented form: https://patchwork.kernel.org/patch/10099397/
Re: [PATCH v2] gpio: winbond: add driver
On Wed, Dec 27, 2017 at 1:24 AM, William Breathitt Gray wrote: > Here are the error messages printed on my system when I add a select > ISA_BUS_API line to the GPIO_WINBOND Kconfig option in the version 2 of > your patch: > > drivers/gpio/Kconfig:13:error: recursive dependency detected! > For a resolution refer to Documentation/kbuild/kconfig-language.txt > subsection "Kconfig recursive dependency limitations" > drivers/gpio/Kconfig:13:symbol GPIOLIB is selected by STX104 > For a resolution refer to Documentation/kbuild/kconfig-language.txt > subsection "Kconfig recursive dependency limitations" > drivers/iio/adc/Kconfig:659:symbol STX104 depends on ISA_BUS_API > For a resolution refer to Documentation/kbuild/kconfig-language.txt > subsection "Kconfig recursive dependency limitations" > arch/Kconfig:818: symbol ISA_BUS_API is selected by GPIO_WINBOND > For a resolution refer to Documentation/kbuild/kconfig-language.txt > subsection "Kconfig recursive dependency limitations" > drivers/gpio/Kconfig:701: symbol GPIO_WINBOND depends on GPIOLIB So STX104 depends on ISA_BUS_API which in turn is selected by GPIO_WINBOND which also depends on GPIOLIB. > The issue seems to relate to the select GPIOLIB line for the STX104 > Kconfig option (which has a ISA_BUS_API dependency). Switching GPIOLIB > to be a dependency, or alternatively selecting ISA_BUS_API, alleviates > the recursion. > > Linus, is my use of select GPIOLIB for the STX104 Kconfig option > appropriate in this context -- or should it instead be part of the > depends on line? The STX104 driver includes linux/gpio/driver.h and > makes use of the devm_gpiochip_add_data function to add support for some > minor auxililary GPIO lines on the STX104 device. In the STX104 case, it seems to be appropriate to select GPIOLIB, as it is a GPIO provider, not consumer. Usually I prefer that drivers just select what they need so I don't have to run around in the whole kernel tree and turn things on to the left and right before I can finally select my driver, but maybe that is just me. The other ISA GPIO drivers depends on ISA_BUS_API, I guess in difference from the symbol GPIOLIB it cannot be universally selected, so shouldn't this driver also just depends on ISA_BUS_API and select it from the machine or wherever? Yours, Linus Walleij
[PATCH] ASoC: rt5514-spi: Check the validity of drvdata pointer on resume
The rt5514-spi driver seem to assume the validity of the drvdata pointer on resume, which it may not be populated, leading to a not-so-nice crash. This stems from the fact that rt5514_spi_pcm_probe() is never called on my system (a kevin Chromebook). No idea why, but if it can happen, it is worth fixing. Fixes: e9c50aa6bd39 ("ASoC: rt5514-spi: check irq status to schedule data copy in resume function") Signed-off-by: Marc Zyngier --- sound/soc/codecs/rt5514-spi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/rt5514-spi.c b/sound/soc/codecs/rt5514-spi.c index 2df91db765ac..9255afcf2c3a 100644 --- a/sound/soc/codecs/rt5514-spi.c +++ b/sound/soc/codecs/rt5514-spi.c @@ -482,7 +482,7 @@ static int __maybe_unused rt5514_resume(struct device *dev) if (device_may_wakeup(dev)) disable_irq_wake(irq); - if (rt5514_dsp->substream) { + if (rt5514_dsp && rt5514_dsp->substream) { rt5514_spi_burst_read(RT5514_IRQ_CTRL, (u8 *)&buf, sizeof(buf)); if (buf[0] & RT5514_IRQ_STATUS_BIT) rt5514_schedule_copy(rt5514_dsp); -- 2.14.2
Re: [RESEND PATCH] blackfin: defconfig: Cleanup from old Kconfig options
On Tue, Dec 26, 2017 at 2:30 PM, Krzysztof Kozlowski wrote: > Remove old, dead Kconfig options (in order appearing in this commit): > - EXPERIMENTAL is gone since v3.9; > - INET_LRO: commit 7bbf3cae65b6 ("ipv4: Remove inet_lro library"); > - USB_DEVICE_CLASS: commit 007bab91324e ("USB: remove >CONFIG_USB_DEVICE_CLASS"); > - HID_SUPPORT: commit 1f41a6a99476 ("HID: Fix the generic Kconfig >options"); > - NETDEV_1000 and NETDEV_1: commit f860b0522f65 ("drivers/net: >Kconfig and Makefile cleanup"); NET_ETHERNET should be replaced with >just ETHERNET but that is separate change; > - RCU_CPU_STALL_DETECTOR: commit a00e0d714fbd ("rcu: Remove conditional >compilation for RCU CPU stall warnings"); > - MISC_DEVICES: commit 7c5763b8453a ("drivers: misc: Remove >MISC_DEVICES config option"); > > Signed-off-by: Krzysztof Kozlowski Acked-by: Linus Walleij This architecture is becoming a burden :/ Last blackfin pull request was in June 2014 by Steven Miao. Steven, what's up with this? I don't mind the arch/blackfin as much as the plethora of boardfiles that need to be maintained as soon as we change some platform data or so, and it's just a mystery whether things really get tested and any of the changes made here since 2014 are screwing up the blackfin. No ACKs or Tested-by's ever appear. Is there active testing of mainline with blackfin? Yours, Linus Walleij
Re: [RESEND PATCH] blackfin: defconfig: Cleanup from old Kconfig options
On Wed, Dec 27, 2017 at 12:29:33PM +0100, Linus Walleij wrote: > On Tue, Dec 26, 2017 at 2:30 PM, Krzysztof Kozlowski wrote: > > > Remove old, dead Kconfig options (in order appearing in this commit): > > > > Signed-off-by: Krzysztof Kozlowski > > Acked-by: Linus Walleij > > This architecture is becoming a burden :/ > > Last blackfin pull request was in June 2014 by Steven Miao. April 2015, actually. Doesn't change the conclusion, though. > Steven, what's up with this? I don't mind the arch/blackfin as much > as the plethora of boardfiles that need to be maintained as soon as > we change some platform data or so, and it's just a mystery whether > things really get tested and any of the changes made here since 2014 > are screwing up the blackfin. No ACKs or Tested-by's ever appear. > > Is there active testing of mainline with blackfin? After multiple pings, I just (two days ago) sent the following: https://marc.info/?l=linux-kernel&m=151421639229901 So even if there's no actual testing, at the very least it'd be good to mark the architecture as orphaned, so people don't wait for ACKs that never come. Meow! -- // If you believe in so-called "intellectual property", please immediately // cease using counterfeit alphabets. Instead, contact the nearest temple // of Amon, whose priests will provide you with scribal services for all // your writing needs, for Reasonable And Non-Discriminatory prices.
RE: [PATCH 1/4] lockd: convert nlm_host.h_count from atomic_t to refcount_t
> On Fri, Dec 22, 2017 at 09:25:53AM -0500, J. Bruce Fields wrote: > > On Fri, Dec 22, 2017 at 09:29:15AM +, Reshetova, Elena wrote: > > > > > > On Wed, Nov 29, 2017 at 01:15:43PM +0200, Elena Reshetova wrote: > > > > atomic_t variables are currently used to implement reference > > > > counters with the following properties: > > > > - counter is initialized to 1 using atomic_set() > > > > - a resource is freed upon counter reaching zero > > > > - once counter reaches zero, its further > > > >increments aren't allowed > > > > - counter schema uses basic atomic operations > > > >(set, inc, inc_not_zero, dec_and_test, etc.) > > > > > > >Whoops, I forgot that this doesn't apply to h_count. > > > > > > >Well, it's confusing, because h_count is actually used in two different > > > >ways: depending on whether a nlm_host represents a client or server, it > > > >may have the above properties or not. > > > > > > > > > So, what happens when it is not having the above properties? Is the object > > > being reused or? > > > > The object isn't destroyed when the counter hits zero--zero is just > > taken as a hint to some garbage collection algorithm that it would be OK > > to destroy it. So decrementing to or incrementing from zero is OK. > > In more detail: the nlm_host objects that are used on the NFS server to > represent NFS clients are put by nlmsvc_release_host, and then may > eventually be freed by nlm_gc_hosts. > > The nlm_host objects that are used on the NFS client to represent NFS > servers are put (and freed when h_count goes to zero) by > nlmclnt_release_host. > > In both cases reference are taken by nlm_get_host. It would be possible > to replace nlm_get_host by two different functions if that would help. > Most callers are obviously only client-side or server-side. The only > exception is next_host_state. It could be passed a pointer to the "get" > function it should use. > > After that we might actually just want to define separate client and > server structs like: > > struct nlm_clnt_host { > struct nlm_host ch_host; > refcount_t ch_count; > ... > } > > struct nlm_srv_host { > struct nlm_host sh_host; > refcount_t sh_count; > ... > } > > rather than have a single h_count which is used in two confusingly > different ways. There are also some other nlm_host fields that really > only make sense for client or server. This sounds reasonable for me, but obviously it is a bigger change and I might not have enough knowledge on NFS to make it correctly. In any case, even for the current server case, when freeing might not happen and object gets re-used later on, is it possible to simply re-initialize the object (and its reference counter) properly before reusing? I think this is the only thing that is needed from the correct refcounting POV in this case, so instead of using refcount_inc() on reused object, you would explicitly do refcount_set(counter, 1) when reuse happens. Best Regards, Elena > > --b.
[PATCH] clk: qcom: Add support for controlling Fabia PLL
Fabia PLL is a Digital Frequency Locked Loop (DFLL) clock generator which has a wide range of frequency output. It supports dynamic updating of the output frequency ("frequency slewing") without need to turn off the PLL before configuration. Add support for initial configuration and programming sequence to control fabia PLLs. Signed-off-by: Amit Nischal --- drivers/clk/qcom/clk-alpha-pll.c | 305 +++ drivers/clk/qcom/clk-alpha-pll.h | 16 ++ 2 files changed, 321 insertions(+) diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c index ad7478b..947607d 100644 --- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -58,6 +58,8 @@ #define PLL_TEST_CTL(p)((p)->offset + (p)->regs[PLL_OFF_TEST_CTL]) #define PLL_TEST_CTL_U(p) ((p)->offset + (p)->regs[PLL_OFF_TEST_CTL_U]) #define PLL_STATUS(p) ((p)->offset + (p)->regs[PLL_OFF_STATUS]) +#define PLL_OPMODE(p) ((p)->offset + (p)->regs[PLL_OFF_OPMODE]) +#define PLL_FRAC(p)((p)->offset + (p)->regs[PLL_OFF_FRAC]) const u8 clk_alpha_pll_regs[][PLL_OFF_MAX_REGS] = { [CLK_ALPHA_PLL_TYPE_DEFAULT] = { @@ -90,6 +92,18 @@ [PLL_OFF_TEST_CTL] = 0x1c, [PLL_OFF_STATUS] = 0x24, }, + [CLK_ALPHA_PLL_TYPE_FABIA] = { + [PLL_OFF_L_VAL] = 0x04, + [PLL_OFF_USER_CTL] = 0x0c, + [PLL_OFF_USER_CTL_U] = 0x10, + [PLL_OFF_CONFIG_CTL] = 0x14, + [PLL_OFF_CONFIG_CTL_U] = 0x18, + [PLL_OFF_TEST_CTL] = 0x1c, + [PLL_OFF_TEST_CTL_U] = 0x20, + [PLL_OFF_STATUS] = 0x24, + [PLL_OFF_OPMODE] = 0x2c, + [PLL_OFF_FRAC] = 0x38, + }, }; /* @@ -107,6 +121,12 @@ #define PLL_HUAYRA_N_MASK 0xff #define PLL_HUAYRA_ALPHA_WIDTH 16 +#define FABIA_OPMODE_STANDBY 0x0 +#define FABIA_OPMODE_RUN 0x1 + +#define FABIA_PLL_OUT_MASK 0x7 +#define FABIA_PLL_RATE_MARGIN 500 + #define pll_alpha_width(p) \ ((PLL_ALPHA_VAL_U(p) - PLL_ALPHA_VAL(p) == 4) ? \ ALPHA_REG_BITWIDTH : ALPHA_REG_16BIT_WIDTH) @@ -819,3 +839,288 @@ static int clk_alpha_pll_postdiv_set_rate(struct clk_hw *hw, unsigned long rate, .recalc_rate = clk_alpha_pll_postdiv_recalc_rate, }; EXPORT_SYMBOL_GPL(clk_alpha_pll_postdiv_ro_ops); + +void clk_fabia_pll_configure(struct clk_alpha_pll *pll, struct regmap *regmap, +const struct alpha_pll_config *config) +{ + u32 val, mask; + + if (config->l) + regmap_write(regmap, PLL_L_VAL(pll), config->l); + + if (config->alpha) + regmap_write(regmap, PLL_FRAC(pll), config->alpha); + + if (config->config_ctl_val) + regmap_write(regmap, PLL_CONFIG_CTL(pll), + config->config_ctl_val); + + if (config->post_div_mask) { + mask = config->post_div_mask; + val = config->post_div_val; + regmap_update_bits(regmap, PLL_USER_CTL(pll), mask, val); + } + + regmap_update_bits(regmap, PLL_MODE(pll), PLL_UPDATE_BYPASS, + PLL_UPDATE_BYPASS); + + regmap_update_bits(regmap, PLL_MODE(pll), PLL_RESET_N, PLL_RESET_N); +} + +static int alpha_pll_fabia_enable(struct clk_hw *hw) +{ + int ret; + struct clk_alpha_pll *pll = to_clk_alpha_pll(hw); + u32 val, opmode_val; + + ret = regmap_read(pll->clkr.regmap, PLL_MODE(pll), &val); + if (ret) + return ret; + + /* If in FSM mode, just vote for it */ + if (val & PLL_VOTE_FSM_ENA) { + ret = clk_enable_regmap(hw); + if (ret) + return ret; + return wait_for_pll_enable_active(pll); + } + + /* Read opmode value */ + ret = regmap_read(pll->clkr.regmap, PLL_OPMODE(pll), &opmode_val); + if (ret) + return ret; + + /* Skip If PLL is already running */ + if ((opmode_val & FABIA_OPMODE_RUN) && (val & PLL_OUTCTRL)) + return 0; + + /* Disable PLL output */ + ret = regmap_update_bits(pll->clkr.regmap, PLL_MODE(pll), + PLL_OUTCTRL, 0); + if (ret) + return ret; + + /* Set Operation mode to STANBY */ + ret = regmap_write(pll->clkr.regmap, PLL_OPMODE(pll), + FABIA_OPMODE_STANDBY); + if (ret) + return ret; + + /* PLL should be in STANDBY mode before continuing */ + mb(); + + /* Bring PLL out of reset */ + ret = regmap_update_bits(pll->clkr.regmap, PLL_MODE(pll), + PLL_RESET_N, PLL_RES
[PATCH IMPROVEMENT/BUGFIX 1/1] block, bfq: limit tags for writes and async I/O
Asynchronous I/O can easily starve synchronous I/O (both sync reads and sync writes), by consuming all request tags. Similarly, storms of synchronous writes, such as those that sync(2) may trigger, can starve synchronous reads. In their turn, these two problems may also cause BFQ to loose control on latency for interactive and soft real-time applications. For example, on a PLEXTOR PX-256M5S SSD, LibreOffice Writer takes 0.6 seconds to start if the device is idle, but it takes more than 45 seconds (!) if there are sequential writes in the background. This commit addresses this issue by limiting the maximum percentage of tags that asynchronous I/O requests and synchronous write requests can consume. In particular, this commit grants a higher threshold to synchronous writes, to prevent the latter from being starved by asynchronous I/O. According to the above test, LibreOffice Writer now starts in about 1.2 seconds on average, regardless of the background workload, and apart from some rare outlier. To check this improvement, run, e.g., sudo ./comm_startup_lat.sh bfq 5 5 seq 10 "lowriter --terminate_after_init" for the comm_startup_lat benchmark in the S suite [1]. [1] https://github.com/Algodev-github/S Signed-off-by: Paolo Valente --- block/bfq-iosched.c | 77 + block/bfq-iosched.h | 12 + 2 files changed, 89 insertions(+) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index e33c5c4c9856..6f75015d18c0 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -417,6 +417,82 @@ static struct request *bfq_choose_req(struct bfq_data *bfqd, } } +/* + * See the comments on bfq_limit_depth for the purpose of + * the depths set in the function. + */ +static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt) +{ + bfqd->sb_shift = bt->sb.shift; + + /* +* In-word depths if no bfq_queue is being weight-raised: +* leaving 25% of tags only for sync reads. +* +* In next formulas, right-shift the value +* (1Usb_shift - something)), to be robust against +* any possible value of bfqd->sb_shift, without having to +* limit 'something'. +*/ + /* no more than 50% of tags for async I/O */ + bfqd->word_depths[0][0] = max((1U >1, 1U); + /* +* no more than 75% of tags for sync writes (25% extra tags +* w.r.t. async I/O, to prevent async I/O from starving sync +* writes) +*/ + bfqd->word_depths[0][1] = max(((1U >2, 1U); + + /* +* In-word depths in case some bfq_queue is being weight- +* raised: leaving ~63% of tags for sync reads. This is the +* highest percentage for which, in our tests, application +* start-up times didn't suffer from any regression due to tag +* shortage. +*/ + /* no more than ~18% of tags for async I/O */ + bfqd->word_depths[1][0] = max(((1U >4, 1U); + /* no more than ~37% of tags for sync writes (~20% extra tags) */ + bfqd->word_depths[1][1] = max(((1U >4, 1U); +} + +/* + * Async I/O can easily starve sync I/O (both sync reads and sync + * writes), by consuming all tags. Similarly, storms of sync writes, + * such as those that sync(2) may trigger, can starve sync reads. + * Limit depths of async I/O and sync writes so as to counter both + * problems. + */ +static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data) +{ + struct blk_mq_tags *tags = blk_mq_tags_from_data(data); + struct bfq_data *bfqd = data->q->elevator->elevator_data; + struct sbitmap_queue *bt; + + if (op_is_sync(op) && !op_is_write(op)) + return; + + if (data->flags & BLK_MQ_REQ_RESERVED) { + if (unlikely(!tags->nr_reserved_tags)) { + WARN_ON_ONCE(1); + return; + } + bt = &tags->breserved_tags; + } else + bt = &tags->bitmap_tags; + + if (unlikely(bfqd->sb_shift != bt->sb.shift)) + bfq_update_depths(bfqd, bt); + + data->shallow_depth = + bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)]; + + bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", + __func__, bfqd->wr_busy_queues, op_is_sync(op), + data->shallow_depth); +} + static struct bfq_queue * bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root, sector_t sector, struct rb_node **ret_parent, @@ -5265,6 +5341,7 @@ static struct elv_fs_entry bfq_attrs[] = { static struct elevator_type iosched_bfq_mq = { .ops.mq = { + .limit_depth= bfq_limit_depth, .prepare_request= bfq_prepare_request, .finish_
[PATCH IMPROVEMENT/BUGFIX 0/1] block, bfq: address starvation caused by tag consumption
Hi Jens, all, here's the patch I anticipated in my last email. It addresses (serious) starvation problems caused by request-tag exhaustion, as explained in more detail in the commit message. I started from the solution in the function kyber_limit_depth, but then I had to define more articulate limits, to counter starvation also in cases not covered in kyber_limit_depth. If this solution proves to be effective, I'm willing to port it somehow to the other schedulers. Thanks, Paolo Paolo Valente (1): block, bfq: limit tags for writes and async I/O block/bfq-iosched.c | 77 + block/bfq-iosched.h | 12 + 2 files changed, 89 insertions(+) -- 2.15.1
Re: [PATCH] clk: mediatek: remove unnecessary include header from reset.c
Hi Sean, Stephen, On Wed, 27 Dec 2017 11:33:00 +0800, Sean Wang wrote: > On Tue, 2017-12-26 at 17:10 -0800, Stephen Boyd wrote: > > drivers/clk/mediatek/reset.c:64:6: warning: symbol > > 'mtk_register_reset_controller' was not declared. Should it be static? > > It cannot be static since the function would be referenced in other > files under the same folder > > > One point I felt confused which is I didn't see the warning complains > when I did these build test, even I also added -Werror and -Wall to > build all files under driver/clk/mediatek. My toolchain is based on gcc > version 5.2.0 (GCC). I tested and I get the warning here (gcc 4.8.5 on SUSE) but only after setting CONFIG_RESET_CONTROLLER=y. Without it, drivers/clk/mediatek/reset.o is never built, so no warning can be generated. > If the warning still is, the include "clk-mtk.h" should be good to stay > there because the declaration it needs is in the clk-mtk.h Agreed. -- Jean Delvare SUSE L3 Support
Re: [PATCH v2] x86/kexec: Exclude GART aperture from vmcore
On Wed, Dec 27, 2017 at 03:44:49PM +0800, Baoquan He wrote: > > yes, instead of crashing the machine (because GART may be initialized in the > > 2nd kernel, overlapping the 1st kernel memory, which the 2nd kernel with its > > fake e820 map sees as unused). > > > > I'd say this is an improvement. > > I don't get what you said. If 'iommu=off' only specified in 1st kernel, > kdump kernel will think the memory which GART bar pointed as a hole. > This is incorrect. I don't see the improvement. So he says, this memory is unused. Why is that incorrect?!? Wh do I care about dumping unused memory?!?! -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. --
[PATCH 0/2] Add efuse driver for Ingenic JZ4780 SoC
This patchset bring support for read-only access to the JZ4780 efuse as found on MIPS Creator CI20. To keep the driver as simple as possible, it was not possible to re-use most of the nvmem core functionalities. This driver is not compatible with the original efuse driver as found in the custom linux kernel from upstream (1), in particular it does not expose to the users neither: `/sys/devices/platform/*/chip_id` nor `/sys/devices/platform/*/user_id`. The goal of this driver is to provide access to the MAC address to the dm9000 driver. (1) https://github.com/ZubairLK/CI20_linux/commit/6efd4ffca7dcfaff0794ab60cd6922ce96c60419 Mathieu Malaterre (1): dts: Probe efuse for CI20 PrasannaKumar Muralidharan (1): nvmem: add driver for JZ4780 efuse .../ABI/testing/sysfs-driver-jz4780-efuse | 16 ++ .../bindings/nvmem/ingenic,jz4780-efuse.txt| 17 ++ MAINTAINERS| 5 + arch/mips/boot/dts/ingenic/jz4780.dtsi | 40 ++- arch/mips/configs/ci20_defconfig | 2 + drivers/nvmem/Kconfig | 10 + drivers/nvmem/Makefile | 2 + drivers/nvmem/jz4780-efuse.c | 274 + 8 files changed, 354 insertions(+), 12 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-driver-jz4780-efuse create mode 100644 Documentation/devicetree/bindings/nvmem/ingenic,jz4780-efuse.txt create mode 100644 drivers/nvmem/jz4780-efuse.c -- 2.11.0
[PATCH 1/2] nvmem: add driver for JZ4780 efuse
From: PrasannaKumar Muralidharan This patch brings support for the JZ4780 efuse. Currently it only expose a read only access to the entire 8K bits efuse memory. Tested-by: Mathieu Malaterre Signed-off-by: PrasannaKumar Muralidharan --- .../ABI/testing/sysfs-driver-jz4780-efuse | 16 ++ .../bindings/nvmem/ingenic,jz4780-efuse.txt| 17 ++ MAINTAINERS| 5 + arch/mips/boot/dts/ingenic/jz4780.dtsi | 40 ++- drivers/nvmem/Kconfig | 10 + drivers/nvmem/Makefile | 2 + drivers/nvmem/jz4780-efuse.c | 274 + 7 files changed, 352 insertions(+), 12 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-driver-jz4780-efuse create mode 100644 Documentation/devicetree/bindings/nvmem/ingenic,jz4780-efuse.txt create mode 100644 drivers/nvmem/jz4780-efuse.c diff --git a/Documentation/ABI/testing/sysfs-driver-jz4780-efuse b/Documentation/ABI/testing/sysfs-driver-jz4780-efuse new file mode 100644 index ..bb6f5d6ceea0 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-driver-jz4780-efuse @@ -0,0 +1,16 @@ +What: /sys/devices/*//nvmem +Date: December 2017 +Contact: PrasannaKumar Muralidharan +Description: read-only access to the efuse on the Ingenic JZ4780 SoC + The SoC has a one time programmable 8K efuse that is + split into segments. The driver supports read only. + The segments are + 0x000 64 bit Random Number + 0x008 128 bit Ingenic Chip ID + 0x018 128 bit Customer ID + 0x028 3520 bit Reserved + 0x1E08 bit Protect Segment + 0x1E1 2296 bit HDMI Key + 0x300 2048 bit Security boot key +Users: any user space application which wants to read the Chip + and Customer ID diff --git a/Documentation/devicetree/bindings/nvmem/ingenic,jz4780-efuse.txt b/Documentation/devicetree/bindings/nvmem/ingenic,jz4780-efuse.txt new file mode 100644 index ..cd6d67ec22fc --- /dev/null +++ b/Documentation/devicetree/bindings/nvmem/ingenic,jz4780-efuse.txt @@ -0,0 +1,17 @@ +Ingenic JZ EFUSE driver bindings + +Required properties: +- "compatible" Must be set to "ingenic,jz4780-efuse" +- "reg"Register location and length +- "clocks" Handle for the ahb clock for the efuse. +- "clock-names"Must be "bus_clk" + +Example: + +efuse: efuse@134100d0 { + compatible = "ingenic,jz4780-efuse"; + reg = <0x134100D0 0xFF>; + + clocks = <&cgu JZ4780_CLK_AHB2>; + clock-names = "bus_clk"; +}; diff --git a/MAINTAINERS b/MAINTAINERS index a6e86e20761e..7a050c20c533 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6902,6 +6902,11 @@ M: Zubair Lutfullah Kakakhel S: Maintained F: drivers/dma/dma-jz4780.c +INGENIC JZ4780 EFUSE Driver +M: PrasannaKumar Muralidharan +S: Maintained +F: drivers/nvmem/jz4780-efuse.c + INGENIC JZ4780 NAND DRIVER M: Harvey Hunt L: linux-...@lists.infradead.org diff --git a/arch/mips/boot/dts/ingenic/jz4780.dtsi b/arch/mips/boot/dts/ingenic/jz4780.dtsi index 9b5794667aee..3fb9d916a2ea 100644 --- a/arch/mips/boot/dts/ingenic/jz4780.dtsi +++ b/arch/mips/boot/dts/ingenic/jz4780.dtsi @@ -224,21 +224,37 @@ reg = <0x10002000 0x100>; }; - nemc: nemc@1341 { - compatible = "ingenic,jz4780-nemc"; - reg = <0x1341 0x1>; - #address-cells = <2>; + + ahb2: ahb2 { + compatible = "simple-bus"; + #address-cells = <1>; #size-cells = <1>; - ranges = <1 0 0x1b00 0x100 - 2 0 0x1a00 0x100 - 3 0 0x1900 0x100 - 4 0 0x1800 0x100 - 5 0 0x1700 0x100 - 6 0 0x1600 0x100>; + ranges = <>; + + nemc: nemc@1341 { + compatible = "ingenic,jz4780-nemc"; + reg = <0x1341 0x1>; + #address-cells = <2>; + #size-cells = <1>; + ranges = <1 0 0x1b00 0x100 + 2 0 0x1a00 0x100 + 3 0 0x1900 0x100 + 4 0 0x1800 0x100 + 5 0 0x1700 0x100 + 6 0 0x1600 0x100>; + + clocks = <&cgu JZ4780_CLK_NEMC>; + + status = "disabled"; + }; - clocks = <&cgu JZ4780_CLK_NEMC>; + efuse: efuse@134100d0 { + compatible
[PATCH 2/2] dts: Probe efuse for CI20
Signed-off-by: Mathieu Malaterre --- arch/mips/configs/ci20_defconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/mips/configs/ci20_defconfig b/arch/mips/configs/ci20_defconfig index b5f4ad8f2c45..62c63617e97a 100644 --- a/arch/mips/configs/ci20_defconfig +++ b/arch/mips/configs/ci20_defconfig @@ -171,3 +171,5 @@ CONFIG_STACKTRACE=y # CONFIG_FTRACE is not set CONFIG_CMDLINE_BOOL=y CONFIG_CMDLINE="earlycon console=ttyS4,115200 clk_ignore_unused" +CONFIG_NVMEM=y +CONFIG_JZ4780_EFUSE=y -- 2.11.0
[RFC PATCH] memory-hotplug: add sysfs immovable_mem attribute
In sometimes users specify the memory region in immovable node in some kernel commandline, such as "kernel_core" or the "immovable_mem=" in the patchset that I have send. But users don't know the memory region. So add this interface to print it. It will show like this: "nn@ss,nn@ss,...". "nn" means the size of memory region, "ss" means the start position of this region. Signed-off-by: Chao Fan --- drivers/base/memory.c | 50 ++ 1 file changed, 50 insertions(+) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 1d60b58a8c19..9cadf1a9dccb 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -25,6 +25,7 @@ #include #include +#include static DEFINE_MUTEX(mem_sysfs_mutex); @@ -389,6 +390,52 @@ static ssize_t show_phys_device(struct device *dev, } #ifdef CONFIG_MEMORY_HOTREMOVE +/* + * Immovable memory region + */ + +static ssize_t +show_immovable_mem(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct acpi_table_header *table_header = NULL; + struct acpi_srat_mem_affinity *ma; + struct acpi_subtable_header *th; + unsigned long long table_size; + unsigned long long table_end; + char pbuf[35], *p = buf; + int len; + + acpi_get_table(ACPI_SIG_SRAT, 0, &table_header); + + table_size = sizeof(struct acpi_table_srat); + table_end = (unsigned long)table_header + table_header->length; + th = (struct acpi_subtable_header *)((unsigned long) + table_header + table_size); + + while (((unsigned long)th) + + sizeof(struct acpi_subtable_header) < table_end) { + if (th->type == 1) { + ma = (struct acpi_srat_mem_affinity *)th; + if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) + continue; + len = sprintf(pbuf, "%llx@%llx", + ma->length, ma->base_address); + if (p != buf) { + *p = ','; + p++; + } + memcpy(p, pbuf, len); + p = p + len; + } + th = (struct acpi_subtable_header *)((unsigned long) + th + th->length); + } + return sprintf(buf, "%s\n", buf); +} + +static DEVICE_ATTR(immovable_mem, 0444, show_immovable_mem, NULL); + static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, unsigned long nr_pages, int online_type, struct zone *default_zone) @@ -798,6 +845,9 @@ static struct attribute *memory_root_attrs[] = { #endif &dev_attr_block_size_bytes.attr, +#ifdef CONFIG_MEMORY_HOTREMOVE + &dev_attr_immovable_mem.attr, +#endif &dev_attr_auto_online_blocks.attr, NULL }; -- 2.14.3
Re: [RFC PATCH] memory-hotplug: add sysfs immovable_mem attribute
The result in my virtual machine looks like this: [root@localhost ~]# cat /sys/devices/system/memory/immovable_mem a@0,1f30@10,1f40@1f40,1f40@3e80,1f40@5dc0,1f40@7d00,1f40@9c40,480@bb80,1ac0@1,1f40@11ac0,1f40@13a00,1f40@15940 Any comments will be welcome. Thanks, Chao Fan On Wed, Dec 27, 2017 at 08:30:12PM +0800, Chao Fan wrote: >In sometimes users specify the memory region in immovable node in >some kernel commandline, such as "kernel_core" or the "immovable_mem=" >in the patchset that I have send. But users don't know the memory >region. So add this interface to print it. > >It will show like this: "nn@ss,nn@ss,...". "nn" means the size of memory >region, "ss" means the start position of this region. > >Signed-off-by: Chao Fan >--- > drivers/base/memory.c | 50 ++ > 1 file changed, 50 insertions(+) > >diff --git a/drivers/base/memory.c b/drivers/base/memory.c >index 1d60b58a8c19..9cadf1a9dccb 100644 >--- a/drivers/base/memory.c >+++ b/drivers/base/memory.c >@@ -25,6 +25,7 @@ > > #include > #include >+#include > > static DEFINE_MUTEX(mem_sysfs_mutex); > >@@ -389,6 +390,52 @@ static ssize_t show_phys_device(struct device *dev, > } > > #ifdef CONFIG_MEMORY_HOTREMOVE >+/* >+ * Immovable memory region >+ */ >+ >+static ssize_t >+show_immovable_mem(struct device *dev, struct device_attribute *attr, >+ char *buf) >+{ >+ struct acpi_table_header *table_header = NULL; >+ struct acpi_srat_mem_affinity *ma; >+ struct acpi_subtable_header *th; >+ unsigned long long table_size; >+ unsigned long long table_end; >+ char pbuf[35], *p = buf; >+ int len; >+ >+ acpi_get_table(ACPI_SIG_SRAT, 0, &table_header); >+ >+ table_size = sizeof(struct acpi_table_srat); >+ table_end = (unsigned long)table_header + table_header->length; >+ th = (struct acpi_subtable_header *)((unsigned long) >+table_header + table_size); >+ >+ while (((unsigned long)th) + >+ sizeof(struct acpi_subtable_header) < table_end) { >+ if (th->type == 1) { >+ ma = (struct acpi_srat_mem_affinity *)th; >+ if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) >+ continue; >+ len = sprintf(pbuf, "%llx@%llx", >+ ma->length, ma->base_address); >+ if (p != buf) { >+ *p = ','; >+ p++; >+ } >+ memcpy(p, pbuf, len); >+ p = p + len; >+ } >+ th = (struct acpi_subtable_header *)((unsigned long) >+th + th->length); >+ } >+ return sprintf(buf, "%s\n", buf); >+} >+ >+static DEVICE_ATTR(immovable_mem, 0444, show_immovable_mem, NULL); >+ > static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, > unsigned long nr_pages, int online_type, > struct zone *default_zone) >@@ -798,6 +845,9 @@ static struct attribute *memory_root_attrs[] = { > #endif > > &dev_attr_block_size_bytes.attr, >+#ifdef CONFIG_MEMORY_HOTREMOVE >+ &dev_attr_immovable_mem.attr, >+#endif > &dev_attr_auto_online_blocks.attr, > NULL > }; >-- >2.14.3 >
Re: [Ocfs2-devel] [PATCH] ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE
Hi Gang, I like your fix. It looks good to me. On 2017/12/27 17:30, Gang He wrote: > If we can't get inode lock immediately in the function > ocfs2_inode_lock_with_page() when reading a page, we should not > return directly here, since this will lead to a softlockup problem. > The method is to get a blocking lock and immediately unlock before > returning, this can avoid CPU resource waste due to lots of retries, > and benefits fairness in getting lock among multiple nodes, increase > efficiency in case modifying the same file frequently from multiple > nodes. > The softlockup problem looks like, > Kernel panic - not syncing: softlockup: hung tasks > CPU: 0 PID: 885 Comm: multi_mmap Tainted: G L 4.12.14-6.1-default #1 > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > Call Trace: > >dump_stack+0x5c/0x82 >panic+0xd5/0x21e >watchdog_timer_fn+0x208/0x210 >? watchdog_park_threads+0x70/0x70 >__hrtimer_run_queues+0xcc/0x200 >hrtimer_interrupt+0xa6/0x1f0 >smp_apic_timer_interrupt+0x34/0x50 >apic_timer_interrupt+0x96/0xa0 > > RIP: 0010:unlock_page+0x17/0x30 > RSP: :af154080bc88 EFLAGS: 0246 ORIG_RAX: ff10 > RAX: dead0100 RBX: f21e009f5300 RCX: 0004 > RDX: dead00ff RSI: 0202 RDI: f21e009f5300 > RBP: R08: R09: af154080bb00 > R10: af154080bc30 R11: 0040 R12: 993749a39518 > R13: R14: f21e009f5300 R15: f21e009f5300 >ocfs2_inode_lock_with_page+0x25/0x30 [ocfs2] >ocfs2_readpage+0x41/0x2d0 [ocfs2] >? pagecache_get_page+0x30/0x200 >filemap_fault+0x12b/0x5c0 >? recalc_sigpending+0x17/0x50 >? __set_task_blocked+0x28/0x70 >? __set_current_blocked+0x3d/0x60 >ocfs2_fault+0x29/0xb0 [ocfs2] >__do_fault+0x1a/0xa0 >__handle_mm_fault+0xbe8/0x1090 >handle_mm_fault+0xaa/0x1f0 >__do_page_fault+0x235/0x4b0 >trace_do_page_fault+0x3c/0x110 >async_page_fault+0x28/0x30 > RIP: 0033:0x7fa75ded638e > RSP: 002b:7ffd6657db18 EFLAGS: 00010287 > RAX: 55c7662fb700 RBX: 0001 RCX: 55c7662fb700 > RDX: 1770 RSI: 7fa75e909000 RDI: 55c7662fb700 > RBP: 0003 R08: 000e R09: > R10: 0483 R11: 7fa75ded61b0 R12: 7fa75e90a770 > R13: 000e R14: 1770 R15: > > Fixes: 1cce4df04f37 ("ocfs2: do not lock/unlock() inode DLM lock") > Signed-off-by: Gang He Reviewed-by: Changwei Ge > --- > fs/ocfs2/dlmglue.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c > index 4689940..5193218 100644 > --- a/fs/ocfs2/dlmglue.c > +++ b/fs/ocfs2/dlmglue.c > @@ -2486,6 +2486,15 @@ int ocfs2_inode_lock_with_page(struct inode *inode, > ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK); > if (ret == -EAGAIN) { > unlock_page(page); > + /* > + * If we can't get inode lock immediately, we should not return > + * directly here, since this will lead to a softlockup problem. > + * The method is to get a blocking lock and immediately unlock > + * before returning, this can avoid CPU resource waste due to > + * lots of retries, and benefits fairness in getting lock. > + */ > + if (ocfs2_inode_lock(inode, ret_bh, ex) == 0) > + ocfs2_inode_unlock(inode, ex); > ret = AOP_TRUNCATED_PAGE; > } > >
Re: [PATCH] objtool: Fix clang enum conversion warning
On Tue, 26 Dec 2017, Nick Desaulniers wrote: I sent a similar one recently: https://patchwork.kernel.org/patch/10131815/ (maybe Josh is just forwarding me an earlier fix?) Reviewed-by: Nick Desaulniers I actually submitted this (other) patch to LKML on 2017-12-10: https://patchwork.kernel.org/patch/10103977/ I also pointed this out on the llvmlinux mailing list: https://lists.linuxfoundation.org/pipermail/llvmlinux/2017-December/001535.html (The mail might not have been distributed yet to its recipients, because I am on the llvmlinux mailing list only for a few days, and I might have not been whitelisted for getting through the spam filtering of that list.) Nick submitted another patch to LKML on 2017-12-24 (see above). The source code change is the same; but the commit message was different. Now the third patch from Josh here is another equal patch with yet another commit message, combining information from both patches. Assuming that the authorship of this one-line change does not matter, as it is largely suggested by the clang compiler anyway, and we want to move the change forward, we should decide on which of three patches to move forward. I can give my Reviewed-by and Tested-by to any of them. Lukas
[PATCH] sched: cgroup: export nr_running for each cpu cgroup
Export the nr_running for each cpu cgroup could help us monitor the container conveniently. The total threads of cpu cgroup could be got from the tasks file, and it could also be got from pids subsystem. But we still donot know how many processes are running in a container, only if we traversal the status of all processes, that's a little expensive. Hence export the nr_running. Signed-off-by: Yafang Shao --- kernel/sched/core.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 644fa2e..926575a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6648,10 +6648,17 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v) { struct task_group *tg = css_tg(seq_css(sf)); struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth; + unsigned int nr_running = 0; + int cpu; + + for_each_online_cpu(cpu) + if (tg->cfs_rq[cpu]) + nr_running += tg->cfs_rq[cpu]->h_nr_running; seq_printf(sf, "nr_periods %d\n", cfs_b->nr_periods); seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled); seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time); + seq_printf(sf, "nr_running %u\n", nr_running); return 0; } -- 1.8.3.1
Re: [SIL2review] [PATCH] fixdep: free memory on second error path of do_config_file
On Mon, 18 Dec 2017, Masahiro Yamada wrote: 2017-12-15 17:23 GMT+09:00 Nicholas Mc Guire : On Thu, Dec 14, 2017 at 08:54:10PM +0100, Lukas Bulwahn wrote: Commit dee81e988674 ("fixdep: faster CONFIG_ search") introduces the memory leak when `map = mmap(...)` was replaced with `map = malloc(...)` and `read(fd, map, ...)`. It introduces a new second error path, which does not free the allocated memory for `map`. We now correct that behavior and free `map` before the do_config_file() function returns. Facebook's static analysis tool Infer (http://fbinfer.com) found this memory leak: scripts/basic/fixdep.c:297: error: MEMORY_LEAK memory dynamically allocated by call to `malloc()` at line 290, \ column 8 is not reachable after line 297, column 3. Fixes: dee81e988674 ("fixdep: faster CONFIG_ search") Signed-off-by: Lukas Bulwahn Reviewed-by: Nicholas Mc Guire --- scripts/basic/fixdep.c | 1 + 1 file changed, 1 insertion(+) diff --git a/scripts/basic/fixdep.c b/scripts/basic/fixdep.c index bbf62cb..131c450 100644 --- a/scripts/basic/fixdep.c +++ b/scripts/basic/fixdep.c @@ -296,6 +296,7 @@ static void do_config_file(const char *filename) if (read(fd, map, st.st_size) != st.st_size) { perror("fixdep: read"); close(fd); + free(map); This looks reasonable but actually it is not clear why do_config_file() should return at all if read fails as the read error would go unnoticed in the current code and allow the build to continue. so this probably should be an exit(2) here and not a return which would then take care of the free() anyway. I agree. This should be exit(2). You can also remove close() as well since the operation system will close it anyway. Atleast I do not see the rational to allow continuation if the read failed as the file should not be empty nor a mismatch with st.st_size expected. If it were due to a EINTR then it still should terminate as EINTR was not handled and we thus could miss a valid dependency. Note: this probably also should be applied to the if (!map) condition before as well, as at that point it is known that map > 0 and a malloc() failure would allow skipping parse_config_file() for a valid config. I agree. We should change this too. I created a new patch with your requested changes, but I did not consider it a second version of this patch by the time of writing. You can find the new patch at: v1: https://patchwork.kernel.org/patch/10126547/ v2: https://patchwork.kernel.org/patch/10128297/ This patch here is now superseded by the new patch. So, this patch here can be abandoned. Lukas
Re: [PATCH v2] x86/kexec: Exclude GART aperture from vmcore
On 12/27/17 at 01:25pm, Borislav Petkov wrote: > On Wed, Dec 27, 2017 at 03:44:49PM +0800, Baoquan He wrote: > > > yes, instead of crashing the machine (because GART may be initialized in > > > the > > > 2nd kernel, overlapping the 1st kernel memory, which the 2nd kernel with > > > its > > > fake e820 map sees as unused). > > > > > > I'd say this is an improvement. > > > > I don't get what you said. If 'iommu=off' only specified in 1st kernel, > > kdump kernel will think the memory which GART bar pointed as a hole. > > This is incorrect. I don't see the improvement. > > So he says, this memory is unused. Why is that incorrect?!? 'iommu=off' specified in 1st kernel, that region will be normal memory, there could be important kernel data written into the place. While kdump kernel will take that region as a gart aperture, trying to read data from that region will cause error which Jiri originally tried to fix.
[PATCH 2/5] kasan: don't use __builtin_return_address(1)
__builtin_return_address(1) is unreliable without frame pointers. With defconfig on kmalloc_pagealloc_invalid_free test I am getting: BUG: KASAN: double-free or invalid-free in (null) Pass caller PC from callers explicitly. Signed-off-by: Dmitry Vyukov Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Cc: kasan-...@googlegroups.com --- include/linux/kasan.h | 9 + mm/kasan/kasan.c | 8 mm/kasan/kasan.h | 2 +- mm/kasan/report.c | 4 ++-- mm/slab.c | 6 +++--- mm/slub.c | 8 6 files changed, 19 insertions(+), 18 deletions(-) diff --git a/include/linux/kasan.h b/include/linux/kasan.h index fc9e642533f8..f0d13c30acc6 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -56,14 +56,14 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object); void kasan_init_slab_obj(struct kmem_cache *cache, const void *object); void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags); -void kasan_kfree_large(void *ptr); +void kasan_kfree_large(void *ptr, unsigned long ip); void kasan_poison_kfree(void *ptr); void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, gfp_t flags); void kasan_krealloc(const void *object, size_t new_size, gfp_t flags); void kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags); -bool kasan_slab_free(struct kmem_cache *s, void *object); +bool kasan_slab_free(struct kmem_cache *s, void *object, unsigned long ip); struct kasan_cache { int alloc_meta_offset; @@ -108,7 +108,7 @@ static inline void kasan_init_slab_obj(struct kmem_cache *cache, const void *object) {} static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {} -static inline void kasan_kfree_large(void *ptr) {} +static inline void kasan_kfree_large(void *ptr, unsigned long ip) {} static inline void kasan_poison_kfree(void *ptr) {} static inline void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, gfp_t flags) {} @@ -117,7 +117,8 @@ static inline void kasan_krealloc(const void *object, size_t new_size, static inline void kasan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags) {} -static inline bool kasan_slab_free(struct kmem_cache *s, void *object) +static inline bool kasan_slab_free(struct kmem_cache *s, void *object, + unsigned long ip) { return false; } diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index ecb64fda79e6..32f555ded938 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -501,7 +501,7 @@ static void kasan_poison_slab_free(struct kmem_cache *cache, void *object) kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE); } -bool kasan_slab_free(struct kmem_cache *cache, void *object) +bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip) { s8 shadow_byte; @@ -511,7 +511,7 @@ bool kasan_slab_free(struct kmem_cache *cache, void *object) shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(object)); if (shadow_byte < 0 || shadow_byte >= KASAN_SHADOW_SCALE_SIZE) { - kasan_report_invalid_free(object, __builtin_return_address(1)); + kasan_report_invalid_free(object, ip); return true; } @@ -601,10 +601,10 @@ void kasan_poison_kfree(void *ptr) kasan_poison_slab_free(page->slab_cache, ptr); } -void kasan_kfree_large(void *ptr) +void kasan_kfree_large(void *ptr, unsigned long ip) { if (ptr != page_address(virt_to_head_page(ptr))) - kasan_report_invalid_free(ptr, __builtin_return_address(1)); + kasan_report_invalid_free(ptr, ip); /* The object will be poisoned by page_alloc. */ } diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h index 57f517d1dfce..2792de927fcd 100644 --- a/mm/kasan/kasan.h +++ b/mm/kasan/kasan.h @@ -107,7 +107,7 @@ static inline const void *kasan_shadow_to_mem(const void *shadow_addr) void kasan_report(unsigned long addr, size_t size, bool is_write, unsigned long ip); -void kasan_report_invalid_free(void *object, void *ip); +void kasan_report_invalid_free(void *object, unsigned long ip); #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB) void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache); diff --git a/mm/kasan/report.c b/mm/kasan/report.c index 55916ad21722..75206991ece0 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -326,12 +326,12 @@ static void print_shadow_for_address(const void *addr) } } -void kasan_report_invalid_free(void *object, void *ip) +void kasan_report_invalid_free(void *object, unsigned long ip) { unsigned long flags; kasan_start_report(&flags); - pr_err("BUG: KASAN: double-free or invalid-free in %pS\n", ip); + pr_err("BUG
[PATCH 1/5] kasan: detect invalid frees for large objects
Detect frees of pointers into middle of large heap objects. I dropped const from kasan_kfree_large() because it starts propagating through a bunch of functions in kasan_report.c, slab/slub nearest_obj(), all of their local variables, fixup_red_left(), etc. Signed-off-by: Dmitry Vyukov Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Cc: kasan-...@googlegroups.com --- include/linux/kasan.h | 4 ++-- lib/test_kasan.c | 33 + mm/kasan/kasan.c | 12 +--- mm/kasan/kasan.h | 3 +-- mm/kasan/report.c | 3 +-- mm/slub.c | 4 ++-- 6 files changed, 44 insertions(+), 15 deletions(-) diff --git a/include/linux/kasan.h b/include/linux/kasan.h index e3eb834c9a35..fc9e642533f8 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -56,7 +56,7 @@ void kasan_poison_object_data(struct kmem_cache *cache, void *object); void kasan_init_slab_obj(struct kmem_cache *cache, const void *object); void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags); -void kasan_kfree_large(const void *ptr); +void kasan_kfree_large(void *ptr); void kasan_poison_kfree(void *ptr); void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, gfp_t flags); @@ -108,7 +108,7 @@ static inline void kasan_init_slab_obj(struct kmem_cache *cache, const void *object) {} static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {} -static inline void kasan_kfree_large(const void *ptr) {} +static inline void kasan_kfree_large(void *ptr) {} static inline void kasan_poison_kfree(void *ptr) {} static inline void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, gfp_t flags) {} diff --git a/lib/test_kasan.c b/lib/test_kasan.c index 2724f86c4cef..e9c5d765be66 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -94,6 +94,37 @@ static noinline void __init kmalloc_pagealloc_oob_right(void) ptr[size] = 0; kfree(ptr); } + +static noinline void __init kmalloc_pagealloc_uaf(void) +{ + char *ptr; + size_t size = KMALLOC_MAX_CACHE_SIZE + 10; + + pr_info("kmalloc pagealloc allocation: use-after-free\n"); + ptr = kmalloc(size, GFP_KERNEL); + if (!ptr) { + pr_err("Allocation failed\n"); + return; + } + + kfree(ptr); + ptr[0] = 0; +} + +static noinline void __init kmalloc_pagealloc_invalid_free(void) +{ + char *ptr; + size_t size = KMALLOC_MAX_CACHE_SIZE + 10; + + pr_info("kmalloc pagealloc allocation: invalid-free\n"); + ptr = kmalloc(size, GFP_KERNEL); + if (!ptr) { + pr_err("Allocation failed\n"); + return; + } + + kfree(ptr + 1); +} #endif static noinline void __init kmalloc_large_oob_right(void) @@ -505,6 +536,8 @@ static int __init kmalloc_tests_init(void) kmalloc_node_oob_right(); #ifdef CONFIG_SLUB kmalloc_pagealloc_oob_right(); + kmalloc_pagealloc_uaf(); + kmalloc_pagealloc_invalid_free(); #endif kmalloc_large_oob_right(); kmalloc_oob_krealloc_more(); diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 8aaee42fcfab..ecb64fda79e6 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -511,8 +511,7 @@ bool kasan_slab_free(struct kmem_cache *cache, void *object) shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(object)); if (shadow_byte < 0 || shadow_byte >= KASAN_SHADOW_SCALE_SIZE) { - kasan_report_double_free(cache, object, - __builtin_return_address(1)); + kasan_report_invalid_free(object, __builtin_return_address(1)); return true; } @@ -602,12 +601,11 @@ void kasan_poison_kfree(void *ptr) kasan_poison_slab_free(page->slab_cache, ptr); } -void kasan_kfree_large(const void *ptr) +void kasan_kfree_large(void *ptr) { - struct page *page = virt_to_page(ptr); - - kasan_poison_shadow(ptr, PAGE_SIZE << compound_order(page), - KASAN_FREE_PAGE); + if (ptr != page_address(virt_to_head_page(ptr))) + kasan_report_invalid_free(ptr, __builtin_return_address(1)); + /* The object will be poisoned by page_alloc. */ } int kasan_module_alloc(void *addr, size_t size) diff --git a/mm/kasan/kasan.h b/mm/kasan/kasan.h index 7c0bcd1f4c0d..57f517d1dfce 100644 --- a/mm/kasan/kasan.h +++ b/mm/kasan/kasan.h @@ -107,8 +107,7 @@ static inline const void *kasan_shadow_to_mem(const void *shadow_addr) void kasan_report(unsigned long addr, size_t size, bool is_write, unsigned long ip); -void kasan_report_double_free(struct kmem_cache *cache, void *object, - void *ip); +void kasan_report_invalid_free(void *object, void *ip); #if defined(CONFIG_SLAB) || defined(CONFIG_SLUB) void quarantine_put
Re: [PATCH 1/3] staging: irda: fix type from "unsigned" to "unsigned int"
On Tue, Dec 26, 2017 at 09:52:54PM -0800, JI-HUN KIM wrote: > Clean up checkpatch warning: > WARNING: Prefer 'unsigned int' to bare use of 'unsigned' > > Signed-off-by: JI-HUN KIM > --- > drivers/staging/irda/drivers/esi-sir.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Please read drivers/staging/irda/TODO sorry. greg k-h
[PATCH 0/5] kasan: detect invalid frees
KASAN detects double-frees, but does not detect invalid-frees (when a pointer into a middle of heap object is passed to free). We recently had a very unpleasant case in crypto code which freed an inner object inside of a heap allocation. This left unnoticed during free, but totally corrupted heap and later lead to a bunch of random crashes all over kernel code. Detect invalid frees. Dmitry Vyukov (5): kasan: detect invalid frees for large objects kasan: don't use __builtin_return_address(1) kasan: detect invalid frees for large mempool objects kasan: unify code between kasan_slab_free() and kasan_poison_kfree() kasan: detect invalid frees include/linux/kasan.h | 13 lib/test_kasan.c | 83 +++ mm/kasan/kasan.c | 57 +++ mm/kasan/kasan.h | 3 +- mm/kasan/report.c | 5 ++-- mm/mempool.c | 6 ++-- mm/slab.c | 6 ++-- mm/slub.c | 10 +++ 8 files changed, 135 insertions(+), 48 deletions(-) -- 2.15.1.620.gb9897f4670-goog
[PATCH 5/5] kasan: detect invalid frees
Detect frees of pointers into middle of heap objects. Signed-off-by: Dmitry Vyukov Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Cc: kasan-...@googlegroups.com --- lib/test_kasan.c | 50 ++ mm/kasan/kasan.c | 6 ++ 2 files changed, 56 insertions(+) diff --git a/lib/test_kasan.c b/lib/test_kasan.c index e9c5d765be66..a808d81b409d 100644 --- a/lib/test_kasan.c +++ b/lib/test_kasan.c @@ -523,6 +523,54 @@ static noinline void __init kasan_alloca_oob_right(void) *(volatile char *)p; } +static noinline void __init kmem_cache_double_free(void) +{ + char *p; + size_t size = 200; + struct kmem_cache *cache; + + cache = kmem_cache_create("test_cache", size, 0, 0, NULL); + if (!cache) { + pr_err("Cache allocation failed\n"); + return; + } + pr_info("double-free on heap object\n"); + p = kmem_cache_alloc(cache, GFP_KERNEL); + if (!p) { + pr_err("Allocation failed\n"); + kmem_cache_destroy(cache); + return; + } + + kmem_cache_free(cache, p); + kmem_cache_free(cache, p); + kmem_cache_destroy(cache); +} + +static noinline void __init kmem_cache_invalid_free(void) +{ + char *p; + size_t size = 200; + struct kmem_cache *cache; + + cache = kmem_cache_create("test_cache", size, 0, SLAB_TYPESAFE_BY_RCU, + NULL); + if (!cache) { + pr_err("Cache allocation failed\n"); + return; + } + pr_info("invalid-free of heap object\n"); + p = kmem_cache_alloc(cache, GFP_KERNEL); + if (!p) { + pr_err("Allocation failed\n"); + kmem_cache_destroy(cache); + return; + } + + kmem_cache_free(cache, p + 1); + kmem_cache_destroy(cache); +} + static int __init kmalloc_tests_init(void) { /* @@ -560,6 +608,8 @@ static int __init kmalloc_tests_init(void) ksize_unpoisons_memory(); copy_user_test(); use_after_scope_test(); + kmem_cache_double_free(); + kmem_cache_invalid_free(); kasan_restore_multi_shot(multishot); diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 578843fab5dc..3fb497d4fbf8 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -495,6 +495,12 @@ static bool __kasan_slab_free(struct kmem_cache *cache, void *object, s8 shadow_byte; unsigned long rounded_up_size; + if (unlikely(nearest_obj(cache, virt_to_head_page(object), object) != + object)) { + kasan_report_invalid_free(object, ip); + return true; + } + /* RCU slabs could be legally used after free within the RCU period */ if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU)) return false; -- 2.15.1.620.gb9897f4670-goog
[PATCH 4/5] kasan: unify code between kasan_slab_free() and kasan_poison_kfree()
Both of these functions deal with freeing of slab objects. However, kasan_poison_kfree() mishandles SLAB_TYPESAFE_BY_RCU (must also not poison such objects) and does not detect double-frees. Unify code between these functions. This solves both of the problems and allows to add more common code (e.g. detection of invalid frees). Signed-off-by: Dmitry Vyukov Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Cc: kasan-...@googlegroups.com --- mm/kasan/kasan.c | 28 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 77c103748728..578843fab5dc 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -489,21 +489,11 @@ void kasan_slab_alloc(struct kmem_cache *cache, void *object, gfp_t flags) kasan_kmalloc(cache, object, cache->object_size, flags); } -static void kasan_poison_slab_free(struct kmem_cache *cache, void *object) -{ - unsigned long size = cache->object_size; - unsigned long rounded_up_size = round_up(size, KASAN_SHADOW_SCALE_SIZE); - - /* RCU slabs could be legally used after free within the RCU period */ - if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU)) - return; - - kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE); -} - -bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip) +static bool __kasan_slab_free(struct kmem_cache *cache, void *object, + unsigned long ip, bool quarantine) { s8 shadow_byte; + unsigned long rounded_up_size; /* RCU slabs could be legally used after free within the RCU period */ if (unlikely(cache->flags & SLAB_TYPESAFE_BY_RCU)) @@ -515,9 +505,10 @@ bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip) return true; } - kasan_poison_slab_free(cache, object); + rounded_up_size = round_up(cache->object_size, KASAN_SHADOW_SCALE_SIZE); + kasan_poison_shadow(object, rounded_up_size, KASAN_KMALLOC_FREE); - if (unlikely(!(cache->flags & SLAB_KASAN))) + if (!quarantine || unlikely(!(cache->flags & SLAB_KASAN))) return false; set_track(&get_alloc_info(cache, object)->free_track, GFP_NOWAIT); @@ -525,6 +516,11 @@ bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip) return true; } +bool kasan_slab_free(struct kmem_cache *cache, void *object, unsigned long ip) +{ + return __kasan_slab_free(cache, object, ip, true); +} + void kasan_kmalloc(struct kmem_cache *cache, const void *object, size_t size, gfp_t flags) { @@ -602,7 +598,7 @@ void kasan_poison_kfree(void *ptr, unsigned long ip) kasan_poison_shadow(ptr, PAGE_SIZE << compound_order(page), KASAN_FREE_PAGE); } else { - kasan_poison_slab_free(page->slab_cache, ptr); + __kasan_slab_free(page->slab_cache, ptr, ip, false); } } -- 2.15.1.620.gb9897f4670-goog
Re: [PATCH 2/2] dts: Probe efuse for CI20
On Wed, Dec 27, 2017 at 01:27:02PM +0100, Mathieu Malaterre wrote: > Signed-off-by: Mathieu Malaterre I know i can't take patches without any changelog text at all, and really, you shouldn't ever create such a thing :) thanks, greg k-h
[PATCH 3/5] kasan: detect invalid frees for large mempool objects
Detect frees of pointers into middle of mempool objects. I did a one-off test, but it turned out to be very tricky, so I reverted it. First, mempool does not call kasan_poison_kfree() unless allocation function fails. I stubbed an allocation function to fail on second and subsequent allocations. But then mempool stopped to call kasan_poison_kfree() at all, because it does it only when allocation function is mempool_kmalloc(). We could support this special failing test allocation function in mempool, but it also can't live with kasan tests, because these are in a module. Signed-off-by: Dmitry Vyukov Cc: linux...@kvack.org Cc: linux-kernel@vger.kernel.org Cc: kasan-...@googlegroups.com --- include/linux/kasan.h | 4 ++-- mm/kasan/kasan.c | 11 --- mm/mempool.c | 6 +++--- 3 files changed, 13 insertions(+), 8 deletions(-) diff --git a/include/linux/kasan.h b/include/linux/kasan.h index f0d13c30acc6..fc45f8952d1e 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -57,7 +57,7 @@ void kasan_init_slab_obj(struct kmem_cache *cache, const void *object); void kasan_kmalloc_large(const void *ptr, size_t size, gfp_t flags); void kasan_kfree_large(void *ptr, unsigned long ip); -void kasan_poison_kfree(void *ptr); +void kasan_poison_kfree(void *ptr, unsigned long ip); void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, gfp_t flags); void kasan_krealloc(const void *object, size_t new_size, gfp_t flags); @@ -109,7 +109,7 @@ static inline void kasan_init_slab_obj(struct kmem_cache *cache, static inline void kasan_kmalloc_large(void *ptr, size_t size, gfp_t flags) {} static inline void kasan_kfree_large(void *ptr, unsigned long ip) {} -static inline void kasan_poison_kfree(void *ptr) {} +static inline void kasan_poison_kfree(void *ptr, unsigned long ip) {} static inline void kasan_kmalloc(struct kmem_cache *s, const void *object, size_t size, gfp_t flags) {} static inline void kasan_krealloc(const void *object, size_t new_size, diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 32f555ded938..77c103748728 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -588,17 +588,22 @@ void kasan_krealloc(const void *object, size_t size, gfp_t flags) kasan_kmalloc(page->slab_cache, object, size, flags); } -void kasan_poison_kfree(void *ptr) +void kasan_poison_kfree(void *ptr, unsigned long ip) { struct page *page; page = virt_to_head_page(ptr); - if (unlikely(!PageSlab(page))) + if (unlikely(!PageSlab(page))) { + if (ptr != page_address(page)) { + kasan_report_invalid_free(ptr, ip); + return; + } kasan_poison_shadow(ptr, PAGE_SIZE << compound_order(page), KASAN_FREE_PAGE); - else + } else { kasan_poison_slab_free(page->slab_cache, ptr); + } } void kasan_kfree_large(void *ptr, unsigned long ip) diff --git a/mm/mempool.c b/mm/mempool.c index 7d8c5a0010a2..5c9dce34719b 100644 --- a/mm/mempool.c +++ b/mm/mempool.c @@ -103,10 +103,10 @@ static inline void poison_element(mempool_t *pool, void *element) } #endif /* CONFIG_DEBUG_SLAB || CONFIG_SLUB_DEBUG_ON */ -static void kasan_poison_element(mempool_t *pool, void *element) +static __always_inline void kasan_poison_element(mempool_t *pool, void *element) { if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc) - kasan_poison_kfree(element); + kasan_poison_kfree(element, _RET_IP_); if (pool->alloc == mempool_alloc_pages) kasan_free_pages(element, (unsigned long)pool->pool_data); } @@ -119,7 +119,7 @@ static void kasan_unpoison_element(mempool_t *pool, void *element, gfp_t flags) kasan_alloc_pages(element, (unsigned long)pool->pool_data); } -static void add_element(mempool_t *pool, void *element) +static __always_inline void add_element(mempool_t *pool, void *element) { BUG_ON(pool->curr_nr >= pool->min_nr); poison_element(pool, element); -- 2.15.1.620.gb9897f4670-goog
Re: [RFC PATCH] memory-hotplug: add sysfs immovable_mem attribute
On Wed, Dec 27, 2017 at 08:30:12PM +0800, Chao Fan wrote: > In sometimes users specify the memory region in immovable node in > some kernel commandline, such as "kernel_core" or the "immovable_mem=" > in the patchset that I have send. But users don't know the memory > region. So add this interface to print it. > > It will show like this: "nn@ss,nn@ss,...". "nn" means the size of memory > region, "ss" means the start position of this region. > > Signed-off-by: Chao Fan > --- > drivers/base/memory.c | 50 ++ > 1 file changed, 50 insertions(+) Why did you not also create the needed Documentation/ABI/ file update? That's required for sysfs attributes. > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index 1d60b58a8c19..9cadf1a9dccb 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -25,6 +25,7 @@ > > #include > #include > +#include > > static DEFINE_MUTEX(mem_sysfs_mutex); > > @@ -389,6 +390,52 @@ static ssize_t show_phys_device(struct device *dev, > } > > #ifdef CONFIG_MEMORY_HOTREMOVE > +/* > + * Immovable memory region > + */ > + > +static ssize_t > +show_immovable_mem(struct device *dev, struct device_attribute *attr, > +char *buf) > +{ > + struct acpi_table_header *table_header = NULL; > + struct acpi_srat_mem_affinity *ma; > + struct acpi_subtable_header *th; > + unsigned long long table_size; > + unsigned long long table_end; > + char pbuf[35], *p = buf; > + int len; > + > + acpi_get_table(ACPI_SIG_SRAT, 0, &table_header); > + > + table_size = sizeof(struct acpi_table_srat); > + table_end = (unsigned long)table_header + table_header->length; > + th = (struct acpi_subtable_header *)((unsigned long) > + table_header + table_size); > + > + while (((unsigned long)th) + > +sizeof(struct acpi_subtable_header) < table_end) { > + if (th->type == 1) { > + ma = (struct acpi_srat_mem_affinity *)th; > + if (ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) > + continue; > + len = sprintf(pbuf, "%llx@%llx", > +ma->length, ma->base_address); sysfs is "one value per file", and if you ever have to care if you are overrunning the length of the buffer, that's a huge hint you are doing something wrong here. sorry, greg k-h
Re: BUG warnings in 4.14.9
On Wed, Dec 27, 2017 at 04:25:00AM +, alexander.le...@verizon.com wrote: > On Tue, Dec 26, 2017 at 10:54:37PM +0200, Ido Schimmel wrote: > >On Tue, Dec 26, 2017 at 07:59:55PM +0100, Willy Tarreau wrote: > >> Guys, > >> > >> Chris reported the bug below and confirmed that reverting commit > >> 9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to > >> have fixed the issue for him. This patch is a94b9367 in mainline. > >> > >> I personally have no opinion on the patch, just found it because it > >> was the only one touching this area between 4.14.8 and 4.14.9 :-) > >> > >> Should this be reverted or maybe fixed differently ? > > > >Maybe I'm missing something, but how come this patch even made its way > >into 4.14.y? It's part of a series to RCU-ify IPv6 FIB lookup that went > >into 4.15. > > > >Anyway, the mentioned bug was already fixed by commit 951f788a80ff > >("ipv6: fix a BUG in rt6_get_pcpu_route()") when the code was still in > >net-next. > > Uh, you're right. Greg, please just revert 9704f81. Thanks! Now reverted, sorry about this. greg k-h
Re: [PATCH 2/2] usb: quirks: Add reset-resume quirk for Dell DW1820 QCA Rome Bluetooth
On Tue, Dec 26, 2017 at 10:01:46PM +0100, Marcel Holtmann wrote: > Hi Greg, > > > Commit ("fd865802c66bc451dc515ed89360f84376ce1a56 Bluetooth: btusb: fix > > QCA Rome suspend/resume") enables reset_resume in btusb_probe(). This > > makes the device resets during btusb_open(), firmware loading gets > > interrupted as a result. > > > > We still want to reset the device to solve the original issue, but we > > should do it before btusb_open(). > > > > Hence, add reset-resume quirk in usb core intead of btusb. > > > > Cc: sta...@vger.kernel.org > > Cc: Leif Liddy > > Cc: Matthias Kaehlcke > > Cc: Brian Norris > > Cc: Daniel Drake > > Signed-off-by: Kai-Heng Feng > > > > --- > > drivers/usb/core/quirks.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c > > index a10b346b9777..96951104c45b 100644 > > --- a/drivers/usb/core/quirks.c > > +++ b/drivers/usb/core/quirks.c > > @@ -197,6 +197,9 @@ static const struct usb_device_id usb_quirk_list[] = { > > { USB_DEVICE(0x0b05, 0x17e0), .driver_info = > > USB_QUIRK_IGNORE_REMOTE_WAKEUP }, > > > > + /* QCA Rome Bluetooth in Dell DW1820 wireless module */ > > + { USB_DEVICE(0x0cf3, 0xe007), .driver_info = USB_QUIRK_RESET_RESUME }, > > + > > can I get an ACK from you to take this patch through bluetooth-next tree? Or > are you planning to take it? It's not in my queue at all, so I didn't even have the chance to take it :) Acked-by: Greg Kroah-Hartman
cancel_work_sync() can cause priority invertion
Hi For those who care about linux RT behavior: while analyzing traces, just found priority inversion caused by RT task calling cancel_work_sync(), while work item in question is executing in non-RT kworker that was preempted for significant time. WBR, Nikita Yushchenko
Re: [PATCH 2/2 v4] scsi: ufs: introduce sysfs entries exposing UFS health info
On Wed, Dec 27, 2017 at 09:00:10AM +, Avri Altman wrote: > > > > -Original Message- > > From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi- > > ow...@vger.kernel.org] On Behalf Of Greg Kroah-Hartman > > Sent: Thursday, December 21, 2017 10:00 AM > > To: Jaegeuk Kim > > Cc: linux-kernel@vger.kernel.org; linux-s...@vger.kernel.org; Jaegeuk Kim > > > > Subject: Re: [PATCH 2/2 v4] scsi: ufs: introduce sysfs entries exposing UFS > > health info > > > > On Wed, Dec 20, 2017 at 02:13:25PM -0800, Jaegeuk Kim wrote: > > > This patch adds a new sysfs group, namely health, via: > > > > > >/sys/devices/soc/X.ufshc/health/ > As device health is just one piece of information out of the device > management, > I think that you should address this in a more comprehensive way, > And set hooks for much more device info: > Allow access to device descriptors, attributes and flags. Add on patches are easy to create for this if people really want and need it :) > The attributes and flags should be placed in separate subfolders Why? What is that going to help with? > The LUN specific descriptors and attributes should be placed in a luns > subfolder, and then per descriptor / attribute type Again, why? > You might also would like to consider differentiating read and write - > to control those type of accesses as well. What do you mean by this exactly? As it is, this is a step forward in getting attributes that people are asking for and already using, into the kernel tree. Please don't object because not all attributes that are possible are being added here, it should be trivial to add more as needed, right? I'm really tired of seeing all of the various out-of-tree forks of this driver, it's about time that someone works to get those features merged, right? thanks, greg k-h