Re: [PATCH] socket.7: add description for SO_BUSY_POLL
Eliezer Tamir writes: > +While busy polling my improve latency of some application, care must be > +taken when using it since this will increase both CPU utilization power > usage. s/my/may/; missing 'and' before 'power usage'. [And should 'application' be plural?] Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs.git part 2
Al Viro writes: > On Thu, Jul 11, 2013 at 02:42:54PM -0700, Linus Torvalds wrote: >> But with an *old* kernel, O_TMPFILE will just silently be ignored as >> an unrecognized flag, and things won't work. If you do >> >> fd = open("dirname", O_CREAT | O_TMPFILE | O_RDWR, 0666); >> >> it may be that it ends up acting as a "create file at specified >> directory path" instead of what the user *meant* for it to do, which >> was "create unnamed temporary file in the specified directory". >> > > It's slightly less painful than that - if dirname exists, the old kernels > will fail; O_CREAT for existing directory means an error. But isn't the problem the case where dirname does not exist? I.e., the application has to make sure that "/some/where" exists and is a directory before open("/some/where", O_CREAT | O_TMPFILE | O_RDWR, 0666) can be relied upon to fail on kernels not recognizing O_TMPFILE, instead of just creating "where" in "/some". Just thinking out loud, and please tell me to shut up if it doesn't make sense: The documentation for O_DIRECTORY seems to imply that one could require O_DIRECTORY to be given when using O_TMPFILE. The "If pathname is not a directory, cause the open to fail" certainly seems to make sense when O_TMPFILE is used, and older kernels should complain when seeing the O_CREAT|O_DIRECTORY combination. It is a hack, though. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs.git part 2
Al Viro writes: > On Fri, Jul 12, 2013 at 12:02:45PM +0000, Rasmus Villemoes wrote: > >> But isn't the problem the case where dirname does not exist? I.e., the >> application has to make sure that "/some/where" exists and is a directory >> before open("/some/where", O_CREAT | O_TMPFILE | O_RDWR, 0666) can be >> relied upon to fail on kernels not recognizing O_TMPFILE, instead of >> just creating "where" in "/some". >> >> Just thinking out loud, and please tell me to shut up if it doesn't make >> sense: The documentation for O_DIRECTORY seems to imply that one could >> require O_DIRECTORY to be given when using O_TMPFILE. The "If pathname >> is not a directory, cause the open to fail" certainly seems to make >> sense when O_TMPFILE is used, and older kernels should complain when >> seeing the O_CREAT|O_DIRECTORY combination. It is a hack, though. > > They should, but they won't ;-/ I see; I should test before I post, but... > It's the same problem - we do *not* validate the flags argument. > We'll get to do_last(), hit lookup_open(), which will create the > sucker and go to finish_open_created. Which is past the logics > checking for LOOKUP_DIRECTORY trying to return a non-directory and it > would've been too late to fail anyway - the file has already been > created. IOW, O_DIRECTORY is ignored when O_CREAT is present *and* > file didn't exist already. In that case we almost certainly can treat > that as a bug (i.e. start failing open() on O_CREAT | O_DIRECTORY in > all cases - I'd be _very_ surprised if somebody called open() with > such combination of flags), but that doesn't help with older > kernels... ... it seems that if one then omits O_CREAT, things work out ok, as long as one uses O_RDWR (which is the only sane thing to do with O_TMPFILE, I guess): open("/tmp/test/dir", O_DIRECTORY | O_RDWR, 0666) -> -1; Is a directory open("/tmp/test/dir", O_DIRECTORY | O_RDONLY, 0666) -> 3; Success open("/tmp/test/file", O_DIRECTORY | O_RDWR, 0666) -> -1; Not a directory open("/tmp/test/link_to_file", O_DIRECTORY | O_RDWR, 0666) -> -1; Not a directory open("/tmp/test/link_to_nowhere", O_DIRECTORY | O_RDWR, 0666) -> -1; No such file or directory open("/tmp/test/link_to_dir", O_DIRECTORY | O_RDWR, 0666) -> -1; Is a directory open("/tmp/test/link_to_dir", O_DIRECTORY | O_RDONLY, 0666) -> 3; Success open("/tmp/test/link_to_dir", O_NOFOLLOW | O_DIRECTORY | O_RDWR, 0666) -> -1; Too many levels of symbolic links open("/tmp/test/link_to_dir", O_NOFOLLOW | O_DIRECTORY | O_RDONLY, 0666) -> -1; Too many levels of symbolic links (The above flags are what an old kernel would effectively see with or without O_TMPFILE present, I suppose.) How about simply making O_TMPFILE == O_DIRECTORY | O_RDWR | O_TMPFILE_INTERNAL, and letting the correct use be open("/some/dir", O_TMPFILE) [with or without a mode argument] Using O_DIRECTORY when we don't want to open a directory, and omitting O_CREAT when we do want to create something new, is somewhat counter-intuitive, but I think this would solve the problem with old kernels. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fs: Properly pair fget_light/fput_light in fdget/fdput
fdget() is implemented using fget_light(). Its companion fdput() currently open-codes the behaviour of fput_light(). This is slightly confusing. To follow the documentation in fs/file.c:688ff to the letter, use fput_light() in fdput(). Signed-off-by: Rasmus Villemoes --- include/linux/file.h |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/linux/file.h b/include/linux/file.h index cbacf4f..30be9b0 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -33,8 +33,7 @@ struct fd { static inline void fdput(struct fd fd) { - if (fd.need_put) - fput(fd.file); + fput_light(fd.file, fd.need_put); } extern struct file *fget(unsigned int fd); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] drivers: use IS_ERR_VALUE() in memory_lseek()
Use IS_ERR_VALUE() instead of comparing the new offset to a hard-coded value of -MAX_ERRNO (which is also off-by-one due to the use of ~ instead of -). Signed-off-by: Rasmus Villemoes --- drivers/char/mem.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 1ccbe94..2ca6d78 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -745,7 +745,7 @@ static loff_t memory_lseek(struct file *file, loff_t offset, int orig) offset += file->f_pos; case SEEK_SET: /* to avoid userland mistaking f_pos=-9 as -EBADF=-9 */ - if ((unsigned long long)offset >= ~0xFFFULL) { + if (IS_ERR_VALUE((unsigned long long)offset)) { ret = -EOVERFLOW; break; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm: mremap: validate input before taking lock
Ping... Rasmus Villemoes writes: > This patch is very similar to 84d96d897671: Perform some basic > validation of the input to mremap() before taking the > ¤t->mm->mmap_sem lock. This also makes the MREMAP_FIXED => > MREMAP_MAYMOVE dependency slightly more explicit. > > Signed-off-by: Rasmus Villemoes > --- > mm/mremap.c | 18 ++ > 1 file changed, 10 insertions(+), 8 deletions(-) > > diff --git a/mm/mremap.c b/mm/mremap.c > index 463a257..00b6905 100644 > --- a/mm/mremap.c > +++ b/mm/mremap.c > @@ -456,13 +456,14 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned > long, old_len, > unsigned long charged = 0; > bool locked = false; > > - down_write(¤t->mm->mmap_sem); > - > if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) > - goto out; > + return ret; > + > + if (flags & MREMAP_FIXED && !(flags & MREMAP_MAYMOVE)) > + return ret; > > if (addr & ~PAGE_MASK) > - goto out; > + return ret; > > old_len = PAGE_ALIGN(old_len); > new_len = PAGE_ALIGN(new_len); > @@ -473,12 +474,13 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned > long, old_len, >* a zero new-len is nonsensical. >*/ > if (!new_len) > - goto out; > + return ret; > + > + down_write(¤t->mm->mmap_sem); > > if (flags & MREMAP_FIXED) { > - if (flags & MREMAP_MAYMOVE) > - ret = mremap_to(addr, old_len, new_addr, new_len, > - &locked); > + ret = mremap_to(addr, old_len, new_addr, new_len, > + &locked); > goto out; > } > > -- > 1.7.9.5 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: mailto:"d...@kvack.org";> em...@kvack.org > -- Rasmus Villemoes <http://rasmusvillemoes.dk/> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/3] mfd: Add irq domain support for max8998 interrupts
Tomasz Figa writes: > All uses of irq_base in platform data and max8997 driver private data > are removed. > [...] > +static int max8998_irq_domain_map(struct irq_domain *d, unsigned int irq, > + irq_hw_number_t hw) > +{ > + struct max8997_dev *max8998 = d->host_data; I may be wrong, but this sure looks like a typo (or copy-pasto). -- Rasmus Villemoes <http://rasmusvillemoes.dk/> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [git pull] vfs.git part 2
Linus Torvalds writes: > On Fri, Jul 12, 2013 at 12:39 PM, Al Viro wrote: >> >> [suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE; >> that will fail on old kernels in a lot more cases than what I came up with. > > So see earlier about why I'm not convinced about O_RDWR. But even if > we really want that (and it might be better to start off too narrow > than accept anything else) your patch tests those bits the wrong way > (any O_RDWR test should be done using the O_ACCMODE mask, not using > the O_RDWR value itself as a mask) On further thought, I think I'll retract the suggestion to include O_RDWR in O_TMPFILE: If that is done, I don't see how one can ever allow O_WRONLY functionality without breaking the ABI (or introducing O_TMPFILE_WRONLY). So I suggest O_TMPFILE == O_DIRECTORY|__O_TMPFILE, and correct use is open(path, O_TMPFILE|O_RDWR, mode). Also, O_TMPFILE without O_RDWR (or O_WRONLY) should then give -EINVAL. Pros: The access mode is explicit Easy to allow O_TMPFILE|O_WRONLY in the future (or now?) Static code checkers complaining about the lack of O_{RDONLY,RDWR,WRONLY} in open() (?) Cons: We catch one fewer case on older kernels, but not one which is likely to occur in real programs: On old kernels, O_TMPFILE|O_RDONLY, or equivalently O_TMPFILE, may give a valid file descriptor (in the case where path does name a directory). However: Anyone writing a program using O_TMPFILE probably at least tests on a kernel knowing about that, and so would be given the -EINVAL error, and the documentation can just spell out that O_TMPFILE is not compatible with O_RDONLY. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add per-process flag to control thp
Alex Thorlton writes: > This patch implements functionality to allow processes to disable the use of > transparent hugepages through the prctl syscall. A few comments: Is there a reason it shouldn't be possible for a process to un-disable/reenable thp? > +static inline int transparent_hugepage_enabled(struct vm_area_struct *vma) > +{ > + return !current->thp_disabled & _transparent_hugepage_enabled(vma); > +} Should probably be &&. > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 50d04b9..f084c76 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1406,6 +1406,9 @@ struct task_struct { > unsigned intsequential_io; > unsigned intsequential_io_avg; > #endif > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > + int thp_disabled; > +#endif > }; Is there a reason this needs a whole int in task_struct and not just a single bit? Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] asm-generic: fix minor include guard issues
I wrote a script to help find (potential) problems with include guards, such as the same macro being used in different header files. These patches are the result of applying that to include/asm-generic. They are mostly trivial, but [1/4] fixes two almost-problems. I didn't touch include/asm-generic/rwsem.h, although it might need some attention. It was introduced in dd472da38, seemingly a copy of arch/powerpc/include/asm/rwsem.h, which was removed in 0766387bc. It is not #included from anywhere, but is used in arch/{hexagon,powerpc}/include/asm/Kbuild. Since it is no longer powerpc-specific, the _ASM_POWERPC_RWSEM_H and #ifdef CONFIG_PPC64 seem malplaced. Rasmus Villemoes (4): asm-generic: Make sure include guards contain the substring ASM_GENERIC asm-generic: Use different include guards asm-generic: Add missing include guards asm-generic: Fix typo in comment after #endif include/asm-generic/bitops/arch_hweight.h |2 +- include/asm-generic/bitops/atomic.h |2 +- include/asm-generic/cacheflush.h|6 +++--- include/asm-generic/clkdev.h|4 ++-- include/asm-generic/dma-coherent.h |4 ++-- include/asm-generic/dma-contiguous.h|4 ++-- include/asm-generic/dma-mapping-broken.h|6 +++--- include/asm-generic/dma-mapping-common.h|4 ++-- include/asm-generic/ide_iops.h |4 include/asm-generic/io-64-nonatomic-hi-lo.h |6 +++--- include/asm-generic/io-64-nonatomic-lo-hi.h |6 +++--- include/asm-generic/iomap.h |4 ++-- include/asm-generic/memory_model.h |4 ++-- include/asm-generic/pci_iomap.h |2 +- include/asm-generic/rtc.h |6 +++--- include/asm-generic/signal.h|2 +- include/asm-generic/statfs.h|4 ++-- include/asm-generic/syscall.h |6 +++--- include/asm-generic/unistd.h|5 + include/asm-generic/vga.h |2 +- include/asm-generic/word-at-a-time.h|6 +++--- include/asm-generic/xor.h |4 22 files changed, 53 insertions(+), 40 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] asm-generic: Make sure include guards contain the substring ASM_GENERIC
This patch ensures that the include guards used in various include/asm-generic/*.h files contain the substring ASM_GENERIC. This should reduce the risk of bugs arising from arch/*/asm/foo.h including asm-generic/foo.h, while both use the same include guard. No such instances exist, but two near-hits are: arch/mn10300/include/asm/rtc.h uses _ASM_RTC_H and includes asm-generic/rtc.h which uses __ASM_RTC_H__ arch/hexagon/include/asm/cacheflush.h uses _ASM_CACHEFLUSH_H and includes asm-generic/cacheflush.h which uses __ASM_CACHEFLUSH_H Signed-off-by: Rasmus Villemoes --- include/asm-generic/cacheflush.h|6 +++--- include/asm-generic/clkdev.h|4 ++-- include/asm-generic/dma-coherent.h |4 ++-- include/asm-generic/dma-contiguous.h|4 ++-- include/asm-generic/io-64-nonatomic-hi-lo.h |6 +++--- include/asm-generic/io-64-nonatomic-lo-hi.h |6 +++--- include/asm-generic/iomap.h |4 ++-- include/asm-generic/memory_model.h |4 ++-- include/asm-generic/rtc.h |6 +++--- include/asm-generic/statfs.h|4 ++-- include/asm-generic/syscall.h |6 +++--- include/asm-generic/word-at-a-time.h|6 +++--- 12 files changed, 30 insertions(+), 30 deletions(-) diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h index 87bc536..c9717ba 100644 --- a/include/asm-generic/cacheflush.h +++ b/include/asm-generic/cacheflush.h @@ -1,5 +1,5 @@ -#ifndef __ASM_CACHEFLUSH_H -#define __ASM_CACHEFLUSH_H +#ifndef __ASM_GENERIC_CACHEFLUSH_H +#define __ASM_GENERIC_CACHEFLUSH_H /* Keep includes the same across arches. */ #include @@ -31,4 +31,4 @@ #define copy_from_user_page(vma, page, vaddr, dst, src, len) \ memcpy(dst, src, len) -#endif /* __ASM_CACHEFLUSH_H */ +#endif /* __ASM_GENERIC_CACHEFLUSH_H */ diff --git a/include/asm-generic/clkdev.h b/include/asm-generic/clkdev.h index 90a32a6..fb1cad9 100644 --- a/include/asm-generic/clkdev.h +++ b/include/asm-generic/clkdev.h @@ -10,8 +10,8 @@ * * Helper for the clk API to assist looking up a struct clk. */ -#ifndef __ASM_CLKDEV_H -#define __ASM_CLKDEV_H +#ifndef __ASM_GENERIC_CLKDEV_H +#define __ASM_GENERIC_CLKDEV_H #include diff --git a/include/asm-generic/dma-coherent.h b/include/asm-generic/dma-coherent.h index 2be8a2d..33a971e 100644 --- a/include/asm-generic/dma-coherent.h +++ b/include/asm-generic/dma-coherent.h @@ -1,5 +1,5 @@ -#ifndef DMA_COHERENT_H -#define DMA_COHERENT_H +#ifndef _ASM_GENERIC_DMA_COHERENT_H +#define _ASM_GENERIC_DMA_COHERENT_H #ifdef CONFIG_HAVE_GENERIC_DMA_COHERENT /* diff --git a/include/asm-generic/dma-contiguous.h b/include/asm-generic/dma-contiguous.h index 294b1e7..cedad6b 100644 --- a/include/asm-generic/dma-contiguous.h +++ b/include/asm-generic/dma-contiguous.h @@ -1,5 +1,5 @@ -#ifndef ASM_DMA_CONTIGUOUS_H -#define ASM_DMA_CONTIGUOUS_H +#ifndef _ASM_GENERIC_DMA_CONTIGUOUS_H +#define _ASM_GENERIC_DMA_CONTIGUOUS_H #ifdef __KERNEL__ #ifdef CONFIG_CMA diff --git a/include/asm-generic/io-64-nonatomic-hi-lo.h b/include/asm-generic/io-64-nonatomic-hi-lo.h index a6806a9..982cc29 100644 --- a/include/asm-generic/io-64-nonatomic-hi-lo.h +++ b/include/asm-generic/io-64-nonatomic-hi-lo.h @@ -1,5 +1,5 @@ -#ifndef _ASM_IO_64_NONATOMIC_HI_LO_H_ -#define _ASM_IO_64_NONATOMIC_HI_LO_H_ +#ifndef _ASM_GENERIC_IO_64_NONATOMIC_HI_LO_H_ +#define _ASM_GENERIC_IO_64_NONATOMIC_HI_LO_H_ #include #include @@ -25,4 +25,4 @@ static inline void writeq(__u64 val, volatile void __iomem *addr) } #endif -#endif /* _ASM_IO_64_NONATOMIC_HI_LO_H_ */ +#endif /* _ASM_GENERIC_IO_64_NONATOMIC_HI_LO_H_ */ diff --git a/include/asm-generic/io-64-nonatomic-lo-hi.h b/include/asm-generic/io-64-nonatomic-lo-hi.h index ca546b1..abb1549 100644 --- a/include/asm-generic/io-64-nonatomic-lo-hi.h +++ b/include/asm-generic/io-64-nonatomic-lo-hi.h @@ -1,5 +1,5 @@ -#ifndef _ASM_IO_64_NONATOMIC_LO_HI_H_ -#define _ASM_IO_64_NONATOMIC_LO_HI_H_ +#ifndef _ASM_GENERIC_IO_64_NONATOMIC_LO_HI_H_ +#define _ASM_GENERIC_IO_64_NONATOMIC_LO_HI_H_ #include #include @@ -25,4 +25,4 @@ static inline void writeq(__u64 val, volatile void __iomem *addr) } #endif -#endif /* _ASM_IO_64_NONATOMIC_LO_HI_H_ */ +#endif /* _ASM_GENERIC_IO_64_NONATOMIC_LO_HI_H_ */ diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h index 6afd7d6..66b6798 100644 --- a/include/asm-generic/iomap.h +++ b/include/asm-generic/iomap.h @@ -1,5 +1,5 @@ -#ifndef __GENERIC_IO_H -#define __GENERIC_IO_H +#ifndef __ASM_GENERIC_IOMAP_H +#define __ASM_GENERIC_IOMAP_H #include #include diff --git a/include/asm-generic/memory_model.h b/include/asm-generic/memory_model.h index aea9e45..e40c952 100644 --- a/include/asm-generic/memory_model.h +++ b/include/asm-generic/memory_model.h @@ -1,5 +1,5 @@ -#ifndef __ASM_MEMORY_MODEL_H -#define __ASM_MEMORY_MODEL_H +#ifndef __ASM_GENERIC_MEMORY_MODEL_H
[PATCH 3/4] asm-generic: Add missing include guards
Add include guards to include/asm-generic/{ide_iops,unistd,xor}.h. Signed-off-by: Rasmus Villemoes --- include/asm-generic/ide_iops.h |4 include/asm-generic/unistd.h |5 + include/asm-generic/xor.h |4 3 files changed, 13 insertions(+) diff --git a/include/asm-generic/ide_iops.h b/include/asm-generic/ide_iops.h index 1b91d06..cfe2ab8 100644 --- a/include/asm-generic/ide_iops.h +++ b/include/asm-generic/ide_iops.h @@ -1,4 +1,6 @@ /* Generic I/O and MEMIO string operations. */ +#ifndef _ASM_GENERIC_IDE_IOPS_H +#define _ASM_GENERIC_IDE_IOPS_H #define __ide_insw insw #define __ide_insl insl @@ -36,3 +38,5 @@ static __inline__ void __ide_mm_outsl(void __iomem * port, void *addr, u32 count addr += 4; } } + +#endif /* _ASM_GENERIC_IDE_IOPS_H */ diff --git a/include/asm-generic/unistd.h b/include/asm-generic/unistd.h index 86e..0cff2e9 100644 --- a/include/asm-generic/unistd.h +++ b/include/asm-generic/unistd.h @@ -1,3 +1,6 @@ +#ifndef _ASM_GENERIC_UNISTD_H +#define _ASM_GENERIC_UNISTD_H + #include #include @@ -10,3 +13,5 @@ #define __ARCH_WANT_STAT64 #define __ARCH_WANT_SYS_LLSEEK #endif + +#endif /* _ASM_GENERIC_UNISTD_H */ diff --git a/include/asm-generic/xor.h b/include/asm-generic/xor.h index b4d8432..d063878 100644 --- a/include/asm-generic/xor.h +++ b/include/asm-generic/xor.h @@ -12,6 +12,8 @@ * (for example /usr/src/linux/COPYING); if not, write to the Free * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */ +#ifndef _ASM_GENERIC_XOR_H +#define _ASM_GENERIC_XOR_H #include @@ -716,3 +718,5 @@ static struct xor_block_template xor_block_32regs_p __maybe_unused = { xor_speed(&xor_block_32regs); \ xor_speed(&xor_block_32regs_p); \ } while (0) + +#endif /* _ASM_GENERIC_XOR_H */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] asm-generic: Fix typo in comment after #endif
Fix the comment so that it matches the actual name of the include guard used. Signed-off-by: Rasmus Villemoes --- include/asm-generic/bitops/arch_hweight.h |2 +- include/asm-generic/bitops/atomic.h |2 +- include/asm-generic/pci_iomap.h |2 +- include/asm-generic/signal.h |2 +- include/asm-generic/vga.h |2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/asm-generic/bitops/arch_hweight.h b/include/asm-generic/bitops/arch_hweight.h index 6a211f4..d6be0af 100644 --- a/include/asm-generic/bitops/arch_hweight.h +++ b/include/asm-generic/bitops/arch_hweight.h @@ -22,4 +22,4 @@ static inline unsigned long __arch_hweight64(__u64 w) { return __sw_hweight64(w); } -#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */ +#endif /* _ASM_GENERIC_BITOPS_ARCH_HWEIGHT_H_ */ diff --git a/include/asm-generic/bitops/atomic.h b/include/asm-generic/bitops/atomic.h index 9ae6c34..d3ac7d5 100644 --- a/include/asm-generic/bitops/atomic.h +++ b/include/asm-generic/bitops/atomic.h @@ -186,4 +186,4 @@ static inline int test_and_change_bit(int nr, volatile unsigned long *addr) return (old & mask) != 0; } -#endif /* _ASM_GENERIC_BITOPS_ATOMIC_H */ +#endif /* _ASM_GENERIC_BITOPS_ATOMIC_H_ */ diff --git a/include/asm-generic/pci_iomap.h b/include/asm-generic/pci_iomap.h index ce37349..0b5f271 100644 --- a/include/asm-generic/pci_iomap.h +++ b/include/asm-generic/pci_iomap.h @@ -32,4 +32,4 @@ static inline void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned lon } #endif -#endif /* __ASM_GENERIC_IO_H */ +#endif /* __ASM_GENERIC_PCI_IOMAP_H */ diff --git a/include/asm-generic/signal.h b/include/asm-generic/signal.h index d840c90..1fc827f 100644 --- a/include/asm-generic/signal.h +++ b/include/asm-generic/signal.h @@ -11,4 +11,4 @@ #undef __HAVE_ARCH_SIG_BITOPS #endif /* __ASSEMBLY__ */ -#endif /* _ASM_GENERIC_SIGNAL_H */ +#endif /* __ASM_GENERIC_SIGNAL_H */ diff --git a/include/asm-generic/vga.h b/include/asm-generic/vga.h index 36c8ff5..93b8a80 100644 --- a/include/asm-generic/vga.h +++ b/include/asm-generic/vga.h @@ -21,4 +21,4 @@ #define vga_readb(x) (*(x)) #define vga_writeb(x, y) (*(y) = (x)) -#endif /* _ASM_GENERIC_VGA_H */ +#endif /* __ASM_GENERIC_VGA_H */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] asm-generic: Use different include guards
For consistency, use different include guards in include/asm-generic/dma-mapping-{broken,common}.h, even though it is unlikely (if not outright a bug) to ever include both. Signed-off-by: Rasmus Villemoes --- include/asm-generic/dma-mapping-broken.h |6 +++--- include/asm-generic/dma-mapping-common.h |4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/asm-generic/dma-mapping-broken.h b/include/asm-generic/dma-mapping-broken.h index 6c32af9..34621ea 100644 --- a/include/asm-generic/dma-mapping-broken.h +++ b/include/asm-generic/dma-mapping-broken.h @@ -1,5 +1,5 @@ -#ifndef _ASM_GENERIC_DMA_MAPPING_H -#define _ASM_GENERIC_DMA_MAPPING_H +#ifndef _ASM_GENERIC_DMA_MAPPING_BROKEN_H +#define _ASM_GENERIC_DMA_MAPPING_BROKEN_H /* define the dma api to allow compilation but not linking of * dma dependent code. Code that depends on the dma-mapping @@ -92,4 +92,4 @@ extern void dma_cache_sync(struct device *dev, void *vaddr, size_t size, enum dma_data_direction direction); -#endif /* _ASM_GENERIC_DMA_MAPPING_H */ +#endif /* _ASM_GENERIC_DMA_MAPPING_BROKEN_H */ diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h index de8bf89..2ae73e6 100644 --- a/include/asm-generic/dma-mapping-common.h +++ b/include/asm-generic/dma-mapping-common.h @@ -1,5 +1,5 @@ -#ifndef _ASM_GENERIC_DMA_MAPPING_H -#define _ASM_GENERIC_DMA_MAPPING_H +#ifndef _ASM_GENERIC_DMA_MAPPING_COMMON_H +#define _ASM_GENERIC_DMA_MAPPING_COMMON_H #include #include -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH/RFC] coccinelle: replace 0/1 with false/true in functions returning bool
This semantic patch replaces "return {0,1};" with "return {false,true};" in functions returning bool. There doesn't seem to be any false positives, but some whitespace mangling is happening, for example: diff -u -p a/block/blk-throttle.c b/block/blk-throttle.c --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -734,9 +734,7 @@ static inline void throtl_extend_slice(s static bool throtl_slice_used(struct throtl_grp *tg, bool rw) { if (time_in_range(jiffies, tg->slice_start[rw], tg->slice_end[rw])) - return 0; - - return 1; + return false;return true; } Is there a way to prevent this, or is this the kind of thing which must be handled in post-processing? Signed-off-by: Rasmus Villemoes --- scripts/coccinelle/misc/boolreturn.cocci | 51 ++ 1 file changed, 51 insertions(+) create mode 100644 scripts/coccinelle/misc/boolreturn.cocci diff --git a/scripts/coccinelle/misc/boolreturn.cocci b/scripts/coccinelle/misc/boolreturn.cocci new file mode 100644 index 000..e6ece0d --- /dev/null +++ b/scripts/coccinelle/misc/boolreturn.cocci @@ -0,0 +1,51 @@ +/// Return statements in functions returning bool should use +/// true/false instead of 1/0. +// + +virtual patch +virtual report + + +@r1 depends on patch@ +identifier fn; +typedef bool; +symbol false; +symbol true; +@@ + +bool fn ( ... ) +{ +... +( +- return 0; ++ return false; +| +- return 1; ++ return true; +) +... +} + +@r2 depends on !patch@ +identifier fn; +position p; +@@ + +bool fn ( ... ) +{ + ... +( +* return 0@p ; +| +* return 1@p ; +) + ... +} + + +@script:python depends on report@ +p << r2.p; +fn << r2.fn; +@@ + +coccilib.report.print_report(p[0], "WARNING: return of 0/1 in function '%s' with return type bool" % fn) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] coccinelle: replace 0/1 with false/true in functions returning bool
This semantic patch replaces "return {0,1};" with "return {false,true};" in functions returning bool. Signed-off-by: Rasmus Villemoes --- v2: Simplified script, and eliminate whitespace mangling at the same time. Thanks to Julia Lawall. scripts/coccinelle/misc/boolreturn.cocci | 56 ++ 1 file changed, 56 insertions(+) create mode 100644 scripts/coccinelle/misc/boolreturn.cocci diff --git a/scripts/coccinelle/misc/boolreturn.cocci b/scripts/coccinelle/misc/boolreturn.cocci new file mode 100644 index 000..c8d494c --- /dev/null +++ b/scripts/coccinelle/misc/boolreturn.cocci @@ -0,0 +1,56 @@ +/// Return statements in functions returning bool should use +/// true/false instead of 1/0. +// +// Confidence: High + +virtual patch +virtual report + + +@r1 depends on patch@ +identifier fn; +typedef bool; +symbol false; +symbol true; +@@ + +bool fn ( ... ) +{ +... +return +( +- 0 ++ false +| +- 1 ++ true +) + ; +... +} + +@r2 depends on !patch@ +identifier fn; +position p; +@@ + +bool fn ( ... ) +{ + ... +return +( +* 0@p +| +* 1@p +) + ; + ... +} + + +@script:python depends on report@ +p << r2.p; +fn << r2.fn; +@@ + +coccilib.report.print_report(p[0], "WARNING: return of 0/1 in function '%s' with return type bool" % fn) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3] coccinelle: replace 0/1 with false/true in functions returning bool
This semantic patch replaces "return {0,1};" with "return {false,true};" in functions returning bool. Signed-off-by: Rasmus Villemoes --- v2: Simplified script, and eliminate whitespace mangling at the same time. Thanks to Julia Lawall. v3: Further improvements from Julia. In particular, it now also does header files. scripts/coccinelle/misc/boolreturn.cocci | 58 ++ 1 file changed, 58 insertions(+) create mode 100644 scripts/coccinelle/misc/boolreturn.cocci diff --git a/scripts/coccinelle/misc/boolreturn.cocci b/scripts/coccinelle/misc/boolreturn.cocci new file mode 100644 index 000..a43c7b0 --- /dev/null +++ b/scripts/coccinelle/misc/boolreturn.cocci @@ -0,0 +1,58 @@ +/// Return statements in functions returning bool should use +/// true/false instead of 1/0. +// +// Confidence: High +// Options: --no-includes --include-headers + +virtual patch +virtual report +virtual context + +@r1 depends on patch@ +identifier fn; +typedef bool; +symbol false; +symbol true; +@@ + +bool fn ( ... ) +{ +<... +return +( +- 0 ++ false +| +- 1 ++ true +) + ; +...> +} + +@r2 depends on report || context@ +identifier fn; +position p; +@@ + +bool fn ( ... ) +{ +<... +return +( +* 0@p +| +* 1@p +) + ; +...> +} + + +@script:python depends on report@ +p << r2.p; +fn << r2.fn; +@@ + +msg = "WARNING: return of 0/1 in function '%s' with return type bool" % fn +coccilib.report.print_report(p[0], msg) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm: madvise: complete input validation before taking lock
In madvise(), there doesn't seem to be any reason for taking the ¤t->mm->mmap_sem before start and len_in have been validated. Incidentally, this removes the need for the out: label. Signed-off-by: Rasmus Villemoes --- diff --git a/mm/madvise.c b/mm/madvise.c index c58c94b..d2ae668 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -473,27 +473,27 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) if (!madvise_behavior_valid(behavior)) return error; - write = madvise_need_mmap_write(behavior); - if (write) - down_write(¤t->mm->mmap_sem); - else - down_read(¤t->mm->mmap_sem); - if (start & ~PAGE_MASK) - goto out; + return error; len = (len_in + ~PAGE_MASK) & PAGE_MASK; /* Check to see whether len was rounded up from small -ve to zero */ if (len_in && !len) - goto out; + return error; end = start + len; if (end < start) - goto out; + return error; error = 0; if (end == start) - goto out; + return error; + + write = madvise_need_mmap_write(behavior); + if (write) + down_write(¤t->mm->mmap_sem); + else + down_read(¤t->mm->mmap_sem); /* * If the interval [start,end) covers some unmapped address @@ -541,7 +541,6 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) } out_plug: blk_finish_plug(&plug); -out: if (write) up_write(¤t->mm->mmap_sem); else -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] watchdog: bcm2835_wdt: set WDOG_HW_RUNNING bit when appropriate
On 2016-07-15 15:46, Guenter Roeck wrote: On 07/15/2016 01:15 AM, Rasmus Villemoes wrote: +static bool bcm2835_wdt_is_running(struct bcm2835_wdt *wdt) +{ +uint32_t cur; + +cur = readl(wdt->base + PM_RSTC); + +return !!(cur & PM_RSTC_WRCFG_FULL_RESET); +} + static int bcm2835_wdt_start(struct watchdog_device *wdog) { struct bcm2835_wdt *wdt = watchdog_get_drvdata(wdog); @@ -70,6 +79,7 @@ static int bcm2835_wdt_start(struct watchdog_device *wdog) PM_RSTC_WRCFG_FULL_RESET, wdt->base + PM_RSTC); spin_unlock_irqrestore(&wdt->lock, flags); +set_bit(WDOG_HW_RUNNING, &wdog->status); You don't need to set this bit here unless the watchdog can not be stopped. return 0; } @@ -79,6 +89,7 @@ static int bcm2835_wdt_stop(struct watchdog_device *wdog) struct bcm2835_wdt *wdt = watchdog_get_drvdata(wdog); writel_relaxed(PM_PASSWORD | PM_RSTC_RESET, wdt->base + PM_RSTC); +clear_bit(WDOG_HW_RUNNING, &wdog->status); ... and since you clear the bit, it can be stopped. Both setting and resetting the bit is therefore not necessary. Well, if the bit isn't cleared here, but it was set during probe(), the framework will (re)start this watchdog (and keep it fed) since there's no separate ping method. I suppose that's reasonable semantics if the watchdog was running at boot (and I like how that ends up interacting with my open_deadline proposal), but probably a little too subtle. This would also change if the ->start method was broken up into separate ping and start methods, which it seems that it could be. If we do clear the bit here, I think it's neater to set it in start as well, even if that doesn't really have any effect. Rasmus
Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
On 2016-07-15 16:29, Guenter Roeck wrote: On 07/15/2016 12:32 AM, Rasmus Villemoes wrote: The initial timeout should be specified as module option or as devicetree parameter, and there should be no additional configuration option. I was under the impression that device tree was exclusively for describing hardware, and this certainly is not that. I also wanted to avoid having to modify each driver, which would seem to be necessary if it was module parameter/DT - the only thing required of a driver now is that it correctly reports WDOG_HW_RUNNING. What is "hardware" ? It is supposed to describe the system, isn't it ? Part of that system is its clock rate, and the means how the OS is loaded, and both have impact on the initial timeout (and the regular timeout). You might as well argue that clock rates should not be in devicetree either. Clock rates are, after all, just reflecting the _use_ of the hardware, not the hardware itself. But they are used to configure hardware. The init timeout is not a property of any particular device - it configures how the kernel behaves, and as such I find it quite natural to have it in the kernel's .config (and overridable on command line and via sysfs). Devicetree could be handled in the core, with a function to set the initial timeout, or possibly even with the watchdog registration itself. But where in the device tree would you put this value? I'd really prefer not having to modify the node representing each individual watchdog device I might use. Rasmus
[PATCH] fs/proc/array.c: slightly improve render_sigset_t
format_decode and vsnprintf occasionally show up in perf top, so I went looking for places that might not need the full printf power. With the help of kprobes, I gathered some statistics on which format strings we mostly pass to vsnprintf. On a trivial desktop workload, I hit "%x" 25% of the time, so something apparently reads /proc/pid/status (which does 5*16 printf("%x") calls) a lot. With this patch, reading /proc/pid/status is 30% faster according to this microbenchmark: char buf[4096]; int i, fd; for (i = 0; i < 1; ++i) { fd = open("/proc/self/status", O_RDONLY); read(fd, buf, sizeof(buf)); close(fd); } Signed-off-by: Rasmus Villemoes --- fs/proc/array.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/proc/array.c b/fs/proc/array.c index 88c7de12197b..7f73b689a15c 100644 --- a/fs/proc/array.c +++ b/fs/proc/array.c @@ -251,7 +251,7 @@ void render_sigset_t(struct seq_file *m, const char *header, if (sigismember(set, i+2)) x |= 2; if (sigismember(set, i+3)) x |= 4; if (sigismember(set, i+4)) x |= 8; - seq_printf(m, "%x", x); + seq_putc(m, hex_asc[x]); } while (i >= 4); seq_putc(m, '\n'); -- 2.1.4
[PATCH] ecryptfs: remove private bin2hex implementation
Calling sprintf in a loop is not very efficient, and in any case, we already have an implementation of bin-to-hex conversion in lib/ which we might as well use. Note that ecryptfs_to_hex used to nul-terminate the destination (and the kernel doc was wrong about the required output size), while bin2hex doesn't. [All but one user of ecryptfs_to_hex explicitly nul-terminates the result anyway.] Signed-off-by: Rasmus Villemoes --- fs/ecryptfs/crypto.c | 15 --- fs/ecryptfs/ecryptfs_kernel.h | 8 +++- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c index e5e29f8c920b..7acd57da4f14 100644 --- a/fs/ecryptfs/crypto.c +++ b/fs/ecryptfs/crypto.c @@ -42,21 +42,6 @@ #define ENCRYPT1 /** - * ecryptfs_to_hex - * @dst: Buffer to take hex character representation of contents of - * src; must be at least of size (src_size * 2) - * @src: Buffer to be converted to a hex string representation - * @src_size: number of bytes to convert - */ -void ecryptfs_to_hex(char *dst, char *src, size_t src_size) -{ - int x; - - for (x = 0; x < src_size; x++) - sprintf(&dst[x * 2], "%.2x", (unsigned char)src[x]); -} - -/** * ecryptfs_from_hex * @dst: Buffer to take the bytes from src hex; must be at least of * size (src_size / 2) diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h index 4ba1547bb9ad..548caac3a907 100644 --- a/fs/ecryptfs/ecryptfs_kernel.h +++ b/fs/ecryptfs/ecryptfs_kernel.h @@ -51,7 +51,13 @@ #define ECRYPTFS_XATTR_NAME "user.ecryptfs" void ecryptfs_dump_auth_tok(struct ecryptfs_auth_tok *auth_tok); -extern void ecryptfs_to_hex(char *dst, char *src, size_t src_size); +static inline void +ecryptfs_to_hex(char *dst, char *src, size_t src_size) +{ + char *end = bin2hex(dst, src, src_size); + *end = '\0'; +} + extern void ecryptfs_from_hex(char *dst, char *src, int dst_size); struct ecryptfs_key_record { -- 2.1.4
Re: [PATCH] fs/proc/array.c: slightly improve render_sigset_t
On Wed, Sep 21 2016, Kees Cook wrote: > On Tue, Sep 20, 2016 at 3:28 PM, Rasmus Villemoes > wrote: >> format_decode and vsnprintf occasionally show up in perf top, so I >> went looking for places that might not need the full printf >> power. With the help of kprobes, I gathered some statistics on which >> format strings we mostly pass to vsnprintf. On a trivial desktop >> workload, I hit "%x" 25% of the time, so something apparently reads >> /proc/pid/status (which does 5*16 printf("%x") calls) a lot. >> >> With this patch, reading /proc/pid/status is 30% faster according to >> this microbenchmark: >> >> char buf[4096]; >> int i, fd; >> for (i = 0; i < 1; ++i) { >> fd = open("/proc/self/status", O_RDONLY); >> read(fd, buf, sizeof(buf)); >> close(fd); >> } >> >> Signed-off-by: Rasmus Villemoes > > Heheh. Nice. :) > > Acked-by: Kees Cook > > Out of curiosity, what other stuff ended up near the top? I'd be > curious to see your kprobes too. I don't have the results from the other machine handy, but it very much depends on what one is doing. I ran a 'find $HOME ...', and "%.2x" ended up accounting for 99%. Turns out ecryptfs does a lot of bin-to-hex conversions, but very inefficiently (calling sprintf("%.2x") in a loop...). Patch sent. Other than that, the top consists of stuff like "%c%c " (/proc/pid/smaps, VmFlags:) " %s" (/proc/cpuinfo, the flags: line) "%*s" (maybe from seq_pad, but not sure who the user of that is) and a lot of other short strings which are rather hard to trace to their source, but presumably most are related to some /proc file. The kprobe is just echo 'p:vsnprintf vsnprintf fmt=+0(%dx):string' > /sys/kernel/debug/tracing/kprobe_events This doesn't escape the strings, so embedded newlines mess up the trace buffer slightly. I used this little piece of line noise to extract the format strings. use File::Slurp; my $txt = read_file($ARGV[0]); while ($txt =~ m/fmt="(.*?)"\n/mg) { $_ = $1; s/\\//g; s/\n/\\n/g; s/\t/\\t/g; s/"/\\"/g; print "\"$_\"\n"; } Rasmus
Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
On 2016-07-21 02:31, Guenter Roeck wrote: On Thu, Jul 21, 2016 at 12:08:52AM +0200, Rasmus Villemoes wrote: I hear you. "configure hardware" is a slippery term, though. After all, one would typically configure the initial timeout in hardware, just as any "normal" timeout. In many cases, this will actually already be the case (and should be), since the watchdog should be enabled by the ROMMON or BIOS before control is passed to the kernel. As such, the initial timeout should already be set when the driver is loaded. So this suggests to me that you've misunderstood how I imagine the "open-deadline" (let me keep that name for now) to work. I do certainly expect the bootloader to have configured and started the watchdog device before control is handed to the kernel, preferably with a rather large timeout, so that the watchdog doesn't reset the board before the kernel gets around to loading the appropriate driver and detecting that there's a dog to be fed. In fact, if the watchdog isn't running when the kernel starts, my patch does nothing. Overall, my conclusion is that if a devicetree property is not acceptable for some reason, we should drop the notion of supporting an initial or "open" timeout entirely, and leave it up to the BIOS/ROMMON as well as the respective watchdog driver to set an acceptable default or initial timeout. This initial timeout can (and will) be overwritten with the desired runtime timeout by the watchdog daemon after it opened the watchdog device. Devicetree could be handled in the core, with a function to set the initial timeout, or possibly even with the watchdog registration itself. But where in the device tree would you put this value? I'd really prefer not having to modify the node representing each individual watchdog device I might use. The existing "timeout-sec" property is in the watchdog node as well. Yes, but my "timeout" is not used in any way for configuring the watchdog device nor the driver for it. Nor does an appropriate value depend on which watchdog hardware one has. I'm interested in, to borrow your words, "fixing the gap" between handing over control from the kernel to user-space, by making sure that the kernel only feeds the watchdog for a finite amount of time - if userspace hasn't come up and opened the watchdog device by then, the board reboots. I set a rather generous default of two minutes; it could be (and may in some cases need to be) more, but the important thing is that the kernel doesn't feed the watchdog indefinitely. Rasmus
[PATCH] mm/shmem.c: constify anon_ops
Every other dentry_operations instance is const, and this one might as well be. Signed-off-by: Rasmus Villemoes --- mm/shmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/shmem.c b/mm/shmem.c index fd8b2b5741b1..693ffdc5899a 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -4077,7 +4077,7 @@ EXPORT_SYMBOL_GPL(shmem_truncate_range); /* common code */ -static struct dentry_operations anon_ops = { +static const struct dentry_operations anon_ops = { .d_dname = simple_dname }; -- 2.1.4
[PATCH] fs/internal.h: add const to ns_dentry_operations declaration
The actual definition in fs/nsfs.c is already const. Signed-off-by: Rasmus Villemoes --- Maybe fs/nsfs.c (and fs/buffer.c) should grow includes of internal.h. I think they are the only two providing stuff declared there but not including it. fs/internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/internal.h b/fs/internal.h index ba0737649d4a..395887882315 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -156,7 +156,7 @@ extern void mnt_pin_kill(struct mount *m); /* * fs/nsfs.c */ -extern struct dentry_operations ns_dentry_operations; +extern const struct dentry_operations ns_dentry_operations; /* * fs/ioctl.c -- 2.1.4
[PATCH] fs/aio.c: eliminate redundant loads in put_aio_ring_file
Using a local variable we can prevent gcc from reloading aio_ring_file->f_inode->i_mapping twice, eliminating 2x2 dependent loads. Signed-off-by: Rasmus Villemoes --- fs/aio.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/aio.c b/fs/aio.c index fb8e45b88cd4..32ce10e55723 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -269,14 +269,17 @@ __initcall(aio_setup); static void put_aio_ring_file(struct kioctx *ctx) { struct file *aio_ring_file = ctx->aio_ring_file; + struct address_space *i_mapping; + if (aio_ring_file) { truncate_setsize(aio_ring_file->f_inode, 0); /* Prevent further access to the kioctx from migratepages */ - spin_lock(&aio_ring_file->f_inode->i_mapping->private_lock); - aio_ring_file->f_inode->i_mapping->private_data = NULL; + i_mapping = aio_ring_file->f_inode->i_mapping; + spin_lock(&i_mapping->private_lock); + i_mapping->private_data = NULL; ctx->aio_ring_file = NULL; - spin_unlock(&aio_ring_file->f_inode->i_mapping->private_lock); + spin_unlock(&i_mapping->private_lock); fput(aio_ring_file); } -- 2.1.4
[PATCH] usb: musb: remove redundant stack buffers
aDate is always the empty string, so entirely pointless. The aRevision formatting might as well be done as part of the pr_debug() call - that also avoids it altogether if pr_debug is compiled out. Signed-off-by: Rasmus Villemoes --- drivers/usb/musb/musb_core.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/usb/musb/musb_core.c b/drivers/usb/musb/musb_core.c index 74fc3069cb42..1724f1889c99 100644 --- a/drivers/usb/musb/musb_core.c +++ b/drivers/usb/musb/musb_core.c @@ -1448,7 +1448,7 @@ static int musb_core_init(u16 musb_type, struct musb *musb) { u8 reg; char *type; - char aInfo[90], aRevision[32], aDate[12]; + char aInfo[90]; void __iomem*mbase = musb->mregs; int status = 0; int i; @@ -1482,7 +1482,6 @@ static int musb_core_init(u16 musb_type, struct musb *musb) pr_debug("%s: ConfigData=0x%02x (%s)\n", musb_driver_name, reg, aInfo); - aDate[0] = 0; if (MUSB_CONTROLLER_MHDRC == musb_type) { musb->is_multipoint = 1; type = "M"; @@ -1497,11 +1496,10 @@ static int musb_core_init(u16 musb_type, struct musb *musb) /* log release info */ musb->hwvers = musb_read_hwvers(mbase); - snprintf(aRevision, 32, "%d.%d%s", MUSB_HWVERS_MAJOR(musb->hwvers), - MUSB_HWVERS_MINOR(musb->hwvers), - (musb->hwvers & MUSB_HWVERS_RC) ? "RC" : ""); - pr_debug("%s: %sHDRC RTL version %s %s\n", -musb_driver_name, type, aRevision, aDate); + pr_debug("%s: %sHDRC RTL version %d.%d%s\n", +musb_driver_name, type, MUSB_HWVERS_MAJOR(musb->hwvers), +MUSB_HWVERS_MINOR(musb->hwvers), +(musb->hwvers & MUSB_HWVERS_RC) ? "RC" : ""); /* configure ep0 */ musb_configure_ep0(musb); -- 2.1.4
[PATCH] mm/page_alloc.c: eliminate unsigned confusion in __rmqueue_fallback
Since current_order starts as MAX_ORDER-1 and is then only decremented, the second half of the loop condition seems superfluous. However, if order is 0, we may decrement current_order past 0, making it UINT_MAX. This is obviously too subtle ([1], [2]). Since we need to add some comment anyway, change the two variables to signed, making the counting-down for loop look more familiar, and apparently also making gcc generate slightly smaller code. [1] https://lkml.org/lkml/2016/6/20/493 [2] https://lkml.org/lkml/2017/6/19/345 Signed-off-by: Rasmus Villemoes --- Michal, something like this, perhaps? mm/page_alloc.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2302f250d6b1..e656f4da9772 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2204,19 +2204,23 @@ static bool unreserve_highatomic_pageblock(const struct alloc_context *ac, * list of requested migratetype, possibly along with other pages from the same * block, depending on fragmentation avoidance heuristics. Returns true if * fallback was found so that __rmqueue_smallest() can grab it. + * + * The use of signed ints for order and current_order is a deliberate + * deviation from the rest of this file, to make the for loop + * condition simpler. */ static inline bool -__rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) +__rmqueue_fallback(struct zone *zone, int order, int start_migratetype) { struct free_area *area; - unsigned int current_order; + int current_order; struct page *page; int fallback_mt; bool can_steal; /* Find the largest possible block of pages in the other list */ for (current_order = MAX_ORDER-1; - current_order >= order && current_order <= MAX_ORDER-1; + current_order >= order; --current_order) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, -- 2.11.0
Re: [PATCH v5 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
On 2017-05-25 02:56, Guenter Roeck wrote: > On 05/22/2017 07:06 AM, Rasmus Villemoes wrote: >> diff --git a/Documentation/watchdog/watchdog-parameters.txt >> b/Documentation/watchdog/watchdog-parameters.txt >> index 914518a..4801ec6 100644 >> --- a/Documentation/watchdog/watchdog-parameters.txt >> +++ b/Documentation/watchdog/watchdog-parameters.txt >> @@ -8,6 +8,15 @@ See Documentation/admin-guide/kernel-parameters.rst >> for information on >> providing kernel parameters for builtin drivers versus loadable >> modules. >> +The watchdog core currently understands one parameter, > > We have a second parameter queued, handle_boot_enabled. Please rephrase > so we don't have to change the text with each new parameter. I agree that it makes sense to rephrase to avoid having to edit when other parameters are added, and I'll send v6 in a moment. But regarding the handle_boot_enabled, I think my patch set implements a superset of that functionality (just set watchdog.open_timeout to 1ms), which is why I asked Sebastian specifically to look at the patches, and he said that they'd work for his use case as well. So while I personally don't really care if both go in, it does seem a little silly and somewhat confusing to have two parameters to do more-or-less the same thing. Btw, where can I find the watchdog-next tree (or whatever it's called) to see what is queued up? -- Rasmus Villemoes Software Developer Prevas A/S Hedeager 1 DK-8200 Aarhus N +45 51210274 rasmus.villem...@prevas.dk www.prevas.dk
[PATCH v6 1/3] watchdog: introduce watchdog_worker_should_ping helper
This will be useful when the condition becomes slightly more complicated in the next patch. Reviewed-by: Guenter Roeck Reviewed-by: Esben Haabendal Signed-off-by: Rasmus Villemoes --- drivers/watchdog/watchdog_dev.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index d5d2bbd..caa4b90 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -192,18 +192,23 @@ static int watchdog_ping(struct watchdog_device *wdd) return __watchdog_ping(wdd); } +static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data) +{ + struct watchdog_device *wdd = wd_data->wdd; + + return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd)); +} + static void watchdog_ping_work(struct work_struct *work) { struct watchdog_core_data *wd_data; - struct watchdog_device *wdd; wd_data = container_of(to_delayed_work(work), struct watchdog_core_data, work); mutex_lock(&wd_data->lock); - wdd = wd_data->wdd; - if (wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd))) - __watchdog_ping(wdd); + if (watchdog_worker_should_ping(wd_data)) + __watchdog_ping(wd_data->wdd); mutex_unlock(&wd_data->lock); } -- 2.7.4
[PATCH v6 0/3] watchdog: allow setting deadline for opening /dev/watchdogN
If a watchdog driver tells the framework that the device is running, the framework takes care of feeding the watchdog until userspace opens the device. If the userspace application which is supposed to do that never comes up properly, the watchdog is fed indefinitely by the kernel. This can be especially problematic for embedded devices. These patches allow one to set a maximum time for which the kernel will feed the watchdog, thus ensuring that either userspace has come up, or the board gets reset. This allows fallback logic in the bootloader to attempt some recovery (for example, if an automatic update is in progress, it could roll back to the previous version). The patches have been tested on a Raspberry Pi 2 and a Wandboard. v6 tweaks the wording in watchdog-parameters.txt to avoid having to update it if and when the watchdog core grows new parameters. It also adds a little more rationale to the commit messages for 2/3 and 3/3, and adds Reviewed-bys to 1/3 which is unchanged from v5. v5 is identical to v4 posted in January, just rebased to current master (v4.12-rc2). v4 is mostly identical to v1. The differences are that the ability to compile out this feature is removed, and the ability to set the default value for the watchdog.open_timeout command line parameter via Kconfig is split into a separate patch. Compared to v2/v3, this drops the ability to set the open_timeout via a device property; I'll leave implementing that to those who actually need it. Rasmus Villemoes (3): watchdog: introduce watchdog_worker_should_ping helper watchdog: introduce watchdog.open_timeout commandline parameter watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Documentation/watchdog/watchdog-parameters.txt | 9 +++ drivers/watchdog/Kconfig | 9 +++ drivers/watchdog/watchdog_dev.c| 37 +++--- 3 files changed, 51 insertions(+), 4 deletions(-) -- 2.7.4
[PATCH v6 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
The watchdog framework takes care of feeding a hardware watchdog until userspace opens /dev/watchdogN. If that never happens for some reason (buggy init script, corrupt root filesystem or whatnot) but the kernel itself is fine, the machine stays up indefinitely. This patch allows setting an upper limit for how long the kernel will take care of the watchdog, thus ensuring that the watchdog will eventually reset the machine. This is particularly useful for embedded devices where some fallback logic is implemented in the bootloader (e.g., use a different root partition, boot from network, ...). The open timeout is also used as a maximum time for an application to re-open /dev/watchdogN after closing it. A value of 0 (the default) means infinite timeout, preserving the current behaviour. The unit is milliseconds rather than seconds because that covers more use cases. For example, one can effectively disable the kernel handling by setting the open_timeout to 1 ms. There are also customers with very strict requirements that may want to set the open_timeout to something like 4500 ms, which combined with a hardware watchdog that must be pinged every 250 ms ensures userspace is up no more than 5 seconds after the bootloader hands control to the kernel (250 ms until the driver gets registered and kernel handling starts, 4500 ms of kernel handling, and then up to 250 ms from the last ping until userspace takes over). Signed-off-by: Rasmus Villemoes --- Documentation/watchdog/watchdog-parameters.txt | 8 drivers/watchdog/watchdog_dev.c| 26 +- 2 files changed, 33 insertions(+), 1 deletion(-) diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt index 914518a..8577c27 100644 --- a/Documentation/watchdog/watchdog-parameters.txt +++ b/Documentation/watchdog/watchdog-parameters.txt @@ -8,6 +8,14 @@ See Documentation/admin-guide/kernel-parameters.rst for information on providing kernel parameters for builtin drivers versus loadable modules. +The watchdog core parameter watchdog.open_timeout is the maximum time, +in milliseconds, for which the watchdog framework will take care of +pinging a hardware watchdog until userspace opens the corresponding +/dev/watchdogN device. A value of 0 (the default) means an infinite +timeout. Setting this to a non-zero value can be useful to ensure that +either userspace comes up properly, or the board gets reset and allows +fallback logic in the bootloader to try something else. + - acquirewdt: diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index caa4b90..c807067 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -66,6 +66,7 @@ struct watchdog_core_data { struct mutex lock; unsigned long last_keepalive; unsigned long last_hw_keepalive; + unsigned long open_deadline; struct delayed_work work; unsigned long status; /* Internal status bits */ #define _WDOG_DEV_OPEN 0 /* Opened ? */ @@ -80,6 +81,21 @@ static struct watchdog_core_data *old_wd_data; static struct workqueue_struct *watchdog_wq; +static unsigned open_timeout; +module_param(open_timeout, uint, 0644); + +static bool watchdog_past_open_deadline(struct watchdog_core_data *data) +{ + if (!open_timeout) + return false; + return time_is_before_jiffies(data->open_deadline); +} + +static void watchdog_set_open_deadline(struct watchdog_core_data *data) +{ + data->open_deadline = jiffies + msecs_to_jiffies(open_timeout); +} + static inline bool watchdog_need_worker(struct watchdog_device *wdd) { /* All variables in milli-seconds */ @@ -196,7 +212,13 @@ static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data) { struct watchdog_device *wdd = wd_data->wdd; - return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd)); + if (!wdd) + return false; + + if (watchdog_active(wdd)) + return true; + + return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wd_data); } static void watchdog_ping_work(struct work_struct *work) @@ -857,6 +879,7 @@ static int watchdog_release(struct inode *inode, struct file *file) watchdog_ping(wdd); } + watchdog_set_open_deadline(wd_data); watchdog_update_worker(wdd); /* make sure that /dev/watchdog can be re-opened */ @@ -955,6 +978,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno) /* Record time of most recent heartbeat as 'just before now'. */ wd_data->last_hw_keepalive = jiffies - 1; + watchdog_set_open_deadline(wd_data); /* * If the watchdog is running, prevent its driver from being unloaded, -- 2.7.4
[PATCH v6 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT
This allows setting a default value for the watchdog.open_timeout commandline parameter via Kconfig. Some BSPs allow remote updating of the kernel image and root file system, but updating the bootloader requires physical access. Hence, if one has a firmware update that requires relaxing the watchdog.open_timeout a little, the value used must be baked into the kernel image itself and cannot come from the u-boot environment via the kernel command line. Being able to set the initial value in .config doesn't change the fact that the value on the command line, if present, takes precedence, and is of course immensely useful for development purposes while one has console acccess, as well as usable in the cases where one can make a permanent update of the kernel command line. Signed-off-by: Rasmus Villemoes --- Documentation/watchdog/watchdog-parameters.txt | 3 ++- drivers/watchdog/Kconfig | 9 + drivers/watchdog/watchdog_dev.c| 2 +- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt index 8577c27..fa34625 100644 --- a/Documentation/watchdog/watchdog-parameters.txt +++ b/Documentation/watchdog/watchdog-parameters.txt @@ -11,7 +11,8 @@ modules. The watchdog core parameter watchdog.open_timeout is the maximum time, in milliseconds, for which the watchdog framework will take care of pinging a hardware watchdog until userspace opens the corresponding -/dev/watchdogN device. A value of 0 (the default) means an infinite +/dev/watchdogN device. The defalt value is +CONFIG_WATCHDOG_OPEN_TIMEOUT. A value of 0 means an infinite timeout. Setting this to a non-zero value can be useful to ensure that either userspace comes up properly, or the board gets reset and allows fallback logic in the bootloader to try something else. diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig index 8b9049d..11946fb 100644 --- a/drivers/watchdog/Kconfig +++ b/drivers/watchdog/Kconfig @@ -52,6 +52,15 @@ config WATCHDOG_SYSFS Say Y here if you want to enable watchdog device status read through sysfs attributes. +config WATCHDOG_OPEN_TIMEOUT + int "Timeout value for opening watchdog device" + default 0 + help + The maximum time, in milliseconds, for which the watchdog + framework takes care of pinging a hardware watchdog. A value + of 0 means infinite. The value set here can be overridden by + the commandline parameter "watchdog.open_timeout". + # # General Watchdog drivers # diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index c807067..098b9cb 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -81,7 +81,7 @@ static struct watchdog_core_data *old_wd_data; static struct workqueue_struct *watchdog_wq; -static unsigned open_timeout; +static unsigned open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT; module_param(open_timeout, uint, 0644); static bool watchdog_past_open_deadline(struct watchdog_core_data *data) -- 2.7.4
Re: [PATCH v7] printk: hash addresses printed with %p
On 25 October 2017 at 01:57, Tobin C. Harding wrote: > On Tue, Oct 24, 2017 at 09:25:20PM +0200, Rasmus Villemoes wrote: >> >> I haven't followed the discussion too closely, but has it been >> considered to exempt NULL from hashing? > [snip] > > The code in question is; > > static noinline_for_stack > char *pointer(const char *fmt, char *buf, char *end, void *ptr, > struct printf_spec spec) > { > const int default_width = 2 * sizeof(void *); > > if (!ptr && *fmt != 'K') { > /* > * Print (null) with the same width as a pointer so it makes > * tabular output look nice. > */ > if (spec.field_width == -1) > spec.field_width = default_width; > return string(buf, end, "(null)", spec); > } Ah, yes, I should have re-read the current code before commenting. So we're effectively already exempting NULL due to this early handling. Good, let's leave that. Regarding the tabular output: Ignore it, it's completely irrelevant to the hardening efforts (good work, btw), and probably completely irrelevant period. If anything I'd say the comment and the attempted alignment should just be killed. > This check and print "(null)" is at the wrong level of abstraction. If we > want tabular output to be > correct for _all_ pointer specifiers then spec.field_width (for NULL) should > be set to match whatever > field_width is used in the associated output function. Removing the NULL > check above would require > NULL checks adding to at least; > > resource_string() > bitmap_list_string() > bitmap_string() > mac_address_string() > ip4_addr_string() > ip4_addr_string_sa() > ip6_addr_string_sa() > uuid_string() > netdev_bits() > address_val() > dentry_name() > bdev_name() > ptr_to_id() No, please don't. The NULL check makes perfect sense early in pointer(), because many of these handlers would dereference the pointer, and while it's probably a bug to pass NULL to say %pD, it's obviously better to print (null) than crash. Adding NULL checks to each of these is error-prone and lots of bloat for no real value (consider that many of these can produce lots of variants of output, and e.g. dotted-decimal ip addresses don't even have a well-defined width at all). > My question is [snip] is it too trivial to matter? Yes. Rasmus
[PATCH 5/5] USB: Add format_template attribute to struct usb_class_driver
This serves as human-readable documentation as well as allowing the format_template plugin to complain about any static initializers of this struct member that do not have the same set of printf specifiers. Signed-off-by: Rasmus Villemoes --- include/linux/usb.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/usb.h b/include/linux/usb.h index 9c63792a8134..8d48c0cd1c01 100644 --- a/include/linux/usb.h +++ b/include/linux/usb.h @@ -1236,7 +1236,7 @@ extern struct bus_type usb_bus_type; * parameters used for them. */ struct usb_class_driver { - char *name; + char *name __format_template("foobar_%d"); char *(*devnode)(struct device *dev, umode_t *mode); const struct file_operations *fops; int minor_base; -- 2.11.0
[PATCH 1/5] plugins: implement format_template attribute
Most format strings in the kernel are string literals, so the compiler and other static analyzers can do type checking. This plugin covers a few of the remaining cases, by introducing a format_template attribute. Consider struct usb_class_driver. Its member 'name' is used as a format string in usb_register_dev(), and that use obviously expects that the format string contains a single "%d" (or maybe %u). So the idea is that we simply attach __format_template("%d") to the declaration of the name member of struct usb_class_driver. We can then check that any static initialization of that member is with a string literal with the same set of specifiers. This is what the plugin currently does. There are a number of things it also could/should do: - Check run-time assignments of literals to a __format_template annotated struct member. - Use this at call sites of *printf functions, where the format string argument is a __format_template annotated struct member - that is, use the template in lieu of a string literal. For now, this is not implemented - mostly because I'm lazy and don't want to write my own format checking code (again), and I suppose there should be an internal gcc function I could (ab)use to say "check this variadic function call, but use _this_ as format string". - It should also be possible to attach it to function parameters - e.g. the namefmt parameter to kthread_create_on_cpu should have __format_template("%u"); its only caller is __smpboot_create_thread which passes struct smp_hotplug_thread->thread_comm, which in turn should also have that attribute. Combined with the above, this would check the entire chain from initializer to sprintf(). While strictly speaking "%*s" and "%d %s" both expect (int, const char*), they're morally distinct, so I don't want to treat them as equivalent. If this is ever a problem, I think one should let the attribute take an optional flag argument, which could then control how strict or lax the checking should be. I'm not sure how much this affects compilation time, but there's not really any point in building with this all the time - it should suffice that the various build bots do it once in a while. Even without the plugin, the __format_template(...) in headers serve as concise documentation. Applying this attribute to smp_hotplug_thread::thread_comm and modifying kernel/watchdog.c slightly, an example of the error message produced for violations is: kernel/watchdog.c:528:1: error: initializer string 'watchdog/%u %d' contains extra format specifier '%d' compared to format template 'foobar/%u' Signed-off-by: Rasmus Villemoes --- arch/Kconfig | 18 ++ include/linux/compiler.h | 6 + scripts/Makefile.gcc-plugins | 2 + scripts/gcc-plugins/format_template_plugin.c | 331 +++ 4 files changed, 357 insertions(+) create mode 100644 scripts/gcc-plugins/format_template_plugin.c diff --git a/arch/Kconfig b/arch/Kconfig index 057370a0ac4e..71c582eaeb69 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -517,6 +517,24 @@ config GCC_PLUGIN_RANDSTRUCT_PERFORMANCE in structures. This reduces the performance hit of RANDSTRUCT at the cost of weakened randomization. +config GCC_PLUGIN_FORMAT_TEMPLATE + bool "Enable format_template attribute" + depends on GCC_PLUGINS + help + This plugin implements a format_template attribute which can + be attached to struct members which are supposed to hold a + (printf) format string. This allows the compiler to check + that (a) any string statically assigned to such a struct + member has format specifiers compatible with those in the + template and (b) when such a struct member is used as the + format argument to a printf function, use the template in + lieu of a string literal to do type checking of the variadic + arguments. + + Even without using the plugin, attaching the format_template + attribute can be beneficial, since it serves as + documentation. + config HAVE_CC_STACKPROTECTOR bool help diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 202710420d6d..478914ad280b 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -625,4 +625,10 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s (_p1); \ }) +#ifdef HAVE_ATTRIBUTE_FORMAT_TEMPLATE +#define __format_template(x) __attribute__((__format_template__(x))) +#else +#define __format_template(x) +#endif + #endif /* __LINUX_COMPILER_H */ diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index b2a95af7df18..2f9bc96aab90 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefi
[PATCH 3/5] staging: speakup: Apply format_template attribute
This serves as human-readable documentation as well as allowing the format_template plugin to complain about any static initializers of this struct member that do not have the same set of printf specifiers. Signed-off-by: Rasmus Villemoes --- drivers/staging/speakup/spk_types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/speakup/spk_types.h b/drivers/staging/speakup/spk_types.h index c50de6035a9a..156000c0c4d3 100644 --- a/drivers/staging/speakup/spk_types.h +++ b/drivers/staging/speakup/spk_types.h @@ -108,7 +108,7 @@ struct st_var_header { }; struct num_var_t { - char *synth_fmt; + char *synth_fmt __format_template("%d"); int default_val; int low; int high; -- 2.11.0
[PATCH 2/5] hwmon: (applesmc) Apply format_template attribute
This serves as human-readable documentation as well as allowing the format_template plugin to complain about any static initializers of this struct member that do not have the same set of printf specifiers. Signed-off-by: Rasmus Villemoes --- drivers/hwmon/applesmc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c index 5c677ba44014..9e6d23cba23e 100644 --- a/drivers/hwmon/applesmc.c +++ b/drivers/hwmon/applesmc.c @@ -110,7 +110,7 @@ struct applesmc_dev_attr { /* Dynamic device node group */ struct applesmc_node_group { - char *format; /* format string */ + char *format __format_template("%d"); /* format string */ void *show; /* show function */ void *store;/* store function */ int option; /* function argument */ -- 2.11.0
[PATCH 4/5] linux/smpboot.h: Apply format_template attribute
This serves as human-readable documentation as well as allowing the format_template plugin to complain about any static initializers of this struct member that do not have the same set of printf specifiers. Signed-off-by: Rasmus Villemoes --- include/linux/smpboot.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/smpboot.h b/include/linux/smpboot.h index c174844cf663..967285b9637d 100644 --- a/include/linux/smpboot.h +++ b/include/linux/smpboot.h @@ -42,7 +42,7 @@ struct smp_hotplug_thread { void(*unpark)(unsigned int cpu); cpumask_var_t cpumask; boolselfparking; - const char *thread_comm; + const char *thread_comm __format_template("foobar/%u"); }; int smpboot_register_percpu_thread_cpumask(struct smp_hotplug_thread *plug_thread, -- 2.11.0
[PATCH] irqdomain: drop pointless NULL check in virq_debug_show_one
We've already done data->irq and data->hwirq, and also unconditionally dereference data in the irq_data_get_irq_chip_data() call. Signed-off-by: Rasmus Villemoes --- kernel/irq/irqdomain.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c index ac4644e92b49..8aba7b4a2a91 100644 --- a/kernel/irq/irqdomain.c +++ b/kernel/irq/irqdomain.c @@ -921,8 +921,7 @@ static void virq_debug_show_one(struct seq_file *m, struct irq_desc *desc) chip = irq_data_get_irq_chip(data); seq_printf(m, "%-15s ", (chip && chip->name) ? chip->name : "none"); - seq_printf(m, data ? "0x%p " : " %p ", - irq_data_get_irq_chip_data(data)); + seq_printf(m, "0x%p ", irq_data_get_irq_chip_data(data)); seq_printf(m, " %c", (desc->action && desc->action->handler) ? '*' : ' '); direct = (irq == hwirq) && (irq < domain->revmap_direct_max_irq); -- 2.11.0
[PATCH 0/7] net: core: devname allocation cleanups
It's somewhat confusing to have both dev_alloc_name and dev_get_valid_name. I can't see why the former is less strict than the latter, so make them (or rather dev_alloc_name_ns and dev_get_valid_name) equivalent, hardening dev_alloc_name() a little. Obvious follow-up patches would be to only export one function, and make dev_alloc_name a static inline wrapper for that (whichever name is chosen for the exported interface). But maybe there is a good reason the two exported interfaces do different checking, so I'll refrain from including the trivial but tree-wide renaming in this series. Rasmus Villemoes (7): net: core: improve sanity checking in __dev_alloc_name net: core: move dev_alloc_name_ns a little higher net: core: eliminate dev_alloc_name{,_ns} code duplication net: core: drop pointless check in __dev_alloc_name net: core: check dev_valid_name in __dev_alloc_name net: core: maybe return -EEXIST in __dev_alloc_name net: core: dev_get_valid_name is now the same as dev_alloc_name_ns net/core/dev.c | 62 +- 1 file changed, 22 insertions(+), 40 deletions(-) -- 2.11.0
[PATCH 7/7] net: core: dev_get_valid_name is now the same as dev_alloc_name_ns
If name contains a %, it's easy to see that this patch doesn't change anything (other than eliminate the duplicate dev_valid_name call). Otherwise, we'll now just spend a little time in snprintf() copying name to the stack buffer allocated in dev_alloc_name_ns, and do the __dev_get_by_name using that buffer rather than name. Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 7c08b4ca7b76..e29eea26f9c1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1144,19 +1144,7 @@ EXPORT_SYMBOL(dev_alloc_name); int dev_get_valid_name(struct net *net, struct net_device *dev, const char *name) { - BUG_ON(!net); - - if (!dev_valid_name(name)) - return -EINVAL; - - if (strchr(name, '%')) - return dev_alloc_name_ns(net, dev, name); - else if (__dev_get_by_name(net, name)) - return -EEXIST; - else if (dev->name != name) - strlcpy(dev->name, name, IFNAMSIZ); - - return 0; + return dev_alloc_name_ns(net, dev, name); } EXPORT_SYMBOL(dev_get_valid_name); -- 2.11.0
[PATCH 5/7] net: core: check dev_valid_name in __dev_alloc_name
We currently only exclude non-sysfs-friendly names via dev_get_valid_name; there doesn't seem to be a reason to allow such names when we're called via dev_alloc_name. This does duplicate the dev_valid_name check in the dev_get_valid_name() case; we'll fix that shortly. Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index 14541b7a3195..c0a92cf27566 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1062,6 +1062,9 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf) unsigned long *inuse; struct net_device *d; + if (!dev_valid_name(name)) + return -EINVAL; + p = strchr(name, '%'); if (p) { /* -- 2.11.0
[PATCH 2/7] net: core: move dev_alloc_name_ns a little higher
No functional change. Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 87e19804757b..240ae6bc1097 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1105,6 +1105,19 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf) return -ENFILE; } +static int dev_alloc_name_ns(struct net *net, +struct net_device *dev, +const char *name) +{ + char buf[IFNAMSIZ]; + int ret; + + ret = __dev_alloc_name(net, name, buf); + if (ret >= 0) + strlcpy(dev->name, buf, IFNAMSIZ); + return ret; +} + /** * dev_alloc_name - allocate a name for a device * @dev: device @@ -1134,19 +1147,6 @@ int dev_alloc_name(struct net_device *dev, const char *name) } EXPORT_SYMBOL(dev_alloc_name); -static int dev_alloc_name_ns(struct net *net, -struct net_device *dev, -const char *name) -{ - char buf[IFNAMSIZ]; - int ret; - - ret = __dev_alloc_name(net, name, buf); - if (ret >= 0) - strlcpy(dev->name, buf, IFNAMSIZ); - return ret; -} - int dev_get_valid_name(struct net *net, struct net_device *dev, const char *name) { -- 2.11.0
[PATCH 6/7] net: core: maybe return -EEXIST in __dev_alloc_name
If we're given format string with no %d, -EEXIST is a saner error code. Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/dev.c b/net/core/dev.c index c0a92cf27566..7c08b4ca7b76 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1104,7 +1104,7 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf) * when the name is long and there isn't enough space left * for the digits, or if all bits are used. */ - return -ENFILE; + return p ? -ENFILE : -EEXIST; } static int dev_alloc_name_ns(struct net *net, -- 2.11.0
[PATCH 4/7] net: core: drop pointless check in __dev_alloc_name
The only caller passes a stack buffer as buf, so it won't equal the passed-in name. Moreover, we're already using buf as a scratch buffer inside the if (p) {} block, so if buf and name were the same, that snprintf() call would be overwriting its own format string. Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 1077bfe97bde..14541b7a3195 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1093,8 +1093,7 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf) free_page((unsigned long) inuse); } - if (buf != name) - snprintf(buf, IFNAMSIZ, name, i); + snprintf(buf, IFNAMSIZ, name, i); if (!__dev_get_by_name(net, buf)) return i; -- 2.11.0
[PATCH 3/7] net: core: eliminate dev_alloc_name{,_ns} code duplication
dev_alloc_name contained a BUG_ON(), which I moved to dev_alloc_name_ns; the only other caller of that already has the same BUG_ON. Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 240ae6bc1097..1077bfe97bde 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1112,6 +1112,7 @@ static int dev_alloc_name_ns(struct net *net, char buf[IFNAMSIZ]; int ret; + BUG_ON(!net); ret = __dev_alloc_name(net, name, buf); if (ret >= 0) strlcpy(dev->name, buf, IFNAMSIZ); @@ -1134,16 +1135,7 @@ static int dev_alloc_name_ns(struct net *net, int dev_alloc_name(struct net_device *dev, const char *name) { - char buf[IFNAMSIZ]; - struct net *net; - int ret; - - BUG_ON(!dev_net(dev)); - net = dev_net(dev); - ret = __dev_alloc_name(net, name, buf); - if (ret >= 0) - strlcpy(dev->name, buf, IFNAMSIZ); - return ret; + return dev_alloc_name_ns(dev_net(dev), dev, name); } EXPORT_SYMBOL(dev_alloc_name); -- 2.11.0
[PATCH 1/7] net: core: improve sanity checking in __dev_alloc_name
__dev_alloc_name is called from the public (and exported) dev_alloc_name(), so we don't have a guarantee that strlen(name) is at most IFNAMSIZ. If somebody manages to get __dev_alloc_name called with a % char beyond the 31st character, we'd be making a snprintf() call that will very easily crash the kernel (using an appropriate %p extension, we'll likely dereference some completely bogus pointer). In the normal case where strlen() is sane, we don't even save anything by limiting to IFNAMSIZ, so just use strchr(). Signed-off-by: Rasmus Villemoes --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/dev.c b/net/core/dev.c index 11596a302a26..87e19804757b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1062,7 +1062,7 @@ static int __dev_alloc_name(struct net *net, const char *name, char *buf) unsigned long *inuse; struct net_device *d; - p = strnchr(name, IFNAMSIZ-1, '%'); + p = strchr(name, '%'); if (p) { /* * Verify the string as this thing may have come from -- 2.11.0
[PATCH] arm: dts: ls1021a: add reboot node to .dtsi
The LS1021A can be reset via the dcfg regmap in the same way as the arm64 layerscape SoCs, so add the corresponding DT node. Signed-off-by: Rasmus Villemoes --- arch/arm/boot/dts/ls1021a.dtsi | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi index 9319e1f..3ff2b8a 100644 --- a/arch/arm/boot/dts/ls1021a.dtsi +++ b/arch/arm/boot/dts/ls1021a.dtsi @@ -108,6 +108,13 @@ ; }; + reboot { + compatible = "syscon-reboot"; + regmap = <&dcfg>; + offset = <0xb0>; + mask = <0x02>; + }; + soc { compatible = "simple-bus"; #address-cells = <2>; -- 2.7.4
[PATCH] arm: dts: ls1021a: add "fsl,ls1021a-esdhc" compatible string to esdhc node
Commit a22950c888e3 (mmc: sdhci-of-esdhc: add quirk SDHCI_QUIRK_BROKEN_TIMEOUT_VAL for ls1021a) added logic to the driver to enable the broken timeout val quirk for ls1021a, but did not add the corresponding compatible string to the device tree, so it didn't really have any effect. Fix that. Signed-off-by: Rasmus Villemoes --- arch/arm/boot/dts/ls1021a.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi index 3ff2b8a9f01a..9cdec545decc 100644 --- a/arch/arm/boot/dts/ls1021a.dtsi +++ b/arch/arm/boot/dts/ls1021a.dtsi @@ -162,7 +162,7 @@ }; esdhc: esdhc@156 { - compatible = "fsl,esdhc"; + compatible = "fsl,ls1021a-esdhc", "fsl,esdhc"; reg = <0x0 0x156 0x0 0x1>; interrupts = ; clock-frequency = <0>; -- 2.7.4
bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Thu, Nov 09 2017, Linus Torvalds wrote: > The code disassembles to > >0: 83 c9 08 or $0x8,%ecx >3: 40 f6 c6 04 test $0x4,%sil >7: 0f 45 d1 cmovne %ecx,%edx >a: 89 d1mov%edx,%ecx >c: 80 cd 04 or $0x4,%ch >f: 40 f6 c6 08 test $0x8,%sil > 13: 0f 45 d1 cmovne %ecx,%edx > 16: 89 d1mov%edx,%ecx > 18: 80 cd 08 or $0x8,%ch > 1b: 40 f6 c6 10 test $0x10,%sil > 1f: 0f 45 d1 cmovne %ecx,%edx > 22: 89 d1mov%edx,%ecx > 24: 80 cd 10 or $0x10,%ch > 27: 83 e6 20 and$0x20,%esi > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > 31: 0f 45 d1 cmovne %ecx,%edx > 34: 83 ca 20 or $0x20,%edx > 37: 89 f1mov%esi,%ecx > 39: 83 e1 10 and$0x10,%ecx > 3c: 89 cfmov%ecx,%edi > > and all those odd cmovne and bit-ops are just the bit selection code > in flags_by_mnt(), which is inlined through calculate_f_flags (which > is _also_ inlined) into vfs_statfs(). > > Sadly, gcc makes a mess of it and actually generates code that looks > like the original C. I would have hoped that gcc could have turned > >if (x & BIT) > y |= OTHER_BIT; > > into > > y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT; > > but that doesn't happen. Actually, new enough gcc (7.1, I think) does contain a pattern that does this, but unfortunately only if one spells it y |= (x & BIT) ? OTHER_BIT : 0; which is half-way to doing it by hand, I suppose. Doing the - if (mnt_flags & MNT_READONLY) - flags |= ST_RDONLY; + flags |= (mnt_flags & MNT_READONLY) ? ST_RDONLY : 0; and pasting into godbolt.org, one can apparently get gcc to compile it to flags_by_mnt(int): leal (%rdi,%rdi), %edx movl %edi, %eax sarl $6, %eax movl %edx, %ecx andl $1, %eax andl $12, %edx andl $2, %ecx orl %ecx, %eax orl %eax, %edx movl %edi, %eax sall $7, %eax andl $7168, %eax orl %edx, %eax ret Rasmus
Re: [PATCH] lib: Avoid redundant sizeof checking in __bitmap_weight() calculation.
On 14 November 2017 at 07:57, Rakib Mullick wrote: > Currently, during __bitmap_weight() calculation hweight_long() is used. > Inside a hweight_long() a check has been made to figure out whether a > hweight32() or hweight64() version to use. > > diff --git a/lib/bitmap.c b/lib/bitmap.c > index d8f0c09..552096f 100644 > --- a/lib/bitmap.c > +++ b/lib/bitmap.c > @@ -241,10 +241,15 @@ EXPORT_SYMBOL(__bitmap_subset); > int __bitmap_weight(const unsigned long *bitmap, unsigned int bits) > { > unsigned int k, lim = bits/BITS_PER_LONG; > - int w = 0; > - > - for (k = 0; k < lim; k++) > - w += hweight_long(bitmap[k]); > + int w = 0, is32 = sizeof(bitmap[0]) ? 1 : 0; > + hint: sizeof() very rarely evaluates to zero... So this is the same as "is32 = 1". So the patch as-is is broken (and may explain the 1-byte delta in vmlinux). But even if this condition is fixed, the patch doesn't change anything, since the sizeof() evaluation is done at compile-time, regardless of whether it happens inside the inlined hweight_long or outside. So it is certainly not worth it to duplicate the loop. Rasmus
[PATCH] arm: dts: ls1021a: Add label to USB controllers
From: Esben Haabendal Add usb2 and usb3 labels to USB2 and USB3 controller device tree nodes, for easier modification in board dts files. Signed-off-by: Esben Haabendal Signed-off-by: Rasmus Villemoes --- arch/arm/boot/dts/ls1021a.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi index 9cdec545decc..50ad4dccb22b 100644 --- a/arch/arm/boot/dts/ls1021a.dtsi +++ b/arch/arm/boot/dts/ls1021a.dtsi @@ -675,7 +675,7 @@ }; }; - usb@860 { + usb2: usb@860 { compatible = "fsl-usb2-dr-v2.5", "fsl-usb2-dr"; reg = <0x0 0x860 0x0 0x1000>; interrupts = ; @@ -683,7 +683,7 @@ phy_type = "ulpi"; }; - usb3@310 { + usb3: usb3@310 { compatible = "snps,dwc3"; reg = <0x0 0x310 0x0 0x1>; interrupts = ; -- 2.7.4
[PATCH] arm: dts: ls1021a: Specify interrupt-affinity for pmu node
From: Esben Haabendal This avoids the warning hw perfevents: no interrupt-affinity property for /pmu, guessing. Signed-off-by: Esben Haabendal [RV: adapt commit log to the warning emitted in current mainline] Signed-off-by: Rasmus Villemoes --- arch/arm/boot/dts/ls1021a.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi index 9319e1f..d62af07 100644 --- a/arch/arm/boot/dts/ls1021a.dtsi +++ b/arch/arm/boot/dts/ls1021a.dtsi @@ -106,6 +106,7 @@ compatible = "arm,cortex-a7-pmu"; interrupts = , ; + interrupt-affinity = <&cpu0>, <&cpu1>; }; soc { -- 2.7.4
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On 14 November 2017 at 00:54, Linus Torvalds wrote: > On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds > wrote: >> >> So let's just rewrite that mnt_flags conversion that way, justr to get >> gcc to generate the obvious code. > > Oh wow. I tried to do the same thing in fs/namespace.c where it does > the reverse bit translation, and gcc makes a _horrible_ mess of it and > actually makes the code much worse because for some reason the pattern > doesn't trigger. [trimming cc list] Can you be more specific? That's not what I see with gcc 7.1. I've found two blocks where I replaced the if's with the ternary expressions, one in clone_mnt, one in do_mount. In clone_mnt, gcc-7.1 seems to do (first instruction is the unconditional |= MNT_LOCK_ATIME) 10d7: 0d 00 00 04 00 or $0x4,%eax 10dc: 89 43 30mov%eax,0x30(%rbx) 10df: a8 02 test $0x2,%al 10e1: 74 08 je 10eb 10e3: 0d 00 00 20 00 or $0x20,%eax 10e8: 89 43 30mov%eax,0x30(%rbx) 10eb: a8 01 test $0x1,%al 10ed: 74 08 je 10f7 10ef: 0d 00 00 10 00 or $0x10,%eax 10f4: 89 43 30mov%eax,0x30(%rbx) and after patching 10cc: 0d 00 00 04 00 or $0x4,%eax 10d1: 89 c2 mov%eax,%edx 10d3: c1 e2 10shl$0x10,%edx 10d6: 81 e2 00 00 40 00 and$0x40,%edx 10dc: 09 d0 or %edx,%eax 10de: 89 c2 mov%eax,%edx 10e0: c1 e2 14shl$0x14,%edx 10e3: 81 e2 00 00 20 00 and$0x20,%edx 10e9: 09 c2 or %eax,%edx (with a final store of %eax to 0x30(%rbx)). Either way it's four instructions per flag, but I assume the one without the branches is preferable. Similarly, in do_mount, before we have 3429: 44 89 f8mov%r15d,%eax 342c: 83 c8 01or $0x1,%eax 342f: 40 f6 c5 02 test $0x2,%bpl 3433: 44 0f 45 f8 cmovne %eax,%r15d 3437: 44 89 f8mov%r15d,%eax 343a: 83 c8 02or $0x2,%eax 343d: 40 f6 c5 04 test $0x4,%bpl 3441: 44 0f 45 f8 cmovne %eax,%r15d but after patching it does something like 3425: 4d 89 femov%r15,%r14 3428: 48 c1 ea 07 shr$0x7,%rdx 342c: 49 d1 eeshr%r14 342f: 89 d0 mov%edx,%eax 3431: c1 e1 05shl$0x5,%ecx 3434: 83 e0 08and$0x8,%eax 3437: 41 83 e6 07 and$0x7,%r14d 343b: 41 09 c6or %eax,%r14d 343e: 89 d0 mov%edx,%eax which actually makes use of MS_{NOSUID,NODEV,NOEXEC} all being 1 bit off from their MNT_ counterparts (witness the shr %r14 and the and with 0x7), so there we also cut 37 bytes according to bloat-o-meter. Now, it does seem that older (and not that old in absolute terms) compilers may generate worse code after the transformation - the 'replace with shift-mask-or' pattern doesn't exist until 7.1. E.g. with 5.4 we have $ scripts/bloat-o-meter fs/namespace.o.{0,1}-5.4 add/remove: 0/0 grow/shrink: 2/0 up/down: 50/0 (50) function old new delta do_mount31533195 +42 clone_mnt768 776 +8 5.4 compiles the "if() ..." construction roughly as 7.1, i.e. with cmovs, but the ternary expressions are transformed into this abomination 336f: 48 89 damov%rbx,%rdx 3372: 83 e2 04and$0x4,%edx 3375: 48 83 fa 01 cmp$0x1,%rdx 3379: 19 d2 sbb%edx,%edx 337b: f7 d2 not%edx 337d: 83 e2 02and$0x2,%edx 3380: 09 d0 or %edx,%eax 3382: 48 89 damov%rbx,%rdx 3385: 83 e2 08and$0x8,%edx 3388: 48 83 fa 01 cmp$0x1,%rdx 338c: 19 d2 sbb%edx,%edx 338e: f7 d2 not%edx 3390: 83 e2 04and$0x4,%edx 3393: 09 d0 or %edx,%eax Was it something like this you saw? Rasmus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On 14 November 2017 at 23:43, Linus Torvalds wrote: > On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes > wrote: >> >> Can you be more specific? That's not what I see with gcc 7.1. > > I have gcc-7.2.1, and it made a horrible mess of the do_mount() code. Odd. 7.2 and 7.1 (both of which I've just compiled from source, no special configure flags or anything) generate exactly the same (good) code for fs/namespace.o after patching. I also tried with CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces reasonable code. Oh well. Rasmus
[PATCH] drivers/char/random.c: remove unused dont_count_entropy
Ever since "random: kill dead extract_state struct" [1], the dont_count_entropy member of struct timer_rand_state has been effectively unused. Since it hasn't found a new use in 12 years, it's probably safe to finally kill it. [1] Pre-git, https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=c1c48e61c251f57e7a3f1bf11b3c462b2de9dcb5 Signed-off-by: Rasmus Villemoes --- drivers/char/random.c | 53 --- 1 file changed, 25 insertions(+), 28 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 8ad92707e45f..cc42392c74dc 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -964,7 +964,6 @@ static ssize_t extract_crng_user(void __user *buf, size_t nbytes) struct timer_rand_state { cycles_t last_time; long last_delta, last_delta2; - unsigned dont_count_entropy:1; }; #define INIT_TIMER_RAND_STATE { INITIAL_JIFFIES, }; @@ -1030,35 +1029,33 @@ static void add_timer_randomness(struct timer_rand_state *state, unsigned num) * We take into account the first, second and third-order deltas * in order to make our estimate. */ + delta = sample.jiffies - state->last_time; + state->last_time = sample.jiffies; + + delta2 = delta - state->last_delta; + state->last_delta = delta; + + delta3 = delta2 - state->last_delta2; + state->last_delta2 = delta2; + + if (delta < 0) + delta = -delta; + if (delta2 < 0) + delta2 = -delta2; + if (delta3 < 0) + delta3 = -delta3; + if (delta > delta2) + delta = delta2; + if (delta > delta3) + delta = delta3; - if (!state->dont_count_entropy) { - delta = sample.jiffies - state->last_time; - state->last_time = sample.jiffies; - - delta2 = delta - state->last_delta; - state->last_delta = delta; - - delta3 = delta2 - state->last_delta2; - state->last_delta2 = delta2; - - if (delta < 0) - delta = -delta; - if (delta2 < 0) - delta2 = -delta2; - if (delta3 < 0) - delta3 = -delta3; - if (delta > delta2) - delta = delta2; - if (delta > delta3) - delta = delta3; + /* +* delta is now minimum absolute delta. +* Round down by 1 bit on general principles, +* and limit entropy entimate to 12 bits. +*/ + credit_entropy_bits(r, min_t(int, fls(delta>>1), 11)); - /* -* delta is now minimum absolute delta. -* Round down by 1 bit on general principles, -* and limit entropy entimate to 12 bits. -*/ - credit_entropy_bits(r, min_t(int, fls(delta>>1), 11)); - } preempt_enable(); } -- 2.11.0
[PATCH] genirq: __setup_irq(): fix type of literal 1 in shift
If we ever get a value >= 31 from ffz(), we'd be invoking UB; in the > 31 case, probably assigning the same thread_mask to to multiple irqactions (at least on x86_64, where the shift count is implicitly truncated to 5 bits). In practice, I think the bug is mostly harmless, since when we first get ffz == 31, the RHS is - ignoring UB - (int)(0x8000), and sign-extension means that the 32nd irqaction just ends up eating the top 33 bits of thread_mask, so all subsequent __setup_irq calls would just end up hitting the -EBUSY branch. (AFAICT, no code seems to rely on ->thread_mask being a single bit; it's all whole-sale |= and &=). However, a sufficiently aggressive optimizer may use the UB of 1<<31 to decide that doesn't happen, and hence elide the sign-extension code, so that subsequent calls can indeed get ffz > 31. In any case, this is the right thing to do. Signed-off-by: Rasmus Villemoes --- kernel/irq/manage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c index 4bff6a10ae8e..a1a448e1760d 100644 --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -1305,7 +1305,7 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new) * thread_mask assigned. See the loop above which or's * all existing action->thread_mask bits. */ - new->thread_mask = 1 << ffz(thread_mask); + new->thread_mask = 1UL << ffz(thread_mask); } else if (new->handler == irq_default_primary_handler && !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) { -- 2.11.0
Re: [PATCH] string: Improve the generic strlcpy() implementation
On Mon, Oct 05 2015, Ingo Molnar wrote: > * Linus Torvalds wrote: > >> So I finally pulled it. I like the patch, I like the new interface, >> but despite that I wasn't really sure if I wanted to pull it in - thus >> the long delay of me just seeing this in my list of pending pulls for >> almost a month, but never really getting to the point where I decided >> I want to commit to it. > > Interesting. I noticed that strscpy() says this in its comments: > > * In addition, the implementation is robust to the string changing out > * from underneath it, unlike the current strlcpy() implementation. > > The strscpy() interface is very nice, but shouldn't we also fix this > strlcpy() > unrobustness/race it refers to, in light of the 2000+ existing strlcpy() call > sites? How about every single occurence of %s in a format string? vsnprintf also has that "issue", but has it actually ever been a problem? The window for something bad to happen is probably also much larger in the printf case, and especially when when some %p extension is used and/or the vsnprintf user is kasprintf() (where we 'replay' the formatting, having hopefully obtained a correct-sized buffer). In fact, writing this, it occurs to me that we should probably check the return value of the second vsnprintf call in kasprintf and compare to the first, issuing a warning if they don't match. I'm not against making strlcpy more robust, but I think the theoretical race is far more likely to manifest through a member of the printf family. Note that, unless one cares for performance or worries about 2G+ length strings, strlcpy could just be 'return snprintf(dst, len, "%s", src);', which would give the "check for insanely large/negative len" for free [though not giving strlen(src) as return value - but the caller is much more likely to be tripped up by no copying having taken place anyway]. > Another problem is that strlcpy() will also happily do bad stuff if we pass > it a negative size. Instead of that we will from now on print a (one time) > warning and return safely. Well, not too sure about that 'safely'. If the caller somehow managed to compute an insanely large (remaining) capacity in the buffer and has that in a size_t variable, then proceeds to comparing the return value to the supposed buffer size to check for overflow, he will think that everything is fine and proceed to using likely uninitialized contents of his buffer. I think a return value of 0 might be slightly better. Assuming the caller has the capacity in a signed variable (so it only became huge by being converted to size_t) and makes a signed comparison with the return value, both 0 and strlen() triggers an overflow check, so we wouldn't be worse off in that case. Clearly the same is true if the return value is not used at all. If the return value is used mindlessly for advancing dst and decrementing the capacity, staying put is probably better. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/6] kernel/cpu.c: change type of cpu_possible_bits and friends
Change cpu_possible_bits and friends (online, present, active) from being bitmaps that happen to have the right size to actually being struct cpumasks. Also rename them to __cpu_xyz_mask. This is mostly a small cleanup in preparation for exporting them and, eventually, eliminating the extra indirection through the cpu_xyz_mask variables. Acked-by: Rusty Russell Signed-off-by: Rasmus Villemoes --- kernel/cpu.c | 44 ++-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 82cf9dff4295..fea4a3ce3011 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -772,71 +772,71 @@ const DECLARE_BITMAP(cpu_all_bits, NR_CPUS) = CPU_BITS_ALL; EXPORT_SYMBOL(cpu_all_bits); #ifdef CONFIG_INIT_ALL_POSSIBLE -static DECLARE_BITMAP(cpu_possible_bits, CONFIG_NR_CPUS) __read_mostly - = CPU_BITS_ALL; +static struct cpumask __cpu_possible_mask __read_mostly + = {CPU_BITS_ALL}; #else -static DECLARE_BITMAP(cpu_possible_bits, CONFIG_NR_CPUS) __read_mostly; +static struct cpumask __cpu_possible_mask __read_mostly; #endif -const struct cpumask *const cpu_possible_mask = to_cpumask(cpu_possible_bits); +const struct cpumask *const cpu_possible_mask = &__cpu_possible_mask; EXPORT_SYMBOL(cpu_possible_mask); -static DECLARE_BITMAP(cpu_online_bits, CONFIG_NR_CPUS) __read_mostly; -const struct cpumask *const cpu_online_mask = to_cpumask(cpu_online_bits); +static struct cpumask __cpu_online_mask __read_mostly; +const struct cpumask *const cpu_online_mask = &__cpu_online_mask; EXPORT_SYMBOL(cpu_online_mask); -static DECLARE_BITMAP(cpu_present_bits, CONFIG_NR_CPUS) __read_mostly; -const struct cpumask *const cpu_present_mask = to_cpumask(cpu_present_bits); +static struct cpumask __cpu_present_mask __read_mostly; +const struct cpumask *const cpu_present_mask = &__cpu_present_mask; EXPORT_SYMBOL(cpu_present_mask); -static DECLARE_BITMAP(cpu_active_bits, CONFIG_NR_CPUS) __read_mostly; -const struct cpumask *const cpu_active_mask = to_cpumask(cpu_active_bits); +static struct cpumask __cpu_active_mask __read_mostly; +const struct cpumask *const cpu_active_mask = &__cpu_active_mask; EXPORT_SYMBOL(cpu_active_mask); void set_cpu_possible(unsigned int cpu, bool possible) { if (possible) - cpumask_set_cpu(cpu, to_cpumask(cpu_possible_bits)); + cpumask_set_cpu(cpu, &__cpu_possible_mask); else - cpumask_clear_cpu(cpu, to_cpumask(cpu_possible_bits)); + cpumask_clear_cpu(cpu, &__cpu_possible_mask); } void set_cpu_present(unsigned int cpu, bool present) { if (present) - cpumask_set_cpu(cpu, to_cpumask(cpu_present_bits)); + cpumask_set_cpu(cpu, &__cpu_present_mask); else - cpumask_clear_cpu(cpu, to_cpumask(cpu_present_bits)); + cpumask_clear_cpu(cpu, &__cpu_present_mask); } void set_cpu_online(unsigned int cpu, bool online) { if (online) { - cpumask_set_cpu(cpu, to_cpumask(cpu_online_bits)); - cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits)); + cpumask_set_cpu(cpu, &__cpu_online_mask); + cpumask_set_cpu(cpu, &__cpu_active_mask); } else { - cpumask_clear_cpu(cpu, to_cpumask(cpu_online_bits)); + cpumask_clear_cpu(cpu, &__cpu_online_mask); } } void set_cpu_active(unsigned int cpu, bool active) { if (active) - cpumask_set_cpu(cpu, to_cpumask(cpu_active_bits)); + cpumask_set_cpu(cpu, &__cpu_active_mask); else - cpumask_clear_cpu(cpu, to_cpumask(cpu_active_bits)); + cpumask_clear_cpu(cpu, &__cpu_active_mask); } void init_cpu_present(const struct cpumask *src) { - cpumask_copy(to_cpumask(cpu_present_bits), src); + cpumask_copy(&__cpu_present_mask, src); } void init_cpu_possible(const struct cpumask *src) { - cpumask_copy(to_cpumask(cpu_possible_bits), src); + cpumask_copy(&__cpu_possible_mask, src); } void init_cpu_online(const struct cpumask *src) { - cpumask_copy(to_cpumask(cpu_online_bits), src); + cpumask_copy(&__cpu_online_mask, src); } -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 5/6] kernel/cpu.c: eliminate cpu_*_mask
Replace the variables cpu_possible_mask, cpu_online_mask, cpu_present_mask and cpu_active_mask with macros expanding to expressions of the same type and value, eliminating some indirection. Acked-by: Rusty Russell Signed-off-by: Rasmus Villemoes --- include/linux/cpumask.h | 8 kernel/cpu.c| 8 2 files changed, 4 insertions(+), 12 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index d4545a1852f2..52ab539aefce 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -85,14 +85,14 @@ extern int nr_cpu_ids; *only one CPU. */ -extern const struct cpumask *const cpu_possible_mask; -extern const struct cpumask *const cpu_online_mask; -extern const struct cpumask *const cpu_present_mask; -extern const struct cpumask *const cpu_active_mask; extern struct cpumask __cpu_possible_mask; extern struct cpumask __cpu_online_mask; extern struct cpumask __cpu_present_mask; extern struct cpumask __cpu_active_mask; +#define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask) +#define cpu_online_mask ((const struct cpumask *)&__cpu_online_mask) +#define cpu_present_mask ((const struct cpumask *)&__cpu_present_mask) +#define cpu_active_mask ((const struct cpumask *)&__cpu_active_mask) #if NR_CPUS > 1 #define num_online_cpus() cpumask_weight(cpu_online_mask) diff --git a/kernel/cpu.c b/kernel/cpu.c index e08db26d351b..dd70f600442f 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -778,23 +778,15 @@ struct cpumask __cpu_possible_mask __read_mostly struct cpumask __cpu_possible_mask __read_mostly; #endif EXPORT_SYMBOL(__cpu_possible_mask); -const struct cpumask *const cpu_possible_mask = &__cpu_possible_mask; -EXPORT_SYMBOL(cpu_possible_mask); struct cpumask __cpu_online_mask __read_mostly; EXPORT_SYMBOL(__cpu_online_mask); -const struct cpumask *const cpu_online_mask = &__cpu_online_mask; -EXPORT_SYMBOL(cpu_online_mask); struct cpumask __cpu_present_mask __read_mostly; EXPORT_SYMBOL(__cpu_present_mask); -const struct cpumask *const cpu_present_mask = &__cpu_present_mask; -EXPORT_SYMBOL(cpu_present_mask); struct cpumask __cpu_active_mask __read_mostly; EXPORT_SYMBOL(__cpu_active_mask); -const struct cpumask *const cpu_active_mask = &__cpu_active_mask; -EXPORT_SYMBOL(cpu_active_mask); void set_cpu_possible(unsigned int cpu, bool possible) { -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/6] drivers/base/cpu.c: use __cpu_*_mask directly
The only user of the lvalue-ness of the cpu_*_mask variables is in drivers/base/cpu.c, and that is mostly a work-around for the fact that not even const variables can be used in static initialization. Now that the underlying struct cpumasks are exposed we can take their address. Acked-by: Rusty Russell Acked-by: Greg Kroah-Hartman Signed-off-by: Rasmus Villemoes --- drivers/base/cpu.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 91bbb1959d8d..691eeea2f19a 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -200,7 +200,7 @@ static const struct attribute_group *hotplugable_cpu_attr_groups[] = { struct cpu_attr { struct device_attribute attr; - const struct cpumask *const * const map; + const struct cpumask *const map; }; static ssize_t show_cpus_attr(struct device *dev, @@ -209,7 +209,7 @@ static ssize_t show_cpus_attr(struct device *dev, { struct cpu_attr *ca = container_of(attr, struct cpu_attr, attr); - return cpumap_print_to_pagebuf(true, buf, *ca->map); + return cpumap_print_to_pagebuf(true, buf, ca->map); } #define _CPU_ATTR(name, map) \ @@ -217,9 +217,9 @@ static ssize_t show_cpus_attr(struct device *dev, /* Keep in sync with cpu_subsys_attrs */ static struct cpu_attr cpu_attrs[] = { - _CPU_ATTR(online, &cpu_online_mask), - _CPU_ATTR(possible, &cpu_possible_mask), - _CPU_ATTR(present, &cpu_present_mask), + _CPU_ATTR(online, &__cpu_online_mask), + _CPU_ATTR(possible, &__cpu_possible_mask), + _CPU_ATTR(present, &__cpu_present_mask), }; /* -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 6/6] kernel/cpu.c: make set_cpu_* static inlines
Almost all callers of the set_cpu_* functions pass an explicit true or false. Making them static inline thus replaces the function calls with a simple set_bit/clear_bit, saving some .text. Acked-by: Rusty Russell Signed-off-by: Rasmus Villemoes --- include/linux/cpumask.h | 43 +++ kernel/cpu.c| 34 -- 2 files changed, 39 insertions(+), 38 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 52ab539aefce..fc14275ff34e 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -720,14 +720,49 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS); #define for_each_present_cpu(cpu) for_each_cpu((cpu), cpu_present_mask) /* Wrappers for arch boot code to manipulate normally-constant masks */ -void set_cpu_possible(unsigned int cpu, bool possible); -void set_cpu_present(unsigned int cpu, bool present); -void set_cpu_online(unsigned int cpu, bool online); -void set_cpu_active(unsigned int cpu, bool active); void init_cpu_present(const struct cpumask *src); void init_cpu_possible(const struct cpumask *src); void init_cpu_online(const struct cpumask *src); +static inline void +set_cpu_possible(unsigned int cpu, bool possible) +{ + if (possible) + cpumask_set_cpu(cpu, &__cpu_possible_mask); + else + cpumask_clear_cpu(cpu, &__cpu_possible_mask); +} + +static inline void +set_cpu_present(unsigned int cpu, bool present) +{ + if (present) + cpumask_set_cpu(cpu, &__cpu_present_mask); + else + cpumask_clear_cpu(cpu, &__cpu_present_mask); +} + +static inline void +set_cpu_online(unsigned int cpu, bool online) +{ + if (online) { + cpumask_set_cpu(cpu, &__cpu_online_mask); + cpumask_set_cpu(cpu, &__cpu_active_mask); + } else { + cpumask_clear_cpu(cpu, &__cpu_online_mask); + } +} + +static inline void +set_cpu_active(unsigned int cpu, bool active) +{ + if (active) + cpumask_set_cpu(cpu, &__cpu_active_mask); + else + cpumask_clear_cpu(cpu, &__cpu_active_mask); +} + + /** * to_cpumask - convert an NR_CPUS bitmap to a struct cpumask * * @bitmap: the bitmap diff --git a/kernel/cpu.c b/kernel/cpu.c index dd70f600442f..5210d80efc28 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -788,40 +788,6 @@ EXPORT_SYMBOL(__cpu_present_mask); struct cpumask __cpu_active_mask __read_mostly; EXPORT_SYMBOL(__cpu_active_mask); -void set_cpu_possible(unsigned int cpu, bool possible) -{ - if (possible) - cpumask_set_cpu(cpu, &__cpu_possible_mask); - else - cpumask_clear_cpu(cpu, &__cpu_possible_mask); -} - -void set_cpu_present(unsigned int cpu, bool present) -{ - if (present) - cpumask_set_cpu(cpu, &__cpu_present_mask); - else - cpumask_clear_cpu(cpu, &__cpu_present_mask); -} - -void set_cpu_online(unsigned int cpu, bool online) -{ - if (online) { - cpumask_set_cpu(cpu, &__cpu_online_mask); - cpumask_set_cpu(cpu, &__cpu_active_mask); - } else { - cpumask_clear_cpu(cpu, &__cpu_online_mask); - } -} - -void set_cpu_active(unsigned int cpu, bool active) -{ - if (active) - cpumask_set_cpu(cpu, &__cpu_active_mask); - else - cpumask_clear_cpu(cpu, &__cpu_active_mask); -} - void init_cpu_present(const struct cpumask *src) { cpumask_copy(&__cpu_present_mask, src); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/6] kernel/cpu.c: export __cpu_*_mask
Exporting the cpumasks __cpu_possible_mask and friends will allow us to remove the extra indirection through the cpu_*_mask variables. It will also allow the set_cpu_* functions to become static inlines, which will give a .text reduction. Acked-by: Rusty Russell Signed-off-by: Rasmus Villemoes --- include/linux/cpumask.h | 4 kernel/cpu.c| 14 +- 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index 59915ea5373c..d4545a1852f2 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -89,6 +89,10 @@ extern const struct cpumask *const cpu_possible_mask; extern const struct cpumask *const cpu_online_mask; extern const struct cpumask *const cpu_present_mask; extern const struct cpumask *const cpu_active_mask; +extern struct cpumask __cpu_possible_mask; +extern struct cpumask __cpu_online_mask; +extern struct cpumask __cpu_present_mask; +extern struct cpumask __cpu_active_mask; #if NR_CPUS > 1 #define num_online_cpus() cpumask_weight(cpu_online_mask) diff --git a/kernel/cpu.c b/kernel/cpu.c index fea4a3ce3011..e08db26d351b 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -772,23 +772,27 @@ const DECLARE_BITMAP(cpu_all_bits, NR_CPUS) = CPU_BITS_ALL; EXPORT_SYMBOL(cpu_all_bits); #ifdef CONFIG_INIT_ALL_POSSIBLE -static struct cpumask __cpu_possible_mask __read_mostly +struct cpumask __cpu_possible_mask __read_mostly = {CPU_BITS_ALL}; #else -static struct cpumask __cpu_possible_mask __read_mostly; +struct cpumask __cpu_possible_mask __read_mostly; #endif +EXPORT_SYMBOL(__cpu_possible_mask); const struct cpumask *const cpu_possible_mask = &__cpu_possible_mask; EXPORT_SYMBOL(cpu_possible_mask); -static struct cpumask __cpu_online_mask __read_mostly; +struct cpumask __cpu_online_mask __read_mostly; +EXPORT_SYMBOL(__cpu_online_mask); const struct cpumask *const cpu_online_mask = &__cpu_online_mask; EXPORT_SYMBOL(cpu_online_mask); -static struct cpumask __cpu_present_mask __read_mostly; +struct cpumask __cpu_present_mask __read_mostly; +EXPORT_SYMBOL(__cpu_present_mask); const struct cpumask *const cpu_present_mask = &__cpu_present_mask; EXPORT_SYMBOL(cpu_present_mask); -static struct cpumask __cpu_active_mask __read_mostly; +struct cpumask __cpu_active_mask __read_mostly; +EXPORT_SYMBOL(__cpu_active_mask); const struct cpumask *const cpu_active_mask = &__cpu_active_mask; EXPORT_SYMBOL(cpu_active_mask); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/6] kernel/cpu.c: eliminate some indirection
v2: fix build failure on ppc, add acks. The four cpumasks cpu_{possible,online,present,active}_bits are exposed readonly via the corresponding const variables cpu_xyz_mask. But they are also accessible for arbitrary writing via the exposed functions set_cpu_xyz. There's quite a bit of code throughout the kernel which iterates over or otherwise accesses these bitmaps, and having the access go via the cpu_xyz_mask variables is nowadays [1] simply a useless indirection. It may be that any problem in CS can be solved by an extra level of indirection, but that doesn't mean every extra indirection solves a problem. In this case, it even necessitates some minor ugliness (see 4/6). Patch 1/6 is new in v2, and fixes a build failure on ppc by renaming a struct member, to avoid problems when the identifier cpu_online_mask becomes a macro later in the series. The next four patches eliminate the cpu_xyz_mask variables by simply exposing the actual bitmaps, after renaming them to discourage direct access - that still happens through cpu_xyz_mask, which are now simply macros with the same type and value as they used to have. After that, there's no longer any reason to have the setter functions be out-of-line: The boolean parameter is almost always a literal true or false, so by making them static inlines they will usually compile to one or two instructions. For a defconfig build on x86_64, bloat-o-meter says we save ~3000 bytes. We also save a little stack (stackdelta says 127 functions have a 16 byte smaller stack frame, while two grow by that amount). Mostly because, when iterating over the mask, gcc typically loads the value of cpu_xyz_mask into a callee-saved register and from there into %rdi before each find_next_bit call - now it can just load the appropriate immediate address into %rdi before each call. [1] See Rusty's kind explanation http://thread.gmane.org/gmane.linux.kernel/2047078/focus=2047722 for some historic context. Rasmus Villemoes (6): powerpc/fadump: rename cpu_online_mask member of struct fadump_crash_info_header kernel/cpu.c: change type of cpu_possible_bits and friends kernel/cpu.c: export __cpu_*_mask drivers/base/cpu.c: use __cpu_*_mask directly kernel/cpu.c: eliminate cpu_*_mask kernel/cpu.c: make set_cpu_* static inlines arch/powerpc/include/asm/fadump.h | 2 +- arch/powerpc/kernel/fadump.c | 4 +-- drivers/base/cpu.c| 10 +++--- include/linux/cpumask.h | 55 - kernel/cpu.c | 64 --- 5 files changed, 68 insertions(+), 67 deletions(-) -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] scsi: move Additional Sense Codes to separate file
On Mon, Oct 05 2015, Bart Van Assche wrote: > On 10/05/15 02:26, Rasmus Villemoes wrote: >> -{0x041A, "Logical unit not ready, start stop unit command in " >> - "progress"}, >> -{0x041B, "Logical unit not ready, sanitize in progress"}, >> -{0x041C, "Logical unit not ready, additional power use not yet " >> - "granted"}, > > Please convert these multi-line strings into single line string > constants such that users can look up these easily with grep. I can certainly do that, but I'd prefer to do it in a separate patch. I really want to keep this as mechanical as possible. >> + >> +SENSE_CODE(0, NULL) > > The above looks confusing to me. Please leave this out and add { 0, > NULL } at the end of the additional[] array instead. OK, I agree that that would have been cleaner. However, the sentinel entry is removed in 2/2 (since we loop using ARRAY_SIZE instead of checking for the sentinel). Let me know if you want me to resend the 1600 line monster anyway with this addressed. Thanks, Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] scsi: reduce CONFIG_SCSI_CONSTANTS=y impact by 8k
On Mon, Oct 05 2015, Bart Van Assche wrote: > On 10/05/15 02:26, Rasmus Villemoes wrote: >> struct error_info { >> unsigned short code12; /* 0x0302 looks better than 0x03,0x02 */ >> -const char * text; >> +unsigned short size; >> }; > > Had you considered to use the type uint16_t instead of unsigned short ? > Yes, but I thought I'd keep it consistent with the other member. AFAIK, they're one and the same on all relevant arches. I actually think spelling it u16 for both members would make sense (for the code because it explicitly is meant to hold two bytes), but again I think that's better done as a trivial follow-up patch, if we really want to change this. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: reduce CONFIG_SCSI_CONSTANTS impact by 4k
On Tue, Oct 06 2015, Julian Calaby wrote: > Hi Rasmus, > >> >> diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c >> index 47aaccd5e68e..ccd34b0481cd 100644 >> --- a/drivers/scsi/constants.c >> +++ b/drivers/scsi/constants.c >> @@ -292,17 +292,31 @@ bool scsi_opcode_sa_name(int opcode, int >> service_action, >> >> struct error_info { >> unsigned short code12; /* 0x0302 looks better than 0x03,0x02 */ >> - const char * text; >> + unsigned short size; >> }; >> >> >> +/* >> + * There are 700+ entries in this table. To save space, we don't store >> + * (code, pointer) pairs, which would make sizeof(struct >> + * error_info)==16 on 64 bits. Rather, the second element just stores >> + * the size (including \0) of the corresponding string, and we use the >> + * sum of these to get the appropriate offset into additional_text >> + * defined below. This approach saves 12 bytes per entry. >> + */ >> static const struct error_info additional[] = >> { >> -#define SENSE_CODE(c, s) {c, s}, >> +#define SENSE_CODE(c, s) {c, sizeof(s)}, >> #include "sense_codes.h" >> #undef SENSE_CODE >> }; >> >> +static const char *additional_text = >> +#define SENSE_CODE(c, s) s "\0" >> +#include "sense_codes.h" >> +#undef SENSE_CODE >> + ; >> + >> struct error_info2 { >> unsigned char code1, code2_min, code2_max; >> const char * str; >> @@ -364,11 +378,14 @@ scsi_extd_sense_format(unsigned char asc, unsigned >> char ascq, const char **fmt) >> { >> int i; >> unsigned short code = ((asc << 8) | ascq); >> + unsigned offset = 0; >> >> *fmt = NULL; >> - for (i = 0; additional[i].text; i++) >> + for (i = 0; i < ARRAY_SIZE(additional); i++) { >> if (additional[i].code12 == code) >> - return additional[i].text; >> + return additional_text + offset; >> + offset += additional[i].size; > > You don't seem to be accounting for the null bytes here. Well, no, I account for the nul bytes where I define the table (the comment actually says as much). sizeof("foo") is 4. Since additional_text ends up pointing to a string containing "foo" "\0" "xyzzy" "\0" "..." "\0" aka "foo\0xyzzy\0...\0" this is the right amount to skip. As I said in the cover letter, I did test this (so that I'd at least catch silly off-by-ones), and I do get the right strings out. Thanks, Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/6] powerpc/fadump: rename cpu_online_mask member of struct fadump_crash_info_header
As preparation for eliminating the indirect access to the various global cpu_*_bits bitmaps via the pointer variables cpu_*_mask, rename the cpu_online_mask member of struct fadump_crash_info_header to simply online_mask, thus allowing cpu_online_mask to become a macro. Acked-by: Michael Ellerman Signed-off-by: Rasmus Villemoes --- arch/powerpc/include/asm/fadump.h | 2 +- arch/powerpc/kernel/fadump.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h index 493e72f64b35..b4407d0add27 100644 --- a/arch/powerpc/include/asm/fadump.h +++ b/arch/powerpc/include/asm/fadump.h @@ -191,7 +191,7 @@ struct fadump_crash_info_header { u64 elfcorehdr_addr; u32 crashing_cpu; struct pt_regs regs; - struct cpumask cpu_online_mask; + struct cpumask online_mask; }; /* Crash memory ranges */ diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c index 26d091a1a54c..3cb3b02a13dd 100644 --- a/arch/powerpc/kernel/fadump.c +++ b/arch/powerpc/kernel/fadump.c @@ -415,7 +415,7 @@ void crash_fadump(struct pt_regs *regs, const char *str) else ppc_save_regs(&fdh->regs); - fdh->cpu_online_mask = *cpu_online_mask; + fdh->online_mask = *cpu_online_mask; /* Call ibm,os-term rtas call to trigger firmware assisted dump */ rtas_os_term((char *)str); @@ -646,7 +646,7 @@ static int __init fadump_build_cpu_notes(const struct fadump_mem_struct *fdm) } /* Lower 4 bytes of reg_value contains logical cpu id */ cpu = be64_to_cpu(reg_entry->reg_value) & FADUMP_CPU_ID_MASK; - if (fdh && !cpumask_test_cpu(cpu, &fdh->cpu_online_mask)) { + if (fdh && !cpumask_test_cpu(cpu, &fdh->online_mask)) { SKIP_TO_NEXT_CPU(reg_entry); continue; } -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] x86: dumpstack, use pr_cont
On Tue, Oct 06 2015, Jiri Slaby wrote: > When dumping flags with which the kernel was built, we print them one > by one in separate printks. Let's use pr_cont as they are > continuation prints. > > Signed-off-by: Jiri Slaby > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: "H. Peter Anvin" > Cc: x...@kernel.org > --- > arch/x86/kernel/dumpstack.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c > index 9c30acfadae2..3850c992f767 100644 > --- a/arch/x86/kernel/dumpstack.c > +++ b/arch/x86/kernel/dumpstack.c > @@ -260,18 +260,18 @@ int __die(const char *str, struct pt_regs *regs, long > err) > printk(KERN_DEFAULT > "%s: %04lx [#%d] ", str, err & 0x, ++die_counter); > #ifdef CONFIG_PREEMPT > - printk("PREEMPT "); > + pr_cont("PREEMPT "); > #endif > #ifdef CONFIG_SMP > - printk("SMP "); > + pr_cont("SMP "); > #endif > #ifdef CONFIG_DEBUG_PAGEALLOC > - printk("DEBUG_PAGEALLOC "); > + pr_cont("DEBUG_PAGEALLOC"); cosmetic: this lost a space. May I suggest diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c index 9c30acfadae2..b473a47d1851 100644 --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -257,21 +257,23 @@ int __die(const char *str, struct pt_regs *regs, long err) unsigned short ss; unsigned long sp; #endif - printk(KERN_DEFAULT - "%s: %04lx [#%d] ", str, err & 0x, ++die_counter); + static const char build_flags[] = "" #ifdef CONFIG_PREEMPT - printk("PREEMPT "); + " PREEMPT" #endif #ifdef CONFIG_SMP - printk("SMP "); + " SMP" #endif #ifdef CONFIG_DEBUG_PAGEALLOC - printk("DEBUG_PAGEALLOC "); + " DEBUG_PAGEALLOC" #endif #ifdef CONFIG_KASAN - printk("KASAN"); + " KASAN" #endif - printk("\n"); + ; + printk(KERN_DEFAULT + "%s: %04lx [#%d]%s\n", str, err & 0x, ++die_counter, + build_flags); if (notify_die(DIE_OOPS, str, regs, err, current->thread.trap_nr, SIGSEGV) == NOTIFY_STOP) return 1; instead, so that there's only one printk call and the pr_cont issue goes away? Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] linux/kconfig.h: generalize IS_ENABLED logic
It's not hard to generalize the macro magic used to build the IS_ENABLED macro and friends to produce a few other potentially useful macros: CHOOSE_EXPR(CONFIG_FOO, expr): if CONFIG_FOO is set expands to expr, otherwise expands to nothing. CHOOSE_EXPR(CONFIG_FOO, expr1, expr2): if CONFIG_FOO is set, expands to expr1, otherwise expands to expr2. While the latter is roughly the same as __builtin_choose_expr(IS_ENABLED(CONFIG_FOO), expr1, expr2), the macro version has the advantage that expr1 and expr2 may be string literals, and they would preserve their ability to be concatenated with other string literals. For example, this little snippet #ifdef CONFIG_X86_64 " x86-tsc: TSC cycle counter\n" #endif from kernel/trace/trace.c (which is surrounded by other string literals) could be written as CHOOSE_EXPR(CONFIG_X86_64, " x86-tsc: TSC cycle counter\n") We're also not really restricted to expressions in the C sense; the only limitation I can see is that they cannot contain unparenthesized commas. (Obviously, if one starts getting too creative, readability will suffer rather than increase.) Similarly, we can define helpers for conditional struct members and their associated initializers. It would probably take some time to get used to reading, to pick another random example, struct task_struct { ... COND_DECLARATION(CONFIG_KASAN, unsigned int kasan_depth) ... } #define INIT_KASAN(tsk) COND_INITIALIZER(CONFIG_KASAN, .kasan_depth = 1) [and I'm certainly not proposing any mass conversion], but I think it might be nice to avoid lots of short #ifdef/#else/#endif sections. The above would replace 3 and 5 lines, respectively. Also, git grep'ing for CONFIG_KASAN currently just reveals that _something_ in sched.h and init_task.h depends on it; with the above, one could at least deduce that it's guarding a certain member of some struct. Namewise, I think CHOOSE_EXPR is appropriate because of its similarity to __builtin_choose_expr, but I'm not sure about the COND_* ones. Feel free to suggest better names, and/or to flame this idea to death. Signed-off-by: Rasmus Villemoes --- include/linux/kconfig.h | 75 + 1 file changed, 75 insertions(+) diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h index b33c7797eb57..ac209814b111 100644 --- a/include/linux/kconfig.h +++ b/include/linux/kconfig.h @@ -51,4 +51,79 @@ #define IS_ENABLED(option) \ (IS_BUILTIN(option) || IS_MODULE(option)) +/* + * CHOOSE_EXPR, COND_DECLARATION and COND_INITIALIZER only work for + * boolean config options. + * + * CHOOSE_EXPR(CONFIG_FOO, expr): if CONFIG_FOO is set expands to + * expr, otherwise expands to nothing. + * + * CHOOSE_EXPR(CONFIG_FOO, expr1, expr2): if CONFIG_FOO is set, + * expands to expr1, otherwise expands to expr2. + * + * COND_DECLARATION(CONFIG_FOO, decl): if CONFIG_FOO is set, expands to + * + * decl; + * + * (a semicolon should not be part of decl), otherwise expands to + * nothing. + * + * COND_INITIALIZER(CONFIG_FOO, init): if CONFIG_FOO is set, expands to + * + * init, + * + * otherwise expands to nothing. + * + * CHOOSE_EXPR(CONFIG_FOO, expr1, expr2) is roughly equivalent to + * __builtin_choose_expr(IS_ENABLED(CONFIG_FOO), expr1, + * expr2). However, since the expansion is done by the preprocessor, + * expr1 and expr2 can be string literals which can then participate + * in string concatenation. Also, we're not really limited to + * expressions, and can choose to expand to nothing (this is also used + * internally by the COND_* macros). The only limitation is that expr1 + * and expr2 cannot contain unparenthesized commas. + * + * COND_DECLARATION can, for example, be used inside a struct + * declaration to eliminate a #ifdef/#endif pair. This would look + * something like + * + * struct foo { + * int a; + * COND_DECLARATION(CONFIG_FOO_DEBUG, int b) + * int c; + * }; + * + * COND_INITIALIZER is the companion for initializing such + * conditionally defined members, again for eliminating the bracketing + * #ifdef/#endif pair. + * + * struct foo f = { + * .a = 1, + * COND_INITIALIZER(CONFIG_FOO_DEBUG, .b = 2) + * .c = 3 + * }; + * + * This is mostly useful when only a single or a few members would be + * protected by the #ifdef/#endif. One advantage of the COND_* macros + * is that git grep'ing for CONFIG_FOO_DEBUG reveals more information + * (above, we would see that it protects the "b" member of some + * struct). + */ + +#define _COMMA , +#define _COND_PUNCTUATION_0(p) +#define _COND_PUNCTUATION_1(p) p + +#define CHOOSE_EXPR(cfg, expr, ...) _CHOOSE_EXPR(cfg, expr, ##__VA_ARGS__, /* empty defalt arg */) +#define _CHOOSE_EXPR(cfg, expr, def, ...) __CHOOSE_EXPR(__ARG_PLACEHOLDER_##cfg, expr, def) +#define __CHOOSE_EXPR(arg1_or_junk, expr, def) ___CHOOSE_EXPR(arg1_or_junk expr, def) +#define ___CHOOSE
[PATCH 2/2] x86: dumpstack: eliminate some #ifdefs
Jiri Slaby noted that these #ifdef protected printks should really be pr_cont. However, we might as well get completely rid of both the multiple printk calls and the tiny #ifdef sections by just building an appropriate string and passing that to the first printk call. Signed-off-by: Rasmus Villemoes --- Mostly an example of how 1/2 can be used. arch/x86/kernel/dumpstack.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c index 9c30acfadae2..6ae7e65e734e 100644 --- a/arch/x86/kernel/dumpstack.c +++ b/arch/x86/kernel/dumpstack.c @@ -257,21 +257,16 @@ int __die(const char *str, struct pt_regs *regs, long err) unsigned short ss; unsigned long sp; #endif + static const char build_flags[] = "" + CHOOSE_EXPR(CONFIG_PREEMPT, " PREEMPT") + CHOOSE_EXPR(CONFIG_SMP, " SMP") + CHOOSE_EXPR(CONFIG_DEBUG_PAGEALLOC, " DEBUG_PAGEALLOC") + CHOOSE_EXPR(CONFIG_KASAN, " KASAN"); + printk(KERN_DEFAULT - "%s: %04lx [#%d] ", str, err & 0x, ++die_counter); -#ifdef CONFIG_PREEMPT - printk("PREEMPT "); -#endif -#ifdef CONFIG_SMP - printk("SMP "); -#endif -#ifdef CONFIG_DEBUG_PAGEALLOC - printk("DEBUG_PAGEALLOC "); -#endif -#ifdef CONFIG_KASAN - printk("KASAN"); -#endif - printk("\n"); + "%s: %04lx [#%d]%s\n", str, err & 0x, ++die_counter, + build_flags); + if (notify_die(DIE_OOPS, str, regs, err, current->thread.trap_nr, SIGSEGV) == NOTIFY_STOP) return 1; -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] string: Improve the generic strlcpy() implementation
On Tue, Oct 06 2015, Ingo Molnar wrote: > * Rasmus Villemoes wrote: > >> >> I'm not against making strlcpy more robust, but I think the theoretical race >> is >> far more likely to manifest through a member of the printf family. > > So the printf family is generally less frequently used in ABI output than > string > copies, Huh? snprintf and friends are often used just to copy or concatenate strings (essentially, any format string containing no specifiers other than %s does exactly that) - even if a str*() function could do the same thing. I see no reason to believe this wouldn't also be done in cases where the final string ends up being presented to userspace. > but yeah, since there are 15,000 s[n]printf() calls in the kernel it's > more likely to be an issue not just by virtue of timing, but also by sheer > mass of > usage, statistically. Yes. > So I'd improve it all in the following order: > > - fix the strscpy() uninitialized use > > - base strlcpy() on strscpy() via the patch I sent. This makes all users > faster > and eliminates the theoretical race. I'm not so sure about that part. I'd strongly suspect that the vast majority of strings handled by strlcpy (and in the future strscpy) are shorter than 32 bytes, so is all the word_at_a_time and pre/post alignment yoga really worth it? strscpy is 299 bytes - that's a lot of instruction cache lines, and it will almost always be cache cold. This isn't an argument against basing strlcpy on strscpy (that would likely just make the former a little smaller), but I'd like to see numbers (cycle counts, distribution of input lengths, ...) before I believe in the performance argument. > - phase out 'simple' strlcpy() uses via an automated patch. This gets rid > of > 2,000 strlcpy() call sites in a single pass. That seems to be exactly the kind of mass-conversion Linus referred to above. Also, you can't really do it if strscpy keeps it __must_check annotation, as you'd then introduce 2000+ warnings... Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] string: Improve the generic strlcpy() implementation
On Wed, Oct 07 2015, Ingo Molnar wrote: > We have procfs and sysfs as well, where format strings are indeed dominant, > but > are you sure this race exists in snprintf() in that form? I.e. can the return > value of snprintf() be different from the true length of the output string, > if the > source string is modified in parallel? Well, if truncation has happened the return value is different (larger). But assuming the output buffer is large enough, the 'compute strlen, then do copying, potentially copying a nul byte which wasn't there moments before' is pretty obvious: lib/vsprintf.c: static noinline_for_stack char *string(char *buf, char *end, const char *s, struct printf_spec spec) { int len, i; if ((unsigned long)s < PAGE_SIZE) s = "(null)"; len = strnlen(s, spec.precision); if (!(spec.flags & LEFT)) { while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } } for (i = 0; i < len; ++i) { if (buf < end) *buf = *s; ++buf; ++s; } while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } return buf; } (spec.precision is an s16 which by default is set to -1, so for the usual case of plain %s the upper bound in strnlen is (size_t)-1, effectively infinity). If it wasn't for the field width padding it would probably not be that hard to fix. But, to rephrase an earlier question: Can anyone point to an instance where the strlcpy source or a %s argument to a printf function can actually change under us? I'd like to see if one can intentionally trigger the potential race, but I suspect that the vast majority cannot have a problem - maybe someone has an idea of specific places that are worth looking at. >> > So I'd improve it all in the following order: >> > >> > - fix the strscpy() uninitialized use >> > >> > - base strlcpy() on strscpy() via the patch I sent. This makes all users >> > faster >> > and eliminates the theoretical race. >> >> I'm not so sure about that part. I'd strongly suspect that the vast >> majority of strings handled by strlcpy (and in the future strscpy) are >> shorter than 32 bytes, so is all the word_at_a_time and pre/post >> alignment yoga really worth it? > > That's a good question, I'll measure it. Here's a few pseudo-datapoints. About half the strlcpy instances (just from lazy grepping) has a string literal as src, and those only have about 1/8th chance of being aligned. A quick skim through a small vmlinux showed 34 calls of strlcpy where %rsi was easily seen to be a literal address, of which 7 were aligned. That's 1/5, but the sample size is rather small (also, the 1/8 is admittedly an underestimate, since gcc seems to put long enough literals in rodata.str1.8). Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] linux/kconfig.h: generalize IS_ENABLED logic
On Wed, Oct 07 2015, Michal Marek wrote: > On 2015-10-06 23:05, Rasmus Villemoes wrote: >> It's not hard to generalize the macro magic used to build the >> IS_ENABLED macro and friends to produce a few other potentially useful >> macros: >> >> CHOOSE_EXPR(CONFIG_FOO, expr): if CONFIG_FOO is set expands to >> expr, otherwise expands to nothing. >> >> CHOOSE_EXPR(CONFIG_FOO, expr1, expr2): if CONFIG_FOO is set, >> expands to expr1, otherwise expands to expr2. > > FWIW, I agree with Ingo that the CHOOSE_EXPR name is not really obvious. > IF_CONFIG is a better alternative IMO, since the average programmer > probably does not know __builtin_choose_expr() to see the analogy. OK, CHOOSE_EXPR is out. But I think IF_CONFIG/COND_CONFIG might be a little annoying or redundant, since "CONFIG" would also always be part of the first argument. Come to think of it, since this would be a primitive for conditional compilation whose primary purpose is to eliminate the verbosity of #ifdef/#endif, I'd prefer plain and simple COND, with COND_INITIALIZER as a sidekick for that special purpose. Unfortunately, COND is already used in a few places :( So I'll go with COND_CONFIG for now, but wait a few days before sending v2, to see if anyone else has comments or naming suggestions. > While the C standard syntax requires struct-declaration to actually > declare a member, the compiler will happily ignore the extra semicolon > if you write > > truct task_struct { >... >CHOOSE_EXPR(CONFIG_KASAN, unsigned int kasan_depth); >... > } > > So I think that the COND_DECLARATION macro is not necessary. Thanks, I didn't know that. I see that both gcc and clang accept it whether the extra semicolon is at the beginning or end of the struct, and whether there's even a single actual member. OK, then COND_DECLARATION is redundant (though it might still be useful as a natural buddy to COND_INITIALIZER). >> #define INIT_KASAN(tsk) COND_INITIALIZER(CONFIG_KASAN, .kasan_depth = 1) > > COND_INITIALIZER on the other hand is useful (CHOOSE_EXPR(CONFIG_KASAN, > .kasan_depth = 1 _COMMA) does does not work, unfortunately). Yeah, since we need to do the multiple expansion thing there's no way of preventing _COMMA from expanding too early, so I'm pretty sure one would need some specialized version of CHOOSE_EXPR (or whatever the name ends up being). Also, I wouldn't really want users to have to supply the _COMMA. One could also consider making COND_ARGUMENT an alias for it - that could be useful for some seq_printf calls generating /proc files (where the format string would be built with COND_CONFIG). >> [and I'm certainly not proposing any mass conversion], but I think it >> might be nice to avoid lots of short #ifdef/#else/#endif sections. > > It should be accompanied by a patch to scripts/tags.sh teaching > ctags/etags about the new macros. Do you mean that something like --regex-c='/COND_CONFIG\([^,]*,([^,]*)\)/\1/' should be added so ctags would pick up the text in the true branch? I'm not very familiar with ctags. Thanks for the feedback, Michal and Ingo. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lib: Make _find_next_bit helper function inline
I've lost track of what's up and down in this, but now that I look at this again let me throw in my two observations of stupid gcc behaviour: For the current code, both debian's gcc (4.7) and 5.1 partially inlines _find_next_bit, namely the "if (!nbits || start >= nbits)" test. I know it does it to avoid a function call, but in this case the early return condition is unlikely, so there's not much to gain. Moreover, it fails to optimize the test to simply "if (start >= nbits)" - everything being unsigned, these are obviously equivalent. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] linux/clock_cooling.h: use unique include guard
Using the same include guard as linux/cpu_cooling.h will cause build breakage if those headers are ever used from the same TU. Signed-off-by: Rasmus Villemoes --- include/linux/clock_cooling.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/clock_cooling.h b/include/linux/clock_cooling.h index 4d1019d56f7f..8cee7b281e96 100644 --- a/include/linux/clock_cooling.h +++ b/include/linux/clock_cooling.h @@ -20,8 +20,8 @@ * General Public License for more details. */ -#ifndef __CPU_COOLING_H__ -#define __CPU_COOLING_H__ +#ifndef __CLOCK_COOLING_H__ +#define __CLOCK_COOLING_H__ #include #include @@ -62,4 +62,4 @@ unsigned long clock_cooling_get_level(struct thermal_cooling_device *cdev, } #endif /* CONFIG_CLOCK_THERMAL */ -#endif /* __CPU_COOLING_H__ */ +#endif /* __CLOCK_COOLING_H__ */ -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/4] net: cavium: liquidio: use kzalloc in setup_glist()
We save a little .text and get rid of the sizeof(...) style inconsistency. Signed-off-by: Rasmus Villemoes --- drivers/net/ethernet/cavium/liquidio/lio_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c index 0660deecc2c9..f683d97d7614 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c @@ -818,10 +818,9 @@ static int setup_glist(struct lio *lio) INIT_LIST_HEAD(&lio->glist); for (i = 0; i < lio->tx_qsize; i++) { - g = kmalloc(sizeof(*g), GFP_KERNEL); + g = kzalloc(sizeof(*g), GFP_KERNEL); if (!g) break; - memset(g, 0, sizeof(struct octnic_gather)); g->sg_size = ((ROUNDUP4(OCTNIC_MAX_SG) >> 2) * OCT_SG_ENTRY_SIZE); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/4] net: mv643xx_eth: use kzalloc
The double memset is a little ugly; using kzalloc avoids it altogether. Signed-off-by: Rasmus Villemoes --- drivers/net/ethernet/marvell/mv643xx_eth.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c index d52639bc491f..960169efe636 100644 --- a/drivers/net/ethernet/marvell/mv643xx_eth.c +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c @@ -1859,14 +1859,11 @@ oom: return; } - mc_spec = kmalloc(0x200, GFP_ATOMIC); + mc_spec = kzalloc(0x200, GFP_ATOMIC); if (mc_spec == NULL) goto oom; mc_other = mc_spec + (0x100 >> 2); - memset(mc_spec, 0, 0x100); - memset(mc_other, 0, 0x100); - netdev_for_each_mc_addr(ha, dev) { u8 *a = ha->addr; u32 *table; -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/4] net: a few kzalloc/memset cleanups
These are just a few k[czm]alloc/memset related cleanups. Rasmus Villemoes (4): net: cavium: liquidio: use kzalloc in setup_glist() net: jme: use kzalloc() instead of kmalloc+memset net: mv643xx_eth: use kzalloc net: qlcnic: delete redundant memsets drivers/net/ethernet/cavium/liquidio/lio_main.c | 3 +-- drivers/net/ethernet/jme.c | 8 ++-- drivers/net/ethernet/marvell/mv643xx_eth.c | 5 + drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 2 -- drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c | 2 -- drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 2 -- 6 files changed, 4 insertions(+), 18 deletions(-) -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] net: jme: use kzalloc() instead of kmalloc+memset
Using kzalloc saves a tiny bit on .text. Signed-off-by: Rasmus Villemoes --- drivers/net/ethernet/jme.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c index 6e9a792097d3..060dd3922974 100644 --- a/drivers/net/ethernet/jme.c +++ b/drivers/net/ethernet/jme.c @@ -583,7 +583,7 @@ jme_setup_tx_resources(struct jme_adapter *jme) atomic_set(&txring->next_to_clean, 0); atomic_set(&txring->nr_free, jme->tx_ring_size); - txring->bufinf = kmalloc(sizeof(struct jme_buffer_info) * + txring->bufinf = kzalloc(sizeof(struct jme_buffer_info) * jme->tx_ring_size, GFP_ATOMIC); if (unlikely(!(txring->bufinf))) goto err_free_txring; @@ -592,8 +592,6 @@ jme_setup_tx_resources(struct jme_adapter *jme) * Initialize Transmit Descriptors */ memset(txring->alloc, 0, TX_RING_ALLOC_SIZE(jme->tx_ring_size)); - memset(txring->bufinf, 0, - sizeof(struct jme_buffer_info) * jme->tx_ring_size); return 0; @@ -845,7 +843,7 @@ jme_setup_rx_resources(struct jme_adapter *jme) rxring->next_to_use = 0; atomic_set(&rxring->next_to_clean, 0); - rxring->bufinf = kmalloc(sizeof(struct jme_buffer_info) * + rxring->bufinf = kzalloc(sizeof(struct jme_buffer_info) * jme->rx_ring_size, GFP_ATOMIC); if (unlikely(!(rxring->bufinf))) goto err_free_rxring; @@ -853,8 +851,6 @@ jme_setup_rx_resources(struct jme_adapter *jme) /* * Initiallize Receive Descriptors */ - memset(rxring->bufinf, 0, - sizeof(struct jme_buffer_info) * jme->rx_ring_size); for (i = 0 ; i < jme->rx_ring_size ; ++i) { if (unlikely(jme_make_new_rx_buf(jme, i))) { jme_free_rx_resources(jme); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/4] net: qlcnic: delete redundant memsets
In all cases, mbx->req.arg and mbx->rsp.arg have just been allocated using kcalloc(), so these six memsets are redundant. Signed-off-by: Rasmus Villemoes --- I cranked $context_lines to 11 to make the kcallocs visible. drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 2 -- drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c | 2 -- drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 2 -- 3 files changed, 6 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 5ab3adf88166..9f0bdd993955 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -910,24 +910,22 @@ int qlcnic_83xx_alloc_mbx_args(struct qlcnic_cmd_args *mbx, mbx->req.arg = kcalloc(mbx->req.num, sizeof(u32), GFP_ATOMIC); if (!mbx->req.arg) return -ENOMEM; mbx->rsp.arg = kcalloc(mbx->rsp.num, sizeof(u32), GFP_ATOMIC); if (!mbx->rsp.arg) { kfree(mbx->req.arg); mbx->req.arg = NULL; return -ENOMEM; } - memset(mbx->req.arg, 0, sizeof(u32) * mbx->req.num); - memset(mbx->rsp.arg, 0, sizeof(u32) * mbx->rsp.num); temp = adapter->ahw->fw_hal_version << 29; mbx->req.arg[0] = (type | (mbx->req.num << 16) | temp); mbx->cmd_op = type; return 0; } } dev_err(&adapter->pdev->dev, "%s: Invalid mailbox command opcode 0x%x\n", __func__, type); return -EINVAL; } diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c index 6e6f18fc5d76..a5f422f26cb4 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c @@ -65,24 +65,22 @@ int qlcnic_82xx_alloc_mbx_args(struct qlcnic_cmd_args *mbx, mbx->req.arg = kcalloc(mbx->req.num, sizeof(u32), GFP_ATOMIC); if (!mbx->req.arg) return -ENOMEM; mbx->rsp.arg = kcalloc(mbx->rsp.num, sizeof(u32), GFP_ATOMIC); if (!mbx->rsp.arg) { kfree(mbx->req.arg); mbx->req.arg = NULL; return -ENOMEM; } - memset(mbx->req.arg, 0, sizeof(u32) * mbx->req.num); - memset(mbx->rsp.arg, 0, sizeof(u32) * mbx->rsp.num); mbx->req.arg[0] = type; break; } } return 0; } /* Free up mailbox registers */ void qlcnic_free_mbx_args(struct qlcnic_cmd_args *cmd) { kfree(cmd->req.arg); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c index 546cd5f1c85a..7327b729ba2e 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c @@ -721,24 +721,22 @@ static int qlcnic_sriov_alloc_bc_mbx_args(struct qlcnic_cmd_args *mbx, u32 type) mbx->req.arg = kcalloc(mbx->req.num, sizeof(u32), GFP_ATOMIC); if (!mbx->req.arg) return -ENOMEM; mbx->rsp.arg = kcalloc(mbx->rsp.num, sizeof(u32), GFP_ATOMIC); if (!mbx->rsp.arg) { kfree(mbx->req.arg); mbx->req.arg = NULL; return -ENOMEM; } - memset(mbx->req.arg, 0, sizeof(u32) * mbx->req.num); - memset(mbx->rsp.arg, 0, sizeof(u32) * mbx->rsp.num); mbx->req.arg[0] = (type | (mbx->req.num << 16) | (3 << 29)); mbx->rsp.arg[0] = (type & 0x) | mbx->rsp.num << 16; return 0; } } return -EINVAL; } static int qlcnic_sriov_prepare_bc_hdr(struct qlcnic_bc_trans *trans,
Re: [PATCH 3/4] net: mv643xx_eth: use kzalloc
On Wed, Sep 09 2015, Joe Perches wrote: > On Wed, 2015-09-09 at 10:38 +0200, Rasmus Villemoes wrote: >> The double memset is a little ugly; using kzalloc avoids it altogether. > [] >> diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c >> b/drivers/net/ethernet/marvell/mv643xx_eth.c > [] >> @@ -1859,14 +1859,11 @@ oom: >> return; >> } >> >> -mc_spec = kmalloc(0x200, GFP_ATOMIC); >> +mc_spec = kzalloc(0x200, GFP_ATOMIC); >> if (mc_spec == NULL) >> goto oom; >> mc_other = mc_spec + (0x100 >> 2); > > This sure looks wrong as it sets a pointer > to unallocated memory. > >> -memset(mc_spec, 0, 0x100); >> -memset(mc_other, 0, 0x100); > > So this does a memset of random memory. > Huh? mc_spec and mc_other are u32*, we allocate 0x200 = 512 bytes = 128 u32s, and pointer arithmetic makes mc_other point to the latter 64. Then the memory is cleared 256 bytes at a time. It's unusual and slightly obfuscated code, but I don't think it's wrong. > > for (i = 0; i < 0x100; i += 4) { > wrl(mp, SPECIAL_MCAST_TABLE(mp->port_num) + i, mc_spec[i >> 2]); > wrl(mp, OTHER_MCAST_TABLE(mp->port_num) + i, mc_other[i >> 2]); > } I'd probably have written that as for (i = 0; i < 64; ++i) { wrl(mp, SPECIAL_MCAST_TABLE(mp->port_num) + 4*i, mc_spec[i]); wrl(mp, OTHER_MCAST_TABLE(mp->port_num) + 4*i, mc_other[i]); } but again, I don't think it's wrong [haven't checked what SPECIAL_MCAST_TABLE/OTHER_MCAST_TABLE do, though]. Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] lib/vsprintf.c: increase the size of the field_width variable
On Wed, Sep 09 2015, Joe Perches wrote: > On Wed, 2015-09-09 at 12:13 +0200, Maurizio Lombardi wrote: >> When printing a bitmap using the "%*pb[l]" printk format >> a 16 bit variable (field_width) is used to store the size of the bitmap. >> In some cases 16 bits are not sufficient, the variable overflows and >> printk does not work as expected. > > If more than 16 bits are necessary, it couldn't work > as a single printk is limited to 1024 bytes. I'm also a little confused; I don't see what printk has to do with the reported problem (I'd expect the /sys/... file to be generated by something like seq_printf). >> 3. Bitmap should be set, but still empty >> # cat /sys/bus/pseudo/drivers/scsi_debug/map > [] >> diff --git a/lib/vsprintf.c b/lib/vsprintf.c > [] >> @@ -384,8 +384,8 @@ struct printf_spec { >> u8 flags; /* flags to number() */ >> u8 base; /* number base, 8, 10 or 16 only */ >> u8 qualifier; /* number qualifier, one of 'hHlLtzZ' */ >> -s16 field_width;/* width of output field */ >> s16 precision; /* # of digits/chars */ >> +s32 field_width;/* width of output field */ >> }; >> >> static noinline_for_stack > > And this makes the sizeof struct printf_spec more than > 8 bytes which isn't desireable on x86-32. Or other architectures, I imagine. I'm pretty sure struct printf_spec purposely has sizeof==8 to allow it to be (relatively cheaply) passed around by value. > %*pb is meant for smallish bitmaps, not big ones. Yup. And that leads to my other confusion: Given that the expected output is given as "0-15", does the bitmap really consist of > S16_MAX bits with only the first 16 set? Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] tracing: use kstrdup_const instead of private implementation
The kernel now has kstrdup_const/kfree_const for reusing .rodata (typically string literals) when possible; there's no reason to duplicate that logic in the tracing system. Moreover, as the comment above core_kernel_data states, it may not always return true for .rodata - that is for example the case on x86_64, where we thus end up kstrdup'ing all the passed-in strings. Arguably, testing for .rodata explicitly (as kstrdup_const does) is also more correct: I don't think one is supposed to be able to change the name after creating the event_subsystem by passing the address of a static char (but non-const) array. Signed-off-by: Rasmus Villemoes --- kernel/trace/trace_events.c | 24 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index 7ca09cdc20c2..ae97d73b31fa 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -38,21 +38,19 @@ static LIST_HEAD(ftrace_common_fields); static struct kmem_cache *field_cachep; static struct kmem_cache *file_cachep; -#define SYSTEM_FL_FREE_NAME(1 << 31) - static inline int system_refcount(struct event_subsystem *system) { - return system->ref_count & ~SYSTEM_FL_FREE_NAME; + return system->ref_count; } static int system_refcount_inc(struct event_subsystem *system) { - return (system->ref_count++) & ~SYSTEM_FL_FREE_NAME; + return system->ref_count++; } static int system_refcount_dec(struct event_subsystem *system) { - return (--system->ref_count) & ~SYSTEM_FL_FREE_NAME; + return --system->ref_count; } /* Double loops, do not use break, only goto's work */ @@ -460,8 +458,7 @@ static void __put_system(struct event_subsystem *system) kfree(filter->filter_string); kfree(filter); } - if (system->ref_count & SYSTEM_FL_FREE_NAME) - kfree(system->name); + kfree_const(system->name); kfree(system); } @@ -1492,13 +1489,9 @@ create_new_subsystem(const char *name) system->ref_count = 1; /* Only allocate if dynamic (kprobes and modules) */ - if (!core_kernel_data((unsigned long)name)) { - system->ref_count |= SYSTEM_FL_FREE_NAME; - system->name = kstrdup(name, GFP_KERNEL); - if (!system->name) - goto out_free; - } else - system->name = name; + system->name = kstrdup_const(name, GFP_KERNEL); + if (!system->name) + goto out_free; system->filter = NULL; @@ -1511,8 +1504,7 @@ create_new_subsystem(const char *name) return system; out_free: - if (system->ref_count & SYSTEM_FL_FREE_NAME) - kfree(system->name); + kfree_const(system->name); kfree(system); return NULL; } -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] trace/ftrace.c: remove redundant swap function
To cover the common case of sorting an array of pointers, Daniel Wagner recently modified the library sort() to use a specific swap function for size==8, in addition to the size==4 case which was already handled. Since sizeof(long) is either 4 or 8, ftrace_swap_ips() is redundant and we can just let sort() pick an appropriate and fast swap callback. Signed-off-by: Rasmus Villemoes --- kernel/trace/ftrace.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index b0623ac785a2..186bafa9935d 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -4783,17 +4783,6 @@ static int ftrace_cmp_ips(const void *a, const void *b) return 0; } -static void ftrace_swap_ips(void *a, void *b, int size) -{ - unsigned long *ipa = a; - unsigned long *ipb = b; - unsigned long t; - - t = *ipa; - *ipa = *ipb; - *ipb = t; -} - static int ftrace_process_locs(struct module *mod, unsigned long *start, unsigned long *end) @@ -4813,7 +4802,7 @@ static int ftrace_process_locs(struct module *mod, return 0; sort(start, count, sizeof(*start), -ftrace_cmp_ips, ftrace_swap_ips); +ftrace_cmp_ips, NULL); start_pg = ftrace_allocate_pages(count); if (!start_pg) -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] futex: eliminate cache miss from futex_hash()
futex_hash() references two global variables: the base pointer futex_queues and the size of the array futex_hashsize. The latter is marked __read_mostly, while the former is not, so they are likely to end up very far from each other. This means that futex_hash() is likely to encounter two cache misses. We could mark futex_queues as __read_mostly as well, but that doesn't guarantee they'll end up next to each other (and even if they do, they may still end up in different cache lines). So put the two variables in a small singleton struct with sufficient alignment and mark that as __read_mostly. A diff of the disassembly shows what I'd expect: : 31 d1 xor%edx,%ecx : c1 ca 12ror$0x12,%edx : 29 d1 sub%edx,%ecx -: 48 8b 15 25 c8 e5 00mov0xe5c825(%rip),%rdx# 81f149c8 +: 48 8b 15 35 c8 e5 00mov0xe5c835(%rip),%rdx# 81f149d8 <__futex_data+0x8> : 31 c8 xor%ecx,%eax : c1 c9 08ror$0x8,%ecx : 29 c8 sub%ecx,%eax : 48 83 ea 01 sub$0x1,%rdx : 48 21 d0and%rdx,%rax : 48 c1 e0 06 shl$0x6,%rax -: 48 03 05 e4 5e 02 01add0x1025ee4(%rip),%rax# 820de0a0 +: 48 03 05 14 c8 e5 00add0xe5c814(%rip),%rax# 81f149d0 <__futex_data> : c3 retq : 0f 1f 00nopl (%rax) Signed-off-by: Rasmus Villemoes --- Resending since this was never picked up - and I assume it's actually ok. Also, this time the alignment is spelled 2*sizeof(long) to avoid wasting 8 bytes on 32bit. kernel/futex.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index 6e443efc65f4..dfc86e93c31d 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -255,9 +255,18 @@ struct futex_hash_bucket { struct plist_head chain; } cacheline_aligned_in_smp; -static unsigned long __read_mostly futex_hashsize; +/* + * The base of the bucket array and its size are always used together + * (after initialization only in hash_futex()), so ensure that they + * reside in the same cacheline. + */ +static struct { + struct futex_hash_bucket *queues; + unsigned longhashsize; +} __futex_data __read_mostly __aligned(2*sizeof(long)); +#define futex_queues (__futex_data.queues) +#define futex_hashsize (__futex_data.hashsize) -static struct futex_hash_bucket *futex_queues; /* * Fault injections for futexes. -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] lib/dynamic_debug.c: use kstrdup_const
Using kstrdup_const, thus reusing .rodata when possible, saves around 2 kB of runtime memory on my laptop/.config combination. Signed-off-by: Rasmus Villemoes --- lib/dynamic_debug.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index e491e02eff54..e3952e9c8ec0 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -42,7 +42,7 @@ extern struct _ddebug __stop___verbose[]; struct ddebug_table { struct list_head link; - char *mod_name; + const char *mod_name; unsigned int num_ddebugs; struct _ddebug *ddebugs; }; @@ -841,12 +841,12 @@ int ddebug_add_module(struct _ddebug *tab, unsigned int n, const char *name) { struct ddebug_table *dt; - char *new_name; + const char *new_name; dt = kzalloc(sizeof(*dt), GFP_KERNEL); if (dt == NULL) return -ENOMEM; - new_name = kstrdup(name, GFP_KERNEL); + new_name = kstrdup_const(name, GFP_KERNEL); if (new_name == NULL) { kfree(dt); return -ENOMEM; @@ -907,7 +907,7 @@ int ddebug_dyndbg_module_param_cb(char *param, char *val, const char *module) static void ddebug_table_free(struct ddebug_table *dt) { list_del_init(&dt->link); - kfree(dt->mod_name); + kfree_const(dt->mod_name); kfree(dt); } -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] PM / wakeup: wakeup_source_create: use kstrdup_const
Using kstrdup_const allows us to save a little runtime memory (and a string copy) in the common case where name is a string literal. Signed-off-by: Rasmus Villemoes --- drivers/base/power/wakeup.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index 51f15bc15774..332e607e0030 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -91,7 +91,7 @@ struct wakeup_source *wakeup_source_create(const char *name) if (!ws) return NULL; - wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL); + wakeup_source_prepare(ws, name ? kstrdup_const(name, GFP_KERNEL) : NULL); return ws; } EXPORT_SYMBOL_GPL(wakeup_source_create); @@ -154,7 +154,7 @@ void wakeup_source_destroy(struct wakeup_source *ws) wakeup_source_drop(ws); wakeup_source_record(ws); - kfree(ws->name); + kfree_const(ws->name); kfree(ws); } EXPORT_SYMBOL_GPL(wakeup_source_destroy); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] lib: introduce kvasprintf_const
This adds kvasprintf_const which tries to use kstrdup_const if possible: If the format string contains no % characters, or if the format string is exactly "%s", we delegate to kstrdup_const. Otherwise, we fall back to kvasprintf. Just as for kstrdup_const, the main motivation is to save memory by reusing .rodata when possible. The return value should be freed by kfree_const, just like for kstrdup_const. There is deliberately no kasprintf_const: In the vast majority of cases, the format string argument is a literal, so one can determine statically whether one could instead use kstrdup_const directly (which would also require one to change all corresponding kfree calls to kfree_const). Signed-off-by: Rasmus Villemoes --- include/linux/kernel.h | 2 ++ lib/kasprintf.c| 16 2 files changed, 18 insertions(+) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 5582410727cb..2c13f747ac2e 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -413,6 +413,8 @@ extern __printf(2, 3) char *kasprintf(gfp_t gfp, const char *fmt, ...); extern __printf(2, 0) char *kvasprintf(gfp_t gfp, const char *fmt, va_list args); +extern __printf(2, 0) +const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args); extern __scanf(2, 3) int sscanf(const char *, const char *, ...); diff --git a/lib/kasprintf.c b/lib/kasprintf.c index 32f12150fc4f..f194e6e593e1 100644 --- a/lib/kasprintf.c +++ b/lib/kasprintf.c @@ -31,6 +31,22 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap) } EXPORT_SYMBOL(kvasprintf); +/* + * If fmt contains no % (or is exactly %s), use kstrdup_const. If fmt + * (or the sole vararg) points to rodata, we will then save a memory + * allocation and string copy. In any case, the return value should be + * freed using kfree_const(). + */ +const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list ap) +{ + if (!strchr(fmt, '%')) + return kstrdup_const(fmt, gfp); + if (!strcmp(fmt, "%s")) + return kstrdup_const(va_arg(ap, const char*), gfp); + return kvasprintf(gfp, fmt, ap); +} +EXPORT_SYMBOL(kvasprintf_const); + char *kasprintf(gfp_t gfp, const char *fmt, ...) { va_list ap; -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] kobject: use kvasprintf_const for formatting ->name
Sometimes kobject_set_name_vargs is called with a format string conaining no %, or a format string of precisely "%s", where the single vararg happens to point to .rodata. kvasprintf_const detects these cases for us and returns a copy of that pointer instead of duplicating the string, thus saving some run-time memory. Otherwise, it falls back to kvasprintf. We just need to always deallocate ->name using kfree_const. Unfortunately, the dance we need to do to perform the '/' -> '!' sanitization makes the resulting code rather ugly. I instrumented kstrdup_const to provide some statistics on the memory saved, and for me this gave an additional ~14KB after boot (306KB was already saved; this patch bumped that to 320KB). I have KMALLOC_SHIFT_LOW==3, and since 80% of the kvasprintf_const hits were satisfied by an 8-byte allocation, the 14K would roughly be quadrupled when KMALLOC_SHIFT_LOW==5. Whether these numbers are sufficient to justify the ugliness I'll leave to others to decide. Signed-off-by: Rasmus Villemoes --- lib/kobject.c | 30 ++ 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/lib/kobject.c b/lib/kobject.c index 3e3a5c3cb330..fee2fd950306 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -257,18 +257,32 @@ static int kobject_add_internal(struct kobject *kobj) int kobject_set_name_vargs(struct kobject *kobj, const char *fmt, va_list vargs) { - char *s; + const char *s; if (kobj->name && !fmt) return 0; - s = kvasprintf(GFP_KERNEL, fmt, vargs); + s = kvasprintf_const(GFP_KERNEL, fmt, vargs); if (!s) return -ENOMEM; - /* ewww... some of these buggers have '/' in the name ... */ - strreplace(s, '/', '!'); - kfree(kobj->name); + /* +* ewww... some of these buggers have '/' in the name ... If +* that's the case, we need to make sure we have an actual +* allocated copy to modify, since kvasprintf_const may have +* returned something from .rodata. +*/ + if (strchr(s, '/')) { + char *t; + + t = kstrdup(s, GFP_KERNEL); + kfree_const(s); + if (!t) + return -ENOMEM; + strreplace(t, '/', '!'); + s = t; + } + kfree_const(kobj->name); kobj->name = s; return 0; @@ -466,7 +480,7 @@ int kobject_rename(struct kobject *kobj, const char *new_name) envp[0] = devpath_string; envp[1] = NULL; - name = dup_name = kstrdup(new_name, GFP_KERNEL); + name = dup_name = kstrdup_const(new_name, GFP_KERNEL); if (!name) { error = -ENOMEM; goto out; @@ -486,7 +500,7 @@ int kobject_rename(struct kobject *kobj, const char *new_name) kobject_uevent_env(kobj, KOBJ_MOVE, envp); out: - kfree(dup_name); + kfree_const(dup_name); kfree(devpath_string); kfree(devpath); kobject_put(kobj); @@ -632,7 +646,7 @@ static void kobject_cleanup(struct kobject *kobj) /* free name if we allocated it */ if (name) { pr_debug("kobject: '%s': free name\n", name); - kfree(name); + kfree_const(name); } } -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] ACPI / scan: use kstrdup_const in acpi_add_id()
Empirically, acpi_add_id is mostly called with string literals, so using kstrdup_const for initializing struct acpi_hardware_id::id saves a little run-time memory and a string copy. Signed-off-by: Rasmus Villemoes --- drivers/acpi/scan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index a3eaf2080707..afaac47eefb4 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -1184,7 +1184,7 @@ static void acpi_add_id(struct acpi_device_pnp *pnp, const char *dev_id) if (!id) return; - id->id = kstrdup(dev_id, GFP_KERNEL); + id->id = kstrdup_const(dev_id, GFP_KERNEL); if (!id->id) { kfree(id); return; @@ -1322,7 +1322,7 @@ void acpi_free_pnp_ids(struct acpi_device_pnp *pnp) struct acpi_hardware_id *id, *tmp; list_for_each_entry_safe(id, tmp, &pnp->ids, list) { - kfree(id->id); + kfree_const(id->id); kfree(id); } kfree(pnp->unique_id); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] ACPI: constify struct acpi_hardware_id::id
This is preparation for using kstrdup_const to initialize that member. Signed-off-by: Rasmus Villemoes --- drivers/acpi/scan.c| 4 ++-- drivers/pnp/pnpacpi/core.c | 4 ++-- include/acpi/acpi_bus.h| 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c index 01136b879038..a3eaf2080707 100644 --- a/drivers/acpi/scan.c +++ b/drivers/acpi/scan.c @@ -1472,7 +1472,7 @@ bool acpi_device_is_present(struct acpi_device *adev) } static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler, - char *idstr, + const char *idstr, const struct acpi_device_id **matchid) { const struct acpi_device_id *devid; @@ -1491,7 +1491,7 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler, return false; } -static struct acpi_scan_handler *acpi_scan_match_handler(char *idstr, +static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr, const struct acpi_device_id **matchid) { struct acpi_scan_handler *handler; diff --git a/drivers/pnp/pnpacpi/core.c b/drivers/pnp/pnpacpi/core.c index 5153d1d69aee..9113876487ed 100644 --- a/drivers/pnp/pnpacpi/core.c +++ b/drivers/pnp/pnpacpi/core.c @@ -207,7 +207,7 @@ struct pnp_protocol pnpacpi_protocol = { }; EXPORT_SYMBOL(pnpacpi_protocol); -static char *__init pnpacpi_get_id(struct acpi_device *device) +static const char *__init pnpacpi_get_id(struct acpi_device *device) { struct acpi_hardware_id *id; @@ -222,7 +222,7 @@ static char *__init pnpacpi_get_id(struct acpi_device *device) static int __init pnpacpi_add_device(struct acpi_device *device) { struct pnp_dev *dev; - char *pnpid; + const char *pnpid; struct acpi_hardware_id *id; int error; diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 9a1b46c64fc1..fec002919b65 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -227,7 +227,7 @@ typedef char acpi_device_class[20]; struct acpi_hardware_id { struct list_head list; - char *id; + const char *id; }; struct acpi_pnp_type { -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] ACPI: constify first argument of struct acpi_scan_handler::match
One wouldn't expect a "match" function modify the string it searches for, and indeed the only instance of the struct acpi_scan_handler::match callback, acpi_pnp_match, can easily be changed. While there, update its helper matching_id(). This is also preparation for constifying struct acpi_hardware_id::id. Signed-off-by: Rasmus Villemoes --- drivers/acpi/acpi_pnp.c | 4 ++-- include/acpi/acpi_bus.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/acpi_pnp.c b/drivers/acpi/acpi_pnp.c index c58940b231d6..48fc3ad13a4b 100644 --- a/drivers/acpi/acpi_pnp.c +++ b/drivers/acpi/acpi_pnp.c @@ -316,7 +316,7 @@ static const struct acpi_device_id acpi_pnp_device_ids[] = { {""}, }; -static bool matching_id(char *idstr, char *list_id) +static bool matching_id(const char *idstr, const char *list_id) { int i; @@ -333,7 +333,7 @@ static bool matching_id(char *idstr, char *list_id) return true; } -static bool acpi_pnp_match(char *idstr, const struct acpi_device_id **matchid) +static bool acpi_pnp_match(const char *idstr, const struct acpi_device_id **matchid) { const struct acpi_device_id *devid; diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 5ba8fb64f664..9a1b46c64fc1 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -129,7 +129,7 @@ static inline struct acpi_hotplug_profile *to_acpi_hotplug_profile( struct acpi_scan_handler { const struct acpi_device_id *ids; struct list_head list_node; - bool (*match)(char *idstr, const struct acpi_device_id **matchid); + bool (*match)(const char *idstr, const struct acpi_device_id **matchid); int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id); void (*detach)(struct acpi_device *dev); void (*bind)(struct device *phys_dev); -- 2.1.3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86: wmi: Remove private %pUL implementation
The work performed by wmi_gtoa is equivalent to simply sprintf(out, "%pUL", in), so one could replace its body by this. However, most users feed the result directly as a %s argument to some other function which also understands the %p extensions (they all ultimately use vsnprintf), so we can eliminate some stack buffers and quite a bit of code by just using %pUL directly. In wmi_dev_uevent I'm not sure whether there's room for a nul-terminator in env->buf, so I've just replaced wmi_gtoa with the equivalent sprintf call. Signed-off-by: Rasmus Villemoes --- Resending, hoping to get it picked up this time. drivers/platform/x86/wmi.c | 51 ++ 1 file changed, 6 insertions(+), 45 deletions(-) diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index aac47573f9ed..eb391a281833 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -194,34 +194,6 @@ static bool wmi_parse_guid(const u8 *src, u8 *dest) return true; } -/* - * Convert a raw GUID to the ACII string representation - */ -static int wmi_gtoa(const char *in, char *out) -{ - int i; - - for (i = 3; i >= 0; i--) - out += sprintf(out, "%02X", in[i] & 0xFF); - - out += sprintf(out, "-"); - out += sprintf(out, "%02X", in[5] & 0xFF); - out += sprintf(out, "%02X", in[4] & 0xFF); - out += sprintf(out, "-"); - out += sprintf(out, "%02X", in[7] & 0xFF); - out += sprintf(out, "%02X", in[6] & 0xFF); - out += sprintf(out, "-"); - out += sprintf(out, "%02X", in[8] & 0xFF); - out += sprintf(out, "%02X", in[9] & 0xFF); - out += sprintf(out, "-"); - - for (i = 10; i <= 15; i++) - out += sprintf(out, "%02X", in[i] & 0xFF); - - *out = '\0'; - return 0; -} - static bool find_guid(const char *guid_string, struct wmi_block **out) { char tmp[16], guid_input[16]; @@ -457,11 +429,7 @@ EXPORT_SYMBOL_GPL(wmi_set_block); static void wmi_dump_wdg(const struct guid_block *g) { - char guid_string[37]; - - wmi_gtoa(g->guid, guid_string); - - pr_info("%s:\n", guid_string); + pr_info("%pUL:\n", g->guid); pr_info("\tobject_id: %c%c\n", g->object_id[0], g->object_id[1]); pr_info("\tnotify_id: %02X\n", g->notify_id); pr_info("\treserved: %02X\n", g->reserved); @@ -661,7 +629,6 @@ EXPORT_SYMBOL_GPL(wmi_has_guid); static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, char *buf) { - char guid_string[37]; struct wmi_block *wblock; wblock = dev_get_drvdata(dev); @@ -670,9 +637,7 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, return strlen(buf); } - wmi_gtoa(wblock->gblock.guid, guid_string); - - return sprintf(buf, "wmi:%s\n", guid_string); + return sprintf(buf, "wmi:%pUL\n", wblock->gblock.guid); } static DEVICE_ATTR_RO(modalias); @@ -695,7 +660,7 @@ static int wmi_dev_uevent(struct device *dev, struct kobj_uevent_env *env) if (!wblock) return -ENOMEM; - wmi_gtoa(wblock->gblock.guid, guid_string); + sprintf(guid_string, "%pUL", wblock->gblock.guid); strcpy(&env->buf[env->buflen - 1], "wmi:"); memcpy(&env->buf[env->buflen - 1 + 4], guid_string, 36); @@ -721,12 +686,9 @@ static struct class wmi_class = { static int wmi_create_device(const struct guid_block *gblock, struct wmi_block *wblock, acpi_handle handle) { - char guid_string[37]; - wblock->dev.class = &wmi_class; - wmi_gtoa(gblock->guid, guid_string); - dev_set_name(&wblock->dev, "%s", guid_string); + dev_set_name(&wblock->dev, "%pUL", gblock->guid); dev_set_drvdata(&wblock->dev, wblock); @@ -877,7 +839,6 @@ static void acpi_wmi_notify(struct acpi_device *device, u32 event) struct guid_block *block; struct wmi_block *wblock; struct list_head *p; - char guid_string[37]; list_for_each(p, &wmi_block_list) { wblock = list_entry(p, struct wmi_block, list); @@ -888,8 +849,8 @@ static void acpi_wmi_notify(struct acpi_device *device, u32 event) if (wblock->handler) wblock->handler(event, wblock->handler_data); if (debug_event) { - wmi_gtoa(wblock->gblock.guid, guid_string); - pr_info(&qu
[PATCH] drm: sti: remove redundant sign extensions
arg is long int, so arg = (arg << 22) >> 22 makes the upper 22 bits of arg equal to bit 9 (or bit 41). But we then mask away all but bits 0-9, so this is entirely redundant. Signed-off-by: Rasmus Villemoes --- gcc seems to be smart enough to realize this - the generated code is the same. This is thus just a tiny cleanup. drivers/gpu/drm/sti/sti_awg_utils.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/sti/sti_awg_utils.c b/drivers/gpu/drm/sti/sti_awg_utils.c index 6029a2e3db1d..00d0698be9d3 100644 --- a/drivers/gpu/drm/sti/sti_awg_utils.c +++ b/drivers/gpu/drm/sti/sti_awg_utils.c @@ -65,7 +65,6 @@ static int awg_generate_instr(enum opcode opcode, mux = 0; data_enable = 0; - arg = (arg << 22) >> 22; arg &= (0x3ff); break; case REPEAT: @@ -77,14 +76,12 @@ static int awg_generate_instr(enum opcode opcode, mux = 0; data_enable = 0; - arg = (arg << 22) >> 22; arg &= (0x3ff); break; case JUMP: mux = 0; data_enable = 0; arg |= 0x40; /* for jump instruction 7th bit is 1 */ - arg = (arg << 22) >> 22; arg &= 0x3ff; break; case STOP: @@ -94,7 +91,6 @@ static int awg_generate_instr(enum opcode opcode, case RPTSET: case RPLSET: case HOLD: - arg = (arg << 24) >> 24; arg &= (0x0ff); break; default: -- 2.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/maccess.c: actually return -EFAULT from strncpy_from_unsafe
As far as I can tell, strncpy_from_unsafe never returns -EFAULT. ret is the result of a __copy_from_user_inatomic(), which is 0 for success and positive (in this case necessarily 1) for access error - it is never negative. So we were always returning the length of the, possibly truncated, destination string. Signed-off-by: Rasmus Villemoes --- Probably not -stable-worthy. I can only find two callers, one of which ignores the return value. mm/maccess.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/maccess.c b/mm/maccess.c index 34fe24759ed1..d318db246826 100644 --- a/mm/maccess.c +++ b/mm/maccess.c @@ -99,5 +99,5 @@ long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count) pagefault_enable(); set_fs(old_fs); - return ret < 0 ? ret : src - unsafe_addr; + return ret ? -EFAULT : src - unsafe_addr; } -- 2.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] intel: i40e: fix confused code
This code is pretty confused. The variable name 'bytes_not_copied' clearly indicates that the programmer knew the semantics of copy_{to,from}_user, but then the return value is checked for being negative and used as a -Exxx return value. I'm not sure this is the proper fix, but at least we get rid of the dead code which pretended to check for access faults. Signed-off-by: Rasmus Villemoes --- There are other things worth looking at. i40e_dbg_netdev_ops_buf is a static buffer of size 256, which can be filled from user space (in i40e_dbg_netdev_ops_write). That function correctly checks that the input is at most 255 bytes. However, in i40e_dbg_netdev_ops_read we snprintf() it along a device name (and some white space) into kmalloc'ed buffer, also of size 256. Hence the snprintf output can be truncated, but snprintf() returns the total size that would be generated - so when we then proceed to using that in copy_to_user(), we may actually copy from beyond the allocated buffer, hence leaking a little kernel data. In i40e_dbg_command_write, we allocate a buffer based on count which is user-supplied. While kmalloc() refuses completely insane sizes, we may still allocate a few MB. Moreover, if allocation fails, returning 'count' is rather odd; -ENOMEM would make more sense. drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 24 ++-- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c index d7c15d17faa6..e9ecd3f9cafe 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -103,8 +103,8 @@ static ssize_t i40e_dbg_dump_read(struct file *filp, char __user *buffer, len = min_t(int, count, (i40e_dbg_dump_data_len - *ppos)); bytes_not_copied = copy_to_user(buffer, &i40e_dbg_dump_buf[*ppos], len); - if (bytes_not_copied < 0) - return bytes_not_copied; + if (bytes_not_copied) + return -EFAULT; *ppos += len; return len; @@ -353,8 +353,8 @@ static ssize_t i40e_dbg_command_read(struct file *filp, char __user *buffer, bytes_not_copied = copy_to_user(buffer, buf, len); kfree(buf); - if (bytes_not_copied < 0) - return bytes_not_copied; + if (bytes_not_copied) + return -EFAULT; *ppos = len; return len; @@ -995,12 +995,10 @@ static ssize_t i40e_dbg_command_write(struct file *filp, if (!cmd_buf) return count; bytes_not_copied = copy_from_user(cmd_buf, buffer, count); - if (bytes_not_copied < 0) { + if (bytes_not_copied) { kfree(cmd_buf); - return bytes_not_copied; + return -EFAULT; } - if (bytes_not_copied > 0) - count -= bytes_not_copied; cmd_buf[count] = '\0'; cmd_buf_tmp = strchr(cmd_buf, '\n'); @@ -2034,8 +2032,8 @@ static ssize_t i40e_dbg_netdev_ops_read(struct file *filp, char __user *buffer, bytes_not_copied = copy_to_user(buffer, buf, len); kfree(buf); - if (bytes_not_copied < 0) - return bytes_not_copied; + if (bytes_not_copied) + return -EFAULT; *ppos = len; return len; @@ -2068,10 +2066,8 @@ static ssize_t i40e_dbg_netdev_ops_write(struct file *filp, memset(i40e_dbg_netdev_ops_buf, 0, sizeof(i40e_dbg_netdev_ops_buf)); bytes_not_copied = copy_from_user(i40e_dbg_netdev_ops_buf, buffer, count); - if (bytes_not_copied < 0) - return bytes_not_copied; - else if (bytes_not_copied > 0) - count -= bytes_not_copied; + if (bytes_not_copied) + return -EFAULT; i40e_dbg_netdev_ops_buf[count] = '\0'; buf_tmp = strchr(i40e_dbg_netdev_ops_buf, '\n'); -- 2.6.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 0/6] kernel/cpu.c: eliminate some indirection
On Tue, Oct 06 2015, Rasmus Villemoes wrote: > v2: fix build failure on ppc, add acks. Does anyone want to take these through their tree? Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/