On Wed, Nov 20, 2013 at 01:48:06AM +0000, Phillip Lougher wrote: > Add a multi-threaded decompression implementation which uses > percpu variables. > > Using percpu variables has advantages and disadvantages over > implementations which do not use percpu variables. > > Advantages: > * the nature of percpu variables ensures decompression is > load-balanced across the multiple cores. > * simplicity. > > Disadvantages: it limits decompression to one thread per core. > > V2: > * squashfs_decompressor_create: improve error handling path, re freeing > of decompressors and comp_opts > * decompressor_multi_percpu.c: include percpu.h header > * Kconfig: indentation > > Signed-off-by: Phillip Lougher <phil...@squashfs.org.uk> > --- > fs/squashfs/Kconfig | 57 ++++++++++++++----- > fs/squashfs/Makefile | 10 +--- > fs/squashfs/decompressor_multi_percpu.c | 98 > +++++++++++++++++++++++++++++++++ > 3 files changed, 145 insertions(+), 20 deletions(-) > create mode 100644 fs/squashfs/decompressor_multi_percpu.c > > diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig > index 1c6d340..159bd66 100644 > --- a/fs/squashfs/Kconfig > +++ b/fs/squashfs/Kconfig > @@ -25,6 +25,50 @@ config SQUASHFS > > If unsure, say N. > > +choice > + prompt "Decompressor parallelisation options"
Nitpick: How about adding default explicitly? default SQUASHFS_DECOMP_SINGLE > + depends on SQUASHFS > + help > + Squashfs now supports three parallelisation options for > + decompression. Each one exhibits various trade-offs between > + decompression performance and CPU and memory usage. > + > + If in doubt, select "Single threaded compression" > + > +config SQUASHFS_DECOMP_SINGLE > + bool "Single threaded compression" > + help > + Traditionally Squashfs has used single-threaded decompression. > + Only one block (data or metadata) can be decompressed at any > + one time. This limits CPU and memory usage to a minimum. > + > +config SQUASHFS_DECOMP_MULTI > + bool "Use multiple decompressors for parallel I/O" > + help > + By default Squashfs uses a single decompressor but it gives > + poor performance on parallel I/O workloads when using multiple CPU > + machines due to waiting on decompressor availability. > + > + If you have a parallel I/O workload and your system has enough memory, > + using this option may improve overall I/O performance. > + > + This decompressor implementation uses up to two parallel > + decompressors per core. It dynamically allocates decompressors > + on a demand basis. > + > +config SQUASHFS_DECOMP_MULTI_PERCPU > + bool "Use percpu multiple decompressors for parallel I/O" > + help > + By default Squashfs uses a single decompressor but it gives > + poor performance on parallel I/O workloads when using multiple CPU > + machines due to waiting on decompressor availability. > + > + This decompressor implementation uses a maximum of one > + decompressor per core. It uses percpu variables to ensure Minor: ^ unnecessary white space. > + decompression is load-balanced across the cores. Actually, I am not sure it's good idea to mention percpu in description. Normal people wouldn't know that and I think what they can want to know is what's benefit compared to SQUASHFS_DECOMP_MULTI. How about this? This decompressor implementation uses a maximum of one decompressor per core and the decompressor is allocated statically so memory footprint is small and limited and I/O cannot be fluctuated by not failing decompressor dynamic allocation compared to SQUAHSDS_DECOMP_MULTI. And I'd like to see what's your point about "decompression is load-balanced across the cores". If scheduler assigns process A, B, C into a core, it couldn't be load-balanced. If scheduler assign process A, B, C into each core, it could be load-balanced. And it's same with SQUSHFS_DECOMP_MULTI. Could you elaborate it a bit? Otherwise, looks good to me. Thanks! > + > +endchoice > + > config SQUASHFS_XATTR > bool "Squashfs XATTR support" > depends on SQUASHFS > @@ -63,19 +107,6 @@ config SQUASHFS_LZO > > If unsure, say N. > > -config SQUASHFS_MULTI_DECOMPRESSOR > - bool "Use multiple decompressors for handling parallel I/O" > - depends on SQUASHFS > - help > - By default Squashfs uses a single decompressor but it gives > - poor performance on parallel I/O workloads when using multiple CPU > - machines due to waiting on decompressor availability. > - > - If you have a parallel I/O workload and your system has enough memory, > - using this option may improve overall I/O performance. > - > - If unsure, say N. > - > config SQUASHFS_XZ > bool "Include support for XZ compressed file systems" > depends on SQUASHFS > diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile > index dfebc3b..5833b96 100644 > --- a/fs/squashfs/Makefile > +++ b/fs/squashfs/Makefile > @@ -5,14 +5,10 @@ > obj-$(CONFIG_SQUASHFS) += squashfs.o > squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o > squashfs-y += namei.o super.o symlink.o decompressor.o > - > +squashfs-$(CONFIG_SQUASHFS_DECOMP_SINGLE) += decompressor_single.o > +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI) += decompressor_multi.o > +squashfs-$(CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU) += > decompressor_multi_percpu.o > squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o > squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o > squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o > squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o > - > -ifdef CONFIG_SQUASHFS_MULTI_DECOMPRESSOR > - squashfs-y += decompressor_multi.o > -else > - squashfs-y += decompressor_single.o > -endif > diff --git a/fs/squashfs/decompressor_multi_percpu.c > b/fs/squashfs/decompressor_multi_percpu.c > new file mode 100644 > index 0000000..0e7b679 > --- /dev/null > +++ b/fs/squashfs/decompressor_multi_percpu.c > @@ -0,0 +1,98 @@ > +/* > + * Copyright (c) 2013 > + * Phillip Lougher <phil...@squashfs.org.uk> > + * > + * This work is licensed under the terms of the GNU GPL, version 2. See > + * the COPYING file in the top-level directory. > + */ > + > +#include <linux/types.h> > +#include <linux/slab.h> > +#include <linux/percpu.h> > +#include <linux/buffer_head.h> > + > +#include "squashfs_fs.h" > +#include "squashfs_fs_sb.h" > +#include "decompressor.h" > +#include "squashfs.h" > + > +/* > + * This file implements multi-threaded decompression using percpu > + * variables, one thread per cpu core. > + */ > + > +struct squashfs_stream { > + void *stream; > +}; > + > +void *squashfs_decompressor_create(struct squashfs_sb_info *msblk, > + void *comp_opts) > +{ > + struct squashfs_stream *stream; > + struct squashfs_stream __percpu *percpu; > + int err, cpu; > + > + percpu = alloc_percpu(struct squashfs_stream); > + if (percpu == NULL) > + return ERR_PTR(-ENOMEM); > + > + for_each_possible_cpu(cpu) { > + stream = per_cpu_ptr(percpu, cpu); > + stream->stream = msblk->decompressor->init(msblk, comp_opts); > + if (IS_ERR(stream->stream)) { > + err = PTR_ERR(stream->stream); > + goto out; > + } > + } > + > + kfree(comp_opts); > + return (__force void *) percpu; > + > +out: > + for_each_possible_cpu(cpu) { > + stream = per_cpu_ptr(percpu, cpu); > + if (!IS_ERR_OR_NULL(stream->stream)) > + msblk->decompressor->free(stream->stream); > + } > + free_percpu(percpu); > + return ERR_PTR(err); > +} > + > +void squashfs_decompressor_destroy(struct squashfs_sb_info *msblk) > +{ > + struct squashfs_stream __percpu *percpu = > + (struct squashfs_stream __percpu *) msblk->stream; > + struct squashfs_stream *stream; > + int cpu; > + > + if (msblk->stream) { > + for_each_possible_cpu(cpu) { > + stream = per_cpu_ptr(percpu, cpu); > + msblk->decompressor->free(stream->stream); > + } > + free_percpu(percpu); > + } > +} > + > +int squashfs_decompress(struct squashfs_sb_info *msblk, > + void **buffer, struct buffer_head **bh, int b, int offset, int length, > + int srclength, int pages) > +{ > + struct squashfs_stream __percpu *percpu = > + (struct squashfs_stream __percpu *) msblk->stream; > + struct squashfs_stream *stream = get_cpu_ptr(percpu); > + int res = msblk->decompressor->decompress(msblk, stream->stream, buffer, > + bh, b, offset, length, srclength, pages); > + put_cpu_ptr(stream); > + > + if (res < 0) > + ERROR("%s decompression failed, data probably corrupt\n", > + msblk->decompressor->name); > + > + return res; > +} > + > +int squashfs_max_decompressors(void) > +{ > + return num_possible_cpus(); > +} > -- > 1.8.3.2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/