date:20161214

Re: [PATCH 09/18] arm64: introduce binfmt_elf32.c

2016-12-14 Thread Yury Norov

On Mon, Dec 05, 2016 at 03:10:19PM +, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:08PM +0300, Yury Norov wrote:
> > As we support more than one compat formats, it looks more reasonable
> > to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
> > specific definitions there and make code more maintainable and readable.
> 
> Can you remind me why we need this patch (rather than using the default
> fs/compat_binfmt_elf.c which you include here anyway)?

https://patchwork.kernel.org/patch/8756121/

This is mostly to avoid runtime checks and hide some re-definitions
for aarch32 from ilp32, to avoid re-re-definition.

> 
> > --- /dev/null
> > +++ b/arch/arm64/kernel/binfmt_elf32.c
> > @@ -0,0 +1,31 @@
> > +/*
> > + * Support for AArch32 Linux ELF binaries.
> > + */
> > +
> > +/* AArch32 EABI. */
> > +#define EF_ARM_EABI_MASK   0xff00
> > +
> > +#define compat_start_threadcompat_start_thread
> > +#define COMPAT_SET_PERSONALITY(ex) \
> > +do {   \
> > +   clear_thread_flag(TIF_32BIT_AARCH64);   \
> > +   set_thread_flag(TIF_32BIT); \
> > +} while (0)
> 
> You introduce this here but it seems to still be present in asm/elf.h.

Hmm... Maybe chunk that delete it from asm/elf.h was dropped at some
rebase. Thank you for the catch. I'll check it again.

Yury
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] kasan: Support for r/w instrumentation control

2016-12-14 Thread Andrey Ryabinin



On 12/13/2016 11:58 AM, Dmitry Vyukov wrote:

> --- a/Documentation/dev-tools/kasan.rst
> +++ b/Documentation/dev-tools/kasan.rst
> @@ -40,6 +40,14 @@ similar to the following to the respective kernel Makefile:
> 
>  KASAN_SANITIZE := n
> 
> +Sometimes it may be useful to disable instrumentation of reads, or writes
> +or both for the entire kernel. For example, if binary size is a concern,
> +it may be useful to disable instrumentation of reads to reduce binary size 
> but
> +still catch more harmful bugs on writes. Or, if one is interested only in
> +sanitization of a particular module and performance is a concern, she can
> +disable instrumentation of both reads and writes for kernel code.
> +Instrumentation can be disabled with CONFIG_KASAN_READS and
> CONFIG_KASAN_WRITES.
> +

I don't understand this. How this can be related to modules? Configs are global.
You can't just disable/enable config per module.

>  Error reports
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] doc: Explain light-handed markup preference a bit better

2016-12-14 Thread Daniel Vetter

We already had a super-short blurb, but worth extending it I think:
We're still pretty far away from anything like a consensus, but
there's clearly a lot of people who prefer an as-light as possible
approach to converting existing .txt files to .rst. Make sure this is
properly taken into account and clear.

Motivated by discussions with Peter and Christoph and others.

v2:
- Mention that existing headings should be kept when converting
  existing .txt files (Mauro).
- Explain that we prefer :: for quoting code, it's easier on the
  eyes (Mauro).
- Explain that blindly converting outdated docs is harmful. Motived
  by comments Peter did in our discussion.

v3: Make the explanations around fixed-width quoting more concise
(Jani).

v4:
- Rebase onto docs-4.10.
- Go with the more terse recommendation from Jani, defer to the much
  more detailed conversion guide Mauro is working on for details.

Cc: Jonathan Corbet 
Cc: linux-doc@vger.kernel.org
Cc: Christoph Hellwig 
Cc: Peter Zijlstra 
Cc: Jani Nikula 
Cc: Mauro Carvalho Chehab 
Signed-off-by: Daniel Vetter 
---
 Documentation/doc-guide/sphinx.rst | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/Documentation/doc-guide/sphinx.rst 
b/Documentation/doc-guide/sphinx.rst
index 96fe7ccb2c67..532d65b70500 100644
--- a/Documentation/doc-guide/sphinx.rst
+++ b/Documentation/doc-guide/sphinx.rst
@@ -73,7 +73,16 @@ Specific guidelines for the kernel documentation
 
 Here are some specific guidelines for the kernel documentation:
 
-* Please don't go overboard with reStructuredText markup. Keep it simple.
+* Please don't go overboard with reStructuredText markup. Keep it
+  simple. For the most part the documentation should be plain text with
+  just enough consistency in formatting that it can be converted to
+  other formats.
+
+* Please keep the formatting changes minimal when converting existing
+  documentation to reStructuredText.
+
+* Also update the content, not just the formatting, when converting
+  documentation.
 
 * Please stick to this order of heading adornments:
 
@@ -103,6 +112,12 @@ Here are some specific guidelines for the kernel 
documentation:
   the order as encountered."), having the higher levels the same overall makes
   it easier to follow the documents.
 
+* For inserting fixed width text blocks (for code examples, use case
+  examples, etc.), use ``::`` for anything that doesn't really benefit
+  from syntax highlighting, especially short snippets. Use
+  ``.. code-block:: `` for longer code blocks that benefit
+  from highlighting.
+
 
 the C domain
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 03/10] kmod: add dynamic max concurrent thread count

2016-12-14 Thread Petr Mladek

On Thu 2016-12-08 11:48:14, Luis R. Rodriguez wrote:
> We currently statically limit the number of modprobe threads which
> we allow to run concurrently to 50. As per Keith Owens, this was a
> completely arbitrary value, and it was set in the 2.3.38 days [0]
> over 16 years ago in year 2000.
> 
> Although we haven't yet hit our lower limits, experimentation [1]
> shows that when and if we hit this limit in the worst case, will be
> fatal -- consider get_fs_type() failures upon mount on a system which
> has many partitions, some of which might even be with the same
> filesystem. Its best to be prudent and increase and set this
> value to something more sensible which ensures we're far from hitting
> the limit and also allows default build/user run time override.
> 
> The worst case is fatal given that once a module fails to load there
> is a period of time during which subsequent request for the same module
> will fail, so in the case of partitions its not just one request that
> could fail, but whole series of partitions. This later issue of a
> module request failure domino effect can be addressed later, but
> increasing the limit to something more meaninful should at least give us
> enough cushion to avoid this for a while.
> 
> Set this value up with a bit more meaninful modern limits:
> 
> Bump this up to 64  max for small systems (CONFIG_BASE_SMALL)
> Bump this up to 128 max for larger systems (!CONFIG_BASE_SMALL)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index 271692a352f1..da2c25746937 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2111,6 +2111,29 @@ config TRIM_UNUSED_KSYMS
>  
> If unsure, or if you need to build out-of-tree modules, say N.
>  
> +config MAX_KMOD_CONCURRENT
> + int "Max allowed concurrent request_module() calls (6=>64, 10=>1024)"
> + range 0 14

Would not too small range break loading module dependencies?
I am not sure how it is implemented but it might require having
some more module loads in progress.

I would give 6 as minimum. Nobody has troubles with the current limit.

> + default 6 if !BASE_SMALL
> + default 7 if BASE_SMALL

Aren't the conditions inversed?

> diff --git a/init/main.c b/init/main.c
> index 8161208d4ece..1fa441aa32c6 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -638,6 +638,7 @@ asmlinkage __visible void __init start_kernel(void)
>   thread_stack_cache_init();
>   cred_init();
>   fork_init();
> + init_kmod_umh();
>   proc_caches_init();
>   buffer_init();
>   key_init();
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 0277d1216f80..cb6f7ca7b8a5 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -186,6 +174,31 @@ int __request_module(bool wait, const char *fmt, ...)
>   return ret;
>  }
>  EXPORT_SYMBOL(__request_module);
> +
> +/*
> + * If modprobe needs a service that is in a module, we get a recursive
> + * loop.  Limit the number of running kmod threads to max_threads/2 or
> + * CONFIG_MAX_KMOD_CONCURRENT, whichever is the smaller.  A cleaner method
> + * would be to run the parents of this process, counting how many times
> + * kmod was invoked.  That would mean accessing the internals of the
> + * process tables to get the command line, proc_pid_cmdline is static
> + * and it is not worth changing the proc code just to handle this case.
> + *
> + * "trace the ppid" is simple, but will fail if someone's
> + * parent exits.  I think this is as good as it gets.
> + *
> + * You can override with with a kernel parameter, for instance to allow
> + * 4096 concurrent modprobe instances:
> + *
> + *   kmod.max_modprobes=4096
> + */
> +void __init init_kmod_umh(void)
> +{
> + if (!max_modprobes)
> + max_modprobes = min(max_threads/2,
> + 2 << CONFIG_MAX_KMOD_CONCURRENT);

This should be

1 << CONFIG_MAX_KMOD_CONCURRENT);

1 << 1 = 2;

Note that this calculation is mentioned also some comments and
documentation.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access

2016-12-14 Thread Petr Mladek

On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> Only decrement *iff* we're possitive. Warn if we've hit
> a situation where the counter is already 0 after we're done
> with a modprobe call, this would tell us we have an unaccounted
> counter access -- this in theory should not be possible as
> only one routine controls the counter, however preemption is
> one case that could trigger this situation. Avoid that situation
> by disabling preemptiong while we access the counter.

I am curious about it. How could enabled preemption cause that
the counter will get negative?

Unaccounted access would be possible if put() is called
without get() or if put() is called before get().

I do not see a way how the value might get negative when
the calls are paired and ordered.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] Add maintainers to the admin guide

2016-12-14 Thread Joe Perches

On Tue, 2016-12-13 at 07:38 -0200, Mauro Carvalho Chehab wrote:
> Em Mon, 12 Dec 2016 12:56:50 -0800
> Joe Perches  escreveu:
> > Does the boxing with the === blocks align properly?
> > It it really useful?  Is there another/better way?
> 
> Do you mean those?
> 
>   === 
>   ``F:``  ``drivers/net/``all files in and below
>   ``drivers/net``
>   ``F:``  ``drivers/net/*``   all files in ``drivers/net``,
>   but not below
>   ``F:``  ``*/net/*`` all files in "any top level
>   directory" ``/net``
>   === 

Yes.

> This is a table. We might instead use a literal block, like:
> 
> ::
> 
>   ``F:``  ``drivers/net/``all files in and below
>   ``drivers/net``
>   ``F:``  ``drivers/net/*``   all files in ``drivers/net``,
>   but not below
>   ``F:``  ``*/net/*`` all files in "any top level
>   directory" ``/net``
> 
> But the result looks uglier when generating LaTeX or HTML, as it won't
> unwrap the continuation lines of the field descriptions.
> 
> Another alternative would be to use ascii artwork, like:
> 
>  ++--+
>  | ``F:`` ``drivers/net/``  | all files in and below   |
>  |  | ``drivers/net``  |
>  ++--+
>  | ``F:`` ``drivers/net/*`` | all files in ``drivers/net``,|
>  |  | but not below|
>  ++--+
>  | ``F:`` ``*/net/*``   | all files in "any top level  |
>  |  | directory" ``/net``  |
>  ++--+

Isn't the ascii art is going to get odd looking
output after the sphinx conversion because of the
doubled quotes being converted to bold?

I suspect the table formatting just isn't necessary
and it could be paragraphed instead.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 07/10] kmod: use simplified rate limit printk

2016-12-14 Thread Petr Mladek

On Thu 2016-12-08 11:49:01, Luis R. Rodriguez wrote:
> Just use the simplified rate limit printk when the max modprobe
> limit is reached, while at it throw out a bone should the error
> be triggered.
> 
> Signed-off-by: Luis R. Rodriguez 
> ---
>  kernel/kmod.c | 10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index 09cf35a2075a..ef65f4c3578a 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -158,7 +158,6 @@ int __request_module(bool wait, const char *fmt, ...)
>   va_list args;
>   char module_name[MODULE_NAME_LEN];
>   int ret;
> - static int kmod_loop_msg;
>  
>   /*
>* We don't allow synchronous module loading from async.  Module
> @@ -183,13 +182,8 @@ int __request_module(bool wait, const char *fmt, ...)
>  
>   ret = kmod_umh_threads_get();
>   if (ret) {
> - /* We may be blaming an innocent here, but unlikely */
> - if (kmod_loop_msg < 5) {
> - printk(KERN_ERR
> -"request_module: runaway loop modprobe %s\n",
> -module_name);
> - kmod_loop_msg++;
> - }
> + pr_err_ratelimited("request_module: modprobe limit (%u) reached 
> with module %s\n",
> +max_modprobes, module_name);

I like this change. I would only be even more descriptive in which
limit is reached. Something like

pr_err_ratelimited("request_module: module \"%s\" reached limit 
(%u) of concurrent modprobe calls\n",
   module_name, max_modprobes);

Either way, feel free to add:

Reviewed-by: Petr Mladek 

Best Regards,
Petr

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 07/10] kmod: use simplified rate limit printk

2016-12-14 Thread Joe Perches

On Wed, 2016-12-14 at 17:23 +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:49:01, Luis R. Rodriguez wrote:
> > Just use the simplified rate limit printk when the max modprobe
> > limit is reached, while at it throw out a bone should the error
> > be triggered.
[]
> > diff --git a/kernel/kmod.c b/kernel/kmod.c
[]
> > @@ -183,13 +182,8 @@ int __request_module(bool wait, const char *fmt, ...)
> >  
> > ret = kmod_umh_threads_get();
> > if (ret) {
> > -   /* We may be blaming an innocent here, but unlikely */
> > -   if (kmod_loop_msg < 5) {
> > -   printk(KERN_ERR
> > -  "request_module: runaway loop modprobe %s\n",
> > -  module_name);
> > -   kmod_loop_msg++;
> > -   }
> > +   pr_err_ratelimited("request_module: modprobe limit (%u) reached 
> > with module %s\n",
> > +  max_modprobes, module_name);
> 
> I like this change. I would only be even more descriptive in which
> limit is reached. Something like
> 
>   pr_err_ratelimited("request_module: module \"%s\" reached limit 
> (%u) of concurrent modprobe calls\n",
>  module_name, max_modprobes);
> 
> Either way, feel free to add:
> 
> Reviewed-by: Petr Mladek 

Seems sensible.

I suggest using "%s: ", __func__ instead of embedding
the function name.

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] Add maintainers to the admin guide

2016-12-14 Thread Mauro Carvalho Chehab

Em Wed, 14 Dec 2016 08:14:44 -0800
Joe Perches  escreveu:

> On Tue, 2016-12-13 at 07:38 -0200, Mauro Carvalho Chehab wrote:
> > Em Mon, 12 Dec 2016 12:56:50 -0800
> > Joe Perches  escreveu:  
> > > Does the boxing with the === blocks align properly?
> > > It it really useful?  Is there another/better way?  
> > 
> > Do you mean those?
> > 
> >   ===   
> >   ``F:````drivers/net/``all files in and below
> > ``drivers/net``
> >   ``F:````drivers/net/*``   all files in ``drivers/net``,
> > but not below
> >   ``F:````*/net/*`` all files in "any top level
> > directory" ``/net``
> >   ===     
> 
> Yes.
> 
> > This is a table. We might instead use a literal block, like:
> > 
> > ::
> > 
> >   ``F:````drivers/net/``all files in and below
> > ``drivers/net``
> >   ``F:````drivers/net/*``   all files in ``drivers/net``,
> > but not below
> >   ``F:````*/net/*`` all files in "any top level
> > directory" ``/net``
> > 
> > But the result looks uglier when generating LaTeX or HTML, as it won't
> > unwrap the continuation lines of the field descriptions.
> > 
> > Another alternative would be to use ascii artwork, like:
> > 
> >  ++--+
> >  | ``F:``   ``drivers/net/``  | all files in and below  
> >  |
> >  || ``drivers/net``  |
> >  ++--+
> >  | ``F:``   ``drivers/net/*`` | all files in ``drivers/net``,   
> >  |
> >  || but not below|
> >  ++--+
> >  | ``F:``   ``*/net/*``   | all files in "any top level 
> >  |
> >  || directory" ``/net``  |
> >  ++--+  
> 
> Isn't the ascii art is going to get odd looking
> output after the sphinx conversion because of the
> doubled quotes being converted to bold?

Doubled quotes should be converted to monospaced fonts, and not to
bold. We might remove the double quotes, but the end result would
be worse, as we would need to escape the asterisks.

> I suspect the table formatting just isn't necessary
> and it could be paragraphed instead.

We could use indented paragraphs instead, like:

   * ``F:`` ``drivers/net/``
   all files in and below ``drivers/net``

   * ``F:`` ``*/net/*``
   all files in "any top level directory" ``/net``

But, IMHO, it would look worse than tables on both ASCII and on
formatted outputs.

Thanks,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access

2016-12-14 Thread Luis R. Rodriguez

On Wed, Dec 14, 2016 at 05:08:58PM +0100, Petr Mladek wrote:
> On Thu 2016-12-08 11:48:50, Luis R. Rodriguez wrote:
> > Only decrement *iff* we're possitive. Warn if we've hit
> > a situation where the counter is already 0 after we're done
> > with a modprobe call, this would tell us we have an unaccounted
> > counter access -- this in theory should not be possible as
> > only one routine controls the counter, however preemption is
> > one case that could trigger this situation. Avoid that situation
> > by disabling preemptiong while we access the counter.
> 
> I am curious about it. How could enabled preemption cause that
> the counter will get negative?

As the commit log describes today in theory this is not possible
was we have only have one routine controlling the counter. If we
were to expand this then such possibilities become more real.

> Unaccounted access would be possible if put() is called
> without get() or if put() is called before get().

Exactly, so buggy users of the get/put calls in future calls.
I can just drop the preemption disable / enable for now as it
should not be an issue now.

> I do not see a way how the value might get negative when
> the calls are paired and ordered.

Right, this just matches parity with module_put(), its perhaps
*preemptively* too cautious though so I could just drop the
preemption enable/disable for now as that would slow down
things a bit.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-14 Thread Robert LeBlanc

On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang  wrote:
> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He  wrote:
>>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
 When trying to configure crashkernel greater than about 800 MB, the
 kernel fails to allocate memory on x86 and x86_64. This is due to an
 undocumented limit that the crashkernel and other low memory items must
 be allocated below 896 MB unless the ",high" option is given. This
 updates the documentation to explain this and what I understand the
 limitations to be on the option.
>>> This is true, but not very accurate. You found it's about 800M, it's
>>> becasue usually the current kernel need about 40M space to run, and some
>>> extra reservation before reserve_crashkernel invocation, another ~10M.
>>> However it's normal case, people may build modules into or have some
>>> special code to bloat kernel. This patch makes sense to address the
>>> low|high issue, it might be not good so determined to say ~800M.
>> My testing showed that I could go anywhere from about 830M to 880M,
>> depending on distro, kernel version, and stuff that you mentioned. I
>> just thought some rule of thumb of when to consider using high would
>> be good. People may not think that 800 MB is 'large' when you have 512
>> GB of RAM for instance. I thought about making 512 MB be the rule of
>> thumb, but you can do a lot with ~300 MB.
>
> Hi Robert,
>
> I think you are correct.
>
> For x86, the kernel uses memblock to locate the proper range starts from 16MB 
> to some "end",
> without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise 
> CRASH_ADDR_HIGH_MAX.
>
> You can find the definition for both 32-bit and 64-bit:
> #ifdef CONFIG_X86_32
> # define CRASH_ADDR_LOW_MAX (512 << 20)
> # define CRASH_ADDR_HIGH_MAX(512 << 20)
> #else
> # define CRASH_ADDR_LOW_MAX (896UL << 20)
> # define CRASH_ADDR_HIGH_MAXMAXMEM
> #endif
>
> as some memory was already allocated by the kernel, which means it's highly 
> likely to get a reservation
> failure after specifying a crashkernel value near 800MB(for x86_64) which was 
> what you met. But we can't
> get the exact threshold, but it would be better if there is some explanation 
> accordingly in the document.

To make sure I'm understanding what you are say, you want me to go
into a bit more detail about the limitation and specify the
differences between x86 and x86_64, right?

>> I'm happy to adjust the wording, what would you recommend? Also, I'm
>> not 100% sure that I got the cases covered correctly. I was surprised
>> that I could not get it to work with the "new" format with the
>> multiple ranges, and that specifying an offset would't work either,
>> although the offset kind of makes sense. Do you know for sure that it
>> doesn't work with ranges?
>>
>> I tried,
>>
>> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>>
>> and
>>
>> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>>
>> and neither worked. It seems that a better separator would be ';'
>> instead of ',' for ranges, then you could specify options better. Kind
>> of hard to change now.
>
> For "crashkernel=range1:size1[,range2:size2,...][@offset]"
> I'm afraid it doesn't support "high" prefix in the current implementation, so 
> there is no guarantee.
> I guess we can drop a note to eliminate the confusion.

I tried to express in the extended syntax section that ',high' is not
available and you have to use the 'simple' format. Do you think this
needs to be expanded as well?



Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

 Signed-off-by: Robert LeBlanc 
 ---
  Documentation/kdump/kdump.txt | 22 +-
  1 file changed, 17 insertions(+), 5 deletions(-)

 diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
 index b0eb27b..aa3efa8 100644
 --- a/Documentation/kdump/kdump.txt
 +++ b/Documentation/kdump/kdump.txt
 @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is 
 sufficient for most
  configurations, sometimes it's handy to have the reserved memory dependent
  on the value of System RAM -- that's mostly for distributors that 
 pre-setup
  the kernel command line to avoid a unbootable system after some memory has
 -been removed from the machine.
 +been removed from the machine. If you need to allocate more than ~800M
 +for x86 or x86_64 then you must use the simple format as the format
 +',high' conflicts with the separators of ranges.

  The syntax is:

 @@ -282,11 +284,21 @@ Boot into System Kernel
  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
 files as necessary.

 -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
 +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | 
 ,high]",
 wh

Re: [PATCH] Add +~800M crashkernel explaination

2016-12-14 Thread Xunlei Pang

On 12/15/2016 at 01:50 AM, Robert LeBlanc wrote:
> On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang  wrote:
>> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
>>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He  wrote:
 On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
> When trying to configure crashkernel greater than about 800 MB, the
> kernel fails to allocate memory on x86 and x86_64. This is due to an
> undocumented limit that the crashkernel and other low memory items must
> be allocated below 896 MB unless the ",high" option is given. This
> updates the documentation to explain this and what I understand the
> limitations to be on the option.
 This is true, but not very accurate. You found it's about 800M, it's
 becasue usually the current kernel need about 40M space to run, and some
 extra reservation before reserve_crashkernel invocation, another ~10M.
 However it's normal case, people may build modules into or have some
 special code to bloat kernel. This patch makes sense to address the
 low|high issue, it might be not good so determined to say ~800M.
>>> My testing showed that I could go anywhere from about 830M to 880M,
>>> depending on distro, kernel version, and stuff that you mentioned. I
>>> just thought some rule of thumb of when to consider using high would
>>> be good. People may not think that 800 MB is 'large' when you have 512
>>> GB of RAM for instance. I thought about making 512 MB be the rule of
>>> thumb, but you can do a lot with ~300 MB.
>> Hi Robert,
>>
>> I think you are correct.
>>
>> For x86, the kernel uses memblock to locate the proper range starts from 
>> 16MB to some "end",
>> without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise 
>> CRASH_ADDR_HIGH_MAX.
>>
>> You can find the definition for both 32-bit and 64-bit:
>> #ifdef CONFIG_X86_32
>> # define CRASH_ADDR_LOW_MAX (512 << 20)
>> # define CRASH_ADDR_HIGH_MAX(512 << 20)
>> #else
>> # define CRASH_ADDR_LOW_MAX (896UL << 20)
>> # define CRASH_ADDR_HIGH_MAXMAXMEM
>> #endif
>>
>> as some memory was already allocated by the kernel, which means it's highly 
>> likely to get a reservation
>> failure after specifying a crashkernel value near 800MB(for x86_64) which 
>> was what you met. But we can't
>> get the exact threshold, but it would be better if there is some explanation 
>> accordingly in the document.
> To make sure I'm understanding what you are say, you want me to go
> into a bit more detail about the limitation and specify the
> differences between x86 and x86_64, right?

Yeah, it would be better to have one, at least to mention the different upper 
bounds.

As I replied in another post, if you really want to detail the behaviour, 
should mention
"crashkernel=size[KMG][@offset[KMG]]" with @offset[KMG] specified explicitly, 
after
all, it's handled differently with no upper bound limitation, but doing this 
may put
the first kernel at the risk of lacking low memory(some devices require 32bit 
DMA),
must use it with care because the kernel will assume users are aware of what 
they
are doing and make a successful reservation as long as the given range is 
available.

>
>>> I'm happy to adjust the wording, what would you recommend? Also, I'm
>>> not 100% sure that I got the cases covered correctly. I was surprised
>>> that I could not get it to work with the "new" format with the
>>> multiple ranges, and that specifying an offset would't work either,
>>> although the offset kind of makes sense. Do you know for sure that it
>>> doesn't work with ranges?
>>>
>>> I tried,
>>>
>>> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>>>
>>> and
>>>
>>> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>>>
>>> and neither worked. It seems that a better separator would be ';'
>>> instead of ',' for ranges, then you could specify options better. Kind
>>> of hard to change now.
>> For "crashkernel=range1:size1[,range2:size2,...][@offset]"
>> I'm afraid it doesn't support "high" prefix in the current implementation, 
>> so there is no guarantee.
>> I guess we can drop a note to eliminate the confusion.
> I tried to express in the extended syntax section that ',high' is not
> available and you have to use the 'simple' format. Do you think this

ditto

> needs to be expanded as well?

If you really have good reasons or use cases, please try it :-)

Regards,
Xunlei

>
>
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
> Signed-off-by: Robert LeBlanc 
> ---
>  Documentation/kdump/kdump.txt | 22 +-
>  1 file changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index b0eb27b..aa3efa8 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is 
> sufficient for most
>  configurations,

Re: [RFC 10/10] kmod: add a sanity check on module loading

2016-12-14 Thread Rusty Russell

"Luis R. Rodriguez"  writes:
> kmod has an optimization in place whereby if a some kernel code
> uses request_module() on a module already loaded we never bother
> userspace as the module already is loaded. This is not true for
> get_fs_type() though as it uses aliases.

Well, the obvious thing to do here is block kmod if we're currently
loading the same module.  Otherwise it has to do some weird spinning
thing in userspace anyway.

We already have module_wq for this, we just need a bit more code to
share the return value; and there's a weird corner case there where we
have "modprobe foo param=invalid" then "modprobe foo param=valid" and we
fail both with -EINVAL, but it's probably not worth fixing.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/8] Documentation/sparse: drop __CHECK_ENDIAN__

2016-12-14 Thread Michael S. Tsirkin

It's no longer used.

Signed-off-by: Michael S. Tsirkin 
---
 Documentation/translations/zh_CN/sparse.txt | 7 +--
 Documentation/dev-tools/sparse.rst  | 7 +--
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/Documentation/translations/zh_CN/sparse.txt 
b/Documentation/translations/zh_CN/sparse.txt
index cc144e5..e41dc94 100644
--- a/Documentation/translations/zh_CN/sparse.txt
+++ b/Documentation/translations/zh_CN/sparse.txt
@@ -92,9 +92,4 @@ DaveJ 把每小时自动生成的 git 源码树 tar 包放在以下地址：
 如果你已经编译了内核，用后一种方式可以很快地检查整个源码树。
 
 make 的可选变量 CHECKFLAGS 可以用来向 sparse 工具传递参数。编译系统会自
-动向 sparse 工具传递 -Wbitwise 参数。你可以定义 __CHECK_ENDIAN__ 来进行
-大小尾检查。
-
-   make C=2 CHECKFLAGS="-D__CHECK_ENDIAN__"
-
-这些检查默认都是被关闭的，因为他们通常会产生大量的警告。
+动向 sparse 工具传递 -Wbitwise 参数。
diff --git a/Documentation/dev-tools/sparse.rst 
b/Documentation/dev-tools/sparse.rst
index e08e6a8..78aa00a 100644
--- a/Documentation/dev-tools/sparse.rst
+++ b/Documentation/dev-tools/sparse.rst
@@ -102,9 +102,4 @@ be recompiled or not.  The latter is a fast way to check 
the whole tree if you
 have already built it.
 
 The optional make variable CF can be used to pass arguments to sparse.  The
-build system passes -Wbitwise to sparse automatically.  To perform endianness
-checks, you may define __CHECK_ENDIAN__::
-
-make C=2 CF="-D__CHECK_ENDIAN__"
-
-These checks are disabled by default as they generate a host of warnings.
+build system passes -Wbitwise to sparse automatically.
-- 
MST

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/8] Documentation/sparse: drop bitwise

2016-12-14 Thread Michael S. Tsirkin

We dropped __CHECK_ENDIAN__ so __bitwise__ is now an implementation
detail. People should use __bitwise everywhere.

Signed-off-by: Michael S. Tsirkin 
---
 Documentation/dev-tools/sparse.rst | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/Documentation/dev-tools/sparse.rst 
b/Documentation/dev-tools/sparse.rst
index 8c250e8..e08e6a8 100644
--- a/Documentation/dev-tools/sparse.rst
+++ b/Documentation/dev-tools/sparse.rst
@@ -51,13 +51,6 @@ sure that bitwise types don't get mixed up (little-endian vs 
big-endian
 vs cpu-endian vs whatever), and there the constant "0" really _is_
 special.
 
-__bitwise__ - to be used for relatively compact stuff (gfp_t, etc.) that
-is mostly warning-free and is supposed to stay that way.  Warnings will
-be generated without __CHECK_ENDIAN__.
-
-__bitwise - noisy stuff; in particular, __le*/__be* are that.  We really
-don't want to drown in noise unless we'd explicitly asked for it.
-
 Using sparse for lock checking
 --
 
-- 
MST

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/18] arm64: introduce binfmt_elf32.c

Re: [PATCH v2] kasan: Support for r/w instrumentation control

[PATCH] doc: Explain light-handed markup preference a bit better

Re: [RFC 03/10] kmod: add dynamic max concurrent thread count

Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access

Re: [PATCH 0/2] Add maintainers to the admin guide

Re: [RFC 07/10] kmod: use simplified rate limit printk

Re: [RFC 07/10] kmod: use simplified rate limit printk

Re: [PATCH 0/2] Add maintainers to the admin guide

Re: [RFC 06/10] kmod: provide sanity check on kmod_concurrent access

Re: [PATCH] Add +~800M crashkernel explaination

Re: [PATCH] Add +~800M crashkernel explaination

Re: [RFC 10/10] kmod: add a sanity check on module loading

[PATCH 6/8] Documentation/sparse: drop __CHECK_ENDIAN__

[PATCH 3/8] Documentation/sparse: drop bitwise

15 matches

Site Navigation

Mail list logo

Footer information