Re: [PATCH 0/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Michal Hocko
On Sun 18-02-18 16:47:54, robert.m.har...@oracle.com wrote:
> From: "Robert M. Harris" 
> 
> __fragmentation_index() calculates a value used to determine whether
> compaction should be favoured over page reclaim in the event of
> allocation failure.  The function purports to return a value between 0
> and 1000, representing units of 1/1000.  Barring the case of a
> pathological shortfall of memory, the lower bound is instead 500.  This
> is significant because it is the default value of
> sysctl_extfrag_threshold, i.e. the value below which compaction should
> be avoided in favour of page reclaim for costly pages.
> 
> Here's an illustration using a zone that I fragmented with selective
> calls to __alloc_pages() and __free_pages --- the fragmentation for
> order-1 could not be minimised further yet is reported as 0.5:

Cover letter for a single patch is usually an overkill. Why is this
information not valuable in the patch description directly?

> # head -1 /proc/buddyinfo
> Node 0, zone  DMA   1983  0  0  0  0  0  0  0 
>  0  0  0 
> # head -1 /sys/kernel/debug/extfrag/extfrag_index
> Node 0, zone  DMA -1.000 0.500 0.750 0.875 0.937 0.969 0.984 0.992 0.996 
> 0.998 0.999 
> # 
> 
> With extreme memory shortage the reported fragmentation index does go
> lower.  In fact, it can go below zero:
> 
> # head -1 /proc/buddyinfo
> Node 0, zone  DMA  1  0  0  0  0  0  0  0 
>  0  0  0 
> # head -1 /sys/kernel/debug/extfrag/extfrag_index
> Node 0, zone  DMA -1.000 0.-500 0.-250 0.-125 0.-62 0.-31 0.-15 0.-07 
> 0.-03 0.-01 0.000 
> # 
> 
> This patch implements and documents a modified version of the original
> expression that returns a value in the range 0 <= index < 1000.  It
> amends the default value of sysctl_extfrag_threshold to preserve the
> existing behaviour.  With this patch in place, the same two tests yield
> 
> # head -1 /proc/buddyinfo
> Node 0, zone  DMA   1983  0  0  0  0  0  0  0 
>  0  0  0 
> # head -1 /sys/kernel/debug/extfrag/extfrag_index
> Node 0, zone  DMA -1.000 0.000 0.500 0.750 0.875 0.937 0.969 0.984 0.992 
> 0.996 0.998 
> # 
> 
> and
> 
> # head -1 /proc/buddyinfo
> Node 0, zone  DMA  1  0  0  0  0  0  0  0 
>  0  0  0 
> # head -1 /sys/kernel/debug/extfrag/extfrag_index
> Node 0, zone  DMA -1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 
> 0.000 0.000 
> # 
> 
> Robert M. Harris (1):
>   mm, compaction: correct the bounds of __fragmentation_index()
> 
>  Documentation/sysctl/vm.txt |  2 +-
>  mm/compaction.c |  2 +-
>  mm/vmstat.c | 47 
> +++--
>  3 files changed, 39 insertions(+), 12 deletions(-)
> 
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Michal Hocko
On Sun 18-02-18 16:47:55, robert.m.har...@oracle.com wrote:
> From: "Robert M. Harris" 
> 
> __fragmentation_index() calculates a value used to determine whether
> compaction should be favoured over page reclaim in the event of allocation
> failure.  The calculation itself is opaque and, on inspection, does not
> match its existing description.  The function purports to return a value
> between 0 and 1000, representing units of 1/1000.  Barring the case of a
> pathological shortfall of memory, the lower bound is instead 500.  This is
> significant because it is the default value of sysctl_extfrag_threshold,
> i.e. the value below which compaction should be avoided in favour of page
> reclaim for costly pages.
> 
> This patch implements and documents a modified version of the original
> expression that returns a value in the range 0 <= index < 1000.  It amends
> the default value of sysctl_extfrag_threshold to preserve the existing
> behaviour.

It is not really clear to me what is the actual problem you are trying
to solve by this patch. Is there any bug or are you just trying to
improve the current implementation to be more effective?

> Signed-off-by: Robert M. Harris 
> ---
>  Documentation/sysctl/vm.txt |  2 +-
>  mm/compaction.c |  2 +-
>  mm/vmstat.c | 47 
> +++--
>  3 files changed, 39 insertions(+), 12 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index 5025ff9..384a78b 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -237,7 +237,7 @@ of memory, values towards 1000 imply failures are due to 
> fragmentation and -1
>  implies that the allocation will succeed as long as watermarks are met.
>  
>  The kernel will not compact memory in a zone if the
> -fragmentation index is <= extfrag_threshold. The default value is 500.
> +fragmentation index is <= extfrag_threshold. The default value is 0.
>  
>  ==
>  
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 10cd757..9db6ef4 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1730,7 +1730,7 @@ static enum compact_result compact_zone_order(struct 
> zone *zone, int order,
>   return ret;
>  }
>  
> -int sysctl_extfrag_threshold = 500;
> +int sysctl_extfrag_threshold;
>  
>  /**
>   * try_to_compact_pages - Direct compact to satisfy a high-order allocation
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 40b2db6..013f1af 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1044,15 +1044,22 @@ static void fill_contig_page_info(struct zone *zone,
>  }
>  
>  /*
> - * A fragmentation index only makes sense if an allocation of a requested
> - * size would fail. If that is true, the fragmentation index indicates
> - * whether external fragmentation or a lack of memory was the problem.
> - * The value can be used to determine if page reclaim or compaction
> - * should be used
> + * If there is no block of at least the requested size, implying that an
> + * allocation would fail, then it might be possible to conjure one by
> + * compaction.  As this is expensive it is reserved for those cases in which
> + * there is a relatively high degree of fragmentation.  For low degrees, page
> + * reclaim is more appropriate since an allocation failure is more likely to 
> be
> + * caused by a lack of memory.
> + *
> + * This function calculates an index in the range 0 to 1, expressed in units 
> of
> + * 1/1000, indicating low and high fragmentation respectively.  The special
> + * value of -1 indicates that free blocks of sufficient size are available 
> and
> + * that an allocation should therefore succeed.
>   */
>  static int __fragmentation_index(unsigned int order, struct contig_page_info 
> *info)
>  {
>   unsigned long requested = 1UL << order;
> + int result;
>  
>   if (WARN_ON_ONCE(order >= MAX_ORDER))
>   return 0;
> @@ -1060,17 +1067,37 @@ static int __fragmentation_index(unsigned int order, 
> struct contig_page_info *in
>   if (!info->free_blocks_total)
>   return 0;
>  
> - /* Fragmentation index only makes sense when a request would fail */
>   if (info->free_blocks_suitable)
>   return -1000;
>  
>   /*
> -  * Index is between 0 and 1 so return within 3 decimal places
> +  * If the number of requested-size blocks that could be constructed if
> +  * all free blocks were compacted is
> +  *
> +  *  B = info->free_pages/requested
> +  *
> +  * then, conceptually, the number of fragments into which each
> +  * requested-size block has been split is
> +  *
> +  *  N = info->free_blocks_total/B
>*
> -  * 0 => allocation would fail due to lack of memory
> -  * 1 => allocation would fail due to fragmentation
> +  * In the least and most fragmented cases all free memory resides on
> +  * either the order - 1

Re: Inline emphasis warnings (more)

2018-02-19 Thread Jani Nikula
On Fri, 16 Feb 2018, Markus Heiser  wrote:
>> Am 16.02.2018 um 19:56 schrieb Markus Heiser :
>> 
>> 
>>> Am 01.10.2017 um 20:33 schrieb Randy Dunlap :
>>> 
>>> A different "problem" this time:
>>> 
>>> Documentation/driver-api/basics.rst pulls in include/linux/kernel.h.
>>>  has kernel-doc notation for (among others) trace_printk()
>>> and trace_puts().
>>> 
>>> Part of the text for trace_printk() says:
>>> 
>>> * Note: __trace_printk is an internal function for trace_printk and
>>> *   the @ip is passed in via the trace_printk macro.
>>> 
>
> OK found the problem, this comment results in reST markup:
>
> --
> **Note**
>
> __trace_printk is an internal function for :c:func:`trace_printk()` and
>   the **ip** is passed in via the :c:func:`trace_printk()` macro.
>
> This function allows a kernel developer to ... 
> ---
>
> in reST this is a definition list item [1] like:
>
> --
> term (up to a line of text)
>Definition of the term, which must be indented
>
> next term
>Description.
> --
>
> Anyway, the 'Note:' is quite unfortunate at this point. It starts
> a new section with title 'Note'. I guess this is not what we wan't
> here. To tag a node-block correct I recommend to patch the comment
> like:
>
> modified   include/linux/kernel.h
> @@ -638,8 +638,10 @@ do { 
>   \
>   * trace_printk - printf formatting in the ftrace buffer
>   * @fmt: the printf format for printing
>   *
> - * Note: __trace_printk is an internal function for trace_printk() and
> - *   the @ip is passed in via the trace_printk() macro.
> + * .. node:

Shouldn't this be ".. note:"?

BR,
Jani.

> + *
> + *   __trace_printk is an internal function for trace_printk() and
> + *   the @ip is passed in via the trace_printk() macro.
>   *
>   * This function allows a kernel developer to debug fast path sections
>   * that printk is not appropriate for. By scattering in various
>
>
> I hope the explanation could help clarify.
>
> [1] http://www.sphinx-doc.org/en/master/rest.html#lists-and-quote-like-blocks
>
> -- Markus --
>
>
>>> I am seeing this 'produced' (when I view the output html) as:
>>> 
>>> __trace_printk is an internal function for trace_printk and
>>> the ip is passed in via the trace_printk macro.
>>> 
>>> (BUT the first line is all BOLD).
>>> 
>>> I tried changing that to
>>> \_\_trace_printk is an internal function...
>>> with the same BOLD result.
>>> 
>>> So I tried the source as
>>> ``__trace_printk`` is an internal function...
>>> with the same BOLD result.
>>> 
>>> Maybe this is a s/w bug?
>>> Running Sphinx v1.3.1
>> 
>> Sorry for my late reply.  It is not only you, see:
>> 
>> - (perl kernel-doc) 
>> https://www.kernel.org/doc/html/latest/driver-api/basics.html#c.trace_printk
>> - (python kernel-doc) 
>> https://h2626237.stratoserver.net/kernel/linux_src_doc/include/linux/kernel_h.html#trace-printk
>> 
>> I have to dig.
>> 
>> -- Markus 
>> To unsubscribe from this list: send the line "unsubscribe linux-doc" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] doc: Rename .system_keyring to .builtin_trusted_keys

2018-02-19 Thread Philipp Hahn
Commit d3bfe84129f65e0af2450743ebdab33d161d01c9 changed the name but did
not update the documentation.

Fixes: d3bfe84129f65e0af2450743ebdab33d161d01c9
Signed-off-by: Philipp Hahn 
---
 Documentation/admin-guide/module-signing.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/module-signing.rst 
b/Documentation/admin-guide/module-signing.rst
index 27e59498b487..62e389fdcb86 100644
--- a/Documentation/admin-guide/module-signing.rst
+++ b/Documentation/admin-guide/module-signing.rst
@@ -180,11 +180,11 @@ Public keys in the kernel
 =
 
 The kernel contains a ring of public keys that can be viewed by root.  They're
-in a keyring called ".system_keyring" that can be seen by::
+in a keyring called ".builtin_trusted_keys" that can be seen by::
 
[root@deneb ~]# cat /proc/keys
...
-   223c7853 I-- 1 perm 1f03 0 0 keyring   
.system_keyring: 1
+   223c7853 I-- 1 perm 1f03 0 0 keyring   
.builtin_trusted_keys: 1
302d2d52 I-- 1 perm 1f01 0 0 asymmetri Fedora 
kernel signing key: d69a84e6bce3d216b979e9505b3e3ef9a7118079: X509.RSA a7118079 
[]
...
 
@@ -197,15 +197,15 @@ add those in also (e.g. from the UEFI key database).
 
 Finally, it is possible to add additional public keys by doing::
 
-   keyctl padd asymmetric "" [.system_keyring-ID] <[key-file]
+   keyctl padd asymmetric "" [.builtin_trusted_keys-ID] <[key-file]
 
 e.g.::
 
keyctl padd asymmetric "" 0x223c7853 http://vger.kernel.org/majordomo-info.html


[PATCH v2] doc: module-signing.rst: Fix reST formatting

2018-02-19 Thread Philipp Hahn
Move the _if_ outside the verbatim string.

Make key ring name as a verbatim string.

Signed-off-by: Philipp Hahn 
---
 Documentation/admin-guide/module-signing.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/module-signing.rst 
b/Documentation/admin-guide/module-signing.rst
index 62e389fdcb86..f8b584179cff 100644
--- a/Documentation/admin-guide/module-signing.rst
+++ b/Documentation/admin-guide/module-signing.rst
@@ -204,8 +204,8 @@ e.g.::
keyctl padd asymmetric "" 0x223c7853 http://vger.kernel.org/majordomo-info.html


[PATCH v2] doc: admin-guide/module-signing.rst fixes

2018-02-19 Thread Philipp Hahn
Two trivial corrections for the module signing documentation.

v2: Add missing Signed-off-by

Philipp Hahn (2):
  doc: Rename .system_keyring to .builtin_trusted_keys
  doc: module-signing.rst: Fix reST formatting

 Documentation/admin-guide/module-signing.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Inline emphasis warnings (more)

2018-02-19 Thread Markus Heiser

> Am 19.02.2018 um 10:31 schrieb Jani Nikula :
> 
>> modified   include/linux/kernel.h
>> @@ -638,8 +638,10 @@ do {
>>\
>>  * trace_printk - printf formatting in the ftrace buffer
>>  * @fmt: the printf format for printing
>>  *
>> - * Note: __trace_printk is an internal function for trace_printk() and
>> - *   the @ip is passed in via the trace_printk() macro.
>> + * .. node:
> 
> Shouldn't this be ".. note:"?

Oops, just a typo of mine. Thanks for the heads-up.

Full list of annotations:

  http://docutils.sourceforge.net/docs/ref/rst/directives.html#admonitions

-- Markus --

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] doc: admin-guide/module-signing.rst fixes

2018-02-19 Thread Philipp Hahn
Two trivial corrections for the module signing documentation.

Philipp Hahn (2):
  doc: Rename .system_keyring to .builtin_trusted_keys
  doc: module-signing.rst: Fix reST formatting

 Documentation/admin-guide/module-signing.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] doc: module-signing.rst: Fix reST formatting

2018-02-19 Thread Philipp Hahn
Move the _if_ outside the verbatim string.

Make key ring name as a verbatim string.
---
 Documentation/admin-guide/module-signing.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/module-signing.rst 
b/Documentation/admin-guide/module-signing.rst
index 62e389fdcb86..f8b584179cff 100644
--- a/Documentation/admin-guide/module-signing.rst
+++ b/Documentation/admin-guide/module-signing.rst
@@ -204,8 +204,8 @@ e.g.::
keyctl padd asymmetric "" 0x223c7853 http://vger.kernel.org/majordomo-info.html


[PATCH] doc: Rename .system_keyring to .builtin_trusted_keys

2018-02-19 Thread Philipp Hahn
Commit d3bfe84129f65e0af2450743ebdab33d161d01c9 changed the name but did
not update the documentation.

Fixes: d3bfe84129f65e0af2450743ebdab33d161d01c9
---
 Documentation/admin-guide/module-signing.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/module-signing.rst 
b/Documentation/admin-guide/module-signing.rst
index 27e59498b487..62e389fdcb86 100644
--- a/Documentation/admin-guide/module-signing.rst
+++ b/Documentation/admin-guide/module-signing.rst
@@ -180,11 +180,11 @@ Public keys in the kernel
 =
 
 The kernel contains a ring of public keys that can be viewed by root.  They're
-in a keyring called ".system_keyring" that can be seen by::
+in a keyring called ".builtin_trusted_keys" that can be seen by::
 
[root@deneb ~]# cat /proc/keys
...
-   223c7853 I-- 1 perm 1f03 0 0 keyring   
.system_keyring: 1
+   223c7853 I-- 1 perm 1f03 0 0 keyring   
.builtin_trusted_keys: 1
302d2d52 I-- 1 perm 1f01 0 0 asymmetri Fedora 
kernel signing key: d69a84e6bce3d216b979e9505b3e3ef9a7118079: X509.RSA a7118079 
[]
...
 
@@ -197,15 +197,15 @@ add those in also (e.g. from the UEFI key database).
 
 Finally, it is possible to add additional public keys by doing::
 
-   keyctl padd asymmetric "" [.system_keyring-ID] <[key-file]
+   keyctl padd asymmetric "" [.builtin_trusted_keys-ID] <[key-file]
 
 e.g.::
 
keyctl padd asymmetric "" 0x223c7853 http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Mel Gorman
On Sun, Feb 18, 2018 at 04:47:55PM +, robert.m.har...@oracle.com wrote:
> From: "Robert M. Harris" 
> 
> __fragmentation_index() calculates a value used to determine whether
> compaction should be favoured over page reclaim in the event of allocation
> failure.  The calculation itself is opaque and, on inspection, does not
> match its existing description.  The function purports to return a value
> between 0 and 1000, representing units of 1/1000.  Barring the case of a
> pathological shortfall of memory, the lower bound is instead 500.  This is
> significant because it is the default value of sysctl_extfrag_threshold,
> i.e. the value below which compaction should be avoided in favour of page
> reclaim for costly pages.
> 
> This patch implements and documents a modified version of the original
> expression that returns a value in the range 0 <= index < 1000.  It amends
> the default value of sysctl_extfrag_threshold to preserve the existing
> behaviour.
> 
> Signed-off-by: Robert M. Harris 

You have to update sysctl_extfrag_threshold as well for the new bounds.
It effectively makes it a no-op but it was a no-op already and adjusting
that default should be supported by data indicating it's safe.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Add support for in-line nested struct comments

2018-02-19 Thread Jani Nikula
On Sun, 18 Feb 2018, Jonathan Corbet  wrote:
> On Fri, 16 Feb 2018 11:48:14 -0200
> Mauro Carvalho Chehab  wrote:
>
>> his series fix two bugs at kernel-doc.rst examples and add support
>> for in-line nested struct comments.
>> 
>> It also converts one documentation at intel_dpio_phy to use it,
>> in order to give a practical example about how to use it.
>
> OK, I've applied everything but the last patch, which I assume will go
> through the DRM tree.

I was going to reference the kernel-doc commit while applying patch 6,
but I can't find the others. I guess applied literally meant just
applied, not pushed... ;)

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Robert Harris


> On 19 Feb 2018, at 08:26, Michal Hocko  wrote:
> 
> On Sun 18-02-18 16:47:55, robert.m.har...@oracle.com wrote:
>> From: "Robert M. Harris" 
>> 
>> __fragmentation_index() calculates a value used to determine whether
>> compaction should be favoured over page reclaim in the event of allocation
>> failure.  The calculation itself is opaque and, on inspection, does not
>> match its existing description.  The function purports to return a value
>> between 0 and 1000, representing units of 1/1000.  Barring the case of a
>> pathological shortfall of memory, the lower bound is instead 500.  This is
>> significant because it is the default value of sysctl_extfrag_threshold,
>> i.e. the value below which compaction should be avoided in favour of page
>> reclaim for costly pages.
>> 
>> This patch implements and documents a modified version of the original
>> expression that returns a value in the range 0 <= index < 1000.  It amends
>> the default value of sysctl_extfrag_threshold to preserve the existing
>> behaviour.
> 
> It is not really clear to me what is the actual problem you are trying
> to solve by this patch. Is there any bug or are you just trying to
> improve the current implementation to be more effective?

There is not a significant bug.

The first problem is that the mathematical expression in
__fragmentation_index() is opaque, particularly given the lack of
description in the comments or the original commit message.  This patch
provides such a description.

Simply annotating the expression did not make sense since the formula
doesn't work as advertised.  The fragmentation index is described as
being in the range 0 to 1000 but the bounds of the formula are instead
500 to 1000.  This patch changes the formula so that its lower bound is
0.

The fragmentation index is compared to the tuneable
sysctl_extfrag_threshold, which defaults to 500.  If the index is above
this value then compaction is preferred over page reclaim in the event
of allocation failure.  Given the issue above, the index will almost
always exceed the default threshold and compaction will occur even if
there is low fragmentation.  This patch changes the default value of the
tuneable to 0, meaning that the existing behaviour will be unchanged.
Changing sysctl_extfrag_threshold back to something non-zero in a future
patch would effect the behaviour intended by the original code but would
require more comprehensive testing since it would modify the kernel's
performance under memory pressure.

Robert Harris
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Robert Harris


> On 19 Feb 2018, at 09:47, Mel Gorman  wrote:
> 
> On Sun, Feb 18, 2018 at 04:47:55PM +, robert.m.har...@oracle.com wrote:
>> From: "Robert M. Harris" 
>> 
>> __fragmentation_index() calculates a value used to determine whether
>> compaction should be favoured over page reclaim in the event of allocation
>> failure.  The calculation itself is opaque and, on inspection, does not
>> match its existing description.  The function purports to return a value
>> between 0 and 1000, representing units of 1/1000.  Barring the case of a
>> pathological shortfall of memory, the lower bound is instead 500.  This is
>> significant because it is the default value of sysctl_extfrag_threshold,
>> i.e. the value below which compaction should be avoided in favour of page
>> reclaim for costly pages.
>> 
>> This patch implements and documents a modified version of the original
>> expression that returns a value in the range 0 <= index < 1000.  It amends
>> the default value of sysctl_extfrag_threshold to preserve the existing
>> behaviour.
>> 
>> Signed-off-by: Robert M. Harris 
> 
> You have to update sysctl_extfrag_threshold as well for the new bounds.

This patch makes its default value zero.

> It effectively makes it a no-op but it was a no-op already and adjusting
> that default should be supported by data indicating it's safe.

Would it be acceptable to demonstrate using tracing that in both the
pre- and post-patch cases

  1. compaction is attempted regardless of fragmentation index,
 excepting that

  2. reclaim is preferred even for non-zero fragmentation during
 an extreme shortage of memory

?

Robert Harris
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Michal Hocko
On Mon 19-02-18 12:14:26, Robert Harris wrote:
> 
> 
> > On 19 Feb 2018, at 08:26, Michal Hocko  wrote:
> > 
> > On Sun 18-02-18 16:47:55, robert.m.har...@oracle.com wrote:
> >> From: "Robert M. Harris" 
> >> 
> >> __fragmentation_index() calculates a value used to determine whether
> >> compaction should be favoured over page reclaim in the event of allocation
> >> failure.  The calculation itself is opaque and, on inspection, does not
> >> match its existing description.  The function purports to return a value
> >> between 0 and 1000, representing units of 1/1000.  Barring the case of a
> >> pathological shortfall of memory, the lower bound is instead 500.  This is
> >> significant because it is the default value of sysctl_extfrag_threshold,
> >> i.e. the value below which compaction should be avoided in favour of page
> >> reclaim for costly pages.
> >> 
> >> This patch implements and documents a modified version of the original
> >> expression that returns a value in the range 0 <= index < 1000.  It amends
> >> the default value of sysctl_extfrag_threshold to preserve the existing
> >> behaviour.
> > 
> > It is not really clear to me what is the actual problem you are trying
> > to solve by this patch. Is there any bug or are you just trying to
> > improve the current implementation to be more effective?
> 
> There is not a significant bug.
> 
> The first problem is that the mathematical expression in
> __fragmentation_index() is opaque, particularly given the lack of
> description in the comments or the original commit message.  This patch
> provides such a description.
> 
> Simply annotating the expression did not make sense since the formula
> doesn't work as advertised.  The fragmentation index is described as
> being in the range 0 to 1000 but the bounds of the formula are instead
> 500 to 1000.  This patch changes the formula so that its lower bound is
> 0.

But why do we want to fix that in the first place? Why don't we simply
deprecate the tunable and remove it altogether? Who is relying on tuning
this option. Considering how it doesn't work as advertised and nobody
complaining I have that feeling that it is not really used in wild...
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Mel Gorman
On Mon, Feb 19, 2018 at 12:26:39PM +, Robert Harris wrote:
> 
> 
> > On 19 Feb 2018, at 09:47, Mel Gorman  wrote:
> > 
> > On Sun, Feb 18, 2018 at 04:47:55PM +, robert.m.har...@oracle.com wrote:
> >> From: "Robert M. Harris" 
> >> 
> >> __fragmentation_index() calculates a value used to determine whether
> >> compaction should be favoured over page reclaim in the event of allocation
> >> failure.  The calculation itself is opaque and, on inspection, does not
> >> match its existing description.  The function purports to return a value
> >> between 0 and 1000, representing units of 1/1000.  Barring the case of a
> >> pathological shortfall of memory, the lower bound is instead 500.  This is
> >> significant because it is the default value of sysctl_extfrag_threshold,
> >> i.e. the value below which compaction should be avoided in favour of page
> >> reclaim for costly pages.
> >> 
> >> This patch implements and documents a modified version of the original
> >> expression that returns a value in the range 0 <= index < 1000.  It amends
> >> the default value of sysctl_extfrag_threshold to preserve the existing
> >> behaviour.
> >> 
> >> Signed-off-by: Robert M. Harris 
> > 
> > You have to update sysctl_extfrag_threshold as well for the new bounds.
> 
> This patch makes its default value zero.
> 

Sorry, I'm clearly blind.

> > It effectively makes it a no-op but it was a no-op already and adjusting
> > that default should be supported by data indicating it's safe.
> 
> Would it be acceptable to demonstrate using tracing that in both the
> pre- and post-patch cases
> 
>   1. compaction is attempted regardless of fragmentation index,
>  excepting that
> 
>   2. reclaim is preferred even for non-zero fragmentation during
>  an extreme shortage of memory
> 

If you can demonstrate that for both reclaim-intensive and
compaction-intensive workloads then yes. Also include the reclaim and
compaction stats from /proc/vmstat and not just tracepoints to demonstrate
that reclaim doesn't get out of control and reclaim the world in
response to failed high-order allocations such as THP.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/3] Introduce Linux kernel labs documentation

2018-02-19 Thread Octavian Purdila
On Mon, Feb 19, 2018 at 1:22 AM, Jonathan Corbet  wrote:
> On Sun, 11 Feb 2018 20:05:43 +0200
> Daniel Baluta  wrote:
>
>> The Linux kernel labs documentation is a collection of "labs" for
>> various device driver topics. For each topic there are two parts: a
>> walk-through which explain the basic concepts and a hands-on part which
>> contains a few exercises.
>
> So I've only had a chance to give this a quick once-over.  It seems like
> useful stuff to have.  Naturally I have a few thoughts...
>

Hi Jon,

Great to hear that, thanks for checking them out.

> The biggest, perhaps, is that, if we're going to bring this material into
> the kernel documentation, we should integrate it properly.  So, for
> example, references to kernel functions and data structures should link to
> the documentation for those objects.  Most of the time, that can be done
> fairly simply using Sphinx markup - :c:func:`printk()`, for example.
>

Good point. They are partly added, some labs are better then others,
but we will make sure they are complete.

We noticed that some APIs we used don't have kernel doc comments.
Would it be appropriate to add them as separate patches in this
series?
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] linux-next: SLIMbus: doc: Fix a warning "Title underline too short"

2018-02-19 Thread Masanari Iida
This patch fixes a warning during "make xmldocs"

Documentation/driver-api/slimbus.rst:93:
WARNING: Title underline too short.

Signed-off-by: Masanari Iida 
---
 Documentation/driver-api/slimbus.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/driver-api/slimbus.rst 
b/Documentation/driver-api/slimbus.rst
index 7555ecd538de..a97449cf603a 100644
--- a/Documentation/driver-api/slimbus.rst
+++ b/Documentation/driver-api/slimbus.rst
@@ -90,7 +90,7 @@ controller resets the bus. This notification allows the 
driver to take necessary
 steps to boot the device so that it's functional after the bus has been reset.
 
 Driver and Controller APIs:
---
+---
 .. kernel-doc:: include/linux/slimbus.h
:internal:
 
-- 
2.16.2.246.ga4ee8fe9

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/3] Introduce Linux kernel labs documentation

2018-02-19 Thread Jonathan Corbet
On Mon, 19 Feb 2018 14:53:22 +0100
Octavian Purdila  wrote:

> We noticed that some APIs we used don't have kernel doc comments.
> Would it be appropriate to add them as separate patches in this
> series?

Absolutely, though they should generally go through the appropriate
maintainers for the subsystem involved rather than me.

Thanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] linux-next: SLIMbus: doc: Fix a warning "Title underline too short"

2018-02-19 Thread Jonathan Corbet
On Mon, 19 Feb 2018 22:55:50 +0900
Masanari Iida  wrote:

> This patch fixes a warning during "make xmldocs"
> 
> Documentation/driver-api/slimbus.rst:93:
> WARNING: Title underline too short.
> 
> Signed-off-by: Masanari Iida 

Applied, thanks.

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] linux-next: SLIMbus: doc: Fix a warning "Title underline too short"

2018-02-19 Thread Matthew Wilcox
On Mon, Feb 19, 2018 at 10:55:50PM +0900, Masanari Iida wrote:
>  Driver and Controller APIs:
> ---
> +---
>  .. kernel-doc:: include/linux/slimbus.h

Is this the right fix?  Shouldn't we rather delete the : instead?
We don't normally have a colon at the end of subsection titles.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Robert Harris


> On 19 Feb 2018, at 13:10, Mel Gorman  wrote:
> 
> On Mon, Feb 19, 2018 at 12:26:39PM +, Robert Harris wrote:
>> 
>> 
>>> On 19 Feb 2018, at 09:47, Mel Gorman  wrote:
>>> 
>>> On Sun, Feb 18, 2018 at 04:47:55PM +, robert.m.har...@oracle.com wrote:
 From: "Robert M. Harris" 
 
 __fragmentation_index() calculates a value used to determine whether
 compaction should be favoured over page reclaim in the event of allocation
 failure.  The calculation itself is opaque and, on inspection, does not
 match its existing description.  The function purports to return a value
 between 0 and 1000, representing units of 1/1000.  Barring the case of a
 pathological shortfall of memory, the lower bound is instead 500.  This is
 significant because it is the default value of sysctl_extfrag_threshold,
 i.e. the value below which compaction should be avoided in favour of page
 reclaim for costly pages.
 
 This patch implements and documents a modified version of the original
 expression that returns a value in the range 0 <= index < 1000.  It amends
 the default value of sysctl_extfrag_threshold to preserve the existing
 behaviour.
 
 Signed-off-by: Robert M. Harris 
>>> 
>>> You have to update sysctl_extfrag_threshold as well for the new bounds.
>> 
>> This patch makes its default value zero.
>> 
> 
> Sorry, I'm clearly blind.
> 
>>> It effectively makes it a no-op but it was a no-op already and adjusting
>>> that default should be supported by data indicating it's safe.
>> 
>> Would it be acceptable to demonstrate using tracing that in both the
>> pre- and post-patch cases
>> 
>>  1. compaction is attempted regardless of fragmentation index,
>> excepting that
>> 
>>  2. reclaim is preferred even for non-zero fragmentation during
>> an extreme shortage of memory
>> 
> 
> If you can demonstrate that for both reclaim-intensive and
> compaction-intensive workloads then yes. Also include the reclaim and
> compaction stats from /proc/vmstat and not just tracepoints to demonstrate
> that reclaim doesn't get out of control and reclaim the world in
> response to failed high-order allocations such as THP.

Understood.  Thanks.

Robert Harris--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/23] kconfig: move compiler capability tests to Kconfig

2018-02-19 Thread Ulf Magnusson
Hello,

On Sun, Feb 18, 2018 at 11:13 PM, Sam Ravnborg  wrote:
> Hi Masahiro.
>
> On Sat, Feb 17, 2018 at 03:38:28AM +0900, Masahiro Yamada wrote:
>> I brushed up the implementation in this version.
>>
>> In the previous RFC, CC_HAS_ was described by using 'option shell=',
>> like this:
>>
>> config CC_HAS_STACKPROTECTOR
>> bool
>> option shell="$CC -Werror -fstack-protector -c -x c /dev/null"
>>
>> After I thought a bit more, the following syntax is more grammatical,
>> and flexible.
>>
>> config CC_HAS_STACKPROTECTOR
>> bool
>> default $(shell $CC -Werror -fstack-protector -c -x c /dev/null)
>
> Looks good - but maybe we should go one step further.
>
> So we in the syntax explicit handles:
> - shell commands
> - other commands, defined as strings
> - environment variables
> - config variables
>
> Each case is explicit - so the reader is not confused what is used when.
>
> $(shell foo)- output of the shell command foo. Uses $SHELL as the shell.
>   May include optional paramters.
>   foo may be a config variable referenced using ${} or a 
> config variable prefixed with $
> Example:
>
> config BUILD_DIR
> string
> default $(shell cd ${objtree}; pwd)
>
> $(call bar) - output of the bar command that may take optional parameters.
>   bar may be a text string, a config variable or an 
> environment variable
>The definition of bar may reference the parameters using 
> $(1), $(2)
>In this context a config variable needs to be prefixed 
> with $
>
> Example:
>
> config reverse
> string
> default $(2) $(1)
>
> config NEW_ORDER
> string
> $(call $reverse, A, B)  # Will assign REVERSE the 
> value "B A"
>
>
> Example2:
>
> config CC_OPTION
> string
> default $(shell ${srctree}/scripts/cc-option ${CC} 
> $(1) $(2))
>
> config CC_OPTIMIZE
> string
> $(call $CC_OPTION, -Oz, -Os)
>
>
> ${FOO}  - environment variable
>
> The above is inspired by how make implement similar functionality.
>
> I'm not happy that we in one context can reference CONFIG variables
> directly, but inside the $(call ...) and $(shell ...) needs the $ prefix.
> But I could not come up with something un-ambigious where this could be 
> avoided.

I think we should be careful about allowing references to config
symbols. It mixes up the parsing and evaluation phases, since $() is
expanded during parsing (which I consider a feature and think is
needed to retain sanity).

Patch 06/23 removes the last existing instance of symbol references in
strings by getting rid of 'option env'. That's an improvement to me.
We shouldn't add it back.

>
> The above proposal include the functionality of the macro stuff proposed in 
> this patch-set.
> But with a simpler syntax and we keep all the other kconfig logic (depends on 
> etc) - so
> users will not be limited in their creativity.
>
>> Current limitations:
>>
>> Dependency on outside scripts.
>> Inter-option dependency:
>> Functions are evaluated statically:
>
> Same limitations exists with the syntax suggested above.
>
> Sam

Cheers,
Ulf
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Intel-gfx] [PATCH 0/6] Add support for in-line nested struct comments

2018-02-19 Thread Daniel Vetter
On Fri, Feb 16, 2018 at 11:48:14AM -0200, Mauro Carvalho Chehab wrote:
> This series fix two bugs at kernel-doc.rst examples and add support
> for in-line nested struct comments.
> 
> It also converts one documentation at intel_dpio_phy to use it,
> in order to give a practical example about how to use it.
> 
> Mauro Carvalho Chehab (6):
>   doc-guide: kernel-doc: fix example for nested_foobar struct
>   doc-guide: kernel-doc: fix example for inlined comments
>   doc-guide: kernel-doc: move in-line section to be after nested struct
>   scripts: kernel-doc: support in-line comments on nested structs/unions
>   doc-guide: kernel-doc: add examples about nested union/structs
>   drm: intel_dpio_phy: fix kernel-doc comments at nested struct

Oh, this is cool. Thanks a lot for doing this.
-Daniel

> 
>  Documentation/doc-guide/kernel-doc.rst | 69 
> --
>  drivers/gpu/drm/i915/intel_dpio_phy.c  |  2 +-
>  scripts/kernel-doc |  2 +-
>  3 files changed, 43 insertions(+), 30 deletions(-)
> 
> -- 
> 2.14.3
> 
> 
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] irqchip/gic-v3-its: Add workaround for ThunderX2 erratum #174

2018-02-19 Thread Jayachandran C
On Sun, Jan 21, 2018 at 11:35:34AM +, Marc Zyngier wrote:
> On Sun, 21 Jan 2018 07:00:48 +,
> Jayachandran C wrote:
> > 
> > On Thu, Jan 18, 2018 at 10:58:20AM +0530, Ganapatrao Kulkarni wrote:
> > > This erratum is observed on the ThunderX2 GICv3 ITS. When a
> > > MOVI command is used to change affinity of a LPI to a collection/cpu
> > > on another node, the LPI is not delivered to the cpu.
> > > An additional INV command is required after the MOVI to deliver
> > > the LPI to the new destination.
> > > 
> > > If we add INV after MOVI, there is a chance that we lose LPIs which
> > > are raised when the affinity is changed. So for now, adding workaround fix
> > > to disable inter node affinity change.
> > > 
> > > Signed-off-by: Ganapatrao Kulkarni 
> > > ---
> > > 
> > > v2: Added workaround to avoid inter node affinity change.
> > > 
> > > v1: Initial patch
> > > 
> > >  Documentation/arm64/silicon-errata.txt |  1 +
> > >  arch/arm64/Kconfig | 10 ++
> > >  drivers/irqchip/irq-gic-v3-its.c   | 21 -
> > >  3 files changed, 31 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/arm64/silicon-errata.txt 
> > > b/Documentation/arm64/silicon-errata.txt
> > > index fc1c884..fb27cb5 100644
> > > --- a/Documentation/arm64/silicon-errata.txt
> > > +++ b/Documentation/arm64/silicon-errata.txt
> > > @@ -63,6 +63,7 @@ stable kernels.
> > >  | Cavium | ThunderX Core   | #27456  | 
> > > CAVIUM_ERRATUM_27456|
> > >  | Cavium | ThunderX Core   | #30115  | 
> > > CAVIUM_ERRATUM_30115|
> > >  | Cavium | ThunderX SMMUv2 | #27704  | N/A   
> > >   |
> > > +| Cavium | ThunderX2 ITS   | #174| 
> > > CAVIUM_ERRATUM_174  |
> > >  | Cavium | ThunderX2 SMMUv3| #74 | N/A   
> > >   |
> > >  | Cavium | ThunderX2 SMMUv3| #126| N/A   
> > >   |
> > >  || | |   
> > >   |
> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index c9a7e9e..0dbf3bd 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -461,6 +461,16 @@ config ARM64_ERRATUM_843419
> > >  
> > > If unsure, say Y.
> > >  
> > > +config CAVIUM_ERRATUM_174
> > > + bool "Cavium ThunderX2 erratum 174"
> > > + default y
> > > + help
> > > +   Cavium ThunderX2 dual socket systems may loose interrupts
> > > +   on affinity change to a cpu on other node.
> > > +   This workaround fix avoids inter node affinity change.
> > 
> > This has to be fixed up to match the commit message (and for spelling).
> > I have seen some questions offlist about how important this fix is,
> > and how it can affect users - so that would be useful to have in the
> > description as well.
> > 
> > To clarify, this errata comes into play only when the irq affinity is
> > forced from the node given by the device (and ITS) affinity to another
> > node.  This should not happen in normal, useful configurations.
> 
> Define normal. That's all under control of userspace, and the kernel
> doesn't really have a say. irqbalance will happily move interrupts
> around. Disable all CPUs from node at runtime (again, from userspace),
> and you'll get the exact same thing. I can't see what's so "abnormal"
> about any of that.
> 
> > Also, we will hold further posting of this errata until we do another
> > round of investigation with the hardware team for a better solution.
> > If we can handle the pending interrupts for the small window of MOVI/INV
> > in first workaround, we will not need this restriction at all.
> 
> What do you mean by "If we can handle the pending interrupts for the
> small window of MOVI/INV"? Taking the interrupt on the source CPU?
> Sure, that would be fine. But that's assuming that the souce CPU is in
> a position to actually handle this, and is not simply going down.
> 
> If there is only a slight possibility that you may loose an interrupt
> in the MOVI/INV window (which is not that small, since that's a 4
> command sequence), your only other solution is to inject a spurious
> interrupt to replace the one you may have lost in that window.
> 
> In the meantime, and until I see a patch fixing this (or a decent
> explanation of why this isn't a problem), I'll consider it broken.

After reviewing the issue with our hardware team, we decided to
tweak the redistributor cache configuration from firmware rather
than go with this errata workaournd in Linux (and other OSes).

So, with the new firmware MOVI will work across nodes as expected,
and this patch is no longer neeeded.

Thanks,
JC.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] taint: Add taint for randstruct

2018-02-19 Thread Kees Cook
Since the randstruct plugin can intentionally produce extremely unusual
kernel structure layouts (even performance pathological ones), some
maintainers want to be able to trivially determine if an Oops is coming
from a randstruct-built kernel, so as to keep their sanity when debugging.
This adds the new flag and initializes taint_mask immediately when built
with randstruct.

Signed-off-by: Kees Cook 
---
 Documentation/sysctl/kernel.txt | 1 +
 include/linux/kernel.h  | 3 ++-
 kernel/panic.c  | 4 +++-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 4a890c7fb6c3..eded671d55eb 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -991,6 +991,7 @@ ORed together. The letters are seen in "Tainted" line of 
Oops reports.
  16384 (L): A soft lockup has previously occurred on the system.
  32768 (K): The kernel has been live patched.
  65536 (X): Auxiliary taint, defined and used by for distros.
+131072 (T): The kernel was built with the struct randomization plugin.
 
 ==
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 62fa060a6e1a..ce1e43bf5f49 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -551,7 +551,8 @@ extern enum system_states {
 #define TAINT_SOFTLOCKUP   14
 #define TAINT_LIVEPATCH15
 #define TAINT_AUX  16
-#define TAINT_FLAGS_COUNT  17
+#define TAINT_RANDSTRUCT   17
+#define TAINT_FLAGS_COUNT  18
 
 struct taint_flag {
char c_true;/* character printed when tainted */
diff --git a/kernel/panic.c b/kernel/panic.c
index 15d333a54ece..0153cae0d330 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -34,7 +34,8 @@
 #define PANIC_BLINK_SPD 18
 
 int panic_on_oops = CONFIG_PANIC_ON_OOPS_VALUE;
-static unsigned long tainted_mask;
+static unsigned long tainted_mask =
+   IS_ENABLED(CONFIG_GCC_PLUGIN_RANDSTRUCT) ? (1 << TAINT_RANDSTRUCT) : 0;
 static int pause_on_oops;
 static int pause_on_oops_flag;
 static DEFINE_SPINLOCK(pause_on_oops_lock);
@@ -325,6 +326,7 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
[ TAINT_SOFTLOCKUP ]= { 'L', ' ', false },
[ TAINT_LIVEPATCH ] = { 'K', ' ', true },
[ TAINT_AUX ]   = { 'X', ' ', true },
+   [ TAINT_RANDSTRUCT ]= { 'T', ' ', true },
 };
 
 /**
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] taint: Convert to indexed initialization

2018-02-19 Thread Kees Cook
This converts to using indexed initializers instead of comments, adds a
comment on why the taint flags can't be an enum, and make sure that no one
forgets to update the taint_flags when adding new bits.

Signed-off-by: Kees Cook 
---
 include/linux/kernel.h |  1 +
 kernel/panic.c | 36 +++-
 2 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index ce51455e2adf..62fa060a6e1a 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -533,6 +533,7 @@ extern enum system_states {
SYSTEM_RESTART,
 } system_state;
 
+/* This cannot be an enum because some may be used in assembly source. */
 #define TAINT_PROPRIETARY_MODULE   0
 #define TAINT_FORCED_MODULE1
 #define TAINT_CPU_OUT_OF_SPEC  2
diff --git a/kernel/panic.c b/kernel/panic.c
index 2cfef408fec9..c5e0fd5a188e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -308,23 +308,23 @@ EXPORT_SYMBOL(panic);
  * is being removed anyway.
  */
 const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
-   { 'P', 'G', true }, /* TAINT_PROPRIETARY_MODULE */
-   { 'F', ' ', true }, /* TAINT_FORCED_MODULE */
-   { 'S', ' ', false },/* TAINT_CPU_OUT_OF_SPEC */
-   { 'R', ' ', false },/* TAINT_FORCED_RMMOD */
-   { 'M', ' ', false },/* TAINT_MACHINE_CHECK */
-   { 'B', ' ', false },/* TAINT_BAD_PAGE */
-   { 'U', ' ', false },/* TAINT_USER */
-   { 'D', ' ', false },/* TAINT_DIE */
-   { 'A', ' ', false },/* TAINT_OVERRIDDEN_ACPI_TABLE */
-   { 'W', ' ', false },/* TAINT_WARN */
-   { 'C', ' ', true }, /* TAINT_CRAP */
-   { 'I', ' ', false },/* TAINT_FIRMWARE_WORKAROUND */
-   { 'O', ' ', true }, /* TAINT_OOT_MODULE */
-   { 'E', ' ', true }, /* TAINT_UNSIGNED_MODULE */
-   { 'L', ' ', false },/* TAINT_SOFTLOCKUP */
-   { 'K', ' ', true }, /* TAINT_LIVEPATCH */
-   { 'X', ' ', true }, /* TAINT_AUX */
+   [ TAINT_PROPRIETARY_MODULE ]= { 'P', 'G', true },
+   [ TAINT_FORCED_MODULE ] = { 'F', ' ', true },
+   [ TAINT_CPU_OUT_OF_SPEC ]   = { 'S', ' ', false },
+   [ TAINT_FORCED_RMMOD ]  = { 'R', ' ', false },
+   [ TAINT_MACHINE_CHECK ] = { 'M', ' ', false },
+   [ TAINT_BAD_PAGE ]  = { 'B', ' ', false },
+   [ TAINT_USER ]  = { 'U', ' ', false },
+   [ TAINT_DIE ]   = { 'D', ' ', false },
+   [ TAINT_OVERRIDDEN_ACPI_TABLE ] = { 'A', ' ', false },
+   [ TAINT_WARN ]  = { 'W', ' ', false },
+   [ TAINT_CRAP ]  = { 'C', ' ', true },
+   [ TAINT_FIRMWARE_WORKAROUND ]   = { 'I', ' ', false },
+   [ TAINT_OOT_MODULE ]= { 'O', ' ', true },
+   [ TAINT_UNSIGNED_MODULE ]   = { 'E', ' ', true },
+   [ TAINT_SOFTLOCKUP ]= { 'L', ' ', false },
+   [ TAINT_LIVEPATCH ] = { 'K', ' ', true },
+   [ TAINT_AUX ]   = { 'X', ' ', true },
 };
 
 /**
@@ -354,6 +354,8 @@ const char *print_tainted(void)
 {
static char buf[TAINT_FLAGS_COUNT + sizeof("Tainted: ")];
 
+   BUILD_BUG_ON(ARRAY_SIZE(taint_flags) != TAINT_FLAGS_COUNT);
+
if (tainted_mask) {
char *s;
int i;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] taint: Consolidate documentation

2018-02-19 Thread Kees Cook
This consolidates the taint bit documentation into a single place with
both numeric and letter values. Additionally adds the missing TAINT_AUX
documentation.

Signed-off-by: Kees Cook 
---
 Documentation/sysctl/kernel.txt | 53 +
 kernel/panic.c  | 23 --
 2 files changed, 31 insertions(+), 45 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 412314eebda6..4a890c7fb6c3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -964,32 +964,33 @@ detect a hard lockup condition.
 
 tainted:
 
-Non-zero if the kernel has been tainted.  Numeric values, which
-can be ORed together:
-
-   1 - A module with a non-GPL license has been loaded, this
-   includes modules with no license.
-   Set by modutils >= 2.4.9 and module-init-tools.
-   2 - A module was force loaded by insmod -f.
-   Set by modutils >= 2.4.9 and module-init-tools.
-   4 - Unsafe SMP processors: SMP with CPUs not designed for SMP.
-   8 - A module was forcibly unloaded from the system by rmmod -f.
-  16 - A hardware machine check error occurred on the system.
-  32 - A bad page was discovered on the system.
-  64 - The user has asked that the system be marked "tainted".  This
-   could be because they are running software that directly modifies
-   the hardware, or for other reasons.
- 128 - The system has died.
- 256 - The ACPI DSDT has been overridden with one supplied by the user
-instead of using the one provided by the hardware.
- 512 - A kernel warning has occurred.
-1024 - A module from drivers/staging was loaded.
-2048 - The system is working around a severe firmware bug.
-4096 - An out-of-tree module has been loaded.
-8192 - An unsigned module has been loaded in a kernel supporting module
-   signature.
-16384 - A soft lockup has previously occurred on the system.
-32768 - The kernel has been live patched.
+Non-zero if the kernel has been tainted. Numeric values, which can be
+ORed together. The letters are seen in "Tainted" line of Oops reports.
+
+ 1 (P):  A module with a non-GPL license has been loaded, this
+ includes modules with no license.
+ Set by modutils >= 2.4.9 and module-init-tools.
+ 2 (F): A module was force loaded by insmod -f.
+Set by modutils >= 2.4.9 and module-init-tools.
+ 4 (S): Unsafe SMP processors: SMP with CPUs not designed for SMP.
+ 8 (R): A module was forcibly unloaded from the system by rmmod -f.
+16 (M): A hardware machine check error occurred on the system.
+32 (B): A bad page was discovered on the system.
+64 (U): The user has asked that the system be marked "tainted". This
+could be because they are running software that directly modifies
+the hardware, or for other reasons.
+   128 (D): The system has died.
+   256 (A): The ACPI DSDT has been overridden with one supplied by the user
+instead of using the one provided by the hardware.
+   512 (W): A kernel warning has occurred.
+  1024 (C): A module from drivers/staging was loaded.
+  2048 (I): The system is working around a severe firmware bug.
+  4096 (O): An out-of-tree module has been loaded.
+  8192 (E): An unsigned module has been loaded in a kernel supporting module
+signature.
+ 16384 (L): A soft lockup has previously occurred on the system.
+ 32768 (K): The kernel has been live patched.
+ 65536 (X): Auxiliary taint, defined and used by for distros.
 
 ==
 
diff --git a/kernel/panic.c b/kernel/panic.c
index c5e0fd5a188e..15d333a54ece 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -328,27 +328,12 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
 };
 
 /**
- * print_tainted - return a string to represent the kernel taint state.
+ * print_tainted - return a string to represent the kernel taint state.
  *
- *  'P' - Proprietary module has been loaded.
- *  'F' - Module has been forcibly loaded.
- *  'S' - SMP with CPUs not designed for SMP.
- *  'R' - User forced a module unload.
- *  'M' - System experienced a machine check exception.
- *  'B' - System has hit bad_page.
- *  'U' - Userspace-defined naughtiness.
- *  'D' - Kernel has oopsed before
- *  'A' - ACPI table overridden.
- *  'W' - Taint on warning.
- *  'C' - modules from drivers/staging are loaded.
- *  'I' - Working around severe firmware bug.
- *  'O' - Out-of-tree module has been loaded.
- *  'E' - Unsigned module has been loaded.
- *  'L' - A soft lockup has previously occurred.
- *  'K' - Kernel has been live patched.
- *  'X' - Auxiliary taint, for distros' use.
+ * For individual taint flag meanings, see Documentation/sysctl/kernel.txt
  *
- * The string is overwritten by the next call to print_tainted().
+ * The string is overwritten by the next call to print_tainted(),
+ * but is always NULL terminated.
  */
 

[PATCH v2 0/3] taint: Add taint for randstruct

2018-02-19 Thread Kees Cook
This cleans up the taint flags and documentation before adding a new
one for randstruct. This v2 reverts the #define->enum change as some
architectures include TAINT flags in assembly source, which cannot
use enums.

Patch 3/3 reads:

Since the randstruct plugin can intentionally produce extremely unusual
kernel structure layouts (even performance pathological ones), some
maintainers want to be able to trivially determine if an Oops is coming
from a randstruct-built kernel, so as to keep their sanity when debugging.
This adds the new flag and initializes taint_mask immediately when built
with randstruct.

Thanks,

-Kees

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/17] trace doc: convert trace/ftrace.txt to rst format

2018-02-19 Thread Philippe Ombredanne
Changbin, Steven,

On Sat, Feb 17, 2018 at 6:39 AM,   wrote:
> From: Changbin Du 
>
> This converts the plain text documentation to reStructuredText format and
> add it into Sphinx TOC tree. No essential content change.
>
> Cc: Steven Rostedt 
> Signed-off-by: Changbin Du 
> ---
>  Documentation/trace/ftrace.rst | 3332 
> 
>  Documentation/trace/ftrace.txt | 3220 --
>  Documentation/trace/index.rst  |1 +
>  3 files changed,  insertions(+), 3220 deletions(-)
>  create mode 100644 Documentation/trace/ftrace.rst
>  delete mode 100644 Documentation/trace/ftrace.txt
>
> diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
> new file mode 100644
> index 000..636aa9bf
> --- /dev/null
> +++ b/Documentation/trace/ftrace.rst
> @@ -0,0 +1,3332 @@
> +
> +ftrace - Function Tracer
> +
> +
> +Copyright 2008 Red Hat Inc.
> +
> +:Author:   Steven Rostedt 
> +:License:  The GNU Free Documentation License, Version 1.2
> +  (dual licensed under the GPL v2)


Do you mind using an SPDX id per [1] rather that this?

Steven, are you OK with this? Can you ack?

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst
-- 
Cordially
Philippe Ombredanne
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html