Re: [PATCH v2] usbip: fix protocol documentation

2018-01-30 Thread Alexandre Demers
I think I should have send it to Jonathan instead of Valentina, and to
linux-doc instead of linux-usb.

Alexandre Demers


On 2018-01-28 02:15, Alexandre Demers wrote:
> While reading the document, I found what seems to be 4 mistakes:
>  - 1 clarification: it seems more logical to say that the device driver on
>the client machine is for the "imported" USB device management, not for
>the "exported" devices;
>  - 2 typos: "the the" and "completition -> completion";
>  - 1 error in the protocol description (I think):  each device has a
>specific path, thus every exported USB device description starts at that
>field, not at the busid.
>
> Signed-off-by: Alexandre Demers 
> ---
>  Documentation/usb/usbip_protocol.txt | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/usb/usbip_protocol.txt 
> b/Documentation/usb/usbip_protocol.txt
> index 16b6fe27284c..23527101e2ea 100644
> --- a/Documentation/usb/usbip_protocol.txt
> +++ b/Documentation/usb/usbip_protocol.txt
> @@ -2,7 +2,7 @@ PRELIMINARY DRAFT, MAY CONTAIN MISTAKES!
>  28 Jun 2011
>
>  The USB/IP protocol follows a server/client architecture. The server exports 
> the
> -USB devices and the clients imports them. The device driver for the exported
> +USB devices and the clients imports them. The device driver for the imported
>  USB device runs on the client machine.
>
>  The client may ask for the list of the exported USB devices. To get the list 
> the
> @@ -153,7 +153,7 @@ OP_REP_DEVLIST: Reply with the list of exported USB 
> devices.
>  
> ---+++---
>   0x144 || m_0| From now on each interface is described, 
> all
> |||   together bNumInterfaces times, with the
> -   |||   the following 4 fields:
> +   |||   following 4 fields:
>  
> ---+++---
> | 1  || bInterfaceClass
>  
> ---+++---
> @@ -164,7 +164,7 @@ OP_REP_DEVLIST: Reply with the list of exported USB 
> devices.
>   0x147 | 1  || padding byte for alignment, shall be set 
> to zero
>  
> ---+++---
>   0xC + ||| The second exported USB device starts at 
> i=1
> - i*0x138 + ||| with the busid field.
> + i*0x138 + ||| with the path field.
>   m_(i-1)*4 |||
>
>  OP_REQ_IMPORT: Request to import (attach) a remote USB device.
> @@ -351,7 +351,7 @@ USBIP_RET_UNLINK: Reply for URB unlink
>   0x10  | 4  || ep: endpoint number
>  
> ---+++---
>   0x14  | 4  || status: This is the value contained in the
> -   |||   urb->status in the URB completition 
> handler.
> +   |||   urb->status in the URB completion 
> handler.
> |||   FIXME: a better explanation needed.
>  
> ---+++---
>   0x30  | n  || URB data bytes. For ISO transfers the 
> padding
> --
> 2.16.1.windows.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable

2018-01-30 Thread Michal Hocko
On Mon 29-01-18 14:38:02, David Rientjes wrote:
> On Fri, 26 Jan 2018, Michal Hocko wrote:
> 
> > > The cgroup aware oom killer is needlessly declared for the entire system
> > > by a mount option.  It's unnecessary to force the system into a single
> > > oom policy: either cgroup aware, or the traditional process aware.
> > > 
> > > This patch introduces a memory.oom_policy tunable for all mem cgroups.
> > > It is currently a no-op: it can only be set to "none", which is its
> > > default policy.  It will be expanded in the next patch to define cgroup
> > > aware oom killer behavior.
> > > 
> > > This is an extensible interface that can be used to define cgroup aware
> > > assessment of mem cgroup subtrees or the traditional process aware
> > > assessment.
> > > 
> > 
> > So what is the actual semantic and scope of this policy. Does it apply
> > only down the hierarchy. Also how do you compare cgroups with different
> > policies? Let's say you have
> >   root
> >  / |  \
> > A  B   C
> >/ \/ \
> >   D   E  F   G
> > 
> > Assume A: cgroup, B: oom_group=1, C: tree, G: oom_group=1
> > 
> 
> At each level of the hierarchy, memory.oom_policy compares immediate 
> children, it's the only way that an admin can lock in a specific oom 
> policy like "tree" and then delegate the subtree to the user.  If you've 
> configured it as above, comparing A and C should be the same based on the 
> cumulative usage of their child mem cgroups.

So cgroup == tree if we are memcg aware OOM killing, right? Why do we
need both then? Just to make memcg aware OOM killing possible?

> The policy for B hasn't been specified, but since it does not have any 
> children "cgroup" and "tree" should be the same.

So now you have a killable cgroup selected by process criterion? That
just doesn't make any sense. So I guess it would at least require to
enforce (cgroup || tree) to allow oom_group.

But even then it doesn't make much sense to me because having a memcg
killable or not is an attribute of the _memcg_ rather than the OOM
context, no? In other words how much sense does it make to have B OOM
intity or not depending on whether this is a global OOM or B OOM. Either
the workload running inside B can cope with partial tear down or it
cannot. Or do you have an example when something like that would be
useful?
 
> > Now we have the global OOM killer to choose a victim. From a quick
> > glance over those patches, it seems that we will be comparing only
> > tasks because root->oom_policy != MEMCG_OOM_POLICY_CGROUP. A, B and C
> > policies are ignored.
> 
> Right, a policy of "none" reverts its subtree back to per-process 
> comparison if you are either not using the cgroup aware oom killer or your 
> subtree is not using the cgroup aware oom killer.

So how are you going to compare none cgroups with those that consider
full memcg or hierarchy (cgroup, tree)? Are you going to consider
oom_score_adj?

> > Moreover If I select any of B's tasks then I will
> > happily kill it breaking the expectation that the whole memcg will go
> > away. Weird, don't you think? Or did I misunderstand?
> > 
> 
> It's just as weird as the behavior of memory.oom_group today without using 
> the mount option :)

Which is why oom_group returns -ENOTSUPP, so you simply cannot even set
any memcg as oom killable. And you do not have this weirdness.

> In that case, mem_cgroup_select_oom_victim() always 
> returns false and the value of memory.oom_group is ignored.  I agree that 
> it's weird in -mm and there's nothing preventing us from separating 
> memory.oom_group from the cgroup aware oom killer and allowing it to be 
> set regardless of a selection change.

it is not weird. I suspect you misunderstood the code and its intention.

> If memory.oom_group is set, and the 
> kill originates from that mem cgroup, kill all processes attached to it 
> and its subtree.
> 
> This is a criticism of the current implementation in -mm, however, my 
> extension only respects its weirdness.
> 
> > So let's assume that root: cgroup. Then we are finally comparing
> > cgroups. D, E, B, C. Of those D, E and F do not have any
> > policy. Do they inherit their policy from the parent? If they don't then
> > we should be comparing their tasks separately, no? The code disagrees
> > because once we are in the cgroup mode, we do not care about separate
> > tasks.
> > 
> 
> No, perhaps I wasn't clear in the documentation: the policy at each level 
> of the hierarchy is specified by memory.oom_policy and compares its 
> immediate children with that policy.  So the per-cgroup usage of A, B, and 
> C and compared regardless of A, B, and C's own oom policies.

You are still operating in terms of levels. And that is rather confusing
because we are operating on a _tree_ and that walk has to be independent
on the way we walk that tree - i.e. whether we do DFS or BFS ordering.

> > Let's say we choose C because it has the largest cumulative consumption.
> > It

Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Michal Hocko
On Mon 29-01-18 11:11:39, Tejun Heo wrote:
> Hello, Michal.
> 
> On Mon, Jan 29, 2018 at 11:46:57AM +0100, Michal Hocko wrote:
> > @@ -1292,7 +1292,11 @@ the memory controller considers only cgroups 
> > belonging to the sub-tree
> >  of the OOM'ing cgroup.
> >  
> >  The root cgroup is treated as a leaf memory cgroup, so it's compared
> > -with other leaf memory cgroups and cgroups with oom_group option set.
> > +with other leaf memory cgroups and cgroups with oom_group option
> > +set. Due to internal implementation restrictions the size of the root
> > +cgroup is a cumulative sum of oom_badness of all its tasks (in other
> > +words oom_score_adj of each task is obeyed). This might change in the
> > +future.
> 
> Thanks, we can definitely use more documentation.  However, it's a bit
> difficult to follow.  Maybe expand it to a separate paragraph on the
> current behavior with a clear warning that the default OOM heuristics
> is subject to changes?

Does this sound any better?

>From ea4fa9c36d3ec2cf13d1949169924a1a54b9fcd6 Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Tue, 30 Jan 2018 09:54:15 +0100
Subject: [PATCH] oom, memcg: clarify root memcg oom accounting

David Rientjes has pointed out that the current way how the root memcg
is accounted for the cgroup aware OOM killer is undocumented. Unlike
regular cgroups there is no accounting going on in the root memcg
(mostly for performance reasons). Therefore we are suming up oom_badness
of its tasks. This might result in an over accounting because of the
oom_score_adj setting. Document this for now.

Signed-off-by: Michal Hocko 
---
 Documentation/cgroup-v2.txt | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 2eaed1e2243d..67bdf19f8e5b 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -1291,8 +1291,14 @@ This affects both system- and cgroup-wide OOMs. For a 
cgroup-wide OOM
 the memory controller considers only cgroups belonging to the sub-tree
 of the OOM'ing cgroup.
 
-The root cgroup is treated as a leaf memory cgroup, so it's compared
-with other leaf memory cgroups and cgroups with oom_group option set.
+Leaf cgroups are compared based on their cumulative memory usage. The
+root cgroup is treated as a leaf memory cgroup as well, so it's
+compared with other leaf memory cgroups. Due to internal implementation
+restrictions the size of the root cgroup is a cumulative sum of
+oom_badness of all its tasks (in other words oom_score_adj of each task
+is obeyed). Relying on oom_score_adj (appart from OOM_SCORE_ADJ_MIN)
+can lead to overestimating of the root cgroup consumption and it is
+therefore discouraged. This might change in the future, though.
 
 If there are no cgroups with the enabled memory controller,
 the OOM killer is using the "traditional" process-based approach.
-- 
2.15.1
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Roman Gushchin
On Tue, Jan 30, 2018 at 09:54:45AM +0100, Michal Hocko wrote:
> On Mon 29-01-18 11:11:39, Tejun Heo wrote:

Hello, Michal!

> diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> index 2eaed1e2243d..67bdf19f8e5b 100644
> --- a/Documentation/cgroup-v2.txt
> +++ b/Documentation/cgroup-v2.txt
> @@ -1291,8 +1291,14 @@ This affects both system- and cgroup-wide OOMs. For a 
> cgroup-wide OOM
>  the memory controller considers only cgroups belonging to the sub-tree
>  of the OOM'ing cgroup.
>  
> -The root cgroup is treated as a leaf memory cgroup, so it's compared
> -with other leaf memory cgroups and cgroups with oom_group option set.
  ^
IMO, this statement is important. Isn't it?

> +Leaf cgroups are compared based on their cumulative memory usage. The
> +root cgroup is treated as a leaf memory cgroup as well, so it's
> +compared with other leaf memory cgroups. Due to internal implementation
> +restrictions the size of the root cgroup is a cumulative sum of
> +oom_badness of all its tasks (in other words oom_score_adj of each task
> +is obeyed). Relying on oom_score_adj (appart from OOM_SCORE_ADJ_MIN)
> +can lead to overestimating of the root cgroup consumption and it is

Hm, and underestimating too. Also OOM_SCORE_ADJ_MIN isn't any different
in this case. Say, all tasks except a small one have OOM_SCORE_ADJ set to
-999, this means the root croup has extremely low chances to be elected.

> +therefore discouraged. This might change in the future, though.

Other than that looks very good to me.

Thank you!
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 11:58:51, Roman Gushchin wrote:
> On Tue, Jan 30, 2018 at 09:54:45AM +0100, Michal Hocko wrote:
> > On Mon 29-01-18 11:11:39, Tejun Heo wrote:
> 
> Hello, Michal!
> 
> > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> > index 2eaed1e2243d..67bdf19f8e5b 100644
> > --- a/Documentation/cgroup-v2.txt
> > +++ b/Documentation/cgroup-v2.txt
> > @@ -1291,8 +1291,14 @@ This affects both system- and cgroup-wide OOMs. For 
> > a cgroup-wide OOM
> >  the memory controller considers only cgroups belonging to the sub-tree
> >  of the OOM'ing cgroup.
> >  
> > -The root cgroup is treated as a leaf memory cgroup, so it's compared
> > -with other leaf memory cgroups and cgroups with oom_group option set.
>   ^
> IMO, this statement is important. Isn't it?
> 
> > +Leaf cgroups are compared based on their cumulative memory usage. The
> > +root cgroup is treated as a leaf memory cgroup as well, so it's
> > +compared with other leaf memory cgroups. Due to internal implementation
> > +restrictions the size of the root cgroup is a cumulative sum of
> > +oom_badness of all its tasks (in other words oom_score_adj of each task
> > +is obeyed). Relying on oom_score_adj (appart from OOM_SCORE_ADJ_MIN)
> > +can lead to overestimating of the root cgroup consumption and it is
> 
> Hm, and underestimating too. Also OOM_SCORE_ADJ_MIN isn't any different
> in this case. Say, all tasks except a small one have OOM_SCORE_ADJ set to
> -999, this means the root croup has extremely low chances to be elected.
> 
> > +therefore discouraged. This might change in the future, though.
> 
> Other than that looks very good to me.

This?

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 2eaed1e2243d..34ad80ee90f2 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -1291,8 +1291,15 @@ This affects both system- and cgroup-wide OOMs. For a 
cgroup-wide OOM
 the memory controller considers only cgroups belonging to the sub-tree
 of the OOM'ing cgroup.
 
-The root cgroup is treated as a leaf memory cgroup, so it's compared
-with other leaf memory cgroups and cgroups with oom_group option set.
+Leaf cgroups and cgroups with oom_group option set are compared based
+on their cumulative memory usage. The root cgroup is treated as a
+leaf memory cgroup as well, so it's compared with other leaf memory
+cgroups. Due to internal implementation restrictions the size of
+the root cgroup is a cumulative sum of oom_badness of all its tasks
+(in other words oom_score_adj of each task is obeyed). Relying on
+oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
+underestimating of the root cgroup consumption and it is therefore
+discouraged. This might change in the future, though.
 
 If there are no cgroups with the enabled memory controller,
 the OOM killer is using the "traditional" process-based approach.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Roman Gushchin
On Tue, Jan 30, 2018 at 01:08:52PM +0100, Michal Hocko wrote:
> On Tue 30-01-18 11:58:51, Roman Gushchin wrote:
> > On Tue, Jan 30, 2018 at 09:54:45AM +0100, Michal Hocko wrote:
> > > On Mon 29-01-18 11:11:39, Tejun Heo wrote:
> > 
> > Hello, Michal!
> > 
> > > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> > > index 2eaed1e2243d..67bdf19f8e5b 100644
> > > --- a/Documentation/cgroup-v2.txt
> > > +++ b/Documentation/cgroup-v2.txt
> > > @@ -1291,8 +1291,14 @@ This affects both system- and cgroup-wide OOMs. 
> > > For a cgroup-wide OOM
> > >  the memory controller considers only cgroups belonging to the sub-tree
> > >  of the OOM'ing cgroup.
> > >  
> > > -The root cgroup is treated as a leaf memory cgroup, so it's compared
> > > -with other leaf memory cgroups and cgroups with oom_group option set.
> >   ^
> > IMO, this statement is important. Isn't it?
> > 
> > > +Leaf cgroups are compared based on their cumulative memory usage. The
> > > +root cgroup is treated as a leaf memory cgroup as well, so it's
> > > +compared with other leaf memory cgroups. Due to internal implementation
> > > +restrictions the size of the root cgroup is a cumulative sum of
> > > +oom_badness of all its tasks (in other words oom_score_adj of each task
> > > +is obeyed). Relying on oom_score_adj (appart from OOM_SCORE_ADJ_MIN)
> > > +can lead to overestimating of the root cgroup consumption and it is
> > 
> > Hm, and underestimating too. Also OOM_SCORE_ADJ_MIN isn't any different
> > in this case. Say, all tasks except a small one have OOM_SCORE_ADJ set to
> > -999, this means the root croup has extremely low chances to be elected.
> > 
> > > +therefore discouraged. This might change in the future, though.
> > 
> > Other than that looks very good to me.
> 
> This?
> 
> diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> index 2eaed1e2243d..34ad80ee90f2 100644
> --- a/Documentation/cgroup-v2.txt
> +++ b/Documentation/cgroup-v2.txt
> @@ -1291,8 +1291,15 @@ This affects both system- and cgroup-wide OOMs. For a 
> cgroup-wide OOM
>  the memory controller considers only cgroups belonging to the sub-tree
>  of the OOM'ing cgroup.
>  
> -The root cgroup is treated as a leaf memory cgroup, so it's compared
> -with other leaf memory cgroups and cgroups with oom_group option set.
> +Leaf cgroups and cgroups with oom_group option set are compared based
> +on their cumulative memory usage. The root cgroup is treated as a
> +leaf memory cgroup as well, so it's compared with other leaf memory
> +cgroups. Due to internal implementation restrictions the size of
> +the root cgroup is a cumulative sum of oom_badness of all its tasks
> +(in other words oom_score_adj of each task is obeyed). Relying on
> +oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
> +underestimating of the root cgroup consumption and it is therefore
> +discouraged. This might change in the future, though.

Acked-by: Roman Gushchin 

Thank you!
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 27/27] mm: display pkey in smaps if arch_pkeys_enabled() is true

2018-01-30 Thread Michal Hocko
On Thu 18-01-18 17:50:48, Ram Pai wrote:
[...]
> @@ -851,9 +848,13 @@ static int show_smap(struct seq_file *m, void *v, int 
> is_pid)
>  (unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
>  
>   if (!rollup_mode) {
> - arch_show_smap(m, vma);
> +#ifdef CONFIG_ARCH_HAS_PKEYS
> + if (arch_pkeys_enabled())
> + seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> +#endif
>   show_smap_vma_flags(m, vma);
>   }
> +

Why do you need to add ifdef here? The previous patch should make
arch_pkeys_enabled == F when CONFIG_ARCH_HAS_PKEYS=n. Btw. could you
merge those two patches into one. It is usually much easier to review a
new helper function if it is added along with a user.

>   m_cache_vma(m, vma);
>   return ret;
>  }
> -- 
> 1.7.1

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 12:13:22, Roman Gushchin wrote:
> On Tue, Jan 30, 2018 at 01:08:52PM +0100, Michal Hocko wrote:
> > On Tue 30-01-18 11:58:51, Roman Gushchin wrote:
> > > On Tue, Jan 30, 2018 at 09:54:45AM +0100, Michal Hocko wrote:
> > > > On Mon 29-01-18 11:11:39, Tejun Heo wrote:
> > > 
> > > Hello, Michal!
> > > 
> > > > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> > > > index 2eaed1e2243d..67bdf19f8e5b 100644
> > > > --- a/Documentation/cgroup-v2.txt
> > > > +++ b/Documentation/cgroup-v2.txt
> > > > @@ -1291,8 +1291,14 @@ This affects both system- and cgroup-wide OOMs. 
> > > > For a cgroup-wide OOM
> > > >  the memory controller considers only cgroups belonging to the sub-tree
> > > >  of the OOM'ing cgroup.
> > > >  
> > > > -The root cgroup is treated as a leaf memory cgroup, so it's compared
> > > > -with other leaf memory cgroups and cgroups with oom_group option set.
> > >   ^
> > > IMO, this statement is important. Isn't it?
> > > 
> > > > +Leaf cgroups are compared based on their cumulative memory usage. The
> > > > +root cgroup is treated as a leaf memory cgroup as well, so it's
> > > > +compared with other leaf memory cgroups. Due to internal implementation
> > > > +restrictions the size of the root cgroup is a cumulative sum of
> > > > +oom_badness of all its tasks (in other words oom_score_adj of each task
> > > > +is obeyed). Relying on oom_score_adj (appart from OOM_SCORE_ADJ_MIN)
> > > > +can lead to overestimating of the root cgroup consumption and it is
> > > 
> > > Hm, and underestimating too. Also OOM_SCORE_ADJ_MIN isn't any different
> > > in this case. Say, all tasks except a small one have OOM_SCORE_ADJ set to
> > > -999, this means the root croup has extremely low chances to be elected.
> > > 
> > > > +therefore discouraged. This might change in the future, though.
> > > 
> > > Other than that looks very good to me.
> > 
> > This?
> > 
> > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> > index 2eaed1e2243d..34ad80ee90f2 100644
> > --- a/Documentation/cgroup-v2.txt
> > +++ b/Documentation/cgroup-v2.txt
> > @@ -1291,8 +1291,15 @@ This affects both system- and cgroup-wide OOMs. For 
> > a cgroup-wide OOM
> >  the memory controller considers only cgroups belonging to the sub-tree
> >  of the OOM'ing cgroup.
> >  
> > -The root cgroup is treated as a leaf memory cgroup, so it's compared
> > -with other leaf memory cgroups and cgroups with oom_group option set.
> > +Leaf cgroups and cgroups with oom_group option set are compared based
> > +on their cumulative memory usage. The root cgroup is treated as a
> > +leaf memory cgroup as well, so it's compared with other leaf memory
> > +cgroups. Due to internal implementation restrictions the size of
> > +the root cgroup is a cumulative sum of oom_badness of all its tasks
> > +(in other words oom_score_adj of each task is obeyed). Relying on
> > +oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
> > +underestimating of the root cgroup consumption and it is therefore
> > +discouraged. This might change in the future, though.
> 
> Acked-by: Roman Gushchin 

Andrew?

>From 361275a05ad7026b8f721f8aa756a4975a2c42b1 Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Tue, 30 Jan 2018 09:54:15 +0100
Subject: [PATCH] oom, memcg: clarify root memcg oom accounting

David Rientjes has pointed out that the current way how the root memcg
is accounted for the cgroup aware OOM killer is undocumented. Unlike
regular cgroups there is no accounting going on in the root memcg
(mostly for performance reasons). Therefore we are suming up oom_badness
of its tasks. This might result in an over accounting because of the
oom_score_adj setting. Document this for now.

Acked-by: Roman Gushchin 
Signed-off-by: Michal Hocko 
---
 Documentation/cgroup-v2.txt | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
index 2eaed1e2243d..34ad80ee90f2 100644
--- a/Documentation/cgroup-v2.txt
+++ b/Documentation/cgroup-v2.txt
@@ -1291,8 +1291,15 @@ This affects both system- and cgroup-wide OOMs. For a 
cgroup-wide OOM
 the memory controller considers only cgroups belonging to the sub-tree
 of the OOM'ing cgroup.
 
-The root cgroup is treated as a leaf memory cgroup, so it's compared
-with other leaf memory cgroups and cgroups with oom_group option set.
+Leaf cgroups and cgroups with oom_group option set are compared based
+on their cumulative memory usage. The root cgroup is treated as a
+leaf memory cgroup as well, so it's compared with other leaf memory
+cgroups. Due to internal implementation restrictions the size of
+the root cgroup is a cumulative sum of oom_badness of all its tasks
+(in other words oom_score_adj of each task is obeyed). Relying on
+oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
+underestimating of the root cgroup consumpt

[PATCH] Fix broken link in Documentation/process/kernel-docs.rst

2018-01-30 Thread Grigory Shipunov

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix broken link in Documentation/process/kernel-docs.rst

2018-01-30 Thread Grigory Shipunov
Kernel Glossary has moved from /Glossary to /KernelGlossary
on kernelnewbies.org. This patch corrects this link.

Signed-off-by: Grigory Shipunov 
---
 Documentation/process/kernel-docs.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/process/kernel-docs.rst 
b/Documentation/process/kernel-docs.rst
index b8cac85a4001..3fb28de556e4 100644
--- a/Documentation/process/kernel-docs.rst
+++ b/Documentation/process/kernel-docs.rst
@@ -58,7 +58,7 @@ On-line docs
 * Title: **Linux Kernel Mailing List Glossary**
 
   :Author: various
-  :URL: http://kernelnewbies.org/glossary/
+  :URL: https://kernelnewbies.org/KernelGlossary
   :Date: rolling version
   :Keywords: glossary, terms, linux-kernel.
   :Description: From the introduction: "This glossary is intended as
-- 
2.14.3

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Tejun Heo
On Tue, Jan 30, 2018 at 01:20:11PM +0100, Michal Hocko wrote:
> From 361275a05ad7026b8f721f8aa756a4975a2c42b1 Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Tue, 30 Jan 2018 09:54:15 +0100
> Subject: [PATCH] oom, memcg: clarify root memcg oom accounting
> 
> David Rientjes has pointed out that the current way how the root memcg
> is accounted for the cgroup aware OOM killer is undocumented. Unlike
> regular cgroups there is no accounting going on in the root memcg
> (mostly for performance reasons). Therefore we are suming up oom_badness
> of its tasks. This might result in an over accounting because of the
> oom_score_adj setting. Document this for now.
> 
> Acked-by: Roman Gushchin 
> Signed-off-by: Michal Hocko 

Acked-by: Tejun Heo 

Thanks, Michal.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 27/27] mm: display pkey in smaps if arch_pkeys_enabled() is true

2018-01-30 Thread Ram Pai
On Tue, Jan 30, 2018 at 01:16:11PM +0100, Michal Hocko wrote:
> On Thu 18-01-18 17:50:48, Ram Pai wrote:
> [...]
> > @@ -851,9 +848,13 @@ static int show_smap(struct seq_file *m, void *v, int 
> > is_pid)
> >(unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
> >  
> > if (!rollup_mode) {
> > -   arch_show_smap(m, vma);
> > +#ifdef CONFIG_ARCH_HAS_PKEYS
> > +   if (arch_pkeys_enabled())
> > +   seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> > +#endif
> > show_smap_vma_flags(m, vma);
> > }
> > +
> 
> Why do you need to add ifdef here? The previous patch should make
> arch_pkeys_enabled == F when CONFIG_ARCH_HAS_PKEYS=n.

You are right. it need not be wrapped in CONFIG_ARCH_HAS_PKEYS.  I had to do it
because vma_pkey(vma)  is not defined in some architectures.

I will provide a generic vma_pkey() definition for architectures that do 
not support PKEYS.



> Btw. could you
> merge those two patches into one. It is usually much easier to review a
> new helper function if it is added along with a user.


ok.

Thanks,
RP

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation: Fix 'file_mapped' -> 'mapped_file'

2018-01-30 Thread Florian Schmidt
There is no entry file_mapped in the memory.stat file. This looks like a
simple word flip that's gone unnoticed since 2010 (dc10e281f5fc,
memcg: update documentation).

Signed-off-by: Florian Schmidt 
---
 Documentation/cgroup-v1/memory.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/cgroup-v1/memory.txt 
b/Documentation/cgroup-v1/memory.txt
index cefb63639070..a4af2e124e24 100644
--- a/Documentation/cgroup-v1/memory.txt
+++ b/Documentation/cgroup-v1/memory.txt
@@ -524,9 +524,9 @@ Note:
Only anonymous and swap cache memory is listed as part of 'rss' stat.
This should not be confused with the true 'resident set size' or the
amount of physical memory used by the cgroup.
-   'rss + file_mapped" will give you resident set size of cgroup.
+   'rss + mapped_file" will give you resident set size of cgroup.
(Note: file and shmem may be shared among other cgroups. In that case,
-file_mapped is accounted only when the memory cgroup is owner of page
+mapped_file is accounted only when the memory cgroup is owner of page
 cache.)
 
 5.3 swappiness
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: Fix 'file_mapped' -> 'mapped_file'

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 17:42:13, Florian Schmidt wrote:
> There is no entry file_mapped in the memory.stat file. This looks like a
> simple word flip that's gone unnoticed since 2010 (dc10e281f5fc,
> memcg: update documentation).
> 
> Signed-off-by: Florian Schmidt 

Acked-by: Michal Hocko 

Thanks for catching this.

> ---
>  Documentation/cgroup-v1/memory.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/cgroup-v1/memory.txt 
> b/Documentation/cgroup-v1/memory.txt
> index cefb63639070..a4af2e124e24 100644
> --- a/Documentation/cgroup-v1/memory.txt
> +++ b/Documentation/cgroup-v1/memory.txt
> @@ -524,9 +524,9 @@ Note:
>   Only anonymous and swap cache memory is listed as part of 'rss' stat.
>   This should not be confused with the true 'resident set size' or the
>   amount of physical memory used by the cgroup.
> - 'rss + file_mapped" will give you resident set size of cgroup.
> + 'rss + mapped_file" will give you resident set size of cgroup.
>   (Note: file and shmem may be shared among other cgroups. In that case,
> -  file_mapped is accounted only when the memory cgroup is owner of page
> +  mapped_file is accounted only when the memory cgroup is owner of page
>cache.)
>  
>  5.3 swappiness
> -- 
> 2.16.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Johannes Weiner
On Tue, Jan 30, 2018 at 01:20:11PM +0100, Michal Hocko wrote:
> From 361275a05ad7026b8f721f8aa756a4975a2c42b1 Mon Sep 17 00:00:00 2001
> From: Michal Hocko 
> Date: Tue, 30 Jan 2018 09:54:15 +0100
> Subject: [PATCH] oom, memcg: clarify root memcg oom accounting
> 
> David Rientjes has pointed out that the current way how the root memcg
> is accounted for the cgroup aware OOM killer is undocumented. Unlike
> regular cgroups there is no accounting going on in the root memcg
> (mostly for performance reasons). Therefore we are suming up oom_badness
> of its tasks. This might result in an over accounting because of the
> oom_score_adj setting. Document this for now.
> 
> Acked-by: Roman Gushchin 
> Signed-off-by: Michal Hocko 

Acked-by: Johannes Weiner 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Documentation: Fix 'file_mapped' -> 'mapped_file'

2018-01-30 Thread Tejun Heo
On Tue, Jan 30, 2018 at 05:42:13PM +0100, Florian Schmidt wrote:
> There is no entry file_mapped in the memory.stat file. This looks like a
> simple word flip that's gone unnoticed since 2010 (dc10e281f5fc,
> memcg: update documentation).
> 
> Signed-off-by: Florian Schmidt 

Applied to cgroup/for-4.16.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Documentation/process: kernel maintainer PGP guide

2018-01-30 Thread Konstantin Ryabitsev
This guide is an adapted version of the more general "Protecting Code
Integrity" guide written and maintained by The Linux Foundation IT for
use with open-source projects. It provides the oft-lacking guidance on
the following topics:

- how to properly protect one's PGP keys to minimize the risks of them
  being stolen and used maliciously to impersonate a kernel developer
- how to configure Git to properly use GnuPG
- when and how to use PGP with Git
- how to verify fellow Linux Kernel developer identities

I believe this document should live with the rest of the documentation
describing proper processes one should follow when participating in
kernel development. Placing it in a wiki on some place like kernel.org
would be insufficient for a number of reasons -- primarily, because only
a relatively small subset of maintainers have accounts on kernel.org,
but also because even those who do rarely remember that such wiki
exists. Keeping it with the rest of in-kernel docs should hopefully give
it more visibility, but also help keep it up-to-date as tools and
processes evolve.

Signed-off-by: Konstantin Ryabitsev 
---
 Documentation/process/index.rst|   1 +
 Documentation/process/maintainer-pgp-guide.rst | 899 +
 2 files changed, 900 insertions(+)
 create mode 100644 Documentation/process/maintainer-pgp-guide.rst

diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
index a430f6eee756..1c9fe657ed01 100644
--- a/Documentation/process/index.rst
+++ b/Documentation/process/index.rst
@@ -24,6 +24,7 @@ Below are the essential guides that every developer should 
read.
development-process
submitting-patches
coding-style
+   maintainer-pgp-guide
email-clients
kernel-enforcement-statement
kernel-driver-statement
diff --git a/Documentation/process/maintainer-pgp-guide.rst 
b/Documentation/process/maintainer-pgp-guide.rst
new file mode 100644
index ..21ec9169a4d5
--- /dev/null
+++ b/Documentation/process/maintainer-pgp-guide.rst
@@ -0,0 +1,899 @@
+===
+Kernel Maintainer PGP guide
+===
+
+This document is aimed at Linux kernel developers, and especially
+subsystem maintainers. It contains a subset of information discussed in
+the more general "`Protecting Code Integrity`_" guide published by the
+Linux Foundation. Please read that document for more in-depth discussion
+on some of the topics mentioned in this guide.
+
+.. _`Protecting Code Integrity`: 
https://github.com/lfit/itpol/blob/master/protecting-code-integrity.md
+
+The role of PGP in Linux Kernel development
+===
+
+PGP helps ensure the integrity of the code that is produced by the Linux
+Kernel development community and, to a lesser degree, establish trusted
+communication channels between developers via PGP-signed email exchange.
+
+The Linux Kernel source code is available in two main formats:
+
+- Distributed source repositories (git)
+- Periodic release snapshots (tarballs)
+
+Both git repositories and tarballs carry PGP signatures of the kernel
+developers who create official kernel releases. These signatures offer a
+cryptographic guarantee that downloadable versions made available via
+kernel.org or any other mirrors are identical to what these developers
+have on their workstations. To this end:
+
+- git repositories provide PGP signatures on all tags
+- tarballs provide detached PGP signatures with all downloads
+
+Trusting the developers, not infrastructure
+---
+
+Ever since the 2011 compromise of core kernel.org systems, the main
+operating principle of the Kernel Archives project has been to assume
+that any part of the infrastructure can be compromised at any time. For
+this reason, the administrators have taken deliberate steps to emphasize
+that trust must always be placed with developers and never with the code
+hosting infrastructure, regardless of how good the security practices
+for the latter may be.
+
+The above guiding principle is the reason why this guide is needed. We
+want to make sure that by placing trust into developers we do not simply
+shift the blame for potential future security incidents to someone else.
+The goal is to provide a set of guidelines developers can use to create
+a secure working environment and safeguard the PGP keys used to
+establish the integrity of the Linux Kernel itself.
+
+PGP tools
+=
+
+Use GnuPG v2
+
+
+Your distro should already have GnuPG installed by default, you just
+need to verify that you are using version 2.x and not the legacy 1.4
+release -- many distributions still package both, with the default
+``gpg`` command invoking GnuPG v.1. To check, run::
+
+$ gpg --version | head -n1
+
+If you see ``gpg (GnuPG) 1.4.x``, then you are using GnuPG v.1. Try the
+``gpg2`` command (if you don't have it, you may need to install the
+gnupg2 package)::
+
+   

Re: [PATCH v9 5/7] arm64: kvm: Introduce KVM_ARM_SET_SERROR_ESR ioctl

2018-01-30 Thread James Morse
Hi gengdongjiu,

On 24/01/18 20:06, gengdongjiu wrote:
>> On 06/01/18 16:02, Dongjiu Geng wrote:
>>> The ARM64 RAS SError Interrupt(SEI) syndrome value is specific to the
>>> guest and user space needs a way to tell KVM this value. So we add a
>>> new ioctl. Before user space specifies the Exception Syndrome Register
>>> ESR(ESR), it firstly checks that whether KVM has the capability to set
>>> the guest ESR, If has, will set it. Otherwise, nothing to do.
>>>
>>> For this ESR specifying, Only support for AArch64, not support AArch32.
>>
>> After this patch user-space can trigger an SError in the guest. If it wants 
>> to migrate the guest, how does the pending SError get migrated?
>>
>> I think we need to fix migration first. Andrew Jones suggested using
>> KVM_GET/SET_VCPU_EVENTS:
>> https://www.spinics.net/lists/arm-kernel/msg616846.html
>>
>> Given KVM uses kvm_inject_vabt() on v8.0 hardware too, we should cover 
>> systems without the v8.2 RAS Extensions with the same API. I
>> think this means a bit to read/write whether SError is pending, and another 
>> to indicate the ESR should be set/read.
>> CPUs without the v8.2 RAS Extensions can reject pending-SError that had an 
>> ESR.
> 
> For the CPUs without the v8.2 RAS Extensions, its ESR is always 0, 
> we only can inject a SError with ESR 0 to guest, cannot set its ESR.

0? It's always implementation-defined. On Juno it seems to be always-0, but
other systems may behave differently. (Juno may generate another ESR value when
I'm not watching it...)

Just because we can't control the ESR doesn't mean injecting an SError isn't
something user-space may want to do.
If we tackle migration of pending-SError first, I think that will give us the
API to create a new pending SError with/without an ESR as appropriate.


> About how about to use the KVM_GET/SET_VCPU_EVENTS, I will check the code, and
> consider your suggestion at the same time.

(Not my suggestion, It was Andrew Jones idea.)

> The IOCTL KVM_GET/SET_VCPU_EVENTS has been used by X86.

We would be re-using the struct to have values with slightly different meanings.
But for migration the upshot is the same, call KVM_GET_VCPU_EVENTS on one
system, and pass the struct to KVM_SET_VCPU_EVENTS on the new system. If we're
lucky Qemu may be able to do this in shared x86/arm64 code.


>>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index
>>> 5c7f657..738ae90 100644
>>> --- a/arch/arm64/kvm/guest.c
>>> +++ b/arch/arm64/kvm/guest.c
>>> @@ -277,6 +277,11 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu 
>>> *vcpu,
>>> return -EINVAL;
>>>  }
>>>
>>> +int kvm_arm_set_sei_esr(struct kvm_vcpu *vcpu, u32 *syndrome) {
>>> +   return -EINVAL;
>>> +}
>>
>> Does nothing in the patch that adds the support? This is a bit odd.
>> (oh, its hiding in patch 6...)
> 
> To make this patch simple and small, I add it in patch 6.

But that made the functionality of this patch: A new API to return -EINVAL from
the kernel.

Swapping the patches round would have avoided this.
Regardless, I think this will fold out in a rebase.


Thanks,

James
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Andrew Morton
On Tue, 30 Jan 2018 13:20:11 +0100 Michal Hocko  wrote:

> Subject: [PATCH] oom, memcg: clarify root memcg oom accounting
> 
> David Rientjes has pointed out that the current way how the root memcg
> is accounted for the cgroup aware OOM killer is undocumented. Unlike
> regular cgroups there is no accounting going on in the root memcg
> (mostly for performance reasons). Therefore we are suming up oom_badness
> of its tasks. This might result in an over accounting because of the
> oom_score_adj setting. Document this for now.

Thanks.  Some tweakage:

--- 
a/Documentation/cgroup-v2.txt~mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix
+++ a/Documentation/cgroup-v2.txt
@@ -1292,13 +1292,13 @@ of the OOM'ing cgroup.
 
 Leaf cgroups and cgroups with oom_group option set are compared based
 on their cumulative memory usage. The root cgroup is treated as a
-leaf memory cgroup as well, so it's compared with other leaf memory
+leaf memory cgroup as well, so it is compared with other leaf memory
 cgroups. Due to internal implementation restrictions the size of
-the root cgroup is a cumulative sum of oom_badness of all its tasks
+the root cgroup is the cumulative sum of oom_badness of all its tasks
 (in other words oom_score_adj of each task is obeyed). Relying on
-oom_score_adj (appart from OOM_SCORE_ADJ_MIN) can lead to over or
-underestimating of the root cgroup consumption and it is therefore
-discouraged. This might change in the future, though.
+oom_score_adj (apart from OOM_SCORE_ADJ_MIN) can lead to over- or
+underestimation of the root cgroup consumption and it is therefore
+discouraged. This might change in the future, however.
 
 If there are no cgroups with the enabled memory controller,
 the OOM killer is using the "traditional" process-based approach.
_

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

2018-01-30 Thread James Morse
Hi gengdongjiu,

On 23/01/18 09:23, gengdongjiu wrote:
> On 2018/1/23 3:39, James Morse wrote:
>> gengdongjiu wrote:
>>> This error source parsing and handling method
>>> is similar with the SEA.
>>
>> There are problems with doing this:
>>
>> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
>> | How do SEA and SEI interact?
>> |
>> | As far as I can see they can both interrupt each other, which isn't 
>> something
>> | the single in_nmi() path in APEI can handle. I thinks we should fix this
>> | first.
>>
>> [..]
>>
>> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
>> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
>> | from SEA, but not SEI. (what happens if an SError arrives while we are
>> | queueing memory_failure work from an IRQ).
>> |
>> | The one that scares me is the trace-point reporting stuff. What happens if 
>> an
>> | SError arrives while we are enabling a trace point? (these are static-keys
>> | right?)
>> |
>> |  I don't think we can just plumb SEI in like this and be done with it.
>> |  (I'm looking at teasing out the estatus cache code from being x86:NMI 
>> only.
>> |  This way we solve the same 'cant do this from NMI context' with the same
>> |  code'.)
>>
>>
>> I will post what I've got for this estatus-cache thing as an RFC, its not 
>> ready
>> to be considered yet.

> Yes, I know you are dong that. Your serial's patch will consider all above 
> things, right?

Assuming I got it right, yes. It currently makes the race Xie XiuQi spotted
worse, which I want to fix too. (details on the cover letter)


> If your patch can be consider that, this patch can based on your patchset. 
> thanks.

I'd like to pick these patches onto the end of that series, but first I want to
know what NOTIFY_SEI means for any OS. The ACPI spec doesn't say, and because
its asynchronous, route-able and mask-able, there are many more corners than
NOTFIY_SEA.

This thing is a notification using an emulated SError exception. (emulated
because physical-SError must be routed to EL3 for firmware-first, and
virtual-SError belongs to EL2).

Does your firmware emulate SError exactly as the TakeException() pseudo code in
the Arm-Arm?
Is the emulated SError routed following the routing rules for HCR_EL2.{AMO, 
TGE}?
What does your firmware do when it wants to emulate SError but its masked?
(e.g.1: The physical-SError interrupted EL2 and the SPSR shows EL2 had PSTATE.A
 set.
 e.g.2: The physical-SError interrupted EL2 but HCR_EL2 indicates the emulated
 SError should go to EL1. This effectively masks SError.)


Answers to these let us determine whether a bug is in the firmware or the
kernel. If firmware is expecting the OS to do something special, I'd like to
know about it from the beginning!


>>> Expose API ghes_notify_sei() to external users. External
>>> modules can call this exposed API to parse APEI table and
>>> handle the SEI notification.
>>
>> external modules? You mean called by the arch code when it gets this 
>> NOTIFY_SEI?

> yes, called by kernel ARCH code, such as below, I remember I have discussed 
> with you.

Sure. The phrase 'external modules' usually means the '.ko' files that live in
/lib/modules, nothing outside the kernel tree should be doing this stuff.


Thanks,

James

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] input: pxrc: new driver for PhoenixRC Flight Controller Adapter

2018-01-30 Thread Marcus Folkesson
Hello Dmitry,

Do you mind have a look at v5?

Thanks!


On Sat, Jan 20, 2018 at 09:58:40PM +0100, Marcus Folkesson wrote:
> This driver let you plug in your RC controller to the adapter and
> use it as input device in various RC simulators.
> 
> Signed-off-by: Marcus Folkesson 
> ---
> 
> v5:
>   - Drop autosuspend support
>   - Use pm_mutex instead of input_dev->mutex
>   - Use pxrc->is_open instead of input_dev->users
> v4:
>   - Add call to usb_mark_last_busy() in irq
>   - Move code from pxrc_resume() to pxrc_reset_resume()
> v3:
>   - Use RUDDER and MISC instead of TILT_X and TILT_Y
>   - Drop kref and anchor
>   - Rework URB handling
>   - Add PM support
> v2:
>   - Change module license to GPLv2 to match SPDX tag
> 
> 
>  Documentation/input/devices/pxrc.rst |  57 +++
>  drivers/input/joystick/Kconfig   |   9 ++
>  drivers/input/joystick/Makefile  |   1 +
>  drivers/input/joystick/pxrc.c| 303 
> +++
>  4 files changed, 370 insertions(+)
>  create mode 100644 Documentation/input/devices/pxrc.rst
>  create mode 100644 drivers/input/joystick/pxrc.c
> 
> diff --git a/Documentation/input/devices/pxrc.rst 
> b/Documentation/input/devices/pxrc.rst
> new file mode 100644
> index ..ca11f646bae8
> --- /dev/null
> +++ b/Documentation/input/devices/pxrc.rst
> @@ -0,0 +1,57 @@
> +===
> +pxrc - PhoenixRC Flight Controller Adapter
> +===
> +
> +:Author: Marcus Folkesson 
> +
> +This driver let you use your own RC controller plugged into the
> +adapter that comes with PhoenixRC [1]_ or other compatible adapters.
> +
> +The adapter supports 7 analog channels and 1 digital input switch.
> +
> +Notes
> +=
> +
> +Many RC controllers is able to configure which stick goes to which channel.
> +This is also configurable in most simulators, so a matching is not necessary.
> +
> +The driver is generating the following input event for analog channels:
> +
> ++-++
> +| Channel |  Event |
> ++=++
> +| 1   |  ABS_X |
> ++-++
> +| 2   |  ABS_Y |
> ++-++
> +| 3   |  ABS_RX|
> ++-++
> +| 4   |  ABS_RY|
> ++-++
> +| 5   |  ABS_RUDDER|
> ++-++
> +| 6   |  ABS_THROTTLE  |
> ++-++
> +| 7   |  ABS_MISC  |
> ++-++
> +
> +The digital input switch is generated as an `BTN_A` event.
> +
> +Manual Testing
> +==
> +
> +To test this driver's functionality you may use `input-event` which is part 
> of
> +the `input layer utilities` suite [2]_.
> +
> +For example::
> +
> +> modprobe pxrc
> +> input-events 
> +
> +To print all input events from input `devnr`.
> +
> +References
> +==
> +
> +.. [1] http://www.phoenix-sim.com/
> +.. [2] https://www.kraxel.org/cgit/input/
> diff --git a/drivers/input/joystick/Kconfig b/drivers/input/joystick/Kconfig
> index f3c2f6ea8b44..332c0cc1b2ab 100644
> --- a/drivers/input/joystick/Kconfig
> +++ b/drivers/input/joystick/Kconfig
> @@ -351,4 +351,13 @@ config JOYSTICK_PSXPAD_SPI_FF
>  
> To drive rumble motor a dedicated power supply is required.
>  
> +config JOYSTICK_PXRC
> + tristate "PhoenixRC Flight Controller Adapter"
> + depends on USB_ARCH_HAS_HCD
> + depends on USB
> + help
> +   Say Y here if you want to use the PhoenixRC Flight Controller Adapter.
> +
> +   To compile this driver as a module, choose M here: the
> +   module will be called pxrc.
>  endif
> diff --git a/drivers/input/joystick/Makefile b/drivers/input/joystick/Makefile
> index 67651efda2e1..dd0492ebbed7 100644
> --- a/drivers/input/joystick/Makefile
> +++ b/drivers/input/joystick/Makefile
> @@ -23,6 +23,7 @@ obj-$(CONFIG_JOYSTICK_JOYDUMP)  += joydump.o
>  obj-$(CONFIG_JOYSTICK_MAGELLAN)  += magellan.o
>  obj-$(CONFIG_JOYSTICK_MAPLE) += maplecontrol.o
>  obj-$(CONFIG_JOYSTICK_PSXPAD_SPI)+= psxpad-spi.o
> +obj-$(CONFIG_JOYSTICK_PXRC)  += pxrc.o
>  obj-$(CONFIG_JOYSTICK_SIDEWINDER)+= sidewinder.o
>  obj-$(CONFIG_JOYSTICK_SPACEBALL) += spaceball.o
>  obj-$(CONFIG_JOYSTICK_SPACEORB)  += spaceorb.o
> diff --git a/drivers/input/joystick/pxrc.c b/drivers/input/joystick/pxrc.c
> new file mode 100644
> index ..07a0dbd3ced2
> --- /dev/null
> +++ b/drivers/input/joystick/pxrc.c
> @@ -0,0 +1,303 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Driver for Phoenix RC Flight Controller Adapter
> + *
> + * Copyright (C) 2018 Marcus Folkesson 
> + *
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define PXRC_VENDOR_ID   (

[PATCH] dm cache: Documentation: update default migration_throttling value

2018-01-30 Thread John Pittman
In commit f8350daf7af0 ("dm cache: tune migration throttling") the
value for DEFAULT_MIGRATION_THRESHOLD was decreased from 204800 to
2048.  Edit device-mapper/cache.txt to reflect the correct default
value for migration_threshold.

Signed-off-by: John Pittman 
---
 Documentation/device-mapper/cache.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/device-mapper/cache.txt 
b/Documentation/device-mapper/cache.txt
index cdfd0fe..c72455e 100644
--- a/Documentation/device-mapper/cache.txt
+++ b/Documentation/device-mapper/cache.txt
@@ -119,7 +119,7 @@ doing here to avoid migrating during those peak io moments.
 
 For the time being, a message "migration_threshold <#sectors>"
 can be used to set the maximum number of sectors being migrated,
-the default being 204800 sectors (or 100MB).
+the default being 2048 sectors (or 1MiB).
 
 Updating on-disk metadata
 -
-- 
2.7.5

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: dm cache: Documentation: update default migration_throttling value

2018-01-30 Thread Mike Snitzer
On Tue, Jan 30 2018 at  4:39pm -0500,
John Pittman  wrote:

> In commit f8350daf7af0 ("dm cache: tune migration throttling") the
> value for DEFAULT_MIGRATION_THRESHOLD was decreased from 204800 to
> 2048.  Edit device-mapper/cache.txt to reflect the correct default
> value for migration_threshold.
> 
> Signed-off-by: John Pittman 

Thanks, I've picked this up.. but I used "1MB" instead of "1MiB".
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable

2018-01-30 Thread David Rientjes
On Tue, 30 Jan 2018, Michal Hocko wrote:

> > > So what is the actual semantic and scope of this policy. Does it apply
> > > only down the hierarchy. Also how do you compare cgroups with different
> > > policies? Let's say you have
> > >   root
> > >  / |  \
> > > A  B   C
> > >/ \/ \
> > >   D   E  F   G
> > > 
> > > Assume A: cgroup, B: oom_group=1, C: tree, G: oom_group=1
> > > 
> > 
> > At each level of the hierarchy, memory.oom_policy compares immediate 
> > children, it's the only way that an admin can lock in a specific oom 
> > policy like "tree" and then delegate the subtree to the user.  If you've 
> > configured it as above, comparing A and C should be the same based on the 
> > cumulative usage of their child mem cgroups.
> 
> So cgroup == tree if we are memcg aware OOM killing, right? Why do we
> need both then? Just to make memcg aware OOM killing possible?
> 

We need "tree" to account the usage of the subtree rather than simply the 
cgroup alone, but "cgroup" and "tree" are accounted with the same units.  
In your example, D and E are treated as individual memory consumers and C 
is treated as the sum of all subtree memory consumers.

If we have /students/michal and /students/david, and both of these are 
"cgroup" policy, as the current patchset in -mm implements, and you use 
1GB, but I create /students/david/{a,b,c,d} each with 512MB of usage, you 
always get oom killed.

If we both have "tree" policy, I always get oom killed because my usage is 
2GB.  /students/michal and /students/david are compared based on their 
total usage instead of each cgroup being an individual memory consumer.

This is impossible with what is in -mm.

> > The policy for B hasn't been specified, but since it does not have any 
> > children "cgroup" and "tree" should be the same.
> 
> So now you have a killable cgroup selected by process criterion? That
> just doesn't make any sense. So I guess it would at least require to
> enforce (cgroup || tree) to allow oom_group.
> 

Hmm, I'm not sure why we would limit memory.oom_group to any policy.  Even 
if we are selecting a process, even without selecting cgroups as victims, 
killing a process may still render an entire cgroup useless and it makes 
sense to kill all processes in that cgroup.  If an unlucky process is 
selected with today's heursitic of oom_badness() or with a "none" policy 
with my patchset, I don't see why we can't enable the user to kill all 
other processes in the cgroup.  It may not make sense for some trees, but 
but I think it could be useful for others.

> > Right, a policy of "none" reverts its subtree back to per-process 
> > comparison if you are either not using the cgroup aware oom killer or your 
> > subtree is not using the cgroup aware oom killer.
> 
> So how are you going to compare none cgroups with those that consider
> full memcg or hierarchy (cgroup, tree)? Are you going to consider
> oom_score_adj?
> 

No, I think it would make sense to make the restriction that to set 
"none", the ancestor mem cgroups would also need the same policy, which is 
to select the largest process while still respecting 
/proc/pid/oom_score_adj.

> > In that case, mem_cgroup_select_oom_victim() always 
> > returns false and the value of memory.oom_group is ignored.  I agree that 
> > it's weird in -mm and there's nothing preventing us from separating 
> > memory.oom_group from the cgroup aware oom killer and allowing it to be 
> > set regardless of a selection change.
> 
> it is not weird. I suspect you misunderstood the code and its intention.
> 

We agree that memory.oom_group and a selection logic are two different 
things, and that's why I find it weird that memory.oom_group cannot be set 
without locking the entire hierarchy into a selection logic.  If you have 
a subtree oom, it makes sense for you to be able to kill all processes as 
a property of the workload.  That's independent of how the target mem 
cgroup was selected.  Regardless of the selection logic, we're going 
to target a specific mem cgroup for kill.  Choosing to kill one or all 
processes is still useful.

> > No, perhaps I wasn't clear in the documentation: the policy at each level 
> > of the hierarchy is specified by memory.oom_policy and compares its 
> > immediate children with that policy.  So the per-cgroup usage of A, B, and 
> > C and compared regardless of A, B, and C's own oom policies.
> 
> You are still operating in terms of levels. And that is rather confusing
> because we are operating on a _tree_ and that walk has to be independent
> on the way we walk that tree - i.e. whether we do DFS or BFS ordering.
> 

The selection criteria for the proposed policies, which can be extended, 
is to compare individual cgroups (for "cgroups" policy) to determine the 
victim and within that subtree, to allow the selection to be delegated 
further.  If the goal is the largest cgroup, all mem cgroups down the tree 
will have "cgroup" set.  If you

Re: [PATCH] Documentation/process: kernel maintainer PGP guide

2018-01-30 Thread Jani Nikula
On Tue, 30 Jan 2018, Konstantin Ryabitsev  
wrote:
> This guide is an adapted version of the more general "Protecting Code
> Integrity" guide written and maintained by The Linux Foundation IT for
> use with open-source projects. It provides the oft-lacking guidance on
> the following topics:
>
> - how to properly protect one's PGP keys to minimize the risks of them
>   being stolen and used maliciously to impersonate a kernel developer
> - how to configure Git to properly use GnuPG
> - when and how to use PGP with Git
> - how to verify fellow Linux Kernel developer identities
>
> I believe this document should live with the rest of the documentation
> describing proper processes one should follow when participating in
> kernel development. Placing it in a wiki on some place like kernel.org
> would be insufficient for a number of reasons -- primarily, because only
> a relatively small subset of maintainers have accounts on kernel.org,
> but also because even those who do rarely remember that such wiki
> exists. Keeping it with the rest of in-kernel docs should hopefully give
> it more visibility, but also help keep it up-to-date as tools and
> processes evolve.

FWIW, agreed on having this with the kernel documentation.

I can't say I reviewed it, but glancing through I didn't spot any errors
either. Lots of good stuff.

Just one nit, I think it would be better to move the Maintainer: bit
from the end near the top as a reStructuredText field list. See 'git
grep :Author:' under Documentation for examples. Could even add a
MAINTAINERS entry to improve your chances of being Cc'd on changes.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html