Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Quan Xu



On 2017/11/14 15:12, Wanpeng Li wrote:

2017-11-14 15:02 GMT+08:00 Quan Xu :


On 2017/11/13 18:53, Juergen Gross wrote:

On 13/11/17 11:06, Quan Xu wrote:

From: Quan Xu 

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Juergen Gross 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org

Hmm, is the idle entry path really so critical to performance that a new
pvops function is necessary?

Juergen, Here is the data we get when running benchmark netperf:
  1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
 29031.6 bit/s -- 76.1 %CPU

  2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
 35787.7 bit/s -- 129.4 %CPU

  3. w/ kvm dynamic poll:
 35735.6 bit/s -- 200.0 %CPU

Actually we can reduce the CPU utilization by sleeping a period of
time as what has already been done in the poll logic of IO subsystem,
then we can improve the algorithm in kvm instead of introduing another
duplicate one in the kvm guest.

We really appreciate upstream's kvm dynamic poll mechanism, which is
really helpful for a lot of scenario..

However, as description said, in virtualization, idle path includes
several heavy operations includes timer access (LAPIC timer or TSC
deadline timer) which will hurt performance especially for latency
intensive workload like message passing task. The cost is mainly from
the vmexit which is a hardware context switch between virtual machine
and hypervisor.

for upstream's kvm dynamic poll mechanism, even you could provide a
better algorism, how could you bypass timer access (LAPIC timer or TSC
deadline timer), or a hardware context switch between virtual machine
and hypervisor. I know these is a tradeoff.

Furthermore, here is the data we get when running benchmark contextswitch
to measure the latency(lower is better):

1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
  3402.9 ns/ctxsw -- 199.8 %CPU

2. w/ patch and disable kvm dynamic poll:
  1163.5 ns/ctxsw -- 205.5 %CPU

3. w/ kvm dynamic poll:
  2280.6 ns/ctxsw -- 199.5 %CPU

so, these tow solution are quite similar, but not duplicate..

that's also why to add a generic idle poll before enter real idle path.
When a reschedule event is pending, we can bypass the real idle path.


Quan
Alibaba Cloud





Regards,
Wanpeng Li


  4. w/patch and w/ kvm dynamic poll:
 42225.3 bit/s -- 198.7 %CPU

  5. idle=poll
 37081.7 bit/s -- 998.1 %CPU



  w/ this patch, we will improve performance by 23%.. even we could improve
  performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the
  cost of CPU is much lower than 'idle=poll' case..


Wouldn't a function pointer, maybe guarded
by a static key, be enough? A further advantage would be that this would
work on other architectures, too.


I assume this feature will be ported to other archs.. a new pvops makes code
clean and easy to maintain. also I tried to add it into existed pvops, but
it
doesn't match.



Quan
Alibaba Cloud


Juergen



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Wanpeng Li
2017-11-14 16:15 GMT+08:00 Quan Xu :
>
>
> On 2017/11/14 15:12, Wanpeng Li wrote:
>>
>> 2017-11-14 15:02 GMT+08:00 Quan Xu :
>>>
>>>
>>> On 2017/11/13 18:53, Juergen Gross wrote:

 On 13/11/17 11:06, Quan Xu wrote:
>
> From: Quan Xu 
>
> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
> in idle path which will poll for a while before we enter the real idle
> state.
>
> In virtualization, idle path includes several heavy operations
> includes timer access(LAPIC timer or TSC deadline timer) which will
> hurt performance especially for latency intensive workload like message
> passing task. The cost is mainly from the vmexit which is a hardware
> context switch between virtual machine and hypervisor. Our solution is
> to poll for a while and do not enter real idle path if we can get the
> schedule event during polling.
>
> Poll may cause the CPU waste so we adopt a smart polling mechanism to
> reduce the useless poll.
>
> Signed-off-by: Yang Zhang 
> Signed-off-by: Quan Xu 
> Cc: Juergen Gross 
> Cc: Alok Kataria 
> Cc: Rusty Russell 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org
> Cc: xen-de...@lists.xenproject.org

 Hmm, is the idle entry path really so critical to performance that a new
 pvops function is necessary?
>>>
>>> Juergen, Here is the data we get when running benchmark netperf:
>>>   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
>>>  29031.6 bit/s -- 76.1 %CPU
>>>
>>>   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
>>>  35787.7 bit/s -- 129.4 %CPU
>>>
>>>   3. w/ kvm dynamic poll:
>>>  35735.6 bit/s -- 200.0 %CPU
>>
>> Actually we can reduce the CPU utilization by sleeping a period of
>> time as what has already been done in the poll logic of IO subsystem,
>> then we can improve the algorithm in kvm instead of introduing another
>> duplicate one in the kvm guest.
>
> We really appreciate upstream's kvm dynamic poll mechanism, which is
> really helpful for a lot of scenario..
>
> However, as description said, in virtualization, idle path includes
> several heavy operations includes timer access (LAPIC timer or TSC
> deadline timer) which will hurt performance especially for latency
> intensive workload like message passing task. The cost is mainly from
> the vmexit which is a hardware context switch between virtual machine
> and hypervisor.
>
> for upstream's kvm dynamic poll mechanism, even you could provide a
> better algorism, how could you bypass timer access (LAPIC timer or TSC
> deadline timer), or a hardware context switch between virtual machine
> and hypervisor. I know these is a tradeoff.
>
> Furthermore, here is the data we get when running benchmark contextswitch
> to measure the latency(lower is better):
>
> 1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
>   3402.9 ns/ctxsw -- 199.8 %CPU
>
> 2. w/ patch and disable kvm dynamic poll:
>   1163.5 ns/ctxsw -- 205.5 %CPU
>
> 3. w/ kvm dynamic poll:
>   2280.6 ns/ctxsw -- 199.5 %CPU
>
> so, these tow solution are quite similar, but not duplicate..
>
> that's also why to add a generic idle poll before enter real idle path.
> When a reschedule event is pending, we can bypass the real idle path.
>

There is a similar logic in the idle governor/driver, so how this
patchset influence the decision in the idle governor/driver when
running on bare-metal(power managment is not exposed to the guest so
we will not enter into idle driver in the guest)?

Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Quan Xu



On 2017/11/14 15:30, Juergen Gross wrote:

On 14/11/17 08:02, Quan Xu wrote:


On 2017/11/13 18:53, Juergen Gross wrote:

On 13/11/17 11:06, Quan Xu wrote:

From: Quan Xu 

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Juergen Gross 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org

Hmm, is the idle entry path really so critical to performance that a new
pvops function is necessary?

Juergen, Here is the data we get when running benchmark netperf:
  1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
     29031.6 bit/s -- 76.1 %CPU

  2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
     35787.7 bit/s -- 129.4 %CPU

  3. w/ kvm dynamic poll:
     35735.6 bit/s -- 200.0 %CPU

  4. w/patch and w/ kvm dynamic poll:
     42225.3 bit/s -- 198.7 %CPU

  5. idle=poll
     37081.7 bit/s -- 998.1 %CPU



  w/ this patch, we will improve performance by 23%.. even we could improve
  performance by 45.4%, if we use w/patch and w/ kvm dynamic poll. also the
  cost of CPU is much lower than 'idle=poll' case..

I don't question the general idea. I just think pvops isn't the best way
to implement it.


Wouldn't a function pointer, maybe guarded
by a static key, be enough? A further advantage would be that this would
work on other architectures, too.

I assume this feature will be ported to other archs.. a new pvops makes


  sorry, a typo.. /other archs/other hypervisors/
  it refers hypervisor like Xen, HyperV and VMware)..


code
clean and easy to maintain. also I tried to add it into existed pvops,
but it
doesn't match.

You are aware that pvops is x86 only?


yes, I'm aware..


I really don't see the big difference in maintainability compared to the
static key / function pointer variant:

void (*guest_idle_poll_func)(void);
struct static_key guest_idle_poll_key __read_mostly;

static inline void guest_idle_poll(void)
{
if (static_key_false(&guest_idle_poll_key))
guest_idle_poll_func();
}




thank you for your sample code :)
I agree there is no big difference.. I think we are discussion for two 
things:

 1) x86 VM on different hypervisors
 2) different archs VM on kvm hypervisor

What I want to do is x86 VM on different hypervisors, such as kvm / xen 
/ hyperv ..



And KVM would just need to set guest_idle_poll_func and enable the
static key. Works on non-x86 architectures, too.



.. referred to 'pv_mmu_ops', HyperV and Xen can implement their own 
functions for 'pv_mmu_ops'.

I think it is the same to pv_idle_ops.

with above explaination, do you still think I need to define the static
key/function pointer variant?

btw, any interest to port it to Xen HVM guest? :)

Quan
Alibaba Cloud
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Quan Xu



On 2017/11/14 16:22, Wanpeng Li wrote:

2017-11-14 16:15 GMT+08:00 Quan Xu :


On 2017/11/14 15:12, Wanpeng Li wrote:

2017-11-14 15:02 GMT+08:00 Quan Xu :


On 2017/11/13 18:53, Juergen Gross wrote:

On 13/11/17 11:06, Quan Xu wrote:

From: Quan Xu 

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Juergen Gross 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org

Hmm, is the idle entry path really so critical to performance that a new
pvops function is necessary?

Juergen, Here is the data we get when running benchmark netperf:
   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
  29031.6 bit/s -- 76.1 %CPU

   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
  35787.7 bit/s -- 129.4 %CPU

   3. w/ kvm dynamic poll:
  35735.6 bit/s -- 200.0 %CPU

Actually we can reduce the CPU utilization by sleeping a period of
time as what has already been done in the poll logic of IO subsystem,
then we can improve the algorithm in kvm instead of introduing another
duplicate one in the kvm guest.

We really appreciate upstream's kvm dynamic poll mechanism, which is
really helpful for a lot of scenario..

However, as description said, in virtualization, idle path includes
several heavy operations includes timer access (LAPIC timer or TSC
deadline timer) which will hurt performance especially for latency
intensive workload like message passing task. The cost is mainly from
the vmexit which is a hardware context switch between virtual machine
and hypervisor.

for upstream's kvm dynamic poll mechanism, even you could provide a
better algorism, how could you bypass timer access (LAPIC timer or TSC
deadline timer), or a hardware context switch between virtual machine
and hypervisor. I know these is a tradeoff.

Furthermore, here is the data we get when running benchmark contextswitch
to measure the latency(lower is better):

1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
   3402.9 ns/ctxsw -- 199.8 %CPU

2. w/ patch and disable kvm dynamic poll:
   1163.5 ns/ctxsw -- 205.5 %CPU

3. w/ kvm dynamic poll:
   2280.6 ns/ctxsw -- 199.5 %CPU

so, these tow solution are quite similar, but not duplicate..

that's also why to add a generic idle poll before enter real idle path.
When a reschedule event is pending, we can bypass the real idle path.


There is a similar logic in the idle governor/driver, so how this
patchset influence the decision in the idle governor/driver when
running on bare-metal(power managment is not exposed to the guest so
we will not enter into idle driver in the guest)?



This is expected to take effect only when running as a virtual machine with
proper CONFIG_* enabled. This can not work on bare mental even with proper
CONFIG_* enabled.

Quan
Alibaba Cloud
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Juergen Gross
On 14/11/17 10:38, Quan Xu wrote:
> 
> 
> On 2017/11/14 15:30, Juergen Gross wrote:
>> On 14/11/17 08:02, Quan Xu wrote:
>>>
>>> On 2017/11/13 18:53, Juergen Gross wrote:
 On 13/11/17 11:06, Quan Xu wrote:
> From: Quan Xu 
>
> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
> in idle path which will poll for a while before we enter the real idle
> state.
>
> In virtualization, idle path includes several heavy operations
> includes timer access(LAPIC timer or TSC deadline timer) which will
> hurt performance especially for latency intensive workload like
> message
> passing task. The cost is mainly from the vmexit which is a hardware
> context switch between virtual machine and hypervisor. Our solution is
> to poll for a while and do not enter real idle path if we can get the
> schedule event during polling.
>
> Poll may cause the CPU waste so we adopt a smart polling mechanism to
> reduce the useless poll.
>
> Signed-off-by: Yang Zhang 
> Signed-off-by: Quan Xu 
> Cc: Juergen Gross 
> Cc: Alok Kataria 
> Cc: Rusty Russell 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: linux-ker...@vger.kernel.org
> Cc: xen-de...@lists.xenproject.org
 Hmm, is the idle entry path really so critical to performance that a
 new
 pvops function is necessary?
>>> Juergen, Here is the data we get when running benchmark netperf:
>>>   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
>>>  29031.6 bit/s -- 76.1 %CPU
>>>
>>>   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
>>>  35787.7 bit/s -- 129.4 %CPU
>>>
>>>   3. w/ kvm dynamic poll:
>>>  35735.6 bit/s -- 200.0 %CPU
>>>
>>>   4. w/patch and w/ kvm dynamic poll:
>>>  42225.3 bit/s -- 198.7 %CPU
>>>
>>>   5. idle=poll
>>>  37081.7 bit/s -- 998.1 %CPU
>>>
>>>
>>>
>>>   w/ this patch, we will improve performance by 23%.. even we could
>>> improve
>>>   performance by 45.4%, if we use w/patch and w/ kvm dynamic poll.
>>> also the
>>>   cost of CPU is much lower than 'idle=poll' case..
>> I don't question the general idea. I just think pvops isn't the best way
>> to implement it.
>>
 Wouldn't a function pointer, maybe guarded
 by a static key, be enough? A further advantage would be that this
 would
 work on other architectures, too.
>>> I assume this feature will be ported to other archs.. a new pvops makes
> 
>   sorry, a typo.. /other archs/other hypervisors/
>   it refers hypervisor like Xen, HyperV and VMware)..
> 
>>> code
>>> clean and easy to maintain. also I tried to add it into existed pvops,
>>> but it
>>> doesn't match.
>> You are aware that pvops is x86 only?
> 
> yes, I'm aware..
> 
>> I really don't see the big difference in maintainability compared to the
>> static key / function pointer variant:
>>
>> void (*guest_idle_poll_func)(void);
>> struct static_key guest_idle_poll_key __read_mostly;
>>
>> static inline void guest_idle_poll(void)
>> {
>> if (static_key_false(&guest_idle_poll_key))
>>     guest_idle_poll_func();
>> }
> 
> 
> 
> thank you for your sample code :)
> I agree there is no big difference.. I think we are discussion for two
> things:
>  1) x86 VM on different hypervisors
>  2) different archs VM on kvm hypervisor
> 
> What I want to do is x86 VM on different hypervisors, such as kvm / xen
> / hyperv ..

Why limit the solution to x86 if the more general solution isn't
harder?

As you didn't give any reason why the pvops approach is better other
than you don't care for non-x86 platforms you won't get an "Ack" from
me for this patch.

> 
>> And KVM would just need to set guest_idle_poll_func and enable the
>> static key. Works on non-x86 architectures, too.
>>
> 
> .. referred to 'pv_mmu_ops', HyperV and Xen can implement their own
> functions for 'pv_mmu_ops'.
> I think it is the same to pv_idle_ops.
> 
> with above explaination, do you still think I need to define the static
> key/function pointer variant?
> 
> btw, any interest to port it to Xen HVM guest? :)

Maybe. But this should work for Xen on ARM, too.


Juergen
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git pull

2017-11-14 Thread Greg Kroah-Hartman
Adding lkml and linux-doc mailing lists...

On Tue, Nov 14, 2017 at 10:11:55AM +1100, Tobin C. Harding wrote:
> Hi Greg,
> 
> This is totally asking a favour, feel free to ignore. How do you format
> your [GIT PULL] emails to Linus? Do you create a tag and then run a git
> command to get the email?
> 
> I tried to do it manually and failed pretty hard (as you no doubt will
> notice on LKML).

Well, I think you got it right the third time, so nice job :)

Anyway, this actually came up at the kernel summit / maintainer meeting
a few weeks ago, in that "how do I make a good pull request to Linus" is
something we need to document.

Here's what I do, and it seems to work well, so maybe we should turn it
into the start of the documentation for how to do it.

---

To start with, put your changes on a branch, hopefully one named in a
semi-useful way (I use 'char-misc-next' for my char/misc driver patches
to be merged into linux-next).  That is the branch you wish to tag and
have Linus pull from.

Name the tag with something useful that you can understand if you run
across it in a few weeks, and something that will be "unique".
Continuing the example of my char-misc tree, for the patches to be sent
to Linus for the 4.15-rc1 merge window, I would name the tag
'char-misc-4.15-rc1':
git tag -u KEY_ID -s char-misc-4.15-rc1 char-misc-next

that will create a signed tag called 'char-misc-4.15-rc1' based on the
last commit in the char-misc-next branch, and sign it with my gpg key
KEY_ID (replace KEY_ID with your own gpg key id.)

When you run the above command, git will drop you into an editor and ask
you to describe the tag.  In this case, you are describing a pull
request, so outline what is contained here, why it should be merged, and
what, if any, testing has happened to it.  All of this information will
end up in the tag itself, and then in the merge commit that Linus makes,
so write it up well, as it will be in the kernel tree for forever.

An example pull request of mine might look like:
Char/Misc patches for 4.15-rc1

Here is the big char/misc patch set for the 4.15-rc1 merge
window.  Contained in here is the normal set of new functions
added to all of these crazy drivers, as well as the following
brand new subsystems:
- time_travel_controller: Finally a set of drivers for
  the latest time travel bus architecture that provides
  i/o to the CPU before it asked for it, allowing
  uninterrupted processing
- relativity_shifters: due to the affect that the
  time_travel_controllers have on the overall system,
  there was a need for a new set of relativity shifter
  drivers to accommodate the newly formed black holes
  that would threaten to suck CPUs into them.  This
  subsystem handles this in a way to successfully
  neutralize the problems.  There is a Kconfig option to
  force these to be enabled when needed, so problems
  should not occur.

All of these patches have been successfully tested in the latest
linux-next releases, and the original problems that it found
have all been resolved (apologies to anyone living near Canberra
for the lack of the Kconfig options in the earlier versions of
the linux-next tree creations.)

Signed-off-by: Your-name-here 


The tag message format is just like a git commit id.  One line at the
top for a "summary subject" and be sure to sign-off at the bottom.

Now that you have a local signed tag, you need to push it up to where it
can be retrieved by others:
git push origin char-misc-4.15-rc1
pushes the char-misc-4.15-rc1 tag to where the 'origin' repo is located.

The last thing to do is create the pull request message.  Git handily
will do this for you with the 'git request-pull' command, but it needs a
bit of help determining what you want to pull, and what to base the pull
against (to show the correct changes to be pulled and the diffstat.)

I use the following command to generate a pull request:
git request-pull master 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ 
char-misc-4.15-rc1

This is asking git to compare the difference from the
'char-misc-4.15-rc1' tag location, to the head of the 'master' branch
(which in my case points to the last location in Linus's tree that I
diverged from, usually a -rc release) and to use the git:// protocol to
pull from.  If you wish to use https://, that can be used here instead
as well (but note that some people behind firewalls will have problems
with https git pulls).

If the char-misc-4.15-rc1 tag is not present in the repo that I am
asking to be pulled from, git will complain saying it is not there, a
handy way to remember to actually push it to a public location.

The output of 'git request-pu

[PATCH v2] sched/deadline: fix runtime accounting in documentation

2017-11-14 Thread Claudio Scordino
Signed-off-by: Claudio Scordino 
Signed-off-by: Luca Abeni 
Acked-by: Daniel Bristot de Oliveira 
CC: Jonathan Corbet 
CC: "Peter Zijlstra (Intel)" 
CC: Ingo Molnar 
CC: linux-doc@vger.kernel.org
Cc: Tommaso Cucinotta 
Cc: Mathieu Poirier 
---
 Documentation/scheduler/sched-deadline.txt | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/Documentation/scheduler/sched-deadline.txt 
b/Documentation/scheduler/sched-deadline.txt
index e89e36e..8ce78f8 100644
--- a/Documentation/scheduler/sched-deadline.txt
+++ b/Documentation/scheduler/sched-deadline.txt
@@ -204,10 +204,17 @@ CONTENTS
  It does so by decrementing the runtime of the executing task Ti at a pace 
equal
  to
 
-   dq = -max{ Ui, (1 - Uinact) } dt
+   dq = -max{ Ui / Umax, (1 - Uinact - Uextra) } dt
 
- where Uinact is the inactive utilization, computed as (this_bq - running_bw),
- and Ui is the bandwidth of task Ti.
+ where:
+
+  - Ui is the bandwidth of task Ti;
+  - Umax is the maximum reclaimable utilization (subjected to RT throttling
+limits);
+  - Uinact is the (per runqueue) inactive utilization, computed as
+(this_bq - running_bw);
+  - Uextra is the (per runqueue) extra reclaimable utilization
+(subjected to RT throttling limits).
 
 
  Let's now see a trivial example of two deadline tasks with runtime equal
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Quan Xu



On 2017/11/14 18:27, Juergen Gross wrote:

On 14/11/17 10:38, Quan Xu wrote:


On 2017/11/14 15:30, Juergen Gross wrote:

On 14/11/17 08:02, Quan Xu wrote:

On 2017/11/13 18:53, Juergen Gross wrote:

On 13/11/17 11:06, Quan Xu wrote:

From: Quan Xu 

So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called
in idle path which will poll for a while before we enter the real idle
state.

In virtualization, idle path includes several heavy operations
includes timer access(LAPIC timer or TSC deadline timer) which will
hurt performance especially for latency intensive workload like
message
passing task. The cost is mainly from the vmexit which is a hardware
context switch between virtual machine and hypervisor. Our solution is
to poll for a while and do not enter real idle path if we can get the
schedule event during polling.

Poll may cause the CPU waste so we adopt a smart polling mechanism to
reduce the useless poll.

Signed-off-by: Yang Zhang 
Signed-off-by: Quan Xu 
Cc: Juergen Gross 
Cc: Alok Kataria 
Cc: Rusty Russell 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org

Hmm, is the idle entry path really so critical to performance that a
new
pvops function is necessary?

Juergen, Here is the data we get when running benchmark netperf:
   1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
  29031.6 bit/s -- 76.1 %CPU

   2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
  35787.7 bit/s -- 129.4 %CPU

   3. w/ kvm dynamic poll:
  35735.6 bit/s -- 200.0 %CPU

   4. w/patch and w/ kvm dynamic poll:
  42225.3 bit/s -- 198.7 %CPU

   5. idle=poll
  37081.7 bit/s -- 998.1 %CPU



   w/ this patch, we will improve performance by 23%.. even we could
improve
   performance by 45.4%, if we use w/patch and w/ kvm dynamic poll.
also the
   cost of CPU is much lower than 'idle=poll' case..

I don't question the general idea. I just think pvops isn't the best way
to implement it.


Wouldn't a function pointer, maybe guarded
by a static key, be enough? A further advantage would be that this
would
work on other architectures, too.

I assume this feature will be ported to other archs.. a new pvops makes

   sorry, a typo.. /other archs/other hypervisors/
   it refers hypervisor like Xen, HyperV and VMware)..


code
clean and easy to maintain. also I tried to add it into existed pvops,
but it
doesn't match.

You are aware that pvops is x86 only?

yes, I'm aware..


I really don't see the big difference in maintainability compared to the
static key / function pointer variant:

void (*guest_idle_poll_func)(void);
struct static_key guest_idle_poll_key __read_mostly;

static inline void guest_idle_poll(void)
{
 if (static_key_false(&guest_idle_poll_key))
     guest_idle_poll_func();
}



thank you for your sample code :)
I agree there is no big difference.. I think we are discussion for two
things:
  1) x86 VM on different hypervisors
  2) different archs VM on kvm hypervisor

What I want to do is x86 VM on different hypervisors, such as kvm / xen
/ hyperv ..

Why limit the solution to x86 if the more general solution isn't
harder?

As you didn't give any reason why the pvops approach is better other
than you don't care for non-x86 platforms you won't get an "Ack" from
me for this patch.



It just looks a little odder to me. I understand you care about no-x86 arch.

Are you aware 'pv_time_ops' for arm64/arm/x86 archs, defined in
   - arch/arm64/include/asm/paravirt.h
   - arch/x86/include/asm/paravirt_types.h
   - arch/arm/include/asm/paravirt.h

I am unfamilar with arm code. IIUC, if you'd implement pv_idle_ops
for arm/arm64 arch, you'd define a same structure in
   - arch/arm64/include/asm/paravirt.h or
   - arch/arm/include/asm/paravirt.h

.. instead of static key / fuction.

then implement a real function in
   - arch/arm/kernel/paravirt.c.

Also I wonder HOW/WHERE to define a static key/function, then to benifit
x86/no-x86 archs?

Quan
Alibaba Cloud


And KVM would just need to set guest_idle_poll_func and enable the
static key. Works on non-x86 architectures, too.


.. referred to 'pv_mmu_ops', HyperV and Xen can implement their own
functions for 'pv_mmu_ops'.
I think it is the same to pv_idle_ops.

with above explaination, do you still think I need to define the static
key/function pointer variant?

btw, any interest to port it to Xen HVM guest? :)

Maybe. But this should work for Xen on ARM, too.


Juergen



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops

2017-11-14 Thread Juergen Gross
On 14/11/17 12:43, Quan Xu wrote:
> 
> 
> On 2017/11/14 18:27, Juergen Gross wrote:
>> On 14/11/17 10:38, Quan Xu wrote:
>>>
>>> On 2017/11/14 15:30, Juergen Gross wrote:
 On 14/11/17 08:02, Quan Xu wrote:
> On 2017/11/13 18:53, Juergen Gross wrote:
>> On 13/11/17 11:06, Quan Xu wrote:
>>> From: Quan Xu 
>>>
>>> So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is
>>> called
>>> in idle path which will poll for a while before we enter the real
>>> idle
>>> state.
>>>
>>> In virtualization, idle path includes several heavy operations
>>> includes timer access(LAPIC timer or TSC deadline timer) which will
>>> hurt performance especially for latency intensive workload like
>>> message
>>> passing task. The cost is mainly from the vmexit which is a hardware
>>> context switch between virtual machine and hypervisor. Our
>>> solution is
>>> to poll for a while and do not enter real idle path if we can get
>>> the
>>> schedule event during polling.
>>>
>>> Poll may cause the CPU waste so we adopt a smart polling
>>> mechanism to
>>> reduce the useless poll.
>>>
>>> Signed-off-by: Yang Zhang 
>>> Signed-off-by: Quan Xu 
>>> Cc: Juergen Gross 
>>> Cc: Alok Kataria 
>>> Cc: Rusty Russell 
>>> Cc: Thomas Gleixner 
>>> Cc: Ingo Molnar 
>>> Cc: "H. Peter Anvin" 
>>> Cc: x...@kernel.org
>>> Cc: virtualizat...@lists.linux-foundation.org
>>> Cc: linux-ker...@vger.kernel.org
>>> Cc: xen-de...@lists.xenproject.org
>> Hmm, is the idle entry path really so critical to performance that a
>> new
>> pvops function is necessary?
> Juergen, Here is the data we get when running benchmark netperf:
>    1. w/o patch and disable kvm dynamic poll (halt_poll_ns=0):
>   29031.6 bit/s -- 76.1 %CPU
>
>    2. w/ patch and disable kvm dynamic poll (halt_poll_ns=0):
>   35787.7 bit/s -- 129.4 %CPU
>
>    3. w/ kvm dynamic poll:
>   35735.6 bit/s -- 200.0 %CPU
>
>    4. w/patch and w/ kvm dynamic poll:
>   42225.3 bit/s -- 198.7 %CPU
>
>    5. idle=poll
>   37081.7 bit/s -- 998.1 %CPU
>
>
>
>    w/ this patch, we will improve performance by 23%.. even we could
> improve
>    performance by 45.4%, if we use w/patch and w/ kvm dynamic poll.
> also the
>    cost of CPU is much lower than 'idle=poll' case..
 I don't question the general idea. I just think pvops isn't the best
 way
 to implement it.

>> Wouldn't a function pointer, maybe guarded
>> by a static key, be enough? A further advantage would be that this
>> would
>> work on other architectures, too.
> I assume this feature will be ported to other archs.. a new pvops
> makes
>>>    sorry, a typo.. /other archs/other hypervisors/
>>>    it refers hypervisor like Xen, HyperV and VMware)..
>>>
> code
> clean and easy to maintain. also I tried to add it into existed pvops,
> but it
> doesn't match.
 You are aware that pvops is x86 only?
>>> yes, I'm aware..
>>>
 I really don't see the big difference in maintainability compared to
 the
 static key / function pointer variant:

 void (*guest_idle_poll_func)(void);
 struct static_key guest_idle_poll_key __read_mostly;

 static inline void guest_idle_poll(void)
 {
  if (static_key_false(&guest_idle_poll_key))
  guest_idle_poll_func();
 }
>>>
>>>
>>> thank you for your sample code :)
>>> I agree there is no big difference.. I think we are discussion for two
>>> things:
>>>   1) x86 VM on different hypervisors
>>>   2) different archs VM on kvm hypervisor
>>>
>>> What I want to do is x86 VM on different hypervisors, such as kvm / xen
>>> / hyperv ..
>> Why limit the solution to x86 if the more general solution isn't
>> harder?
>>
>> As you didn't give any reason why the pvops approach is better other
>> than you don't care for non-x86 platforms you won't get an "Ack" from
>> me for this patch.
> 
> 
> It just looks a little odder to me. I understand you care about no-x86
> arch.
> 
> Are you aware 'pv_time_ops' for arm64/arm/x86 archs, defined in
>    - arch/arm64/include/asm/paravirt.h
>    - arch/x86/include/asm/paravirt_types.h
>    - arch/arm/include/asm/paravirt.h

Yes, I know. This is just a hack to make it compile. Other than the
same names this has nothing to do with pvops, but is just a function
vector.

> I am unfamilar with arm code. IIUC, if you'd implement pv_idle_ops
> for arm/arm64 arch, you'd define a same structure in
>    - arch/arm64/include/asm/paravirt.h or
>    - arch/arm/include/asm/paravirt.h
> 
> .. instead of static key / fuction.
> 
> then implement a real function in
>    - arch/arm/kernel/paravirt.c.

So just to use pvops you want to implement it in each arch instead
of using a mechanism available everywhere?

>

Re: git pull

2017-11-14 Thread Ulf Hansson
[...]

>
> An example pull request of mine might look like:
> Char/Misc patches for 4.15-rc1
>
> Here is the big char/misc patch set for the 4.15-rc1 merge
> window.  Contained in here is the normal set of new functions
> added to all of these crazy drivers, as well as the following
> brand new subsystems:
> - time_travel_controller: Finally a set of drivers for
>   the latest time travel bus architecture that provides
>   i/o to the CPU before it asked for it, allowing
>   uninterrupted processing
> - relativity_shifters: due to the affect that the
>   time_travel_controllers have on the overall system,
>   there was a need for a new set of relativity shifter
>   drivers to accommodate the newly formed black holes
>   that would threaten to suck CPUs into them.  This
>   subsystem handles this in a way to successfully
>   neutralize the problems.  There is a Kconfig option to
>   force these to be enabled when needed, so problems
>   should not occur.
>
> All of these patches have been successfully tested in the latest
> linux-next releases, and the original problems that it found
> have all been resolved (apologies to anyone living near Canberra
> for the lack of the Kconfig options in the earlier versions of
> the linux-next tree creations.)
>
> Signed-off-by: Your-name-here 
>
>
> The tag message format is just like a git commit id.  One line at the
> top for a "summary subject" and be sure to sign-off at the bottom.

I don't add my s-o-b to signed tags for pull requests, but perhaps I should.

However, I think most maintainers don't use it, and neither does it
seems like Linus is preserving the tag when he does the pull.

[...]

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git pull

2017-11-14 Thread Greg Kroah-Hartman
On Tue, Nov 14, 2017 at 01:00:14PM +0100, Ulf Hansson wrote:
> [...]
> 
> >
> > An example pull request of mine might look like:
> > Char/Misc patches for 4.15-rc1
> >
> > Here is the big char/misc patch set for the 4.15-rc1 merge
> > window.  Contained in here is the normal set of new functions
> > added to all of these crazy drivers, as well as the following
> > brand new subsystems:
> > - time_travel_controller: Finally a set of drivers for
> >   the latest time travel bus architecture that provides
> >   i/o to the CPU before it asked for it, allowing
> >   uninterrupted processing
> > - relativity_shifters: due to the affect that the
> >   time_travel_controllers have on the overall system,
> >   there was a need for a new set of relativity shifter
> >   drivers to accommodate the newly formed black holes
> >   that would threaten to suck CPUs into them.  This
> >   subsystem handles this in a way to successfully
> >   neutralize the problems.  There is a Kconfig option to
> >   force these to be enabled when needed, so problems
> >   should not occur.
> >
> > All of these patches have been successfully tested in the latest
> > linux-next releases, and the original problems that it found
> > have all been resolved (apologies to anyone living near Canberra
> > for the lack of the Kconfig options in the earlier versions of
> > the linux-next tree creations.)
> >
> > Signed-off-by: Your-name-here 
> >
> >
> > The tag message format is just like a git commit id.  One line at the
> > top for a "summary subject" and be sure to sign-off at the bottom.
> 
> I don't add my s-o-b to signed tags for pull requests, but perhaps I should.
> 
> However, I think most maintainers don't use it, and neither does it
> seems like Linus is preserving the tag when he does the pull.

The text of the tag is in the merge commit, but you are right, the
signed-off-by doesn't seem to be in the merge commit, I guess Linus's
workflow removes them.  I know I keep them in there if present for pull
requests that people send to me.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 0/7] Support RAS virtualization in KVM

2017-11-14 Thread James Morse
Hi Dongjiu Geng,

On 10/11/17 19:54, Dongjiu Geng wrote:
> This series patches mainly do below things:
> 
> 1. Trap RAS ERR* registers Accesses to EL2 from Non-secure EL1,
>KVM will will do a minimum simulation, there registers are simulated
>to RAZ/WI in KVM.
> 2. Route synchronous External Abort exceptions from Non-secure EL0
>and EL1 to EL2. When exception EL3 routing is enabled by firmware,
>system will trap to EL3 firmware instead of EL2 KVM, then firmware
>judges whether El2 routing is enabled, if enabled, jump to EL2 KVM, 
>otherwise jump to EL1 host kernel.
> 3. Enable APEI ARv8 SEI notification to parse the CPER records for SError
>in the ACPI GHES driver, KVM will call handle_guest_sei() to let ACPI
>driver to parse the CPER record for SError which happened in the guest
> 4. Although we can use APEI driver to handle the guest SError, but not all
>system support SEI notification, such as kernel-first. So here KVM will
>also classify the Error through Exception Syndrome Register and do 
> different
>approaches according to Asynchronous Error Type

> 5. If the guest SError error is not propagated and not consumed, then KVM 
> return
>recoverable error status to user-space, user-space will specify the guest 
> ESR

I thought we'd gone over this. There should be no RAS errors/notifications in
user space. Only the symptoms should be sent, using the SIGBUS_MCEERR_A{O,R} if
the kernel has handled as much as it can. This hides the actual mechanisms the
kernel and firmware used.

User-space should not have to know how to handle RAS errors directly. This is a
service the operating system provides for it. This abstraction means the smae
user-space code is portable between x86, arm64, powerpc etc.

What if the firmware uses another notification method? User space should expect
the kernel to hide things like this from it.

If the kernel has no information to interpret a notification, how is user space
supposed to know?

I understand you are trying to work around your 'memory corruption at an unknown
address'[0] problem, but if the kernel can't know where this corrupt memory is
it should really reboot. What stops this corrupt data being swapped to disk?

Killing 'the thing' that was running at the time is not sufficient because we
don't know that this 'got' all the users of the corrupt memory. KSM can merge
pages between guests. This is the difference between the error persisting
forever killing off all the VMs one by one, and the corrupt page being silently
re-read from disk clearing the error.


>and inject a virtual SError. For other Asynchronous Error Type, KVM 
> directly
>injects virtual SError with IMPLEMENTATION DEFINED ESR or KVM is panic if 
> the
>error is fatal. In the RAS extension, guest virtual ESR must be set, 
> because
>all-zero  means 'RAS error: Uncategorized' instead of 'no valid ISS', so 
> set
>this ESR to IMPLEMENTATION DEFINED by default if user space does not 
> specify it.


Thanks,

James


[0] https://www.spinics.net/lists/arm-kernel/msg605345.html
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 7/7] arm64: kvm: handle SError Interrupt by categorization

2017-11-14 Thread James Morse
Hi Dongjiu Geng,

On 10/11/17 19:54, Dongjiu Geng wrote:
> If it is not RAS SError, directly inject virtual SError,
> which will keep the old way. If it is RAS SError, firstly
> let host ACPI module to handle it.

> For the ACPI handling,
> if the error address is invalid, APEI driver will not
> identify the address to hwpoison memory and can not notify
> guest to do the recovery.

The guest can't do any recover either. There is no recovery you can do without
some information about what the error is.

This is your memory corruption at an unknown address? We should reboot.

(I agree memory_failure.c's::me_kernel() is ignoring kernel errors, we should
try and fix this. It makes some sense for polled or irq notifications, but not
SEA/SEI).


> In order to safe, KVM continues
> categorizing errors and handle it separately.

> If the RAS error is not propagated, let host user space to
> handle it. 

No. Host user space should not know anything about the kernel or platform RAS
support. Doing so creates an ABI link between EL3 firmware and Qemu. This is
totally unmaintainable.

This thing needs to be portable. The kernel should handle the error, and report
any symptoms to user-space. e.g. 'this memory is gone'.

We shouldn't special case KVM.


> The reason is that sometimes we can only kill the
> guest effected application instead of panic whose guest OS.
> Host user space specifies a valid ESR and inject virtual
> SError, guest can just kill the current application if the
> non-consumed error coming from guest application.
> 
> Signed-off-by: Dongjiu Geng 
> Signed-off-by: Quanming Wu 

The last Signed-off-by should match the person posting the patch. It's a chain
of custody for GPL-signoff purposes, not a 'partially-written-by'. If you want
to credit Quanming Wu you can add CC and they can Ack/Review your patch.


> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 7debb74..1afdc87 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -178,6 +179,66 @@ static exit_handle_fn kvm_get_exit_handler(struct 
> kvm_vcpu *vcpu)
>   return arm_exit_handlers[hsr_ec];
>  }
>  
> +/**
> + * kvm_handle_guest_sei - handles SError interrupt or asynchronous aborts
> + * @vcpu:the VCPU pointer
> + *
> + * For RAS SError interrupt, firstly let host kernel handle it.
> + * If the AET is [ESR_ELx_AET_UER], then let user space handle it,
> + */
> +static int kvm_handle_guest_sei(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + unsigned int esr = kvm_vcpu_get_hsr(vcpu);
> + bool impdef_syndrome =  esr & ESR_ELx_ISV;  /* aka IDS */
> + unsigned int aet = esr & ESR_ELx_AET;
> +
> + /*
> +  * This is not RAS SError
> +  */
> + if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
> + kvm_inject_vabt(vcpu);
> + return 1;
> + }

> + /* The host kernel may handle this abort. */
> + handle_guest_sei();

This has to claim the SError as a notification. If APEI claims the error, KVM
doesn't need to do anything more. You ignore its return code.


> +
> + /*
> +  * In below two conditions, it will directly inject the
> +  * virtual SError:
> +  * 1. The Syndrome is IMPLEMENTATION DEFINED
> +  * 2. It is Uncategorized SEI
> +  */
> + if (impdef_syndrome ||
> + ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)) {
> + kvm_inject_vabt(vcpu);
> + return 1;
> + }
> +
> + switch (aet) {
> + case ESR_ELx_AET_CE:/* corrected error */
> + case ESR_ELx_AET_UEO:   /* restartable error, not yet consumed */
> + return 1;   /* continue processing the guest exit */

> + case ESR_ELx_AET_UER:   /* The error has not been propagated */
> + /*
> +  * Userspace only handle the guest SError Interrupt(SEI) if the
> +  * error has not been propagated
> +  */
> + run->exit_reason = KVM_EXIT_EXCEPTION;
> + run->ex.exception = ESR_ELx_EC_SERROR;
> + run->ex.error_code = KVM_SEI_SEV_RECOVERABLE;
> + return 0;

We should not pass RAS notifications to user space. The kernel either handles
them, or it panics(). User space shouldn't even know if the kernel supports RAS
until it gets an MCEERR signal.

You're making your firmware-first notification an EL3->EL0 signal, bypassing 
the OS.

If we get a RAS SError and there are no CPER records or values in the ERR nodes,
we should panic as it looks like the CPU/firmware is broken. (spurious RAS 
errors)


> + default:
> + /*
> +  * Until now, the CPU supports RAS and SEI is fatal, or host
> +  * does not support to handle the SError.
> +  */
> + panic("This Asynchronous SError interrupt is dangerous, panic");
> + }
> +
> + return 0;
> +}
> +
>  /*
>   * Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on
>   * prop

Re: [PATCH v2 1/6] PM / core: Add LEAVE_SUSPENDED driver flag

2017-11-14 Thread Ulf Hansson
On 11 November 2017 at 00:45, Rafael J. Wysocki  wrote:
> On Fri, Nov 10, 2017 at 10:09 AM, Ulf Hansson  wrote:
>> On 8 November 2017 at 14:25, Rafael J. Wysocki  wrote:
>>> From: Rafael J. Wysocki 
>>>
>>> Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
>>> instruct the PM core and middle-layer (bus type, PM domain, etc.)
>>> code that it is desirable to leave the device in runtime suspend
>>> after system-wide transitions to the working state (for example,
>>> the device may be slow to resume and it may be better to avoid
>>> resuming it right away).
>>>
>>> Generally, the middle-layer code involved in the handling of the
>>> device is expected to indicate to the PM core whether or not the
>>> device may be left in suspend with the help of the device's
>>> power.may_skip_resume status bit.  That has to happen in the "noirq"
>>> phase of the preceding system suspend (or analogous) transition.
>>> The middle layer is then responsible for handling the device as
>>> appropriate in its "noirq" resume callback which is executed
>>> regardless of whether or not the device may be left suspended, but
>>> the other resume callbacks (except for ->complete) will be skipped
>>> automatically by the core if the device really can be left in
>>> suspend.
>>
>> I don't understand the reason to why you need to skip invoking resume
>> callbacks to achieve this behavior, could you elaborate on that?
>
> The reason why it is done this way is because that takes less code and
> is easier (or at least less error-prone, because it avoids repeating
> patterns in middle layers).
>
> Note that the callbacks only may be skipped by the core if the middle
> layer has set power.skip_resume for the device (or if the core is
> handling it in patch [5/6], but that's one more step ahead still).
>
>> Couldn't the PM domain or the middle-layer instead decide what to do?
>
> They still can, the whole thing is a total opt-in.
>
> But to be constructive, do you have any specific examples in mind?

See more below.

>
>> To me it sounds a bit prone to errors by skipping callbacks from the
>> PM core, and I wonder if the general driver author will be able to
>> understand how to use this flag properly.
>
> This has nothing to do with general driver authors and I'm not sure
> what you mean here and where you are going with this.

Let me elaborate.

My general goal is that I want to make it easier (or as easy as
possible) for the general driver author to deploy runtime PM and
system-wide PM support - in an optimized manner. Therefore, I am
pondering over the solution you picked in this series, trying to
understand how it fits into those aspects.

Particular I am a bit worried from a complexity point of view, about
the part with skipping callbacks from the PM core. We have observed
some difficulties with the direct_complete path (i2c dw driver), which
is based on a similar approach as this one.

Additionally, in this case, to trigger skipping of callbacks to
happen, first, drivers needs to inform the middle-layer, second, the
middle layer acts on that information and then informs the PM core,
then in the third step, the PM core can decide what to do. It doesn't
sound straight-forward.

I guess I need to be convinced that this new approach is going to be
better than the the direct_complete path, so it somehow can replace it
along the road. Otherwise, we may end up just having yet another way
of skipping callbacks in the PM core and I don't like that.

Of course, I also realize this hole thing is opt-in, so nothing will
break and we are all good. :-)

>
>> That said, as the series don't include any changes for drivers making
>> use of the flag, could please fold in such change as it would provide
>> a more complete picture?
>
> I've already done so, see https://patchwork.kernel.org/patch/10007349/
>
> IMHO it's not really useful to drag this stuff (which doesn't change
> BTW) along with every iteration of the core patches.

Well, to me it's useful because it shows how these flags can/will be used.

Anyway, I thought you scraped that patch and was working on a new
version. I will have a look then.

[...]

>>>   * device_resume_noirq - Execute a "noirq resume" callback for given 
>>> device.
>>>   * @dev: Device to handle.
>>>   * @state: PM transition of the system being carried out.
>>> @@ -575,6 +587,12 @@ static int device_resume_noirq(struct de
>>> error = dpm_run_callback(callback, dev, state, info);
>>> dev->power.is_noirq_suspended = false;
>>>
>>> +   if (dev_pm_may_skip_resume(dev)) {
>>> +   pm_runtime_set_suspended(dev);
>>
>> According to the doc, the DPM_FLAG_LEAVE_SUSPENDED intends to leave
>> the device in runtime suspend state during system resume.
>> However, here you are actually trying to change its runtime PM state to that.
>
> So the doc needs to be fixed. :-)

Yep.

>
> But I'm guessing that this just is a misunderstanding and you mean the
> phrase "it may be desirable to leave so

Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-11-14 Thread Waiman Long
On 10/26/2017 02:12 PM, Waiman Long wrote:
> On 10/26/2017 10:39 AM, Tejun Heo wrote:
>> Hello, Waiman.
>>
>> On Wed, Oct 25, 2017 at 11:50:34AM -0400, Waiman Long wrote:
>>> Ping! Any comment on this patch?
>> Sorry about the lack of response.  Here are my two thoughts.
>>
>> 1. I'm not really sure about the memory part.  Mostly because of the
>>way it's configured and enforced is completely out of step with how
>>mm behaves in general.  I'd like to get more input from mm folks on
>>this.
> Yes, I also have doubt about which of the additional features are being
> actively used. That is why the current patch exposes only the memory_migrate
> flag in addition to the core *cpus and *mems control files. All the
> other v1 features are not exposed waiting for further investigation and
> feedback. One way to get more feedback is to have something that people
> can play with. Maybe we could somehow tag it as experimental so that we
> can change the interface later on, when necessary, if you have concern
> about setting the APIs in stone.
>
>> 2. I want to think more about how we expose the effective settings.
>>Not that anything is wrong with what cpuset does, but more that I
>>wanna ensure that it's something we can follow in other cases where
>>we have similar hierarchical property propagation.
> Currently, the effective setting is exposed via the effective_cpus and
> effective_mems control files. Unlike other controllers that control
> resources, cpuset is unique in the sense that it is propagating
> hierarchical constraints on CPUs and memory nodes down the tree. I
> understand your desire to have a unified framework that can be applied
> to most controllers, but I doubt cpuset is a good model in this regard.

What do you think we can do for the 4.16 development cycle? I really
like to see some kind of at least experimental support for cpuset v2.
That may be the best way to gather feedback and decide what to do next.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git pull

2017-11-14 Thread Linus Torvalds
On Tue, Nov 14, 2017 at 3:05 AM, Greg Kroah-Hartman
 wrote:
>
> Name the tag with something useful that you can understand if you run
> across it in a few weeks, and something that will be "unique".
> Continuing the example of my char-misc tree, for the patches to be sent
> to Linus for the 4.15-rc1 merge window, I would name the tag
> 'char-misc-4.15-rc1':
> git tag -u KEY_ID -s char-misc-4.15-rc1 char-misc-next

Side note: since you _usually_ would use the same key for the same
project, just set it once with

git config user.signingkey "keyname"

and if you use the same key for everything, just add "--global".

Or just edit your .git/config or ~/.gitconfig file by hand, it's
designed to be human-readable and writable, and not some garbage like
XML:

   [torvalds@i7 ~]$ head -4 .gitconfig
   [user]
name = Linus Torvalds
email = torva...@linux-foundation.org
signingkey = torva...@linux-foundation.org

it's not really all that complicated ;)

Then you don't  need that "-u KEY_ID" when you sign things.

Anyway, at least to me, the important part is the *message*. I want to
understand what I'm pulling, and why I should pull it. I also want to
use that message as the message for the merge, so it should not just
make sense to me, but make sense as a historical record too.

Note that if there is something odd about the pull request, that
should very much be in the explanation. If you're touching files that
you don't maintain, explain _why_. I will see it in the diffstat
anyway, and if you didn't mention it, I'll just be extra suspicious.
And when you send me new stuff after the merge window (or even
bug-fixes, but ones that look scary), explain not just what they do
and why they do it, but explain the _timing_. What happened that this
didn't go through the merge window..

I will take both what you write in the email pull request _and_ in the
signed tag, so depending on your workflow, you can either describe
your work in the signed tag (which will also automatically make it
into the pull request email), or you can make the signed tag just a
placeholder with nothing interesting in it, and describe the work
later when you actually send me the pull request.

And yes, I will edit the message. Partly because I tend to do just
trivial formatting (the whole indentation and quoting etc), but partly
because part of the message may make sense for me at pull time
(describing the conflicts and your personal issues for sending it
right now), but may not make sense in the context of a merge commit
message, so I will try to make it all make sense. I will also fix any
speeling mistaeks and bad grammar I notice, particularly for
non-native speakers (but also for native ones ;^). But I may miss
some, or even add some.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] sched/deadline: fix runtime accounting in documentation

2017-11-14 Thread Mathieu Poirier
Hi Claudio,

On 14 November 2017 at 04:19, Claudio Scordino  wrote:
> Signed-off-by: Claudio Scordino 
> Signed-off-by: Luca Abeni 
> Acked-by: Daniel Bristot de Oliveira 
> CC: Jonathan Corbet 
> CC: "Peter Zijlstra (Intel)" 
> CC: Ingo Molnar 
> CC: linux-doc@vger.kernel.org
> Cc: Tommaso Cucinotta 
> Cc: Mathieu Poirier 
> ---

Please add a short description for your change, even if it is trivial.


>  Documentation/scheduler/sched-deadline.txt | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/scheduler/sched-deadline.txt 
> b/Documentation/scheduler/sched-deadline.txt
> index e89e36e..8ce78f8 100644
> --- a/Documentation/scheduler/sched-deadline.txt
> +++ b/Documentation/scheduler/sched-deadline.txt
> @@ -204,10 +204,17 @@ CONTENTS
>   It does so by decrementing the runtime of the executing task Ti at a pace 
> equal
>   to
>
> -   dq = -max{ Ui, (1 - Uinact) } dt
> +   dq = -max{ Ui / Umax, (1 - Uinact - Uextra) } dt
>
> - where Uinact is the inactive utilization, computed as (this_bq - 
> running_bw),
> - and Ui is the bandwidth of task Ti.
> + where:
> +
> +  - Ui is the bandwidth of task Ti;
> +  - Umax is the maximum reclaimable utilization (subjected to RT throttling
> +limits);
> +  - Uinact is the (per runqueue) inactive utilization, computed as
> +(this_bq - running_bw);
> +  - Uextra is the (per runqueue) extra reclaimable utilization
> +(subjected to RT throttling limits).

I think there would be value in defining 'dq' and 'dt'.  That way
people know exactly what they are and it leaves no room for
interpretation.

Thanks,
Mathieu

>
>
>   Let's now see a trivial example of two deadline tasks with runtime equal
> --
> 2.7.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git pull

2017-11-14 Thread Tobin C. Harding
Added Linus to To: header.

On Tue, Nov 14, 2017 at 12:05:00PM +0100, Greg Kroah-Hartman wrote:
> Adding lkml and linux-doc mailing lists...
> 
> On Tue, Nov 14, 2017 at 10:11:55AM +1100, Tobin C. Harding wrote:
> > Hi Greg,
> > 
> > This is totally asking a favour, feel free to ignore. How do you format
> > your [GIT PULL] emails to Linus? Do you create a tag and then run a git
> > command to get the email?
> > 
> > I tried to do it manually and failed pretty hard (as you no doubt will
> > notice on LKML).
> 
> Well, I think you got it right the third time, so nice job :)
> 
> Anyway, this actually came up at the kernel summit / maintainer meeting
> a few weeks ago, in that "how do I make a good pull request to Linus" is
> something we need to document.
> 
> Here's what I do, and it seems to work well, so maybe we should turn it
> into the start of the documentation for how to do it.
> 
> ---
> 
> To start with, put your changes on a branch, hopefully one named in a
> semi-useful way (I use 'char-misc-next' for my char/misc driver patches
> to be merged into linux-next).  That is the branch you wish to tag and
> have Linus pull from.
> 
> Name the tag with something useful that you can understand if you run
> across it in a few weeks, and something that will be "unique".
> Continuing the example of my char-misc tree, for the patches to be sent
> to Linus for the 4.15-rc1 merge window, I would name the tag
> 'char-misc-4.15-rc1':
>   git tag -u KEY_ID -s char-misc-4.15-rc1 char-misc-next
> 
> that will create a signed tag called 'char-misc-4.15-rc1' based on the
> last commit in the char-misc-next branch, and sign it with my gpg key
> KEY_ID (replace KEY_ID with your own gpg key id.)
> 
> When you run the above command, git will drop you into an editor and ask
> you to describe the tag.  In this case, you are describing a pull
> request, so outline what is contained here, why it should be merged, and
> what, if any, testing has happened to it.  All of this information will
> end up in the tag itself, and then in the merge commit that Linus makes,
> so write it up well, as it will be in the kernel tree for forever.
> 
> An example pull request of mine might look like:
>   Char/Misc patches for 4.15-rc1
> 
>   Here is the big char/misc patch set for the 4.15-rc1 merge
>   window.  Contained in here is the normal set of new functions
>   added to all of these crazy drivers, as well as the following
>   brand new subsystems:
>   - time_travel_controller: Finally a set of drivers for
> the latest time travel bus architecture that provides
> i/o to the CPU before it asked for it, allowing
> uninterrupted processing
>   - relativity_shifters: due to the affect that the
> time_travel_controllers have on the overall system,
> there was a need for a new set of relativity shifter
> drivers to accommodate the newly formed black holes
> that would threaten to suck CPUs into them.  This
> subsystem handles this in a way to successfully
> neutralize the problems.  There is a Kconfig option to
> force these to be enabled when needed, so problems
> should not occur.
> 
>   All of these patches have been successfully tested in the latest
>   linux-next releases, and the original problems that it found
>   have all been resolved (apologies to anyone living near Canberra
>   for the lack of the Kconfig options in the earlier versions of
>   the linux-next tree creations.)
> 
>   Signed-off-by: Your-name-here 
> 
> 
> The tag message format is just like a git commit id.  One line at the
> top for a "summary subject" and be sure to sign-off at the bottom.
> 
> Now that you have a local signed tag, you need to push it up to where it
> can be retrieved by others:
>   git push origin char-misc-4.15-rc1
> pushes the char-misc-4.15-rc1 tag to where the 'origin' repo is located.
> 
> The last thing to do is create the pull request message.  Git handily
> will do this for you with the 'git request-pull' command, but it needs a
> bit of help determining what you want to pull, and what to base the pull
> against (to show the correct changes to be pulled and the diffstat.)
> 
> I use the following command to generate a pull request:
>   git request-pull master 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ 
> char-misc-4.15-rc1
> 
> This is asking git to compare the difference from the
> 'char-misc-4.15-rc1' tag location, to the head of the 'master' branch
> (which in my case points to the last location in Linus's tree that I
> diverged from, usually a -rc release) and to use the git:// protocol to
> pull from.  If you wish to use https://, that can be used here instead
> as well (but note that some people behind firewalls will have problems
> with https 

Re: git pull

2017-11-14 Thread Tobin C. Harding
On Tue, Nov 14, 2017 at 12:05:00PM +0100, Greg Kroah-Hartman wrote:
> Adding lkml and linux-doc mailing lists...
> 
> On Tue, Nov 14, 2017 at 10:11:55AM +1100, Tobin C. Harding wrote:
> > Hi Greg,
> > 
> > This is totally asking a favour, feel free to ignore. How do you format
> > your [GIT PULL] emails to Linus? Do you create a tag and then run a git
> > command to get the email?
> > 
> > I tried to do it manually and failed pretty hard (as you no doubt will
> > notice on LKML).
> 
> Well, I think you got it right the third time, so nice job :)

Lucky. Three strikes and your out isn't it?

> Anyway, this actually came up at the kernel summit / maintainer meeting
> a few weeks ago, in that "how do I make a good pull request to Linus" is
> something we need to document.
> 
> Here's what I do, and it seems to work well, so maybe we should turn it
> into the start of the documentation for how to do it.

Patch to come.

> ---
> 
> To start with, put your changes on a branch, hopefully one named in a
> semi-useful way (I use 'char-misc-next' for my char/misc driver patches
> to be merged into linux-next).  That is the branch you wish to tag and
> have Linus pull from.
> 
> Name the tag with something useful that you can understand if you run
> across it in a few weeks, and something that will be "unique".
> Continuing the example of my char-misc tree, for the patches to be sent
> to Linus for the 4.15-rc1 merge window, I would name the tag
> 'char-misc-4.15-rc1':
>   git tag -u KEY_ID -s char-misc-4.15-rc1 char-misc-next
> 
> that will create a signed tag called 'char-misc-4.15-rc1' based on the
> last commit in the char-misc-next branch, and sign it with my gpg key
> KEY_ID (replace KEY_ID with your own gpg key id.)
> 
> When you run the above command, git will drop you into an editor and ask
> you to describe the tag.  In this case, you are describing a pull
> request, so outline what is contained here, why it should be merged, and
> what, if any, testing has happened to it.  All of this information will
> end up in the tag itself, and then in the merge commit that Linus makes,
> so write it up well, as it will be in the kernel tree for forever.
> 
> An example pull request of mine might look like:
>   Char/Misc patches for 4.15-rc1
> 
>   Here is the big char/misc patch set for the 4.15-rc1 merge
>   window.  Contained in here is the normal set of new functions
>   added to all of these crazy drivers, as well as the following
>   brand new subsystems:
>   - time_travel_controller: Finally a set of drivers for
> the latest time travel bus architecture that provides
> i/o to the CPU before it asked for it, allowing
> uninterrupted processing
>   - relativity_shifters: due to the affect that the
> time_travel_controllers have on the overall system,
> there was a need for a new set of relativity shifter
> drivers to accommodate the newly formed black holes
> that would threaten to suck CPUs into them.  This
> subsystem handles this in a way to successfully
> neutralize the problems.  There is a Kconfig option to
> force these to be enabled when needed, so problems
> should not occur.
> 
>   All of these patches have been successfully tested in the latest
>   linux-next releases, and the original problems that it found
>   have all been resolved (apologies to anyone living near Canberra
>   for the lack of the Kconfig options in the earlier versions of
>   the linux-next tree creations.)
> 
>   Signed-off-by: Your-name-here 
> 
> 
> The tag message format is just like a git commit id.  One line at the
> top for a "summary subject" and be sure to sign-off at the bottom.
> 
> Now that you have a local signed tag, you need to push it up to where it
> can be retrieved by others:
>   git push origin char-misc-4.15-rc1
> pushes the char-misc-4.15-rc1 tag to where the 'origin' repo is located.
> 
> The last thing to do is create the pull request message.  Git handily
> will do this for you with the 'git request-pull' command, but it needs a
> bit of help determining what you want to pull, and what to base the pull
> against (to show the correct changes to be pulled and the diffstat.)
> 
> I use the following command to generate a pull request:
>   git request-pull master 
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/ 
> char-misc-4.15-rc1
> 
> This is asking git to compare the difference from the
> 'char-misc-4.15-rc1' tag location, to the head of the 'master' branch
> (which in my case points to the last location in Linus's tree that I
> diverged from, usually a -rc release) and to use the git:// protocol to
> pull from.  If you wish to use https://, that can be used here instead
> as well (but note that some people behind firewalls wil

Re: git pull

2017-11-14 Thread Linus Torvalds
On Tue, Nov 14, 2017 at 1:33 PM, Tobin C. Harding  wrote:
>
> Linus do you care what protocol? I'm patching Documentation and since
> the point is creating pull requests for you 'some people' don't matter.

I actually tend to prefer the regular git:// protocol and signed tags.

It's true that https should have the proper certificate and perhaps
help with DNS spoofing, but I'm not convinced that git won't just
accept self-signed random certs, and I basically don't think we should
trust that.

In contrast, using ssh I would actually trust, but it's not convenient
and involves people sending things that aren't necessarily publicly
available.

So instead, I prefer just using git:// and not trying to fool people
into thinking the protocol is secure - the security should come from
the signed tag.

And then people can do this:

  [url "ssh://g...@gitolite.kernel.org"]
  insteadOf = https://git.kernel.org
  insteadOf = http://git.kernel.org
  insteadOf = git://git.kernel.org

which makes git.kernel.org addresses use ssh, and avoid the whole
possible DNS spoofing problem.

That said, I actually would prefer even kernel.org repositories to
just send pull requests with signed tags, despite the protocol itself
being secure for that (and only that).

Other hosts I will simply not trust without it because I can't do the above.

Side note: there's an unrelated advantage of using "git://" over
"https://";. It means that people who do automation see that it's a git
repo. It also means, for example, that people that highlight https://
URL's and perhaps use them for spam marking hopefully don't do that
with git:// format.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] docs: add submitting-pull-requests.rst

2017-11-14 Thread Tobin C. Harding
There is currently no documentation on how to create a pull request for
Linus.

Anyway, this actually came up at the kernel summit / maintainer
meeting a few weeks ago, in that "how do I make a good pull request
to Linus" is something we need to document.

Here's what I do, and it seems to work well, so maybe we should turn
it into the start of the documentation for how to do it.

Create document from email thread on LKML (referenced in document).

Signed-off-by: Tobin C. Harding 
---

Is it rude to send this during the merge window? Can resend after it closes if
it makes life easier.

thanks,
Tobin.

 Documentation/process/submitting-pull-requests.rst | 171 +
 1 file changed, 171 insertions(+)
 create mode 100644 Documentation/process/submitting-pull-requests.rst

diff --git a/Documentation/process/submitting-pull-requests.rst 
b/Documentation/process/submitting-pull-requests.rst
new file mode 100644
index ..9528aead4809
--- /dev/null
+++ b/Documentation/process/submitting-pull-requests.rst
@@ -0,0 +1,171 @@
+Submitting Pull Requests to Linus: a guide for maintainers
+==
+
+This document is aimed at kernel maintainers.  It describes a method for 
creating
+a pull request to be sent to Linus.
+
+Configure Git
+-
+
+Since you _usually_ would use the same key for the same project, just set it
+once with
+
+   git config user.signingkey "keyname"
+
+and if you use the same key for everything, just add "--global".
+
+Or just edit your .git/config or ~/.gitconfig file by hand, it's designed to be
+human-readable and writable, and not some garbage like XML:
+
+   [torvalds@i7 ~]$ head -4 .gitconfig
+   [user]
+   name = Linus Torvalds
+   email = torva...@linux-foundation.org
+   signingkey = torva...@linux-foundation.org
+
+You may need to tell git to use gpg2
+
+   [gpg]
+   program = /path/to/gpg2
+
+You may also like to tell gpg which tty to use (add to shell rc file)
+
+   export GPG_TTY=$(tty)
+
+
+Branch, Tag, Push
+-
+
+Next, put your changes on a branch, hopefully one named in a semi-useful way (I
+use 'char-misc-next' for my char/misc driver patches to be merged into
+linux-next).  That is the branch you wish to tag and have Linus pull from.
+
+Name the tag with something useful that you can understand if you run across it
+in a few weeks, and something that will be "unique".  Continuing the example of
+the char-misc tree, for the patches to be sent to Linus for the 4.15-rc1 merge
+window, I would name the tag 'char-misc-4.15-rc1':
+
+   git tag -s char-misc-4.15-rc1 char-misc-next
+
+that will create a signed tag called 'char-misc-4.15-rc1' based on the last
+commit in the char-misc-next branch, and sign it with your gpg key (configured
+above).
+
+When you run the above command, git will drop you into an editor and ask you to
+describe the tag.  In this case, you are describing a pull request, so outline
+what is contained here, why it should be merged, and what, if any, testing has
+happened to it.  All of this information will end up in the tag itself, and 
then
+in the merge commit that Linus makes, so write it up well, as it will be in the
+kernel tree for forever.
+
+   Anyway, at least to me, the important part is the *message*. I want to
+   understand what I'm pulling, and why I should pull it. I also want to
+   use that message as the message for the merge, so it should not just
+   make sense to me, but make sense as a historical record too.
+
+   Note that if there is something odd about the pull request, that
+   should very much be in the explanation. If you're touching files that
+   you don't maintain, explain _why_. I will see it in the diffstat
+   anyway, and if you didn't mention it, I'll just be extra suspicious.
+   And when you send me new stuff after the merge window (or even
+   bug-fixes, but ones that look scary), explain not just what they do
+   and why they do it, but explain the _timing_. What happened that this
+   didn't go through the merge window..
+
+   I will take both what you write in the email pull request _and_ in the
+   signed tag, so depending on your workflow, you can either describe
+   your work in the signed tag (which will also automatically make it
+   into the pull request email), or you can make the signed tag just a
+   placeholder with nothing interesting in it, and describe the work
+   later when you actually send me the pull request.
+
+   And yes, I will edit the message. Partly because I tend to do just
+   trivial formatting (the whole indentation and quoting etc), but partly
+   because part of the message may make sense for me at pull time
+   (describing the conflicts and your personal issues for sending it
+   right now), but may not make sens

Re: [PATCH] docs: add submitting-pull-requests.rst

2017-11-14 Thread Jonathan Corbet
On Wed, 15 Nov 2017 09:54:21 +1100
"Tobin C. Harding"  wrote:

> There is currently no documentation on how to create a pull request for
> Linus.
> 
> Anyway, this actually came up at the kernel summit / maintainer
> meeting a few weeks ago, in that "how do I make a good pull request
> to Linus" is something we need to document.
> 
> Here's what I do, and it seems to work well, so maybe we should turn
> it into the start of the documentation for how to do it.
> 
> Create document from email thread on LKML (referenced in document).
> 
> Signed-off-by: Tobin C. Harding 
> ---
> 
> Is it rude to send this during the merge window? Can resend after it closes if
> it makes life easier.

I can handle patches during the merge window.  That said, while I welcome
this effort and think it's a good start, there's a few things I'll
quibble about:

 - Much of this was actually written by Greg, I believe, and some by Linus.
   That deserves credit in the changelog, if nowhere else.

 - Putting it in Documentation/process as RST is good.  But it should be
   added to index.rst and made part of the docs build.  I suspect you
   haven't run it through sphinx at all yet, right?  Some things are
   unlikely to format the way you think they might.

Finally, I see this as being the first installment in what, I hope, will
someday be a nice "how to be a kernel maintainer" manual.  I wouldn't
insist on it before taking a patch like this, but if you could see
through to organizing it as a chapter in a bigger sub-book, that would be
great.

Finally finally... Dan Williams [CC'd] has plans for doing some
maintainer-level documentation.  He may have thoughts on how this fits
into what he's scheming, and I'd suggest copying him on the next
iteration.

Finally finally finally...some specific comments on the text.  Some of
them might be read to suggest a major expansion of the work you've done;
please see that as me saying "that would be nice".  Doing all of this is
not a precondition to getting this document added!

> +Submitting Pull Requests to Linus: a guide for maintainers
> +==
> +
> +This document is aimed at kernel maintainers.  It describes a method for 
> creating
> +a pull request to be sent to Linus.

Limiting text widths to, say, 75 columns when possible is preferable.  Word
has it some maintainers are still reading the docs on their adm3a
terminals.

Most maintainers push directly to Linus, so that's an obvious best focus,
but pull requests happen at other levels too.  One would hope that this
information would be applicable at all levels, so it might be nice to
describe it as such.

> +Configure Git
> +-

"Configure Git to use your private key"

We are, of course, missing the whole discussion on why one would want a
keypair, how to create it, how to get it into the web of trust, etc.  All
fodder for a separate chapter in our shiny new maintainer book :)  But it
is worth saying at least that this is about making Git use your key so you
can sign tags for pull requests.

> +Since you _usually_ would use the same key for the same project, just set it
> +once with

If you end a line like that with "::", the following indented section will
be formatted as code by sphinx.  That's almost always what you want.

> + git config user.signingkey "keyname"
> +
> +and if you use the same key for everything, just add "--global".
> +
> +Or just edit your .git/config or ~/.gitconfig file by hand, it's designed to 
> be
> +human-readable and writable, and not some garbage like XML:
> +
> + [torvalds@i7 ~]$ head -4 .gitconfig
> + [user]
> + name = Linus Torvalds
> + email = torva...@linux-foundation.org
> + signingkey = torva...@linux-foundation.org
> +
> +You may need to tell git to use gpg2
> +
> + [gpg]
> + program = /path/to/gpg2
> +
> +You may also like to tell gpg which tty to use (add to shell rc file)
> +
> + export GPG_TTY=$(tty)
> +
> +
> +Branch, Tag, Push
> +-
> +
> +Next, put your changes on a branch, hopefully one named in a semi-useful way 
> (I
> +use 'char-misc-next' for my char/misc driver patches to be merged into
> +linux-next).  That is the branch you wish to tag and have Linus pull from.

Management of patches and branches would, of course, make for another nice
chapter.

> +Name the tag with something useful that you can understand if you run across 
> it
> +in a few weeks, and something that will be "unique".  Continuing the example 
> of

Greg likes to put quotes in weird places, but we don't need to preserve
that :)  Git will force the tag to be "unique", so we can just say unique. 

> +the char-misc tree, for the patches to be sent to Linus for the 4.15-rc1 
> merge
> +window, I would name the tag 'char-misc-4.15-rc1':
> +
> + git tag -s char-misc-4.15-rc1 char-misc-next
> +
> +that will create a signed tag called 'char-misc-4.15-rc1' based on

Re: [PATCH v2 1/6] PM / core: Add LEAVE_SUSPENDED driver flag

2017-11-14 Thread Rafael J. Wysocki
On Tuesday, November 14, 2017 5:07:59 PM CET Ulf Hansson wrote:
> On 11 November 2017 at 00:45, Rafael J. Wysocki  wrote:
> > On Fri, Nov 10, 2017 at 10:09 AM, Ulf Hansson  
> > wrote:
> >> On 8 November 2017 at 14:25, Rafael J. Wysocki  wrote:
> >>> From: Rafael J. Wysocki 
> >>>
> >>> Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
> >>> instruct the PM core and middle-layer (bus type, PM domain, etc.)
> >>> code that it is desirable to leave the device in runtime suspend
> >>> after system-wide transitions to the working state (for example,
> >>> the device may be slow to resume and it may be better to avoid
> >>> resuming it right away).
> >>>
> >>> Generally, the middle-layer code involved in the handling of the
> >>> device is expected to indicate to the PM core whether or not the
> >>> device may be left in suspend with the help of the device's
> >>> power.may_skip_resume status bit.  That has to happen in the "noirq"
> >>> phase of the preceding system suspend (or analogous) transition.
> >>> The middle layer is then responsible for handling the device as
> >>> appropriate in its "noirq" resume callback which is executed
> >>> regardless of whether or not the device may be left suspended, but
> >>> the other resume callbacks (except for ->complete) will be skipped
> >>> automatically by the core if the device really can be left in
> >>> suspend.
> >>
> >> I don't understand the reason to why you need to skip invoking resume
> >> callbacks to achieve this behavior, could you elaborate on that?
> >
> > The reason why it is done this way is because that takes less code and
> > is easier (or at least less error-prone, because it avoids repeating
> > patterns in middle layers).
> >
> > Note that the callbacks only may be skipped by the core if the middle
> > layer has set power.skip_resume for the device (or if the core is
> > handling it in patch [5/6], but that's one more step ahead still).
> >
> >> Couldn't the PM domain or the middle-layer instead decide what to do?
> >
> > They still can, the whole thing is a total opt-in.
> >
> > But to be constructive, do you have any specific examples in mind?
> 
> See more below.
> 
> >
> >> To me it sounds a bit prone to errors by skipping callbacks from the
> >> PM core, and I wonder if the general driver author will be able to
> >> understand how to use this flag properly.
> >
> > This has nothing to do with general driver authors and I'm not sure
> > what you mean here and where you are going with this.
> 
> Let me elaborate.
> 
> My general goal is that I want to make it easier (or as easy as
> possible) for the general driver author to deploy runtime PM and
> system-wide PM support - in an optimized manner. Therefore, I am
> pondering over the solution you picked in this series, trying to
> understand how it fits into those aspects.
> 
> Particular I am a bit worried from a complexity point of view, about
> the part with skipping callbacks from the PM core. We have observed
> some difficulties with the direct_complete path (i2c dw driver), which
> is based on a similar approach as this one.

These are resume callbacks, not suspend callbacks.  Also not all of them
are skipped.  That is quite a bit different from skipping *all* callbacks.

Moreover, at the point the core decides to skip the callbacks, the device
*has* *to* be left suspended and there simply is no point in running them
no matter what.

That part of code can be trivially moved to middle layers, but then each
of them will have to do exactly the same thing.  I don't see any reason to
do that and I'm not finding one in your comments.  Sorry.

> Additionally, in this case, to trigger skipping of callbacks to
> happen, first, drivers needs to inform the middle-layer, second, the
> middle layer acts on that information and then informs the PM core,
> then in the third step, the PM core can decide what to do. It doesn't
> sound straight-forward.

It really doesn't work like that.

First, the driver sets the LEAVE_SUSPENDED flag for the core to consume.
The middle layers don't have to look at it at all.

Second, each middle layer sets power.may_skip_resume for devices whose
state after system suspend should match the runtime suspend state.  The
middle layer must know that this is the case to set that bit.  [The core
effectively does that part for devices handled by it directly in patch
[5/6].]

The core then takes the LEAVE_SUSPENDED flags, power.may_skip_resume bits,
status of the children and consumers into account in order to produce the
power.must_resume bits and those are used (later) to decide whether or not
to resume the devices.  That decision is made by the core and so the core
acts on it and the middle layers must follow.

> I guess I need to be convinced that this new approach is going to be
> better than the the direct_complete path, so it somehow can replace it
> along the road. Otherwise, we may end up just having yet another way
> of skipping callbacks in the PM cor

Re: [PATCH] docs: add submitting-pull-requests.rst

2017-11-14 Thread Tobin C. Harding
On Tue, Nov 14, 2017 at 04:48:16PM -0700, Jonathan Corbet wrote:

Awesome comments Jon, I knew there would be more to writing docs than
first met the eye.

> On Wed, 15 Nov 2017 09:54:21 +1100
> "Tobin C. Harding"  wrote:
> 
> > There is currently no documentation on how to create a pull request for
> > Linus.
> > 
> > Anyway, this actually came up at the kernel summit / maintainer
> > meeting a few weeks ago, in that "how do I make a good pull request
> > to Linus" is something we need to document.
> > 
> > Here's what I do, and it seems to work well, so maybe we should turn
> > it into the start of the documentation for how to do it.
> > 
> > Create document from email thread on LKML (referenced in document).
> > 
> > Signed-off-by: Tobin C. Harding 
> > ---
> > 
> > Is it rude to send this during the merge window? Can resend after it closes 
> > if
> > it makes life easier.
> 
> I can handle patches during the merge window.  That said, while I welcome
> this effort and think it's a good start, there's a few things I'll
> quibble about:
> 
>  - Much of this was actually written by Greg, I believe, and some by Linus.
>That deserves credit in the changelog, if nowhere else.

Yeah, I struggled for ages with the tense, Greg's stuff is obviously
written as him. But I didn't want to paraphrase and present it as if I'd
written it. After your comments I'm still unsure of the _best_ way to
present this material with a good flow but still giving credit where
credit is due? I didn't seem right to add their names to the document
(thereby presenting myself as them). I did not think of the changelog -
I'll go that path for v2.

>  - Putting it in Documentation/process as RST is good.  But it should be
>added to index.rst and made part of the docs build.  I suspect you
>haven't run it through sphinx at all yet, right?  Some things are
>unlikely to format the way you think they might.

My bad, I knew I would botch some of the RST stuff, didn't think to run
it through sphinx (I tend to view kernel docs as the raw files ;)

> Finally, I see this as being the first installment in what, I hope, will
> someday be a nice "how to be a kernel maintainer" manual.  I wouldn't
> insist on it before taking a patch like this, but if you could see
> through to organizing it as a chapter in a bigger sub-book, that would be
> great.

Happy to do so. I'm no way qualified to produce much of the text but
perhaps can assist in getting things moving.

> Finally finally... Dan Williams [CC'd] has plans for doing some
> maintainer-level documentation.  He may have thoughts on how this fits
> into what he's scheming, and I'd suggest copying him on the next
> iteration.

Let's liaise on this Dan (if you want to).

> Finally finally finally...some specific comments on the text.  Some of
> them might be read to suggest a major expansion of the work you've done;
> please see that as me saying "that would be nice".  Doing all of this is
> not a precondition to getting this document added!

There is no rush to get merged, let's get it into shape first :)

> > +Submitting Pull Requests to Linus: a guide for maintainers
> > +==
> > +
> > +This document is aimed at kernel maintainers.  It describes a method for 
> > creating
> > +a pull request to be sent to Linus.
> 
> Limiting text widths to, say, 75 columns when possible is preferable.  Word
> has it some maintainers are still reading the docs on their adm3a
> terminals.

Got it. (idea for next doc 'column widths howto' - your canonical guide
to column widths (includes git brief, commit log, email, source code,
and docs).

I'm kidding. 75 it is.

> Most maintainers push directly to Linus, so that's an obvious best focus,
> but pull requests happen at other levels too.  One would hope that this
> information would be applicable at all levels, so it might be nice to
> describe it as such.

Oh, Greg had this, I stripped it out. Back in on next spin.

> > +Configure Git
> > +-
> 
> "Configure Git to use your private key"
> 
> We are, of course, missing the whole discussion on why one would want a
> keypair, how to create it, how to get it into the web of trust, etc.  All
> fodder for a separate chapter in our shiny new maintainer book :)  But it
> is worth saying at least that this is about making Git use your key so you
> can sign tags for pull requests.

Funny you should say that, I'm trying to get into the web of trust so
perhaps I can help with that document (as I work out how to do it).

> > +Since you _usually_ would use the same key for the same project, just set 
> > it
> > +once with
> 
> If you end a line like that with "::", the following indented section will
> be formatted as code by sphinx.  That's almost always what you want.
> 
> > +   git config user.signingkey "keyname"

cool.

> > +
> > +and if you use the same key for everything, just add "--global".
> > +
> > +Or just edit your .g

Re: [PATCH v3 4/6] PM / core: Add helpers for subsystem callback selection

2017-11-14 Thread Ulf Hansson
On 12 November 2017 at 01:42, Rafael J. Wysocki  wrote:
> From: Rafael J. Wysocki 
>
> Add helper routines to find and return a suitable subsystem callback
> during the "noirq" phases of system suspend/resume (or analogous)
> transitions as well as during the "late" phase of system suspend and
> the "early" phase of system resume (or analogous) transitions.
>
> The helpers will be called from additional sites going forward.
>
> Signed-off-by: Rafael J. Wysocki 

With a minor nitpick, see below, feel free to add:

Reviewed-by: Ulf Hansson 

> ---
>
> v2 -> v3: No changes.
>
> ---
>  drivers/base/power/main.c |  196 
> +++---
>  1 file changed, 136 insertions(+), 60 deletions(-)
>
> Index: linux-pm/drivers/base/power/main.c
> ===
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -525,6 +525,14 @@ static void dpm_watchdog_clear(struct dp
>  #define dpm_watchdog_clear(x)
>  #endif
>
> +static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev,
> +pm_message_t state,
> +const char **info_p);
> +
> +static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev,
> +   pm_message_t state,
> +   const char **info_p);
> +

There is no need to declare these functions.

Perhaps a following patch in the series need them, but then that
change should add these or even better (in my opinion) just move the
implementations and avoid the declarations all together.

[...]

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html