On Fri, Apr 01, 2016 at 09:14:30AM +0200, Juergen Gross wrote:
> + if (cpu >= nr_cpu_ids)
> + return -EINVAL;
> + if (cpu != 0)
> + return -EINVAL;
The other functions return -ENXIO for this.
___
Xen-devel mailing list
X
On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote:
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -14,6 +14,7 @@
> #include
> #include
> #include
> +#include
>
> #include "smpboot.h"
>
> @@ -758,9 +759,14 @@ struct smp_sync_call_struct {
> static void smp_call_sync_callback
On Fri, Apr 01, 2016 at 10:28:46AM +0200, Juergen Gross wrote:
> On 01/04/16 09:43, Peter Zijlstra wrote:
> > On Fri, Apr 01, 2016 at 09:14:33AM +0200, Juergen Gross wrote:
> >> --- a/kernel/smp.c
> >> +++ b/kernel/smp.c
> >> @@ -14,6 +14,7 @@
> >&
On Fri, Apr 01, 2016 at 11:03:21AM +0200, Juergen Gross wrote:
> > Maybe just make the vpin thing an option like:
> >
> > smp_call_on_cpu(int (*func)(void *), int phys_cpu);
> > Also; is something like the vpin thing possible on KVM? because if we're
> > going to expose it to generic code lik
On Mon, Apr 04, 2016 at 01:52:06PM +0200, Jan Kara wrote:
> Sounds like a good idea to me. I've also consulted this with Petr Mladek
> (added to CC) who is using printk_func per-cpu variable in his
> printk-from-NMI patches and he also doesn't see a problem with this.
There's a few printk() varian
On Mon, Apr 04, 2016 at 08:32:21AM -0700, Andy Lutomirski wrote:
> Adding locking would be easy enough, wouldn't it?
See patch in this thread..
> But do any platforms really boot a second CPU before switching to real
> printk?
I _only_ use early_printk() as printk() is a quagmire of fail :-)
On Tue, Apr 05, 2016 at 07:10:04AM +0200, Juergen Gross wrote:
> +int smp_call_on_cpu(unsigned int cpu, bool pin, int (*func)(void *), void
> *par)
Why .pin and not .phys? .pin does not (to me) reflect the
hypervisor/physical-cpu thing.
Also, as per smp_call_function_single() would it not be mor
kernel add a service function for this purpose. This will enable
> the possibility to take special measures in virtualized environments
> like Xen, too.
>
> Signed-off-by: Juergen Gross
Thanks!
Acked-by: Peter Zijlstra (Intel)
___
Xen-deve
We've unconditionally used the queued spinlock for many releases now.
Its time to remove the old ticket lock code.
Cc: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
---
arch/x86/Kconfig | 3 +-
arch/x86/include/asm/paravirt.h | 18 ---
arch/x86/includ
On Wed, May 18, 2016 at 03:13:44PM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, May 18, 2016 at 08:43:02PM +0200, Peter Zijlstra wrote:
> >
> > We've unconditionally used the queued spinlock for many releases now.
>
> Like since 4.2?
Yeah, that seems to be the right
On Thu, Apr 20, 2017 at 03:24:53PM +0200, Vitaly Kuznetsov wrote:
> In this patch I suggest we set __max_logical_packages based on the
> max_physical_pkg_id and total_cpus,
So my 4 socket 144 CPU system will then get max_physical_pkg_id=144,
instead of 4.
This wastes quite a bit of memory for the
On Thu, Apr 20, 2017 at 05:40:37PM +0200, Vitaly Kuznetsov wrote:
> > This is getting ludicrous. Xen is plain broken, and instead of fixing
> > it, you propose to somehow deal with its obviously crack induced
> > behaviour :-(
>
> Totally agree and I don't like the solution I propose (and that's w
On Wed, Aug 16, 2017 at 05:12:35PM +0200, Ingo Molnar wrote:
> Unfortunately mcmodel=large looks pretty heavy too AFAICS, at the machine
> instruction level.
>
> Function calls look like this:
>
> -mcmodel=medium:
>
>757: e8 98 ff ff ff callq 6f4
>
> -mcmodel=large
>
>7
On Mon, Aug 21, 2017 at 03:32:22PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 16, 2017 at 05:12:35PM +0200, Ingo Molnar wrote:
> > Unfortunately mcmodel=large looks pretty heavy too AFAICS, at the machine
> > instruction level.
> >
> > Function calls look like this
On Tue, Aug 15, 2017 at 07:20:38AM -0700, Thomas Garnier wrote:
> On Tue, Aug 15, 2017 at 12:56 AM, Ingo Molnar wrote:
> > Have you considered a kernel with -mcmodel=small (or medium) instead of
> > -fpie
> > -mcmodel=large? We can pick a random 2GB window in the (non-kernel)
> > canonical
> >
On Thu, Aug 24, 2017 at 11:22:58AM +0200, Vitaly Kuznetsov wrote:
> diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
> index c7797307fc2b..d43a7fcafee9 100644
> --- a/arch/x86/include/asm/tlb.h
> +++ b/arch/x86/include/asm/tlb.h
> @@ -15,4 +15,9 @@
>
> #include
>
> +stati
On Thu, Aug 24, 2017 at 05:27:21PM +0200, Vitaly Kuznetsov wrote:
> Do you think adding something like
>
> /*
> * While x86 architecture in general requires an IPI to perform TLB
> * shootdown, enablement code for several hypervisors overrides
> * .flush_tlb_others hook in pv_mmu_ops and implem
On Tue, Sep 05, 2017 at 03:24:43PM +0200, Juergen Gross wrote:
> diff --git a/arch/x86/include/asm/qspinlock.h
> b/arch/x86/include/asm/qspinlock.h
> index 48a706f641f2..fbd98896385c 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -17,6 +17,25 @@ stati
On Tue, Sep 05, 2017 at 10:02:57AM -0400, Waiman Long wrote:
> On 09/05/2017 09:24 AM, Juergen Gross wrote:
> > +static inline bool native_virt_spin_lock(struct qspinlock *lock)
> > +{
> > + if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
> > + return false;
> > +
>
> I think you can tak
Guys, please trim email.
On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote:
> For clarification, I was actually asking if you consider just adding one
> more jump label to skip it for Xen/KVM instead of making
> virt_spin_lock() a pv-op.
I don't understand. What performance are you wor
On Wed, Sep 06, 2017 at 08:44:09AM -0400, Waiman Long wrote:
> On 09/06/2017 03:08 AM, Peter Zijlstra wrote:
> > Guys, please trim email.
> >
> > On Tue, Sep 05, 2017 at 10:31:46AM -0400, Waiman Long wrote:
> >> For clarification, I was actually asking if you cons
On Wed, Sep 06, 2017 at 11:49:49AM -0400, Waiman Long wrote:
> > #define virt_spin_lock virt_spin_lock
> > static inline bool virt_spin_lock(struct qspinlock *lock)
> > {
> > + if (!static_branch_likely(&virt_spin_lock_key))
> > + return false;
> > if (!static_cpu_has(X86_FEATURE
On Tue, Oct 10, 2017 at 05:14:08PM +0800, Dongli Zhang wrote:
> After guest live migration on xen, steal time in /proc/stat
> (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
> paravirt_steal_clock() might be less than this_rq()->prev_steal_time.
So why not fix paravirt_steal_cloc
On Tue, Oct 10, 2017 at 02:42:01PM +0200, Stanislaw Gruszka wrote:
> > > + u64 steal, steal_time;
> > > + s64 steal_delta;
> > > +
> > > + steal_time = paravirt_steal_clock(smp_processor_id());
> > > + steal = steal_delta = steal_time - this_rq()->prev_steal_time;
>
On Mon, Nov 13, 2017 at 06:06:02PM +0800, Quan Xu wrote:
> From: Yang Zhang
>
> Implement a generic idle poll which resembles the functionality
> found in arch/. Provide weak arch_cpu_idle_poll function which
> can be overridden by the architecture code if needed.
No, we want less of those magic
On Wed, Nov 15, 2017 at 11:03:08PM +0100, Thomas Gleixner wrote:
> If I understand the problem correctly then he wants to avoid the heavy
> lifting in tick_nohz_idle_enter() in the first place, but there is already
> an interesting quirk there which makes it exit early.
Sure. And there are people
On Thu, Aug 10, 2017 at 02:52:52PM +0200, Juergen Gross wrote:
> Xen's paravirt patch function xen_patch() does some special casing for
> irq_ops functions to apply relocations when those functions can be
> patched inline instead of calls.
>
> Unfortunately none of the special case function replac
On Thu, Aug 10, 2017 at 06:24:53PM +0200, Peter Zijlstra wrote:
> -ENTRY(xen_irq_enable_direct)
> - FRAME_BEGIN
> - /* Unmask events */
> - movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
> -
> - /*
> - * Preempt here doesn't matter becau
On Fri, Aug 11, 2017 at 11:23:10AM +0200, Vitaly Kuznetsov wrote:
> Peter Zijlstra writes:
>
> > On Thu, Aug 10, 2017 at 07:08:22PM +, Jork Loeser wrote:
> >
> >> > > Subject: Re: [tip:x86/platform] x86/hyper-v: Use hypercall for remote
> >> &g
On Fri, Aug 11, 2017 at 12:05:45PM +0100, Andrew Cooper wrote:
> >> Oh, I see your concern. Hyper-V, however, is not the first x86
> >> hypervisor trying to avoid IPIs on remote TLB flush, Xen does this
> >> too. Briefly looking at xen_flush_tlb_others() I don't see anything
> >> special, do we kno
On Fri, Aug 11, 2017 at 02:22:25PM +0200, Juergen Gross wrote:
> Wait - the TLB can be cleared at any time, as Andrew was pointing out.
> No cpu can rely on an address being accessible just because IF is being
> cleared. All that matters is the existing and valid page table entry.
>
> So clearing
On Fri, Aug 11, 2017 at 02:46:41PM +0200, Juergen Gross wrote:
> Aah, okay. Now I understand the problem. The TLB isn't the issue but the
> IPI is serving two purposes here: TLB flushing (which is allowed to
> happen at any time) and serialization regarding access to critical pages
> (which seems t
On Fri, Aug 11, 2017 at 03:07:29PM +0200, Juergen Gross wrote:
> On 11/08/17 14:54, Peter Zijlstra wrote:
> > On Fri, Aug 11, 2017 at 02:46:41PM +0200, Juergen Gross wrote:
> >> Aah, okay. Now I understand the problem. The TLB isn't the issue but the
> >> IPI i
On Wed, Sep 06, 2017 at 07:36:23PM +0200, Juergen Gross wrote:
> With virt_spin_lock() being guarded by a static key the bare metal case
> can be optimized by patching the call away completely. In case a kernel
> running as a guest it can decide whether to use paravitualized
> spinlocks, the curren
On Wed, Feb 08, 2017 at 01:00:24PM -0500, Waiman Long wrote:
> It was found when running fio sequential write test with a XFS ramdisk
> on a 2-socket x86-64 system, the %CPU times as reported by perf were
> as follows:
>
> 71.27% 0.28% fio [k] down_write
> 70.99% 0.01% fio [k] call_rwsem_d
On Wed, Feb 08, 2017 at 01:00:25PM -0500, Waiman Long wrote:
> As the vcpu_is_preempted() call is pretty costly compared with other
> checks within mutex_spin_on_owner() and rwsem_spin_on_owner(), they
> are done at a reduce frequency of once every 256 iterations.
That's just disgusting.
On Fri, Feb 10, 2017 at 10:43:09AM -0500, Waiman Long wrote:
> It was found when running fio sequential write test with a XFS ramdisk
> on a VM running on a 2-socket x86-64 system, the %CPU times as reported
> by perf were as follows:
>
> 69.75% 0.59% fio [k] down_write
> 69.15% 0.01% fio
On Fri, Feb 10, 2017 at 12:00:43PM -0500, Waiman Long wrote:
> >> +asm(
> >> +".pushsection .text;"
> >> +".global __raw_callee_save___kvm_vcpu_is_preempted;"
> >> +".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
> >> +"__raw_callee_save___kvm_vcpu_is_preempted:"
> >> +FRAME_BEGIN
>
On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote:
> That way we'd end up with something like:
>
> asm("
> push %rdi;
> movslq %edi, %rdi;
> movq __per_cpu_offset(,%rdi,8), %rax;
> cmpb $0, %[offset](%rax);
> setne %al;
> pop %rdi;
> "
On Mon, Feb 13, 2017 at 03:12:45PM -0500, Waiman Long wrote:
> On 02/13/2017 02:42 PM, Waiman Long wrote:
> > On 02/13/2017 05:53 AM, Peter Zijlstra wrote:
> >> On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote:
> >>> That way we'd end up w
On Mon, Feb 13, 2017 at 12:06:44PM -0800, h...@zytor.com wrote:
> >Maybe:
> >
> >movsql %edi, %rax;
> >movq __per_cpu_offset(,%rax,8), %rax;
> >cmpb $0, %[offset](%rax);
> >setne %al;
> >
> >?
>
> We could kill the zero or sign extend by changing the calling
> interface to pass an unsigned long i
On Mon, Feb 13, 2017 at 05:24:36PM -0500, Waiman Long wrote:
> >> movsql %edi, %rax;
> >> movq __per_cpu_offset(,%rax,8), %rax;
> >> cmpb $0, %[offset](%rax);
> >> setne %al;
> I have thought of that too. However, the goal is to eliminate memory
> read/write from/to stack. Eliminating a register
On Mon, Feb 13, 2017 at 05:34:01PM -0500, Waiman Long wrote:
> It is the address of &steal_time that will exceed the 32-bit limit.
That seems extremely unlikely. That would mean we have more than 4G
worth of per-cpu variables declared in the kernel.
___
On Tue, Feb 14, 2017 at 09:46:17AM -0500, Waiman Long wrote:
> On 02/14/2017 04:39 AM, Peter Zijlstra wrote:
> > On Mon, Feb 13, 2017 at 05:34:01PM -0500, Waiman Long wrote:
> >> It is the address of &steal_time that will exceed the 32-bit limit.
> > That seems extreme
On Wed, Feb 15, 2017 at 04:37:49PM -0500, Waiman Long wrote:
> The cpu argument in the function prototype of vcpu_is_preempted()
> is changed from int to long. That makes it easier to provide a better
> optimized assembly version of that function.
>
> For Xen, vcpu_is_preempted(long) calls xen_vcp
On Wed, Feb 15, 2017 at 04:37:50PM -0500, Waiman Long wrote:
> +/*
> + * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and
> + * restoring to/from the stack. It is assumed that the preempted value
> + * is at an offset of 16 from the beginning of the kvm_steal_time structure
>
On Thu, Feb 16, 2017 at 04:02:57PM -0500, Waiman Long wrote:
> On 02/16/2017 11:09 AM, Peter Zijlstra wrote:
> > On Wed, Feb 15, 2017 at 04:37:49PM -0500, Waiman Long wrote:
> >> The cpu argument in the function prototype of vcpu_is_preempted()
> >> is changed from
is will go through the KVM tree, if people want me to
take it through the tip tree, please let me know.
Acked-by: Peter Zijlstra (Intel)
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
On Wed, Nov 02, 2016 at 05:08:33AM -0400, Pan Xinhui wrote:
> diff --git a/arch/x86/include/asm/paravirt_types.h
> b/arch/x86/include/asm/paravirt_types.h
> index 0f400c0..38c3bb7 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -310,6 +310,8
On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote:
> Hi, Peter.
> I think we can avoid a function call in a simpler way. How about below
>
> static inline bool vcpu_is_preempted(int cpu)
> {
> /* only set in pv case*/
> if (pv_lock_ops.vcpu_is_preempted)
>
On Wed, Nov 16, 2016 at 12:29:44PM +0100, Christian Borntraeger wrote:
> On 11/16/2016 11:23 AM, Peter Zijlstra wrote:
> > On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote:
> >> Hi, Peter.
> >>I think we can avoid a function call in a simpler way. How a
On Tue, Dec 06, 2016 at 01:46:37AM -0700, Jan Beulich wrote:
> > + asm volatile (
> > + "pushfl\n\t"
> > + "pushl %%cs\n\t"
> > + "pushl $1f\n\t"
> > + "iret\n\t"
> > + "1:"
> > + : "+r" (__sp) : : "cc", "memory");
>
> I don't thing EFL
How can this be missing? Things compile fine now, right? So please
better explain why we do this change.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
On Thu, Nov 05, 2015 at 05:30:01PM +, Stefano Stabellini wrote:
> On Thu, 5 Nov 2015, Peter Zijlstra wrote:
> > How can this be missing? Things compile fine now, right?
>
> Fair enough.
>
>
> > So please better explain why we do this change.
>
> asm/par
On Tue, Nov 10, 2015 at 11:27:33AM +, Stefano Stabellini wrote:
> On Mon, 9 Nov 2015, Peter Zijlstra wrote:
> > On Thu, Nov 05, 2015 at 05:30:01PM +, Stefano Stabellini wrote:
> > > On Thu, 5 Nov 2015, Peter Zijlstra wrote:
> > > > How can this be missing?
On Tue, Nov 10, 2015 at 11:57:49AM +, Stefano Stabellini wrote:
> __current_kernel_time64 returns a struct timespec64, without taking the
> xtime lock. Mirrors __current_kernel_time/current_kernel_time.
It always helps if you include a reason why you want a patch.
On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote:
> +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par)
> +{
> + cpumask_var_t old_mask;
> + int ret;
> +
> + if (cpu >= nr_cpu_ids)
> + return -EINVAL;
> +
> + if (!alloc_cpumask_var(&old_
On Fri, Mar 11, 2016 at 12:59:28PM +0100, Juergen Gross wrote:
> Some hardware (e.g. Dell Studio laptops) require special functions to
> be called on physical cpu 0 in order to avoid occasional hangs. When
> running as dom0 under Xen this could be achieved only via special boot
> parameters (vcpu p
On Fri, Mar 11, 2016 at 01:19:50PM +0100, Peter Zijlstra wrote:
> On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote:
> > +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par)
> > +{
> > + cpumask_var_t old_mask;
> > + int ret;
> >
On Fri, Mar 11, 2016 at 01:43:53PM +0100, Juergen Gross wrote:
> On 11/03/16 13:19, Peter Zijlstra wrote:
> > On Fri, Mar 11, 2016 at 12:59:30PM +0100, Juergen Gross wrote:
> >> +int call_sync_on_phys_cpu(unsigned cpu, int (*func)(void *), void *par)
> >> +{
>
On Fri, Mar 11, 2016 at 01:48:12PM +0100, Juergen Gross wrote:
> On 11/03/16 13:42, Peter Zijlstra wrote:
> > how about something like:
> >
> > struct xen_callback_struct {
> > struct work_struct work;
> > struct completion done;
int
On Fri, Mar 11, 2016 at 01:15:04PM +, One Thousand Gnomes wrote:
> On Fri, 11 Mar 2016 13:25:14 +0100
> Peter Zijlstra wrote:
>
> > On Fri, Mar 11, 2016 at 12:59:28PM +0100, Juergen Gross wrote:
> > > Some hardware (e.g. Dell Studio laptops) require special functio
On Mon, Mar 14, 2016 at 11:10:16AM -0700, Andy Lutomirski wrote:
> A couple of the wrmsr users actually care about performance. These
> are the ones involved in context switching and, to a lesser extent, in
> switching in and out of guest mode.
Right, this very much includes a number of perf MSRs
On Sun, Dec 20, 2015 at 05:07:19PM +, Andrew Cooper wrote:
>
> Very much +1 for fixing this.
>
> Those names would be fine, but they do add yet another set of options in
> an already-complicated area.
>
> An alternative might be to have the regular smp_{w,r,}mb() not revert
> back to nops if
On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote:
> On s390 read_barrier_depends, smp_read_barrier_depends
> smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
> asm-generic variants exactly. Drop the local definitions and pull in
> asm-generic/barrier.h inst
On Thu, Dec 31, 2015 at 09:07:10PM +0200, Michael S. Tsirkin wrote:
> -#define smp_store_release(p, v)
> \
> -do { \
> - compiletime_assert_atomic_type(*p);
On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote:
> > My only concern is that it gives people an additional handle onto a
> > "new" set of barriers - just because they're prefixed with __*
> > unfortunate
On Thu, Dec 31, 2015 at 09:08:22PM +0200, Michael S. Tsirkin wrote:
> +#ifdef CONFIG_SMP
> +#define fence() metag_fence()
> +#else
> +#define fence() do { } while (0)
> #endif
James, it strikes me as odd that fence() is a no-op instead of a
barrier() for UP, can you verify/explain?
On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> This defines __smp_xxx barriers for s390,
> for use by virtualization.
>
> Some smp_xxx barriers are removed as they are
> defined correctly by asm-generic/barriers.h
>
> Note: smp_mb, smp_rmb and smp_wmb are defined as full ba
On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote:
> On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux wrote:
>
> > > My only concern is that it gives people an add
On Thu, Dec 31, 2015 at 09:09:47PM +0200, Michael S. Tsirkin wrote:
> At the moment, xchg on sh only supports 4 and 1 byte values, so using it
> from smp_store_mb means attempts to store a 2 byte value using this
> macro fail.
>
> And happens to be exactly what virtio drivers want to do.
>
> Chec
On Thu, Dec 31, 2015 at 09:10:01PM +0200, Michael S. Tsirkin wrote:
> drivers/xen/xenbus/xenbus_comms.c uses
> full memory barriers to communicate with the other side.
>
> For guests compiled with CONFIG_SMP, smp_wmb and smp_mb
> would be sufficient, so mb() and wmb() here are only needed if
> a n
On Mon, Jan 04, 2016 at 03:25:58PM +, James Hogan wrote:
> It is used along with the metag specific __global_lock1() (global
> voluntary lock between hw threads) whenever a write is performed, and by
> smp_mb/smp_rmb to try to catch other cases, but I've never been
> confident this fixes every
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote:
> This statement doesn't fit MIPS barriers variations. Moreover, there is a
> reason to extend that even more specific, at least for smp_store_release and
> smp_load_acquire, look into
>
> http://patchwork.linux-mips.org/patch/1
On Tue, Jan 12, 2016 at 10:43:36AM +0200, Michael S. Tsirkin wrote:
> On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote:
> > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote:
> > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends,
> > >smp_read_barrier_depends, smp_store_re
On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote:
> 2) the changelog _completely_ fails to explain the sync 0x11 and sync
> 0x12 semantics nor does it provide a publicly accessible link to
> documentation that does.
Ralf pointed me at: https://imgtec.com/mips/architectur
On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote:
> > 2) the changelog _completely_ fails to explain the sync 0x11 and sync
> > 0x12 semantics nor does it provide a publicly accessible link to
> &g
> duplicate patch, and assume conflict will be resolved.
>
> I would really appreciate some feedback on arch bits (especially the x86
> bits),
> and acks for merging this through the vhost tree.
Thanks for doing this, looks good to me.
Acke
.
>
> 3. I bother MIPS Arch team long time until I completely understood that MIPS
> SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly
> that is required in Documentation/memory-barriers.txt
Ha! and you think that document covers all the really fun details?
In part
On Wed, Jan 13, 2016 at 11:02:35AM -0800, Leonid Yegoshin wrote:
> I ask HW team about it but I have a question - has it any relationship with
> replacing MIPS SYNC with lightweight SYNCs (SYNC_WMB etc)?
Of course. If you cannot explain the semantics of the primitives you
introduce, how can we ju
On Thu, Jan 14, 2016 at 11:42:02AM -0800, Leonid Yegoshin wrote:
> An the only point - please use an appropriate SYNC_* barriers instead of
> heavy bold hammer. That stuff was design explicitly to support the
> requirements of Documentation/memory-barriers.txt
That's madness. That document changes
On Thu, Jan 14, 2016 at 09:15:13PM +0100, Peter Zijlstra wrote:
> On Thu, Jan 14, 2016 at 11:42:02AM -0800, Leonid Yegoshin wrote:
> > An the only point - please use an appropriate SYNC_* barriers instead of
> > heavy bold hammer. That stuff was design explicitly to support the
>
On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> So smp_mb() provides transitivity, as do pairs of smp_store_release()
> and smp_read_acquire(),
But they provide different grades of transitivity, which is where all
the confusion lays.
smp_mb() is strongly/globally transitive,
On Fri, Jan 15, 2016 at 09:55:54AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 14, 2016 at 01:29:13PM -0800, Paul E. McKenney wrote:
> > So smp_mb() provides transitivity, as do pairs of smp_store_release()
> > and smp_read_acquire(),
>
> But they provide different grades o
On Fri, Jan 15, 2016 at 09:46:12AM -0800, Paul E. McKenney wrote:
> On Fri, Jan 15, 2016 at 10:13:48AM +0100, Peter Zijlstra wrote:
> > And the stuff we're confused about is how best to express the difference
> > and guarantees of these two forms of transitivity and how exact
ase()/smp_load_acquire() chains is local. This
> commit therefore introduces the notion of local transitivity and
> gives an example.
>
> Reported-by: Peter Zijlstra
> Reported-by: Will Deacon
> Signed-off-by: Paul E. McKenney
I think it fails to
On Mon, Jan 25, 2016 at 10:12:11PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 06:02:34PM +, Will Deacon wrote:
> > Thanks for having a go at this. I tried defining something axiomatically,
> > but got stuck pretty quickly. In my scheme, I used "data-directed
> > transitivity" ins
On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 04:42:43PM +, Will Deacon wrote:
> > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > On Fri, Jan 15, 2016 at 10:27:14PM +0100, Peter Zijlstra wrote:
> &g
On Thu, Jan 14, 2016 at 02:20:46PM -0800, Paul E. McKenney wrote:
> On Thu, Jan 14, 2016 at 01:24:34PM -0800, Leonid Yegoshin wrote:
> > On 01/14/2016 12:48 PM, Paul E. McKenney wrote:
> > >
> > >So SYNC_RMB is intended to implement smp_rmb(), correct?
> > Yes.
> > >
> > >You could use SYNC_ACQUIRE
On Tue, Jan 26, 2016 at 11:24:02AM +0100, Peter Zijlstra wrote:
> Yeah, this goes under the header: memory-barriers.txt is _NOT_ a
> specification (I seem to keep repeating this).
Do we want this ?
---
Documentation/memory-barriers.txt | 17 +
1 file changed, 17 inse
On Wed, Jan 27, 2016 at 12:52:07AM +0800, Boqun Feng wrote:
> I recall that last time you and Linus came into a conclusion that even
> on Alpha, a barrier for read->write with data dependency is unnecessary:
>
> http://article.gmane.org/gmane.linux.kernel/2077661
>
> And in an earlier mail of tha
On Tue, Jan 26, 2016 at 02:33:40PM -0800, Linus Torvalds wrote:
> If it turns out that some architecture does actually need a barrier
> between a read and a dependent write, then that will mean that
>
> (a) we'll have to make up a _new_ barrier, because
> "smp_read_barrier_depends()" is not that
e insane to require it
when building new hardware.
Signed-off-by: Peter Zijlstra (Intel)
---
Documentation/memory-barriers.txt | 18 +-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/Documentation/memory-barriers.txt
b/Documentation/memory-barriers.txt
index a
On Tue, Jan 26, 2016 at 12:13:39PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 26, 2016 at 11:19:27AM +0100, Peter Zijlstra wrote:
> > So isn't smp_mb__after_unlock_lock() exactly such a scenario? And would
> > not someone trying to implement RCsc locks using locally tr
Cc: Rik van Riel
Cc: Linus Torvalds
Cc: Raghavendra K T
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Link:
http://lkml.kernel.org/r/1421784755-21945-2-git-send-email-waiman.l...@hp.com
---
include/asm-generic/qspi
current kvm code. We can do a single
enrty because any nesting will wake the vcpu and cause the lower loop
to retry.
Signed-off-by: Peter Zijlstra (Intel)
---
include/asm-generic/qspinlock.h |3
kernel/locking/qspinlock.c | 69 +-
kernel/locking/qspinlock_paravirt.h
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Cc: "Paul E. McKenney"
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Rik van Riel
Cc: Raghavendra K T
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Link:
http://lkml.kernel.org/r/14
From: Peter Zijlstra
Because the qspinlock needs to touch a second cacheline (the per-cpu
mcs_nodes[]); add a pending bit and allow a single in-word spinner
before we punt to the second cacheline.
It is possible so observe the pending bit without the locked bit when
the last owner has just
Scott J Norton
Cc: Paolo Bonzini
Cc: Douglas Hatch
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Cc: "Paul E. McKenney"
Cc: Rik van Riel
Cc: Linus Torvalds
Cc: Raghavendra K T
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Inte
From: Peter Zijlstra
When we detect a hypervisor (!paravirt, see qspinlock paravirt support
patches), revert to a simple test-and-set lock to avoid the horrors
of queue preemption.
Cc: Ingo Molnar
Cc: David Vrabel
Cc: Oleg Nesterov
Cc: Scott J Norton
Cc: Paolo Bonzini
Cc: Douglas Hatch
Cc
1 - 100 of 144 matches
Mail list logo