Re: [PATCH] x86/pci: fix intel_mid_pci.c build error when ACPI is not enabled

2020-08-13 Thread Arjan van de Ven
/intel_mid_pci.c:303:2: error: implicit declaration of function ‘acpi_noirq_set’; did you mean ‘acpi_irq_get’? [-Werror=implicit-function-declaration] acpi_noirq_set(); Signed-off-by: Randy Dunlap Cc: Jacob Pan Cc: Len Brown Cc: Bjorn Helgaas Cc: Jesse Barnes Cc: Arjan van de Ven Cc: linux

Re: [PATCH v11 2/3] x86,/proc/pid/status: Add AVX-512 usage elapsed time

2019-02-20 Thread Arjan van de Ven
On 2/20/2019 7:35 AM, David Laight wrote: From: Sent: 16 February 2019 12:56 To: Li, Aubrey ... The above experiment just confirms what I said: The numbers are inaccurate and potentially misleading to a large extent when the AVX using task is not scheduled out for a longer time. Not only tha

Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations

2019-01-14 Thread Arjan van de Ven
On 1/14/2019 5:06 AM, Jiri Kosina wrote: On Mon, 14 Jan 2019, Pavel Machek wrote: Frankly I'd not call it Meltdown, as it works only on data in the cache, so the defense is completely different. Seems more like a l1tf :-). Meltdown on x86 also seems to work only for data in L1D, but the pipel

Re: [PATCH] x86/speculation: Add document to describe Spectre and its mitigations

2018-12-31 Thread Arjan van de Ven
On 12/31/2018 8:22 AM, Ben Greear wrote: On 12/21/2018 05:17 PM, Tim Chen wrote: On 12/21/18 1:59 PM, Ben Greear wrote: On 12/21/18 9:44 AM, Tim Chen wrote: Thomas, Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while.  But

Re: WARNING in __rcu_read_unlock

2018-12-17 Thread Arjan van de Ven
On 12/17/2018 3:29 AM, Paul E. McKenney wrote: As does this sort of report on a line that contains simple integer arithmetic and boolean operations.;-) Any chance of a bisection? btw this looks like something caused a stack overflow and thus all the weirdness that then happens

Re: [PATCH v4 1/2] x86/fpu: track AVX-512 usage of tasks

2018-12-11 Thread Arjan van de Ven
On 12/11/2018 3:46 PM, Li, Aubrey wrote: On 2018/12/12 1:18, Dave Hansen wrote: On 12/10/18 4:24 PM, Aubrey Li wrote: The tracking turns on the usage flag at the next context switch of the task, but requires 3 consecutive context switches with no usage to clear it. This decay is required becaus

Re: [patch V2 27/28] x86/speculation: Add seccomp Spectre v2 user space protection mode

2018-12-04 Thread Arjan van de Ven
On processors with enhanced IBRS support, we recommend setting IBRS to 1 and left set. Then why doesn't CPU with EIBRS support acutally *default* to '1', with opt-out possibility for OS? (slightly longer answer) you can pretty much assume that on these CPUs, IBRS doesn't actually do anything

Re: [patch V2 27/28] x86/speculation: Add seccomp Spectre v2 user space protection mode

2018-12-04 Thread Arjan van de Ven
On processors with enhanced IBRS support, we recommend setting IBRS to 1 and left set. Then why doesn't CPU with EIBRS support acutally *default* to '1', with opt-out possibility for OS? the BIOSes could indeed get this set up this way. do you want to trust the bios to get it right?

Re: [patch 01/24] x86/speculation: Update the TIF_SSBD comment

2018-11-21 Thread Arjan van de Ven
On 11/21/2018 2:53 PM, Borislav Petkov wrote: On Wed, Nov 21, 2018 at 11:48:41PM +0100, Thomas Gleixner wrote: Btw, I really do not like the app2app wording. I'd rather go for usr2usr, but that's kinda horrible as well. But then, all of this is horrible. Any better ideas? It needs to have "ta

Re: STIBP by default.. Revert?

2018-11-20 Thread Arjan van de Ven
On 11/20/2018 11:27 PM, Jiri Kosina wrote: On Mon, 19 Nov 2018, Arjan van de Ven wrote: In the documentation, AMD officially recommends against this by default, and I can speak for Intel that our position is that as well: this really must not be on by default. Thanks for pointing to the AMD

Re: Re: STIBP by default.. Revert?

2018-11-18 Thread Arjan van de Ven
On 11/19/2018 6:00 AM, Linus Torvalds wrote: On Sun, Nov 18, 2018 at 1:49 PM Jiri Kosina wrote: So why do that STIBP slow-down by default when the people who *really* care already disabled SMT? BTW for them, there is no impact at all. Right. People who really care about security and are a

Re: [RFC PATCH v1 2/2] proc: add /proc//thread_state

2018-11-12 Thread Arjan van de Ven
I'd prefer the kernel to do such clustering... I think that is a next step. Also, while the kernel can do this at a best effort basis, it cannot take into account things the kernel doesn't know about, like high priority job peak load etc.., things a job scheduler would know. Then again, a j

Re: [RFC] x86, tsc: Add kcmdline args for skipping tsc calibration sequences

2018-07-13 Thread Arjan van de Ven
On 7/13/2018 12:19 PM, patrickg wrote: This RFC patch is intended to allow bypass CPUID, MSR and QuickPIT calibration methods should the user desire to. The current ordering in ML x86 tsc is to calibrate in the order listed above; returning whenever there's a successful calibration. However t

Re: [RFC][PATCH] x86: proposed new ARCH_CAPABILITIES MSR bit for RSB-underflow

2018-02-16 Thread Arjan van de Ven
On 2/16/2018 11:43 AM, Linus Torvalds wrote: On Fri, Feb 16, 2018 at 11:38 AM, Linus Torvalds wrote: Of course, your patch still doesn't allow for "we claim to be skylake for various other independent reasons, but the RSB issue is fixed". .. maybe nobody ever has a reason to do that, though?

Re: [PATCH] platform/x86: intel_turbo_max_3: Remove restriction for HWP platforms

2018-02-14 Thread Arjan van de Ven
On 2/14/2018 11:29 AM, Andy Shevchenko wrote: On Mon, Feb 12, 2018 at 9:50 PM, Srinivas Pandruvada wrote: On systems supporting HWP (Hardware P-States) mode, we expected to enumerate core priority via ACPI-CPPC tables. Unfortunately deployment of TURBO 3.0 didn't use this method to show core pr

Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

2018-02-13 Thread Arjan van de Ven
So, any hints on what you think should be the correct fix here? the patch sure looks correct to me, it now has a nice table for CPU IDs including all of AMD (and soon hopefully the existing Intel ones that are not exposed to meltdown)

Re: [RFC,05/10] x86/speculation: Add basic IBRS support infrastructure

2018-01-31 Thread Arjan van de Ven
On 1/31/2018 2:15 AM, Thomas Gleixner wrote: Good luck with making all that work. on the Intel side we're checking what we can do that works and doesn't break things right now; hopefully we just end up with a bit in the arch capabilities MSR for "you should do RSB stuffing" and then the HV's c

Re: [PATCH] x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel

2018-01-30 Thread Arjan van de Ven
On 1/30/2018 5:11 AM, Borislav Petkov wrote: On Tue, Jan 30, 2018 at 01:57:21PM +0100, Thomas Gleixner wrote: So much for the theory. That's not going to work. If the boot cpu has the feature then the alternatives will have been applied. So even if the flag mismatch can be observed when a second

Re: [RFC,05/10] x86/speculation: Add basic IBRS support infrastructure

2018-01-30 Thread Arjan van de Ven
On 1/29/2018 7:32 PM, Linus Torvalds wrote: On Mon, Jan 29, 2018 at 5:32 PM, Arjan van de Ven wrote: the most simple solution is that we set the internal feature bit in Linux to turn on the "stuff the RSB" workaround is we're on a SKL *or* as a guest in a VM. That so

Re: [RFC,05/10] x86/speculation: Add basic IBRS support infrastructure

2018-01-29 Thread Arjan van de Ven
On 1/29/2018 4:23 PM, Linus Torvalds wrote: Why do you even _care_ about the guest, and how it acts wrt Skylake? What you should care about is not so much the guests (which do their own thing) but protect guests from each other, no? the most simple solution is that we set the internal feature

Re: [RFC,05/10] x86/speculation: Add basic IBRS support infrastructure

2018-01-29 Thread Arjan van de Ven
On 1/29/2018 12:42 PM, Eduardo Habkost wrote: The question is how the hypervisor could tell that to the guest. If Intel doesn't give us a CPUID bit that can be used to tell that retpolines are enough, maybe we should use a hypervisor CPUID bit for that? the objective is to have retpoline be saf

Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation

2018-01-26 Thread Arjan van de Ven
On 1/26/2018 10:11 AM, David Woodhouse wrote: I am *actively* ignoring Skylake right now. This is about per-SKL userspace even with SMEP, because we think Intel's document lies to us. if you think we lie to you then I think we're done with the conversation? Please tell us then what you deploy

Re: [PATCH v3 5/6] x86/pti: Do not enable PTI on processors which are not vulnerable to Meltdown

2018-01-26 Thread Arjan van de Ven
On 1/26/2018 7:27 AM, Dave Hansen wrote: On 01/26/2018 04:14 AM, Yves-Alexis Perez wrote: I know we'll still be able to manually enable PTI with a command line option, but it's also a hardening feature which has the nice side effect of emulating SMEP on CPU which don't support it (e.g the Atom b

Re: [RFC PATCH 1/2] x86/ibpb: Skip IBPB when we switch back to same user process

2018-01-25 Thread Arjan van de Ven
This patch tries to address the case when we do switch to init_mm and back. Do you still have objections to the approach in this patch to save the last active mm before switching to init_mm? how do you know the last active mm did not go away and started a new process with new content? (other t

Re: [RFC PATCH 1/2] x86/ibpb: Skip IBPB when we switch back to same user process

2018-01-25 Thread Arjan van de Ven
The idea is simple, do what we do for virt. Don't send IPI's to CPUs that don't need them (in virt's case because the vCPU isn't running, in our case because we're not in fact running a user process), but mark the CPU as having needed a TLB flush. I am really uncomfortable with that idea. You re

Re: [RFC PATCH 1/2] x86/ibpb: Skip IBPB when we switch back to same user process

2018-01-25 Thread Arjan van de Ven
On 1/25/2018 5:50 AM, Peter Zijlstra wrote: On Thu, Jan 25, 2018 at 05:21:30AM -0800, Arjan van de Ven wrote: This means that 'A -> idle -> A' should never pass through switch_mm to begin with. Please clarify how you think it does. the idle code does leave_mm() to avoid ha

Re: [RFC PATCH 1/2] x86/ibpb: Skip IBPB when we switch back to same user process

2018-01-25 Thread Arjan van de Ven
This means that 'A -> idle -> A' should never pass through switch_mm to begin with. Please clarify how you think it does. the idle code does leave_mm() to avoid having to IPI CPUs in deep sleep states for a tlb flush. (trust me, that you really want, sequentially IPI's a pile of cores in a d

Re: [RFC 05/10] x86/speculation: Add basic IBRS support infrastructure

2018-01-24 Thread Arjan van de Ven
On 1/24/2018 1:10 AM, Greg Kroah-Hartman wrote: That means the whitelist ends up basically empty right now. Should I add a command line parameter to override it? Otherwise we end up having to rebuild the kernel every time there's a microcode release which covers a new CPU SKU (which is why I ki

Re: [RFC 04/10] x86/mm: Only flush indirect branches when switching into non dumpable process

2018-01-21 Thread Arjan van de Ven
On 1/21/2018 8:21 AM, Ingo Molnar wrote: So if it's only about the scheduler barrier, what cycle cost are we talking about here? in the order of 5000 to 1 cycles. (depends a bit on the cpu generation but this range is a reasonable approximation) Because putting something like this

Re: kexec reboot fails with extra wbinvd introduced for AME SME

2018-01-17 Thread Arjan van de Ven
Does anybody have any other ideas? the only other weird case that comes to mind; what happens if there's a line dirty in the caches, but the memory is now mapped uncached. (Which could happen if kexec does muck with MTRRs, CR0 or other similar things in weird ways)... not sure what happens in

Re: kexec reboot fails with extra wbinvd introduced for AME SME

2018-01-17 Thread Arjan van de Ven
Does anybody have any other ideas? wbinvd is thankfully not common, but also not rare (MTRR setup and a bunch of other cases) and in some other operating systems it happens even more than on Linux.. it's generally not totally broken like this. I can only imagine a machine check case where a

Re: [tip:x86/pti] x86/retpoline: Fill RSB on context switch for affected CPUs

2018-01-15 Thread Arjan van de Ven
This would means that userspace would see return predictions based on the values the kernel 'stuffed' into the RSB to fill it. Potentially this leaks a kernel address to userspace. KASLR pretty much died in May this year to be honest with the KAISER paper (if not before then) also with KPTI

Re: [PATCH 3/8] kvm: vmx: pass MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD down to the guest

2018-01-10 Thread Arjan van de Ven
On 1/10/2018 5:20 AM, Paolo Bonzini wrote: * a simple specification that does "IBRS=1 blocks indirect branch prediction altogether" would actually satisfy the specification just as well, and it would be nice to know if that's what the processor actually does. it doesn't exactly, not for all. s

Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU

2018-01-09 Thread Arjan van de Ven
On 1/9/2018 8:17 AM, Paolo Bonzini wrote: On 09/01/2018 16:19, Arjan van de Ven wrote: On 1/9/2018 7:00 AM, Liran Alon wrote: - ar...@linux.intel.com wrote: On 1/9/2018 3:41 AM, Paolo Bonzini wrote: The above ("IBRS simply disables the indirect branch predictor") was my

Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU

2018-01-09 Thread Arjan van de Ven
I'm sorry I'm not familiar with your L0/L1/L2 terminology (maybe it's before coffee has had time to permeate the brain) These are standard terminology for guest levels: L0 == hypervisor that runs on bare-metal L1 == hypervisor that runs as L0 guest. L2 == software that runs as L1 guest. (We ar

Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU

2018-01-09 Thread Arjan van de Ven
On 1/9/2018 7:00 AM, Liran Alon wrote: - ar...@linux.intel.com wrote: On 1/9/2018 3:41 AM, Paolo Bonzini wrote: The above ("IBRS simply disables the indirect branch predictor") was my take-away message from private discussion with Intel. My guess is that the vendors are just handwavi

Re: [PATCH 6/7] x86/svm: Set IBPB when running a different VCPU

2018-01-09 Thread Arjan van de Ven
On 1/9/2018 3:41 AM, Paolo Bonzini wrote: The above ("IBRS simply disables the indirect branch predictor") was my take-away message from private discussion with Intel. My guess is that the vendors are just handwaving a spec that doesn't match what they have implemented, because honestly a microc

Re: [PATCH 00/18] prevent bounds-check bypass via speculative execution

2018-01-06 Thread Arjan van de Ven
It sounds like Coverity was used to produce these patches? If so, is there a plan to have smatch (hey Dan) or other open source static analysis tool be possibly enhanced to do a similar type of work? I'd love for that to happen; the tricky part is being able to have even a sort of sensible conce

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-20 Thread Arjan van de Ven
On 7/20/2017 1:11 AM, Thomas Gleixner wrote: On Thu, 20 Jul 2017, Li, Aubrey wrote: Don't get me wrong, even if a fast path is acceptable, we still need to figure out if the coming idle is short and when to switch. I'm just worried about if irq timings is not an ideal statistics, we have to skip

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-20 Thread Arjan van de Ven
On 7/20/2017 5:50 AM, Paul E. McKenney wrote: To make this work reasonably, you would also need some way to check for the case where the prediction idle time is short but the real idle time is very long. so the case where you predict very short but is actually "indefinite", the real solution li

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-18 Thread Arjan van de Ven
On 7/18/2017 9:36 AM, Peter Zijlstra wrote: On Tue, Jul 18, 2017 at 08:29:40AM -0700, Arjan van de Ven wrote: the most obvious way to do this (for me, maybe I'm naive) is to add another C state, lets call it "C1-lite" with its own thresholds and power levels etc, and just let

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-18 Thread Arjan van de Ven
On 7/18/2017 8:20 AM, Paul E. McKenney wrote: 3.2) how to determine if the idle is short or long. My current proposal is to use a tunable value via /sys, while Peter prefers an auto-adjust mechanism. I didn't get the details of an auto-adjust mechanism yet the most obvious way to do this (for

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-17 Thread Arjan van de Ven
On 7/17/2017 12:53 PM, Thomas Gleixner wrote: On Mon, 17 Jul 2017, Arjan van de Ven wrote: On 7/17/2017 12:23 PM, Peter Zijlstra wrote: Of course, this all assumes a Gaussian distribution to begin with, if we get bimodal (or worse) distributions we can still get it wrong. To fix that, we&#

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-17 Thread Arjan van de Ven
On 7/17/2017 12:46 PM, Thomas Gleixner wrote: On Mon, 17 Jul 2017, Arjan van de Ven wrote: On 7/17/2017 12:23 PM, Peter Zijlstra wrote: Now I think the problem is that the current predictor goes for an average idle duration. This means that we, on average, get it wrong 50% of the time. For

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-17 Thread Arjan van de Ven
On 7/17/2017 12:23 PM, Peter Zijlstra wrote: Of course, this all assumes a Gaussian distribution to begin with, if we get bimodal (or worse) distributions we can still get it wrong. To fix that, we'd need to do something better than what we currently have. fwiw some time ago I made a chart for

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-17 Thread Arjan van de Ven
On 7/17/2017 12:23 PM, Peter Zijlstra wrote: Now I think the problem is that the current predictor goes for an average idle duration. This means that we, on average, get it wrong 50% of the time. For performance that's bad. that's not really what it does; it looks at next tick and then discount

Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods

2017-07-14 Thread Arjan van de Ven
On 7/14/2017 8:38 AM, Peter Zijlstra wrote: No, that's wrong. We want to fix the normal C state selection process to pick the right C state. The fast-idle criteria could cut off a whole bunch of available C states. We need to understand why our current C state pick is wrong and amend the algorit

Re: [x86/mm] e2a7dcce31: kernel_BUG_at_arch/x86/mm/tlb.c

2017-05-30 Thread Arjan van de Ven
On 5/27/2017 9:56 AM, Andy Lutomirski wrote: On Sat, May 27, 2017 at 9:00 AM, Andy Lutomirski wrote: On Sat, May 27, 2017 at 6:31 AM, kernel test robot wrote: FYI, we noticed the following commit: commit: e2a7dcce31f10bd7471b4245a6d1f2de344e7adf ("x86/mm: Rework lazy TLB to track the actua

Re: [patch 12/18] async: Adjust system_state checks

2017-05-14 Thread Arjan van de Ven
On 5/14/2017 11:27 AM, Thomas Gleixner wrote: looks good .. ack

Re: [PATCH] use get_random_long for the per-task stack canary

2017-05-04 Thread Arjan van de Ven
On 5/4/2017 6:32 AM, Daniel Micay wrote: The stack canary is an unsigned long and should be fully initialized to random data rather than only 32 bits of random data. that makes sense to me... ack

Re: [PATCH 5/6] notifiers: Use CHECK_DATA_CORRUPTION() on checks

2017-03-22 Thread Arjan van de Ven
On 3/22/2017 12:29 PM, Kees Cook wrote: When performing notifier function pointer sanity checking, allow CONFIG_BUG_ON_DATA_CORRUPTION to upgrade from a WARN to a BUG. Additionally enables CONFIG_DEBUG_NOTIFIERS when selecting CONFIG_BUG_ON_DATA_CORRUPTION. Any feedback on this change? By defa

Re: [PATCH 1/5] x86: Implement __WARN using UD0

2017-03-21 Thread Arjan van de Ven
On 3/21/2017 8:14 AM, Peter Zijlstra wrote: For self-documentation purposes, maybe use a define for the length of the ud0 instruction? #define TWO 2 ;-) some things make sense as a define, others don't (adding a comment, maybe)

Re: [PATCH] x86/dmi: Switch dmi_remap to ioremap_cache

2017-03-09 Thread Arjan van de Ven
On 3/9/2017 9:48 AM, Julian Brost wrote: I'm note entirely sure whether it's actually the kernel or HP to blame, but for now, hp-health is completely broken on 4.9 (probably on everything starting from 4.6), so this patch should be reviewed again. it looks like another kernel driver is doing a

Re: [PATCH] x86: Implement __WARN using UD0

2017-02-23 Thread Arjan van de Ven
On 2/23/2017 5:28 AM, Peter Zijlstra wrote: By using "UD0" for WARNs we remove the function call and its possible __FILE__ and __LINE__ immediate arguments from the instruction stream. Total image size will not change much, what we win in the instruction stream we'll loose because of the __bug_

Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

2017-01-05 Thread Arjan van de Ven
On 1/5/2017 9:54 AM, Thomas Garnier wrote: That's my goal too. I started by doing a RO remap and got couple problems with hibernation. I can try again for the next iteration or delay it for another patch. I also need to look at KVM GDT usage, I am not familiar with it yet. don't we write to t

Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

2017-01-05 Thread Arjan van de Ven
On 1/5/2017 8:40 AM, Thomas Garnier wrote: Well, it happens only when KASLR memory randomization is enabled. Do you think it should have a separate config option? no I would want it a runtime option "sgdt from ring 3" is going away with UMIP (and is already possibly gone in virtual machines

Re: [PATCH 1/3] cpuidle/menu: stop seeking deeper idle if current state is too deep

2017-01-05 Thread Arjan van de Ven
On 1/5/2017 7:43 AM, Rik van Riel wrote: On Thu, 2017-01-05 at 23:29 +0800, Alex Shi wrote: The obsolete commit 71abbbf85 want to introduce a dynamic cstates, but it was removed for long time. Just left the nonsense deeper cstate checking. Since all target_residency and exit_latency are going l

Re: [RFC] x86/mm/KASLR: Remap GDTs at fixed location

2017-01-05 Thread Arjan van de Ven
On 1/5/2017 12:11 AM, Ingo Molnar wrote: * Thomas Garnier wrote: Each processor holds a GDT in its per-cpu structure. The sgdt instruction gives the base address of the current GDT. This address can be used to bypass KASLR memory randomization. With another bug, an attacker could target other

Re: [PATCH] proc: Fix timerslack_ns CAP_SYS_NICE check when adjusting self

2016-08-10 Thread Arjan van de Ven
On 8/10/2016 12:03 PM, John Stultz wrote: I wasn't entierly sure. I didn't think PR_SET_TIMERSLACK has a security hook, but looking again I now see the top-level security_task_prctl() check, so maybe not skipping it in this case would be good? the easy fix would be to add back the ptrace check

Re: [PATCH 2/2] proc: Add /proc//timerslack_ns interface

2016-07-14 Thread Arjan van de Ven
On 7/14/2016 10:45 AM, Kees Cook wrote: On Thu, Jul 14, 2016 at 9:09 AM, John Stultz wrote: On Thu, Jul 14, 2016 at 5:48 AM, Serge E. Hallyn wrote: Quoting Kees Cook (keesc...@chromium.org): I think the original CAP_SYS_NICE should be fine. A malicious CAP_SYS_NICE process can do plenty of i

Re: [PATCH 2/2] proc: Add /proc//timerslack_ns interface

2016-07-14 Thread Arjan van de Ven
On 7/14/2016 5:48 AM, Serge E. Hallyn wrote: Can someone give a detailed explanation of what you could do with the new timerslack feature and compare it to what you can do with sys_nice? what you can do with the timerslack feature is add upto 4 seconds of extra time/delay on top of each selec

Re: [PATCH 2/2] proc: Add /proc//timerslack_ns interface

2016-07-13 Thread Arjan van de Ven
On 7/13/2016 8:39 PM, Kees Cook wrote: So I worry I'm a bit stuck here. For general systems, CAP_SYS_NICE is too low a level of privilege to set a tasks timerslack, but apparently CAP_SYS_PTRACE is too high a privilege for Android's system_server to require just to set a tasks timerslack value.

Re: [PATCH 1/8] x86: don't use module.h just for AUTHOR / LICENSE tags

2016-07-13 Thread Arjan van de Ven
contained at the top of the file in the comments. Cc: Arjan van de Ven Acked-by: Arjan van de Ven original these were tested as modules, but they really shouldn't be modules in the normal kernel (and aren't per Kconfig)

Re: [patch V2 00/20] timer: Refactor the timer wheel

2016-06-26 Thread Arjan van de Ven
On Sun, Jun 26, 2016 at 12:00 PM, Pavel Machek wrote: > > Umm. I'm not sure if you should be designing kernel... > > I have alarm clock application. It does sleep(60) many times till its > time to wake me up. I'll be very angry if sleep(60) takes 65 seconds > without some very, very good reason.

Re: [patch V2 00/20] timer: Refactor the timer wheel

2016-06-20 Thread Arjan van de Ven
so is there really an issue? sounds like KISS principle can apply On Mon, Jun 20, 2016 at 7:46 AM, Thomas Gleixner wrote: > On Mon, 20 Jun 2016, Arjan van de Ven wrote: >> On Mon, Jun 20, 2016 at 6:56 AM, Thomas Gleixner wrote: >> > >> > 2) Cut off at 37hrs for

Re: [patch V2 00/20] timer: Refactor the timer wheel

2016-06-20 Thread Arjan van de Ven
On Mon, Jun 20, 2016 at 6:56 AM, Thomas Gleixner wrote: > > 2) Cut off at 37hrs for HZ=1000. We could make this configurable as a 1000HZ >option so datacenter folks can use this and people who don't care and want >better batching for power can use the 4ms thingy. if there really is one u

Re: initialize a mutex into locked state?

2016-06-17 Thread Arjan van de Ven
On 6/17/2016 7:54 AM, Oleg Drokin wrote: Yes, we can add all sorts of checks that have various impacts on code readability, we can also move code around that also have code readability and CPU impact. But in my discussion with Arjan he said this is a new use case that was not met before and s

Re: [patch V2 00/20] timer: Refactor the timer wheel

2016-06-17 Thread Arjan van de Ven
>To achieve this capacity with HZ=1000 without increasing the storage size >by another level, we reduced the granularity of the first wheel level from >1ms to 4ms. According to our data, there is no user which relies on that >1ms granularity and 99% of those timers are canceled befo

Re: [patch 13/20] timer: Switch to a non cascading wheel

2016-06-16 Thread Arjan van de Ven
ixner wrote: > On Wed, 15 Jun 2016, Thomas Gleixner wrote: >> On Wed, 15 Jun 2016, Arjan van de Ven wrote: >> > what would 1 more timer wheel do? >> >> Waste storage space and make the collection of expired timers more expensive. >> >> The selection

Re: [patch 13/20] timer: Switch to a non cascading wheel

2016-06-15 Thread Arjan van de Ven
what would 1 more timer wheel do? On Wed, Jun 15, 2016 at 7:53 AM, Thomas Gleixner wrote: > On Tue, 14 Jun 2016, Eric Dumazet wrote: >> Original TCP RFCs tell timeout is infinite ;) >> >> Practically, conntrack has a 5 days timeout, but I really doubt anyone >> expects an idle TCP flow to stay 'a

Re: [patch 13/20] timer: Switch to a non cascading wheel

2016-06-14 Thread Arjan van de Ven
evaluating a 120 hours timer ever 37 hours to see if it should fire... not too horrid. On Tue, Jun 14, 2016 at 9:28 AM, Thomas Gleixner wrote: > On Tue, 14 Jun 2016, Ingo Molnar wrote: >> * Thomas Gleixner wrote: >> > On Mon, 13 Jun 2016, Peter Zijlstra wrote: >> > > On Mon, Jun 13, 2016 at 08:4

Re: [patch 04/20] cpufreq/powernv: Initialize timer as pinned

2016-06-13 Thread Arjan van de Ven
On Mon, Jun 13, 2016 at 1:40 AM, Thomas Gleixner wrote: > mod_timer(&gpstates->timer, jiffies + msecs_to_jiffies(timer_interval)); are you sure this is right? the others did not get replaced by mod_timer().. (and this is more evidence that a relative API in msecs is what drivers really want)

Re: [patch 06/20] drivers/tty/metag_da: Initialize timer as pinned

2016-06-13 Thread Arjan van de Ven
I know it's not related to this patch, but it'd be nice to, as you're changing the api name anyway, make a mod_pinned_relative() so that more direct users of jiffies can go away... or even better, mod_pinned_relative_ms() so that these drivers also do not need to care about HZ. On Mon, Jun 13, 201

Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]

2016-05-11 Thread Arjan van de Ven
Oh, and this was with acpi_idle. This machine already failed to resume from S3 with intel_idle since forever, as detailed in https://bugzilla.kernel.org/show_bug.cgi?id=107151 but acpi_idle worked fine until now. can you disable (in sysfs) all C states other than C0/C1 and see if that makes i

Re: S3 resume regression [1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu")]

2016-05-11 Thread Arjan van de Ven
On 5/11/2016 3:19 AM, Ville Syrjälä wrote: Oh, and this was with acpi_idle. This machine already failed to resume from S3 with intel_idle since forever, as detailed in https://bugzilla.kernel.org/show_bug.cgi?id=107151 but acpi_idle worked fine until now. this is the important clue part afaics

Re: [PATCH v5 3/9] x86/head: Move early exception panic code into early_fixup_exception

2016-04-04 Thread Arjan van de Ven
On 4/4/2016 8:32 AM, Andy Lutomirski wrote: Adding locking would be easy enough, wouldn't it? But do any platforms really boot a second CPU before switching to real printk? Given that I see all the smpboot stuff in dmesg, I guess real printk happens first. I admit I haven't actually checked.

Re: [PATCH] x86: Enable full randomization on i386 and X86_32.

2016-03-10 Thread Arjan van de Ven
Arjan, or other folks, can you remember why x86_32 disabled mmap randomization here? There doesn't seem to be a good reason for it that I see. for unlimited stack it got really messy with threaded apps. anyway, I don't mind seeing if this will indeed work, with time running out where 32 bit is

Re: [PATCH v2] arm64: add alignment fault hanling

2016-02-16 Thread Arjan van de Ven
On 2/16/2016 10:50 AM, Linus Torvalds wrote: On Tue, Feb 16, 2016 at 9:04 AM, Will Deacon wrote: [replying to self and adding some x86 people] Background: Euntaik reports a problem where userspace has ended up with a memory page mapped adjacent to an MMIO page (e.g. from /dev/mem or a PCI memo

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread Arjan van de Ven
and most of the RT guys would only tolerate a little bit of it is there any real/practial use of going longer than 4 seconds? if there is then yeah fixing it makes sense. if it's just theoretical... shrug... 32 bit systems have a bunch of other limits/differences a well. So I'd think it would b

Re: [PATCH] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-02-05 Thread Arjan van de Ven
On 2/5/2016 4:51 PM, John Stultz wrote: On Fri, Feb 5, 2016 at 2:35 PM, John Stultz wrote: On Fri, Feb 5, 2016 at 12:50 PM, Andrew Morton wrote: On Fri, 5 Feb 2016 12:44:04 -0800 Kees Cook wrote: Could this be exposed as a writable /proc entry instead? Like the oom_* stuff? /proc//timer_s

Re: [RFC][PATCH v2] prctl: Add PR_SET_TIMERSLACK_PID for setting timer slack of an arbitrary thread.

2016-01-26 Thread Arjan van de Ven
arbitrary apps do not change the timer slack for other apps. Acked-by: Arjan van de Ven only slight concern is the locking around the value of the field in the task struct, but nobody does read-modify-write on it, so they'll get either the new or the old version, which should be ok. (until now

Re: 4.4-rc5: ugly warn on: 5 W+X pages found

2015-12-15 Thread Arjan van de Ven
On 12/14/2015 11:56 PM, Pavel Machek wrote: On Mon 2015-12-14 13:24:08, Arjan van de Ven wrote: That's weird. The only API to do that seems to be manually setting kmap_prot to _PAGE_KERNEL_EXEC, and nothing does that. (Why is kmap_prot a variable on x86 at all? It has exactly one w

Re: 4.4-rc5: ugly warn on: 5 W+X pages found

2015-12-14 Thread Arjan van de Ven
That's weird. The only API to do that seems to be manually setting kmap_prot to _PAGE_KERNEL_EXEC, and nothing does that. (Why is kmap_prot a variable on x86 at all? It has exactly one writer, and that's the code that initializes it in the first place. Shouldn't we #define kmap_prot _PAGE_KE

Re: [PATCH 3/4] sched: introduce synchronized idle injection

2015-11-18 Thread Arjan van de Ven
On 11/18/2015 7:44 AM, Morten Rasmussen wrote: I would not necessarily want to punish all cpus system-wide if we have local overheating in one corner. If would rather have it apply to only the overheating socket in a multi-socket machine and only the big cores in a big.LITTLE system. most of th

Re: [PATCH 3/4] sched: introduce synchronized idle injection

2015-11-18 Thread Arjan van de Ven
On 11/18/2015 12:36 AM, Ingo Molnar wrote: What will such throttling do to latencies, as observed by user-space tasks? What's the typical expected frequency of the throttling frequency that you are targeting? for this to meaningfully reduce power consumption, deep system power states need t

Re: [PATCH 2/4] timer: relax tick stop in idle entry

2015-11-16 Thread Arjan van de Ven
On 11/16/2015 6:53 PM, Paul E. McKenney wrote: Fair point. When in the five-jiffy throttling state, what can wake up a CPU? In an earlier version of this proposal, the answer was "nothing", but maybe that has changed. device interrupts are likely to wake the cpus. -- To unsubscribe from this

Re: [PATCH 2/4] timer: relax tick stop in idle entry

2015-11-16 Thread Arjan van de Ven
On 11/16/2015 3:28 PM, Paul E. McKenney wrote: Is this mostly an special-purpose embedded thing, or do you expect distros to be enabling this? If the former, I suggest CONFIG_RCU_NOCB_CPU_ALL, but if distros are doing this for general-purpose workloads, I instead suggest CONFIG_RCU_FAST_NO_HZ.

Re: [RFC PATCH 3/3] sched: introduce synchronized idle injection

2015-11-05 Thread Arjan van de Ven
On 11/5/2015 7:32 AM, Jacob Pan wrote: On Thu, 5 Nov 2015 15:33:32 +0100 Peter Zijlstra wrote: On Thu, Nov 05, 2015 at 06:22:58AM -0800, Arjan van de Ven wrote: On 11/5/2015 2:09 AM, Peter Zijlstra wrote: I can see such a scheme having a fairly big impact on latency, esp. with forced

Re: [RFC PATCH 3/3] sched: introduce synchronized idle injection

2015-11-05 Thread Arjan van de Ven
On 11/5/2015 6:33 AM, Peter Zijlstra wrote: On Thu, Nov 05, 2015 at 06:22:58AM -0800, Arjan van de Ven wrote: On 11/5/2015 2:09 AM, Peter Zijlstra wrote: I can see such a scheme having a fairly big impact on latency, esp. with forced idleness such as this. That's not going to be popula

Re: [RFC PATCH 3/3] sched: introduce synchronized idle injection

2015-11-05 Thread Arjan van de Ven
On 11/5/2015 2:09 AM, Peter Zijlstra wrote: I can see such a scheme having a fairly big impact on latency, esp. with forced idleness such as this. That's not going to be popular for many workloads. idle injection is a last ditch effort in thermal management, before this gets used the hardware

Re: [PATCH 3/3] cpuidle,menu: smooth out measured_us calculation

2015-11-04 Thread Arjan van de Ven
measured_us, will reduce the error. there is no perfect answer for this issue; but at least this makes the situation a lot better, so Acked-by: Arjan van de Ven -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More

Re: [PATCH 1/3] cpuidle,x86: increase forced cut-off for polling to 20us

2015-11-04 Thread Arjan van de Ven
Acked-by: Arjan van de Ven -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 2/3] cpuidle,menu: use interactivity_req to disable polling

2015-11-04 Thread Arjan van de Ven
performance and power on those CPUs if we are expecting a very low wakeup latency. Disable polling based on the estimated interactivity requirement, not on the time to the next timer interrupt. good catch! Acked-by: Arjan van de Ven -- To unsubscribe from this list: send the line "unsubs

Re: [tip:x86/mm] x86/mm: Warn on W^X mappings

2015-10-08 Thread Arjan van de Ven
On 10/8/2015 7:57 AM, Borislav Petkov wrote: + pr_info("x86/mm: Checked W+X mappings: passed, no W+X pages found.\n"); Do we really want to issue anything here in the success case? IMO, we should be quiet if the check passes and only scream when something's wrong... I would like

Re: [tip:x86/mm] x86/mm: Warn on W^X mappings

2015-10-06 Thread Arjan van de Ven
enabled without exposing the debugfs interface. Switch EFI_PGT_DUMP to using X86_PTDUMP_CORE so that it also does not require enabling the debugfs interface. I like it, so Acked-by: Arjan van de Ven I also have/had an old userland script to do similar checks but using the debugfs interface

Re: [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

2015-09-21 Thread Arjan van de Ven
On 9/21/2015 9:36 AM, Linus Torvalds wrote: On Mon, Sep 21, 2015 at 1:46 AM, Ingo Molnar wrote: Linus, what's your preference? So quite frankly, is there any reason we don't just implement native_read_msr() as just unsigned long long native_read_msr(unsigned int msr) { int er

Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops

2015-09-17 Thread Arjan van de Ven
On 9/17/2015 8:29 AM, Paolo Bonzini wrote: On 17/09/2015 17:27, Arjan van de Ven wrote: ( We should double check that rdmsr()/wrmsr() results are never left uninitialized, but are set to zero or so, for cases where the return code is not checked. ) It sure looks like

Re: [PATCH 0/3] x86/paravirt: Fix baremetal paravirt MSR ops

2015-09-17 Thread Arjan van de Ven
( We should double check that rdmsr()/wrmsr() results are never left uninitialized, but are set to zero or so, for cases where the return code is not checked. ) It sure looks like native_read_msr_safe doesn't clear the output if the rdmsr fails. I'd suggest to return some poison not j

Re: V4.0.x fails to create /dev/rtc0 on Winbook TW100 when CONFIG_PINCTRL_BAYTRAIL is set, bisected to commit 7486341

2015-07-11 Thread Arjan van de Ven
On 7/11/2015 11:26 AM, Porteus Kiosk wrote: Hello Arjan, We need it for setting up the time in the hardware clock through the 'hwclock' command. Thank you. hmm thinking about it after coffee... there is an RTC that can be exposed to userspace. hrmpf. Wonder why its not there for you --

Re: V4.0.x fails to create /dev/rtc0 on Winbook TW100 when CONFIG_PINCTRL_BAYTRAIL is set, bisected to commit 7486341

2015-07-11 Thread Arjan van de Ven
On 7/11/2015 11:21 AM, Arjan van de Ven wrote: On 7/11/2015 10:59 AM, Larry Finger wrote: On a Winbook TW100 BayTrail tablet, kernel 4.0 and later do not create /dev/rtc0 when CONFIG_PINCTRL_BAYTRAIL is set in the configuration. Removing this option from the config creates a real-time clock

  1   2   3   4   5   6   7   8   9   10   >