RE: [BUG] 2.6.23-rc3-mm1 Kernel panic - not syncing: DMA: Memory would be corrupted

2007-08-22 Thread Luck, Tony
> Attached the boot log and config file. Kamelesh, I don't see anything obvious in the boot_log. I used your "dotconfig" file to build a 2.6.23-rc3-mm1 kernel and booted it on my test system ... it worked just fine (except that for some reason the network did not come up :-( ) I tried to compar

RE: [BUG] 2.6.23-rc3-mm1 Kernel panic - not syncing: DMA: Memory would be corrupted

2007-08-22 Thread Luck, Tony
[ 20.903201] [] swiotlb_full+0x50/0x120 [ 20.903202] sp=e0014322fac0 bsp=e00143229120 [ 20.916902] [] swiotlb_map_single+0x120/0x1c0 [ 20.916904] sp=e0014322fac0 bsp=e001432290d8 [ 20.931215] [] swiotlb_alloc_coherent+0x150/0x240 [ 20.931217] sp=e0014322fac0 bsp=e00143229090

RE: [BUG] 2.6.23-rc3-mm1 Kernel panic - not syncing: DMA: Memory would be corrupted

2007-08-22 Thread Luck, Tony
> The more ioc's you have, the more space you will use. Default SW IOTLB allocation is 64MB ... how much should we see used per ioc? Kamelesh: You could try increasing the amount of sw iotlb space available by booting with a swiotlb=131072 argument (argument value is the number of 2K slabs to all

RE: [BUG] 2.6.23-rc3-mm1 Kernel panic - not syncing: DMA: Memory would be corrupted

2007-08-22 Thread Luck, Tony
> Hmm. Must be something else going on then. It should be less than 1MB > per ioc plus whatever is used for streaming I/O. > > | mptbase: Initiating ioc2 bringup | GSI 16 > (level, low) -> CPU 2 (0xc418) vector 50 > | ioc2: LSI53C1030 C0: Capabilities={Initiator}

RE: [BUG] 2.6.23-rc3-mm1 Kernel panic - not syncing: DMA: Memory would be corrupted

2007-08-23 Thread Luck, Tony
> __get_free_pages() of swiotlb_alloc_coherent() fails in rc3-mm1. > But, it doesn't fail on rc2-mm2, and kernel can boot up. That looks to be part of the problem here ... failing an order=3 allocation during boot on a system that just a few lines earlier in the boot log reported "Memory: 37474000

RE: "double" hpet clocksource && hard freeze [bisected]

2007-08-23 Thread Luck, Tony
> I have a double "hpet" entry in "available_clocksource": > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource > tsc hpet hpet acpi_pm jiffies Oops. If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c both register a clocksource named "hpet". P

RE: "double" hpet clocksource && hard freeze [bisected]

2007-08-24 Thread Luck, Tony
> > Prevent duplicate names being registered with clocksource. This also > > eliminates the duplication of hpet clock registration when the arch > > uses the hpet timer and the hpet driver does too. The patch was > > compile and link tested. > > This one works too. It is good to avoid registering

RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

2007-08-24 Thread Luck, Tony
>> static inline void wait_for_init_deassert(atomic_t *deassert) >> { >> -while (!atomic_read(deassert)); >> +while (!atomic_read(deassert)) >> +cpu_relax(); >> return; >> } > > For less-than-briliant people like me, it's totally non-obvious that > cpu_relax() is needed

RE: [Tech-board-discuss] Re: [Ksummit-2007-discuss] Re: LinuxFoundation Technical Advisory Board Elections

2007-08-24 Thread Luck, Tony
Have there been any more nominations? At the moment we are sitting with three people standing for five positions, so the whole concept of who should be allowed to vote in the election seems to be moot. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a messa

RE: [PATCH] ACPI, EINJ: Enhance error injection tolerance level

2014-12-08 Thread Luck, Tony
> If it solves problem related to ACPI, you should not use in this file: > > + if ((pci_probe & PCI_PROBE_MMCONF) == 0) > > + return 0; > This is very x86 specific. We are making a lot of effort to make > MMCONFIG platform independent now. So you'd like to see the changes in this patc

RE: [PATCH] x86: mce: Avoid timer double-add during CMCI interrupt storms.

2014-12-09 Thread Luck, Tony
> Right, so this polling thing once again proves its fragility to me - we > have had problems with it in the past so maybe we should think about > replacing it with something simpler and and much more robust instead of > this flaky dynamically adjustable polling thing. Dynamic intervals for pollin

[PATCHv2] ACPI, EINJ: Enhance error injection tolerance level

2014-12-10 Thread Luck, Tony
From: "Chen, Gong" Some BIOSes utilize PCI MMCFG space read/write opertion to trigger specific errors. EINJ will report errors as below when hitting such cases: APEI: Can not request [mem 0x83f990a0-0x83f990a3] for APEI EINJ Trigger registers It is because on x86 platform ACPI based PCI MMCFG

RE: [PATCH] x86, mce: use mce_usable_address() for UCNA memory error recovery

2015-01-05 Thread Luck, Tony
> The IA32_MCi_ADDR MSR contains the address of the code or data memory > location that produced the machine-check error. The IA32_MCi_ADDR > register is either not implemented or contains no address if the ADDRV > flag in the IA32_MCi_STATUS register is clear. The address returned is > an offset i

[PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks

2015-01-05 Thread Luck, Tony
We now switch to the kernel stack when a machine check interrupts during user mode. This means that we can perform recovery actions in the tail of do_machine_check() Signed-off-by: Tony Luck --- On top of Andy's x86/paranoid branch Andy: Should I really move that: pr_err("Uncorrected ha

RE: [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks

2015-01-06 Thread Luck, Tony
> For the context-related bits: > > Reviewed-by: Andy Lutomirski Thanks. > Should I stick this in my -next branch so it can stew? That's fine with me. Boris ... any comments? -Tony

RE: [GIT PULL] RAS update for 3.20

2015-01-13 Thread Luck, Tony
> Nothing special this time, just an error messages improvement from Andy > and a cleanup from me. Also this APEI change that hasn't seen much commentary since Tomas Nowicki pointed out that v1 had cluttered up the APEI code with some x86-ism after he'd spent time cleaning it up - so v2 addresse

RE: [PATCH v4 4/5] pstore: add pmsg

2015-01-16 Thread Luck, Tony
> Cool, seems reasonable to me. Thanks for the respins! I've applied it - but I have a nagging worry about it being user accessible. We've limited access to some pstore features on platforms that only support backend drivers that write to limited lifetime flash storage - efivars & erst. Should

RE: [PATCH] x86, mce, severities: Add AMD severities function

2015-03-19 Thread Luck, Tony
>> It would be best if what Tony suggested comes ontop of your patch with >> his Suggested-by: tag. This ordering should be also the easiest wrt >> functionality and bisectability. >> >> > > Ok, I'll have it ready and send out a V2 by tomorrow if there are no > other comments/reviews. Fame & glor

RE: [PATCH] tracing: add trace event for memory-failure

2015-03-20 Thread Luck, Tony
> RAS user space tools like rasdaemon which base on trace event, could > receive mce error event, but no memory recovery result event. So, I > want to add this event to make this scenario complete. Excellent answer. Are you going to write that patch for rasdaemon? -Tony

RE: [PATCH] x86, mce, severities: Add AMD severities function

2015-03-20 Thread Luck, Tony
>> And I call this from mcheck_init(). >> I tested the above bits on AMD and the severities grading works fine. >> >> Should we also come up with a '_default' function to assign to mce_severity >> function pointer? > >I think that should be > > default: > WARN_ONCE("WTF?!"); >

RE: [PATCH V2 2/2] x86, mce, severities: Define mce_severity function pointer

2015-03-20 Thread Luck, Tony
+ default: + WARN_ONCE(1, "WTF!?"); + break; You meant to type: mce_severity = mce_severity_default; just there, right? -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.k

RE: [RFC PATCH 5/5] GHES: Make NMI handler have a single reader

2015-04-23 Thread Luck, Tony
> I think we should apply this. > > Here's why: nothing in the ghes_notify_nmi() handler does CPU-specific > accesses This looks to be true. > Tony, objections? No objections. -Tony N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf�

RE: [PATCH] mm/hugetlb: reduce arch dependent code about huge_pmd_unshare

2015-04-23 Thread Luck, Tony
> Memory fails me. Why do some architectures (arm, arm64, x86_64) want > huge_pmd_[un]share() while other architectures (ia64, tile, mips, > powerpc, metag, sh, s390) do not? Potentially laziness/ignorance-of-feature? It looks like this feature started on x86_64 and then spread to arm*. Huge p

RE: [PATCH] x86: Drop 32-bit support ... finally.

2015-04-01 Thread Luck, Tony
> Hey! My father still has a PDP-15/30, complete with core memory, 4 > DECtape units, and a high speed punched paper tape reader, all in > a really tiny footprint of 4 full-height 19 inch racks, in his house. > It hasn't been turned on in over a decade, but doesn't deserve a Free > Unixlike OS of

RE: [PATCH v8] x86: mce: kexec: switch MCE handler for kexec/kdump

2015-04-09 Thread Luck, Tony
> Why? Those CPUs are offlined and num_online_cpus() in mce_start() should > account for that, no? > > And if those are offlined, they're very very unlikely to trigger an MCE > as they're idle and not executing code. Let's step back a few feet and look at the big picture. There are three main cl

RE: [PATCH v8] x86: mce: kexec: switch MCE handler for kexec/kdump

2015-04-09 Thread Luck, Tony
> If only APEI EINJ could be taught to do delayed injection, regardless of > OS kernel running. Tony, is something like that even possible at all? Use: # echo 1 > notrigger that allows you to plant a land-mine in memory that will get tripped later. Pick the memory address in a clever way and

[GIT PULL] pstore fix for 4.1

2015-04-13 Thread Luck, Tony
The following changes since commit 06e5801b8cb3fc057d88cb4dc03c0b64b2744cda: Linux 4.0-rc4 (2015-03-15 17:38:20 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to 62f269ef8191568

RE: [PATCH] mm/slub: fix a BUG_ON() when offlining a memory node and CONFIG_SLUB_DEBUG is on

2012-07-17 Thread Luck, Tony
> This suggests that a call to early_kmem_cache_node_alloc was not needed > because the per node structure already existed. Lets fix that instead. Perhaps by just having one API for users to call? It seems odd to force users to figure out whether they are called before some magic time during boot

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-19 Thread Luck, Tony
> With this patch, efi_pstore can hold multiple logs with a new kernel > parameter, efi_pstore_log_num. How big a value for efi_pstore_log_num have you tried? Did you see any problems with EFI running out of space? Do you get some helpful error message if you pick a number that is too big? -T

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-19 Thread Luck, Tony
> If users specify a number of that is too big, the message will be meaningless. > I just couldn't decide the appropriate number by myself. > Then, I make it tunable. I think that 3 or 4 logs should be plenty to cover almost all situations. E.g. with 3 logs you could capture 2 OOPS (and perhaps mi

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-19 Thread Luck, Tony
> If you are concerned about multiple OOPS case, I think an user app which logs > from /dev/pstore to /var/log should be developed. Agreed - we need an app/daemon to do this. > Once it is developed, we don't need to care about multiple oops case and the > appropriate number is two. Only if you

Re: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations

2015-06-12 Thread Luck, Tony
On Fri, Jun 12, 2015 at 08:42:33AM +, Naoya Horiguchi wrote: > 4?) I don't have the whole picture of how address ranging mirroring works, > but I'm curious about what happens when an uncorrected memory error happens > on the a mirror page. If HW/FW do some useful work invisible from kernel, > p

Re: [RFC PATCH 10/12] mm: add the buddy system interface

2015-06-15 Thread Luck, Tony
On Mon, Jun 15, 2015 at 05:47:27PM +0900, Kamezawa Hiroyuki wrote: > So, there are 3 ideas. > > (1) kernel only from MIRROR / user only from MOVABLE (Tony) > (2) kernel only from MIRROR / user from MOVABLE + MIRROR(ASAP) (AKPM > suggested) > This makes use of the fact MOVABLE memory is re

Re: MCE Bug?

2015-06-17 Thread Luck, Tony
On Wed, Jun 17, 2015 at 11:41:56AM +0200, Borislav Petkov wrote: > And I was waiting in line to get a chance to do some injection on our > EINJ box here too. But it seems you have the required setup already so > if you want to give those changes a run, I've uploaded them here: > > git://git.kernel

RE: MCE Bug?

2015-06-17 Thread Luck, Tony
> if you want to give those changes a run, I've uploaded them here: > > git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git#tip-ras Latest experiments show that sometimes checking kventd_up() before calling schedule_work() helps ... but mostly only when I fake some early logs from low numbe

changing format/size of data in TRACE_EVENT(extlog_mem_event)

2015-06-24 Thread Luck, Tony
In we define a trace event for memory errors. The last field is: __field_struct(struct cper_mem_err_compact, data) where the structure is defined in as: struct cper_mem_err_compact { __u64 validation_bits; __u16 node; __u16 card; __u16 mo

[GIT PULL] pstore for 4.2

2015-06-24 Thread Luck, Tony
The following changes since commit e26081808edadfd257c6c9d81014e3b25e9a6118: Linux 4.1-rc4 (2015-05-18 10:13:47 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to 078550569eafc66

RE: [RFC PATCH 12/12] mm: let slab/slub/slob use mirrored memory

2015-06-04 Thread Luck, Tony
- page = alloc_pages_exact_node(nodeid, flags | __GFP_NOTRACK, cachep->gfporder); + page = alloc_pages_exact_node(nodeid, flags | __GFP_NOTRACK | __GFP_MIRROR, + cachep->gfporder); Set some global "got_mirror"[*] if we have any mirrored memory t

RE: [PATCH] efi: Work around ia64 build problem with ESRT driver.

2015-06-05 Thread Luck, Tony
-obj-$(CONFIG_EFI) += efi.o esrt.o vars.o reboot.o +obj-$(CONFIG_EFI) += efi.o vars.o reboot.o +ifeq ($(CONFIG_IA64),) +obj-$(CONFIG_EFI) += esrt.o +endif I doubt that ia64 systems are going to start implementing ERST - so it's fine

RE: [RFC PATCH 01/12] mm: add a new config to manage the code

2015-06-08 Thread Luck, Tony
> > +config MEMORY_MIRROR > > + bool "Address range mirroring support" > > + depends on X86 && NUMA > > + default y > Is it correct for the systems (NOT xeon) without memory support built in? Is the "&& NUMA" doing that? If you support NUMA, then you are not a minimal config for

RE: [PATCH 4/4 Rebase] x86, MCE: Avoid potential deadlock in MCE context

2015-06-08 Thread Luck, Tony
> So AFAINM, we want to do MCE work only after we've logged something to > the genpool. So we can do the much simplified thing below and kick the > workqueue from within mce_log() as everything that logs, calls that > function. > > Tony, any concerns? @@ -156,7 +156,8 @@ void mce_log(struct mce *m

RE: [RFC PATCH 08/12] mm: use mirrorable to switch allocate mirrored memory

2015-06-04 Thread Luck, Tony
> Add a new interface in path /proc/sys/vm/mirrorable. When set to 1, it means > we should allocate mirrored memory for both user and kernel processes. With some "to be defined later" mechanism for how the user requests mirror vs. not mirror. Plus some capability/ulimit pieces that restrict who c

RE: [RFC PATCH 02/12] mm: introduce mirror_info

2015-06-04 Thread Luck, Tony
+#ifdef CONFIG_MEMORY_MIRROR +struct numa_mirror_info { + int node; + unsigned long start; + unsigned long size; +}; + +struct mirror_info { + int count; + struct numa_mirror_info info[MAX_NUMNODES]; +}; Do we really need this? My patch series leaves all the mirrored

RE: [RFC PATCH 10/12] mm: add the buddy system interface

2015-06-04 Thread Luck, Tony
+#ifdef CONFIG_MEMORY_MIRROR + if (change_to_mirror(gfp_mask, ac.high_zoneidx)) + ac.migratetype = MIGRATE_MIRROR; +#endif We may have to be smarter than this here. I'd like to encourage the enterprise Linux distributions to set CONFIG_MEMORY_MIRROR=y But the reality is that mo

RE: [PATCH] ia64: remove paravirt code

2015-06-09 Thread Luck, Tony
> Hey folks, any feedback? Sorry - it sounds like a good idea - haven't had time to play with the patch yet. -Tony

RE: [RFC PATCH 10/12] mm: add the buddy system interface

2015-06-10 Thread Luck, Tony
> I guess, mirrored memory should be allocated if !__GFP_HIGHMEM or > !__GFP_MOVABLE HIGHMEM shouldn't matter - partial memory mirror only makes any sense on X86_64 systems ... 32-bit kernels don't even boot on systems with 64GB, and the minimum rational configuration for a machine that support

Re: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations

2015-06-18 Thread Luck, Tony
On Thu, Jun 18, 2015 at 11:55:42AM +0200, Vlastimil Babka wrote: > >>>If there are many mirror regions in one node, then it will be many holes > >>>in the > >>>normal zone, is this fine? > >> > >>Yeah, it doesn't matter how many holes there are. > > > >So mirror zone and normal zone will span each

Re: [PATCH] x86/mce: Initialize workqueues only once (alternate proposal)

2015-06-19 Thread Luck, Tony
96d98bfd0366 ("x86/mce: Don't use percpu workqueues") dropped the per-CPU workqueues in the MCE code but left the initialization per-CPU. This lead to early boot time splats (below) in the workqueues code because we were overwriting the workqueue during INIT_WORK() on each new CPU which would appea

RE: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations

2015-06-19 Thread Luck, Tony
> What's your suggestions? a new zone or a new migratetype? > Maybe add a new zone will change more mm code. I don't understand this code well enough (yet) to make a recommendation. I think our primary concern may not be "how much code we change", but more "how can we minimize the run-time impac

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Luck, Tony
On Tue, May 22, 2018 at 04:54:26PM +0200, Borislav Petkov wrote: > I especially don't want to have the case where a PCIe error is *really* > fatal and then we noodle in some handlers debating about the severity > because it got marked as recoverable intermittently and end up causing > data corrupti

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Luck, Tony
On Tue, May 22, 2018 at 08:10:47PM +0200, Rafael J. Wysocki wrote: > > PCIe fatal means that the link or the device is broken. > > And that may really mean that the component in question is on fire. > We just don't know. Components on fire could be the root cause of many errors. If we really beli

Re: [PATCH v6 1/2] acpi: apei: Rename ghes_severity() to ghes_cper_severity()

2018-05-22 Thread Luck, Tony
On Tue, May 22, 2018 at 01:19:34PM -0500, Alex G. wrote: > Firmware started passing "fatal" GHES headers with the explicit intent of > crashing an OS. At the same time, we've learnt how to handle these errors in > a number of cases. With DPC (coming soon to firmware-first) the error is > contained,

RE: [PATCH v2 0/8] Decode IA32/X64 CPER

2018-03-01 Thread Luck, Tony
> One much more important thing I forgot about yesterday: how is > this thing playing into our RAS reporting, x86 decoding chain, etc > infrastructure? > > Is CPER bypassing it completely and the firmware is doing everything > now? I sure hope not. Intel gives OEMs lots of options to catch and twe

RE: [PATCH] x86/mce: Save microcode revision in machine check records

2018-03-01 Thread Luck, Tony
+ c = &cpu_data(m->cpu); Bother. Breaks on systems with >255 cpus because "cpu" is __u8. s/m->cpu/m->extcpu/ -Tony

Re: v4.16+ seeing many unaligned access in dequeue_task_fair() on IA64

2018-04-03 Thread Luck, Tony
On Tue, Apr 03, 2018 at 09:37:06AM +0200, Peter Zijlstra wrote: > On Mon, Apr 02, 2018 at 04:24:49PM -0700, Luck, Tony wrote: > > Any guesses before I start to bisect? > > That doesn't sound good. The only guess I have at this moment is you > accidentially enabled RAN

RE: v4.16+ seeing many unaligned access in dequeue_task_fair() on IA64

2018-04-03 Thread Luck, Tony
> bisect says: > > d519329f72a6 ("sched/fair: Update util_est only on util_avg updates") > > Reverting just this commit makes the problem go away. The unaligned read and write seem to come from: struct util_est ue = READ_ONCE(p->se.avg.util_est); WRITE_ONCE(p->se.avg.util_est, ue); which is puz

Re: v4.16+ seeing many unaligned access in dequeue_task_fair() on IA64

2018-04-04 Thread Luck, Tony
On Wed, Apr 04, 2018 at 09:25:13AM +0200, Peter Zijlstra wrote: > Right, I remember being careful with that. Which again brings me to the > RANDSTRUCT thing, which will mess that up. No RANDSTRUCT config options set for my build. > Does the below cure things? It makes absolutely no difference for

RE: [PATCH v3] x86,sched: allow topologies where NUMA nodes share an LLC

2018-03-29 Thread Luck, Tony
> Hmm, if we shrink llc-size by splitting it, do we also need to create a > unique "id" for each slice? RDT uses the cache id ... but it doesn't play well with cluster on die mode ... so our recommendation is to not use RDT if COD mode is enabled. If the result of these changes happens to be som

[PATCH] x86, mce: Fix stack out-of-bounds write in mce-inject.c:flags_read()

2018-04-27 Thread Luck, Tony
Each of the strings that we want to put into the buf[MAX_FLAG_OPT_SIZE] in flags_read() is two characters. But the sprintf() adds a trailing newline and will add a terminating NUL byte. So MAX_FLAG_OPT_SIZE needs to be 4. Reported-by: Dmitry Vyukov Cc: Signed-off-by: Tony Luck --- diff --git

Re: [PATCH 2/3] x86/mce: Fix incorrect "Machine check from unknown source" message

2018-05-29 Thread Luck, Tony
On Tue, May 29, 2018 at 12:42:22PM +0200, Borislav Petkov wrote: > > +* fatal error. We call "mce_severity()" again to > > +* make sure we have the right "msg". > > */ > > -if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) > > -

[PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message

2018-05-29 Thread Luck, Tony
Some injection testing resulted in the following console log: mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd8000100134 mce: [Hardware Error]: RIP 10: {pmem_do_bvec+0x11d/0x330 [nd_pmem]} mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88 mce: [Hardware

RE: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message

2018-05-29 Thread Luck, Tony
> It is still assigning. Ah. That would be because I forgot to "git add" before "git commit --amend" :-( > I'll simply do: > > if (worst >= MCE_PANIC_SEVERITY && mca_cfg.tolerant < 3) { > mce_severity(&m, cfg->tolerant, &msg, true); > mce_panic("Local fatal machi

Re: [PATCH 2/3 V2] x86/mce: Fix incorrect "Machine check from unknown source" message

2018-05-29 Thread Luck, Tony
On Tue, May 29, 2018 at 07:53:14PM +0200, Borislav Petkov wrote: > Nah, the cleanups will all go ontop. This is just a dirty branch to show > my intention but yours go first and then the cleanup. Couple of thoughts: In "x86/mce: Carve out bank scanning code" you drop the extra call to mce_severi

Re: [PATCH 03/15] x86/split_lock: Handle #AC exception for split lock in kernel mode

2018-05-15 Thread Luck, Tony
On Tue, May 15, 2018 at 08:51:24AM -0700, Dave Hansen wrote: > > + pr_info_ratelimited("Alignment check for split lock at %lx\n", address); > > This is a potential KASLR bypass, I believe. We shouldn't be printing > raw kernel addresses. > > We have some nice printk's for page faults that give

Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Luck, Tony
On Tue, Apr 16, 2019 at 11:07:26AM +0200, Borislav Petkov wrote: > On Mon, Apr 15, 2019 at 06:20:00PM -0700, Cong Wang wrote: > > ce_arr.array[] is always within the range [0, ce_arr.n-1]. > > However, the binary search code in __find_elem() uses ce_arr.n > > as the maximum index, which could lead

Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Luck, Tony
On Tue, Apr 16, 2019 at 04:18:57PM -0700, Cong Wang wrote: > > The problem case occurs when we've seen enough distinct > > errors that we have filled every entry, then we try to > > look up a pfn that is larger that any seen before. > > > > The loop: > > > > while (min < max) { > >

Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-16 Thread Luck, Tony
On Tue, Apr 16, 2019 at 04:47:55PM -0700, Cong Wang wrote: > 229 static void del_elem(struct ce_array *ca, int idx) > 230 { > 231 /* Save us a function call when deleting the last element. */ > 232 if (ca->n - (idx + 1)) > 233 memmove((void *)&ca->array[idx], > 234

Re: [PATCH 1/2] ras: fix an off-by-one error in __find_elem()

2019-04-17 Thread Luck, Tony
On Tue, Apr 16, 2019 at 07:37:41PM -0700, Cong Wang wrote: > On Tue, Apr 16, 2019 at 7:31 PM Cong Wang wrote: > > Yes it is, I have a stacktrace in production which clearly shows > > del_elem.isra.1+0x34/0x40, unlike the one I triggered via fake > > PFN's. I can show you if you want, it is on 4.14

[PATCH v2] x86, mce: Fix machine_check_poll() tests for which errors

2019-03-12 Thread Luck, Tony
There has been a lurking "TBD" in the machine check poll routine ever since it was first split out from the machine check handler. The potential issue is that the poll routine may have just begun a read from the STATUS register in a machine check bank when the hardware logs an error in that bank

Re: [PATCH] EDAC, {skx|i10nm}_edac: Fix randconfig build error

2019-03-13 Thread Luck, Tony
On Wed, Mar 06, 2019 at 09:15:13PM +0100, Arnd Bergmann wrote: > On Wed, Mar 6, 2019 at 6:58 PM Luck, Tony wrote: > > From: Qiuxu Zhuo > > > > This seems cleaner than adding all the EXPORTs to skx_common.c > > I also tried a build with the 0x8A152468-config.gz that A

Re: [PATCH] EDAC, {skx|i10nm}_edac: Fix randconfig build error

2019-03-14 Thread Luck, Tony
On Thu, Mar 14, 2019 at 12:04:13PM +0100, Borislav Petkov wrote: > On Thu, Mar 14, 2019 at 08:09:06AM +0100, Arnd Bergmann wrote: > > > So where should we go. Proposed solutions are piling up: > > > > > > 1) Make skx_common a module > > > [downside: have to EXPORT everything in it] > > > 2)

Re: [PATCH] EDAC, {skx|i10nm}_edac: Fix randconfig build error

2019-03-15 Thread Luck, Tony
On Fri, Mar 15, 2019 at 10:43:42AM +0100, Borislav Petkov wrote: > On Thu, Mar 14, 2019 at 02:59:52PM -0700, Luck, Tony wrote: > > I made a patch based on option #3. Rough steps were: > > > > $ cat skx_common.c >> skx_common.h > > That doesn't look real

Re: linux-next: Tree for Feb 6 (drivers/edac/skx* and i10nm)

2019-02-06 Thread Luck, Tony
On Wed, Feb 06, 2019 at 01:11:23PM -0800, Randy Dunlap wrote: > on x86_64: > > ld: drivers/edac/skx_common.o: in function `skx_mce_check_error': > skx_common.c:(.text+0x982): undefined reference to `adxl_decode' > ld: drivers/edac/skx_common.o: in function `skx_adxl_get': > skx_common.c:(.init.tex

Re: [GIT PULL] x86/mm changes for v4.21

2019-02-06 Thread Luck, Tony
On Tue, Dec 25, 2018 at 12:11:06AM +0100, Ingo Molnar wrote: > Peter Zijlstra (9): > x86/mm/cpa: Add ARRAY and PAGES_ARRAY selftests > x86/mm/cpa: Add __cpa_addr() helper > x86/mm/cpa: Make cpa_data::vaddr invariant > x86/mm/cpa: Simplify the code after making cpa->vaddr inv

Re: [PATCH RESEND 4/5] x86/MCE: Make number of MCA banks per_cpu

2019-04-08 Thread Luck, Tony
On Mon, Apr 08, 2019 at 02:12:17PM +, Ghannam, Yazen wrote: > +DEFINE_PER_CPU_READ_MOSTLY(u8, num_banks); > +EXPORT_PER_CPU_SYMBOL_GPL(num_banks); The name "num_banks" is a bit generic for an exported symbol. I think it should have a "mce_" prefix. -Tony

RE: [PATCH RESEND 4/5] x86/MCE: Make number of MCA banks per_cpu

2019-04-08 Thread Luck, Tony
> Actually, it should not be exported at all. A function returning the num > banks is better instead. Are all the places it is used in non-pre-emptible sections of code? Looping in the CMCI and #MC handlers should be fine. But do we need get_cpu()/put_cpu() in any places? -Tony

Re: [PATCH RESEND 4/5] x86/MCE: Make number of MCA banks per_cpu

2019-04-08 Thread Luck, Tony
On Mon, Apr 08, 2019 at 10:48:34PM +, Ghannam, Yazen wrote: > Okay, so drop the export and leave the injector code as-is (it's > already doing a rdmsrl_on_cpu()). It's still a globally visible symbol (shared by core.c and amd.c). So I think it needs a "mce_" prefix. While it doesn't collide n

checkpatch warnings for references to earlier commits

2021-02-22 Thread Luck, Tony
Would it be possible to teach checkpatch not to warn about canonical references to earlier commits? E.g. WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line) #7: commit e80634a75aba ("EDAC, skx: Retrieve and print retry_rd_err_log registers") Thanks -Tony

RE: [PATCH v2] x86/mce: fix wrong no-return-ip logic in do_machine_check()

2021-02-23 Thread Luck, Tony
> What I think is qemu has not an easy to get the MCE signature from host or > currently no methods for this > So qemu treat all AR will be No RIPV, Do more is better than do less. RIPV would be important in the guest in the case where the guest can fix the problem that caused the machine check

Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-02-23 Thread Luck, Tony
On Tue, Feb 23, 2021 at 07:33:46AM -0800, Andy Lutomirski wrote: > > > On Feb 23, 2021, at 4:44 AM, Aili Yao wrote: > > > > On Fri, 5 Feb 2021 17:01:35 +0800 > > Aili Yao wrote: > > > >> When one page is already hwpoisoned by MCE AO action, processes may not > >> be killed, processes mapping

RE: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Luck, Tony
> And the much more important question is, what is the code supposed to > do when that overflow *actually* happens in real life? Because IINM, > an overflow condition on the same page would mean killing the task to > contain the error and not killing the machine... Correct. The cases I've actually

RE: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-02 Thread Luck, Tony
> So that "system hang or panic" which the validation folks triggered, > that cannot be reproduced anymore? Did they run the latest version of > the patch? I will get the validation folks to run the latest version (and play around with hyperthreading if they see problems). -Tony

Re: [PATCH] x86/fault: Send SIGBUS to user process always for hwpoison page access.

2021-01-28 Thread Luck, Tony
On Thu, Jan 28, 2021 at 07:43:26PM +0800, Aili Yao wrote: > when one page is already hwpoisoned by AO action, process may not be > killed, the process mapping this page may make a syscall include this > page and result to trigger a VM_FAULT_HWPOISON fault, as it's in kernel > mode it may be fixed b

RE: linux-next: removal of the ia64 tree

2021-01-28 Thread Luck, Tony
Stephen, Yes. Most stuff I do these days goes through the RAS tree I share with Boris. -Tony -Original Message- From: Stephen Rothwell Sent: Thursday, January 28, 2021 11:53 AM To: Luck, Tony Cc: Arnd Bergmann ; Linux Kernel Mailing List ; Linux Next Mailing List Subject: linux

Re: [PATCH] x86/fault: Send SIGBUS to user process always for hwpoison page access.

2021-01-29 Thread Luck, Tony
Thanks for the explanation and test code. I think I see better what is going on here. [I took your idea for using madvise(...MADV_HWPOISON) and added a new "-S" option to my einj_mem_uc test program to use madvise instead of ACPI/EINJ for injections. Update pushed here: git://git.kernel.or

Re: [PATCH v5] x86/mce: Avoid infinite loop for copy from user recovery

2021-02-01 Thread Luck, Tony
On Thu, Jan 28, 2021 at 06:57:35PM +0100, Borislav Petkov wrote: > Crazy idea: if you still can reproduce on -rc3, you could bisect: i.e., > if you apply the patch on -rc3 and it explodes and if you apply the same > patch on -rc5 and it works, then that could be a start... Yeah, don't > have a bett

RE: [PATCH v2] x86/fault: Send a SIGBUS to user process always for hwpoison page access.

2021-02-01 Thread Luck, Tony
> In any case, this patch needs rebasing on top of my big fault series Is that series in some GIT tree? Or can you give a lore.kernel.org link? Thanks -Tony

Re: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86

2021-02-08 Thread Luck, Tony
On Mon, Feb 08, 2021 at 02:04:24PM -0500, Liang, Kan wrote: > On 2/8/2021 12:56 PM, Borislav Petkov wrote: > > I think it's good enough for perf, but I'm not sure whether other codes need > the CPU type information. > > Ricardo, do you know? > > Maybe we should implement a generic function as be

RE: Pstore : Query on using ramoops driver for DDR

2021-02-09 Thread Luck, Tony
> Can we use existing backend pstore ram driver (fs/pstore/ram.c) for DDR > instead of SRAM ? The expectation for pstore is that the system will go through a reset when it crashes. Most systems do not preserve DDR contents across reset. > Was the current driver written only to support persistant

RE: [PATCH] x86, sched: Allow NUMA nodes to share an LLC on Intel platforms

2021-02-09 Thread Luck, Tony
> +#define X86_BUG_NUMA_SHARES_LLC X86_BUG(25) /* CPU may > enumerate an LLC shared by multiple NUMA nodes */ During internal review I wondered why this is a "BUG" rather than a "FEATURE" bit. Apparently, the suggestion for "BUG" came from earlier community discussions. Historical

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-02-25 Thread Luck, Tony
On Thu, Feb 25, 2021 at 12:38:06PM +, HORIGUCHI NAOYA(堀口 直也) wrote: > Thank you for shedding light on this, this race looks worrisome to me. > We call try_to_unmap() inside memory_failure(), where we find affected > ptes by page_vma_mapped_walk() and convert into hwpoison entires in > try_to_un

Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-02-26 Thread Luck, Tony
On Fri, Feb 26, 2021 at 10:52:50AM +0800, Aili Yao wrote: > Hi naoya,Oscar,david: > > > > > We could use some negative value (error code) to report the reported case, > > > then as you mentioned above, some callers need change to handle the > > > new case, and the same is true if you use some posi

Re: [PATCH] x86/mm: Don't try to change poison pages to uncacheable in a guest

2020-05-18 Thread Luck, Tony
On Mon, May 18, 2020 at 06:55:00PM +0200, Borislav Petkov wrote: > On Mon, May 18, 2020 at 08:36:25AM -0700, Luck, Tony wrote: > > The VMM gets the page fault (because the unmapping of the guest > > physical address is at the VMM EPT level). The VMM can't map a new >

[PATCH v2] x86/mm: Change so poison pages are either unmapped or marked uncacheable

2020-05-20 Thread Luck, Tony
An interesting thing happened when a guest Linux instance took a machine check. The VMM unmapped the bad page from guest physical space and passed the machine check to the guest. Linux took all the normal actions to offline the page from the process that was using it. But then guest Linux crashed

RE: [PATCH v3 2/2] x86/kvm: Expose new features for supported cpuid

2020-08-10 Thread Luck, Tony
> As you suggest, I will split the kvm patch into two parts, SERIALIZE and > TSXLDTRK, and this series will include three patches then, 2 kvm patches > and 1 kernel patch. SERIALIZE could get merged into 5.9, but TSXLDTRK > should wait for the next release. I just want to double confirm with >

[GIT PULL] EDAC for 5.9

2020-08-02 Thread Luck, Tony
Hi Linus, Boris is on vacation and aske me to send you the pull request for EDAC changes that are queued for v5.9 -Tony --- The following changes since commit b3a9e3b9622ae10064826dccb4f7a52bd88c7407: Linux 5.8-rc1 (2020-06-14 12:45:04 -0700) are available in the Git repository at: git:/

RE: [PATCH] x86/mce: Increase maximum number of banks to 64

2020-08-20 Thread Luck, Tony
>> How much does vmlinux size grow with your change? >> > > It seems to get smaller. > > -rwxrwxr-x 1 yghannam yghannam 807634088 Aug 20 17:51 vmlinux-32banks > -rwxrwxr-x 1 yghannam yghannam 807634072 Aug 20 17:50 vmlinux-64banks You need to run: $ size vmlinux textdata bss de

RE: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available

2020-08-05 Thread Luck, Tony
> I meant asm as in a .S file. But the code we have is fine for this purpose, > at least for now. There seem to be some drivers that call sync_core: drivers/misc/sgi-gru/grufault.c:sync_core(); drivers/misc/sgi-gru/grufault.c:sync_core();/* make sure

Re: [PATCH] EDAC/ie31200: fallback if host bridge device is already initialized

2020-08-06 Thread Luck, Tony
On Thu, Aug 06, 2020 at 05:35:49PM -0400, Jason Baron wrote: > > > On 7/16/20 4:33 PM, Jason Baron wrote: > > > > > > On 7/16/20 2:52 PM, Luck, Tony wrote: > >> On Thu, Jul 16, 2020 at 02:25:11PM -0400, Jason Baron wrote: > >>> The Intel uncore d

Re: [LKP] Re: [x86/mce] 1de08dccd3: will-it-scale.per_process_ops -14.1% regression

2020-08-18 Thread Luck, Tony
On Tue, Aug 18, 2020 at 04:29:43PM +0800, Feng Tang wrote: > Hi Borislav, > > On Sat, Apr 25, 2020 at 03:01:36PM +0200, Borislav Petkov wrote: > > On Sat, Apr 25, 2020 at 07:44:14PM +0800, kernel test robot wrote: > > > Greeting, > > > > > > FYI, we noticed a -14.1% regression of will-it-scale.pe

<    1   2   3   4   5   6   7   8   9   10   >