RE: [PATCH 3/6] x86-mce: Clear CMCI enable on all claimed CMCI banks before reboot.

2014-07-09 Thread Luck, Tony
+ if (!xchg(&reboot_notifier_registered, true)) + register_reboot_notifier(&cmci_reboot_notifier); This is super-safe ... but isn't the xchg() overkill? I thought we serialized bringup of other cpus. Has this "do it once" caught on elsewhere in the kernel ... I suppose it is

RE: [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports.

2014-07-09 Thread Luck, Tony
if (!(flags & MCP_UC) && - (m.status & (mca_cfg.ser ? MCI_STATUS_S : MCI_STATUS_UC))) + (m.status & (mca_cfg.ser ? MCI_STATUS_S : MCI_STATUS_UC))) { + spin_unlock_irqrestore(&mce_banks[i].poll_spinlock, +

RE: [PATCH 5/6] x86-mce: check if no_way_out applies before deciding not to clear MCE banks.

2014-07-09 Thread Luck, Tony
+ if (!(no_way_out && cfg->tolerant < 3)) mce_clear_state(toclear); Style - I think this is easier to grok: if (!no_way_out || cfg->tolerant >=3) mce_clear_state(toclear); but not too strongly if other like !(a && b) form. I'm never sure how to trea

RE: [PATCH 6/6] x86-mce: ensure the MCP timer is not already set in the mce_timer_fn.

2014-07-09 Thread Luck, Tony
+ /* Ensure a CMCI interrupt can't preempt this. */ + local_irq_save(flags); if (mce_available(__this_cpu_ptr(&cpu_info))) { machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_poll_banks)); Does this remove the problem that you

RE: [PATCH 5/6] x86-mce: check if no_way_out applies before deciding not to clear MCE banks.

2014-07-09 Thread Luck, Tony
> to run with tolerant=3, but I kind of understood the logic to be that > if we're going to keep running, we need to clear the banks, and if > we're going to crash, we need to leave them intact That makes sense ... fold some text like that into the commit description, and this part is: Acked-by:

RE: [PATCH 4/6] x86-mce: Add spinlocks to prevent duplicated MCP and CMCI reports.

2014-07-09 Thread Luck, Tony
> I don't think we got the description right here. I think the real > issue here was machine check polls happening on multiple CPUs with > shared banks, all reporting the same MCEs. This is very reproducible > when booting with mce=no_cmci, since all CPUs will handle all banks, > and there's AFAICT

RE: linux-next: manual merge of the char-misc tree with the ia64 tree

2014-07-15 Thread Luck, Tony
> Wait, do we really need a drivers/ras directory for one single driver? > Why not put it in drivers/misc/ instead? A whole subdir at the top of > drivers seems overkill and odd. > > As it's a memory driver, what about drivers/firmware/ or drivers/edac/ > or drivers/platform? It isn't really a fi

RE: linux-next: manual merge of the char-misc tree with the ia64 tree

2014-07-15 Thread Luck, Tony
> Currently, all the stuff we're doing is x86-only so arch/x86/ras/ might > be a good place too, if other arches wanna do their own thing or if x86 > RAS facilities turn out to be PITA to make arch-independent. Oops - not quite. We moved creation of the EDAC trace point into drivers/ras/ras.c an

RE: [PATCH v2 0/3] Clean up ACPI core to prepare for running ACPI on ARM64

2014-07-16 Thread Luck, Tony
> Are there any objections against this series from the x86 and ia64 > maintainers? I'll believe the claim of no functional changes for ia64 ... so no objections from me. -Tony

RE: [PATCH v2 1/3] ACPI: ARM64 does not have a BIOS add config for BIOS table scan.

2014-07-16 Thread Luck, Tony
> + select ACPI_LEGACY_TABLES_LOOKUP if ACPI > This shouldn't actually be set on IA64, should it? IA64 doesn't have > BIOS, either, it has EFI/UEFI, like ARM64... Which ACPI tables are in the "LEGACY" category affected by this option? -Tony

RE: [tip:x86/urgent] x86/mce: Fix CMCI preemption bugs

2014-04-17 Thread Luck, Tony
> Hohum, __raw_spin_lock_irqsave does preempt_disable(). And > machine_check_poll should be running in irq context so why would the > original issue happen? > >> kernel: [7.341085] BUG: using __this_cpu_write() in preemptible >> [] code: modprobe/546 > > Unfortunately, I have only one

[GIT PULL] RAS - fix CMCI storm code

2014-03-31 Thread Luck, Tony
The following changes since commit b098d6726bbfb94c06d6e1097466187afddae61f: Linux 3.14-rc8 (2014-03-24 19:31:17 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-cmci-storm for you to fetch changes up to 27f6c573e0f77

RE: [PATCH 1/2] memory-failure: Send right signal code to correct thread

2014-05-20 Thread Luck, Tony
> Looks good to me, thank you. > Reviewed-by: Naoya Horiguchi Thanks for your time reviewing this > and I think this is worth going into stable trees. Good point. I should dig in the git history and make one of those fancy "Fixes: sha1 title" tags too. -Tony -- To unsubscribe from this list:

RE: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU

2014-05-21 Thread Luck, Tony
>> mce_regin, which is only called by monarch CPU, can be used for system >> panics as quickly as possible if there is a truly data corrupting error. >> But Monarch CPU don't have to help all other CPU to clean mces_clean. >> One advantage of Per-CPU is the isolation of errors propagation, being >>

RE: [RFC] x86_64: A real proposal for iret-less return to kernel

2014-05-21 Thread Luck, Tony
> But sending signals from #MC context is definitely a bad idea. I think > we had addressed this with irq_work at some point but my memory is very > hazy. We added code for recoverable errors to get out of the MC context before trying to lookup the page and send the signal. Bottom of do_machine_c

RE: [RFC] x86_64: A real proposal for iret-less return to kernel

2014-05-21 Thread Luck, Tony
>> That TIF_MCE_NOTIFY prevents the return to user mode, and we end up in >> mce_notify_process(). > > Why is this necessary? The recovery path has to do more than just send a signal - it needs to walk processes and "mm"s to see which have mapped the physical address that the h/w told us has go

RE: [RFC] x86_64: A real proposal for iret-less return to kernel

2014-05-21 Thread Luck, Tony
>> The recovery path has to do more than just send a signal - it needs to walk >> processes and >> "mm"s to see which have mapped the physical address that the h/w told us has >> gone bad. > > I still feel like I'm missing something. If we interrupted user space > code, then the context we're in

RE: [RFC] x86_64: A real proposal for iret-less return to kernel

2014-05-21 Thread Luck, Tony
On Wed, May 21, 2014 at 03:39:11PM -0700, Andy Lutomirski wrote: > But if we get a new MCE in here, it will be an MCE from kernel context > and it's fatal. So, yes, we'll clobber the stack, but we'll never > return (unless tolerant is set to something insane), so who cares? Remember that machine c

RE: [RFC] x86_64: A real proposal for iret-less return to kernel

2014-05-21 Thread Luck, Tony
> FWIW, this means that there really is a problem if one of these #MC > errors hits an innocent bystander who just happens to be handling an > NMI, at least if we delete the nested NMI code. But I think my > simplified proposal gets this right. Yes. Bystander broadcast machine checks can and will

RE: [RFC] x86_64: A real proposal for iret-less return to kernel

2014-05-21 Thread Luck, Tony
> MCE is frankly misdesigned. It's a piece of shit, and any of the > hardware designers that claim that what they do is for system > stability are out to lunch. This is a prime example of what *NOT* to > do, and how you can actually spread what was potentially a localized > and recoverable error, a

RE: [PATCH] x86, MCE: Kill CPU_POST_DEAD

2014-05-22 Thread Luck, Tony
>> So I think we can reduce it to just the one rwsem (with recursion) if we >> shoot CPU_POST_DEAD in the head. > > Here's the first bullet. Stressing my box here with Steve's hotplug > script seems to work fine. > > Tony, any objections? what was this comment referring to: /* intentionally i

RE: [RFC] Unnecessary work and noise from mce code in suspend/resume path

2014-05-23 Thread Luck, Tony
>> When we suspend a laptop we offline all but one processor. But >> the mce code registers on a notify chain so it can clean up >> some sysfs entries. Part of that code calls device_unregister() >> which will fire kobject_uevent() which might wake up some user >> code that is watching for such thi

RE: [PATCH] x86, MCE: Flesh out when to panic comment

2014-05-27 Thread Luck, Tony
>> I think the comment is still not explaining the big part of what the >> discussion was about -- i.e. if it was in kernel context, we always >> panic. > > I thought the pointer to mce_severity was enough? People should open an > editor and look at the function and at its gory insanity. :-P It is

RE: [PATCH] x86, MCE: Flesh out when to panic comment

2014-05-27 Thread Luck, Tony
> And this tolerant check looks fishy to me: > >if (s->sev >= MCE_UC_SEVERITY && ctx == IN_KERNEL) { >if (panic_on_oops || tolerant < 1) >return MCE_PANIC_SEVERITY; >} > > since we set it to 1 by default. But I'

RE: [RFC PATCH 0/3] RAS: Correctable Errors Collector thing

2014-05-28 Thread Luck, Tony
> A possible alternative would be to soft-offline the page. This is > currently done in APEI code when corrected memory error thresholds are > exceeded and reported by UEFI via a generic hardware error source > (GHES). +1 This is what the existing mcelog(8) daemon does when it sees an excessive

RE: [PATCH v2] x86/mce: Improve mcheck_init_device() error handling

2014-05-05 Thread Luck, Tony
+err_device_create: + /* +* mce_device_remove behave properly if mce_device_create was not +* called on that device. +*/ + for_each_possible_cpu(i) + mce_device_remove(i); grammar comment "s/behave/behaves/" Though perhaps this is better:

RE: [patch 31/32] ia64: Use irq_init_desc

2014-05-07 Thread Luck, Tony
> Switch over to the new interface. No functional change. ia64 parts: Tested-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please re

RE: [PATCH 0/3] HWPOISON: improve memory error handling for multithread process

2014-05-30 Thread Luck, Tony
> This patchset is the summary of recent discussion about memory error handling > on multithread application. Patch 1 and 2 is for action required errors, and > patch 3 is for action optional errors. Naoya, You suggested early in the discussion (when there were just two patches) that they deserve

RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-05-30 Thread Luck, Tony
>> For memory error location, I will utilize type offset to save one >> more byte, furthermore, I want to drop requestor_id, responder_id >> and target_id. 1) They are very rare (I've never seen them by now) > > My concern is, are we sure we're never going to need them at all? Tony, > what's your t

RE: [PATCH 5/7 v6] trace, RAS: Add eMCA trace event interface

2014-06-02 Thread Luck, Tony
>> All of this stuff only applies to server systems - so quibbling over >> a handful of *bytes* in an error record on a system that has tens, >> hundreds or even thousands of *gigabytes* of memory seems >> a bit pointless. > > But there's still only a limited number of bytes in the ring buffer no >

RE: [PATCH 0/3] HWPOISON: improve memory error handling for multithread process

2014-06-02 Thread Luck, Tony
> I'm not sure that "[PATCH 3/3] mm/memory-failure.c: support dedicated > thread to handle SIGBUS(BUS_MCEERR_AO)" is a -stable thing? That's a > feature addition more than a bugfix? No - the old behavior was crazy - someone with a multithreaded process might well expect that if they call prctl(PF

RE: [PATCH v6 1/9] efi: Use early_mem*() instead of early_io*()

2014-06-24 Thread Luck, Tony
> I am CC'ing IA-64 guys. The *_unmap() functions are no-op on ia64 - because we have mappings for everything all the time - the *_map() functions just need to compute the proper address to use to get the right attributes (so we don't mix and match cacheable and uncachable access to the same ad

RE: [PATCH 00/17] x86, ia64 NUMA cleanup

2014-02-04 Thread Luck, Tony
> I added this series to my "next" branch for v3.14. Tony, let me know > if you see any ia64 issues. It showed up in next-20140204 - and doesn't seem to have caused any build or boot problems on my test machines. -Tony

RE: [PATCH v3 8/9] ACPI, APEI, CPER: Cleanup CPER memory error output format

2013-10-21 Thread Luck, Tony
+ if (severity != CPER_SEV_FATAL) >>> >>> Shouldn't this just be (severity == CPER_SEV_CORRECTED)? >> IMO, only fatal error can't be handlered gracefully in current >> kernel plus H/W. Once it can be recovered by H/W and OS, we >> can call it recovered. > Sure, but we don't recover in all

Re: [PATCH v3 4/9] ACPI, x86: Extended error log driver for x86 platform

2013-10-21 Thread Luck, Tony
> But yes, this is possible and it would make it all even cleaner > and simpler by simply not needing the reg/dereg interfaces for > mce_ext_err_print but adding it to the chain. So this is on top of the 9 patch series (using the V4 that Chen Gong posted for part 4/9 and V3 for all the others). O

RE: [PATCH V2 Resend 13/34] cpufreq: ia64: Convert to light weight ->target_index() routine

2013-10-22 Thread Luck, Tony
Kumar Tested-by: Tony Luck -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

[GIT PULL] For x86/mce ... enhanced error logs

2013-10-22 Thread Luck, Tony
Ingo, Ultimate plan is to use these enhanced error logs to feed a perf/trace event ... but we are still discussing the exact format of that, and also how it should interact/complement/replace the existing EDAC trace event. Meanwhile all this precursor work has been reviewed and agreed on by Mauro

[GIT PULLv2] For x86/mce ... enhanced error logs

2013-10-23 Thread Luck, Tony
Replacement for yesterday's pull request - fixes a build bug when CONFIG_SMP=n found by Fengguang's zero-day auto-build robot army. If you pulled (and pushed) that one before finding this in your mailbox - then I can send the one-line patch to be applied on top of yesterday's version. -Tony The

[PATCH] UEFI, CPER: Move cper.c to a more proper place

2013-10-25 Thread Luck, Tony
From: "Chen, Gong" CPER (Common Platform Error Record - See UEFI spec, appendix N) support is implemented via cper.c, which is under drivers/acpi/apei. But it is not APEI specific, nor even ACPI specific. So move it to lib/ as a function library. Signed-off-by: Chen, Gong Signed-off-by: Tony Lu

RE: [PATCH] don't select EFI from certain special ACPI drivers

2013-12-17 Thread Luck, Tony
> No, it hasn't. But I explicitly checked the relevant EFI=n and EFI=y > cases. Jan, I pushed your patch into my "next" tree - the robots will notice soon and send us e-mail if they find any issues. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

[GIT PULL] Update to error injection interface

2013-12-17 Thread Luck, Tony
Ingo, Please queue in x86/ras branch for next merge window. Thanks -Tony The following changes since commit 319e2e3f63c348a9b66db4667efa73178e18b17d: Linux 3.13-rc4 (2013-12-15 12:31:33 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.g

RE: [PATCH v2 00/10] Kconfig: cleanup SERIO_I8042 dependencies

2013-12-18 Thread Luck, Tony
> This is v2 of the patch series. Changes from version 1: > > o Added acks. arm, ia64, and sh are only ones without acks. ia64 bits look OK Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More m

RE: [patch x86/mce] ACPI, x86: Export boot_cpu_physical_apicid to modules

2013-11-14 Thread Luck, Tony
> ERROR: "boot_cpu_physical_apicid" [drivers/acpi/acpi_extlog.ko] undefined! > > The symbol needs to be exported for it to be available. Good - but I wonder how many more useless layers there are to this onion :-( First I had to add a "#include " Then add the dependency on CONFIG_X86_LOCAL_APIC N

RE: [patch x86/mce] ACPI, x86: Export boot_cpu_physical_apicid to modules

2013-11-14 Thread Luck, Tony
,.. then Acked-by: Tony Luck -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

RE: [PATCH] x86: Add check for number of available vectors before CPU down [v3]

2013-12-20 Thread Luck, Tony
> I haven't double checked, but I'm assuming the hot plug locks are held > while you are doing this. I dug into that when looking at v2 - the whole thing is under "stop_machine()" so locking was not an issue. Reviewed-by: Tony Luck -Tony -- To unsubscribe fro

RE: [PATCH] x86, MCE: support memory error recovery for both UCNA and Deferred error in machine_check_poll

2014-10-23 Thread Luck, Tony
> The general idea of preemptively poisoning pages which contain deferred > errors is fine though. Agreed. I used to think that it wasn't likely to be very useful because in many cases the UCNA errors are just a trail of breadcrumbs set by different units on the chip as the poison passed through o

RE: [PATCH 1/2 v2] x86, mce, severity: extend the the mce_severity

2014-11-06 Thread Luck, Tony
>> +int mce_severity(struct mce *m, int tolerant, char **msg, bool is_excp) > > You're adding a function argument which is carrying redundant info which > is already present in *m... > >> { >> +enum exception excp = (is_excp ? EXCP_CONTEXT : NO_EXCP); > > ... and so this should be: > > e

RE: [PATCH 1/2 v2] x86, mce, severity: extend the the mce_severity

2014-11-06 Thread Luck, Tony
> Basically, this check is being done only for machine check exceptions > only. But you proposed setting excp by looking at mcg_status: > excp = ((m->mcg_status & MCG_STATUS_MCIP) ? EXCP_CONTEXT : NO_EXCP); Which makes the code rather self referential. If we actually did arrive in MCE handler w

RE: [PATCH 2/2] x86, mce: support memory error recovery for both UCNA and Deferred error in machine_check_poll

2014-10-27 Thread Luck, Tony
+ m->mcgstatus |= (MCG_STATUS_MCIP|MCG_STATUS_RIPV); + severity = mce_severity(m, mca_cfg.tolerant, NULL); This seems a big hack to make mce_severity() work when called from CMCI context (when MCG_STATUS register is not set). It would also be confusing as the subsequent logged entries

RE: [PATCH 1/2 v2] x86, mce, severity: extend the the mce_severity

2014-11-06 Thread Luck, Tony
> I'm under the assumption that at all times, when we get a MCE, MCIP will > be set. For example, mce_gather_info() reads MCG_STATUS before we call > mce_severity() in do_machine_check(). > > Or am I missing something? Architecturally it is true that MCIP will be set when machine check is signaled

RE: [PATCH] sb_edac: fix TAD presence check for sbridge_mci_bind_devs()

2015-08-05 Thread Luck, Tony
> In 7d375bff, NUM_CHANNELS was changed to 8 and the channel space was > renumerated to handle EN, EP, and EX configurations. > > The *_mci_bind_devs functions, except for sbridge_mci_bind_devs(), got a > new device presence check in the form of saw_chan_mask. However, > sbridge_mci_bind_devs() st

RE: [PATCH] mm: Check if section present during memory block (un)registering

2015-08-25 Thread Luck, Tony
> It appears this should be backported into -stable kernels, yes? Do you > know which kernel versions need the fix? For my setup the problem is first seen after: commit bdee237c0343 " x86: mm: Use 2GB memory block size on large memory x86-64 systems" which appeared in v3.19 and forced a 2GB me

RE: [RFT PATCH] ia64: zx1_defconfig: convert to use libata PATA drivers

2015-08-17 Thread Luck, Tony
> I'm running kernels configured that way (i.e. using libata PATA > drivers) for years on my hp workstation zx6000 (zx1 chipset) without > apparent problems. Do you have PATA drives? My zx6000 just has SCSI: scsi host0: ioc0: LSI53C1030 C0, FwRev=01032341h, Ports=1, MaxQ=255, IRQ=57 scsi 0:0

RE: [PATCH v2] pstore: add lzo/lz4 compression support

2016-05-02 Thread Luck, Tony
> Tony, should I take over the pstore tree? Do you have any testing > procedures I could use? My testing is rather manual at the moment. Kees, Sure - I seem to be bad about keeping track of stuff here. I don't have any good tests ... just manually crash a machine and make sure things show up in

RE: [PATCH] ia64: Remove superfluous SMP function call

2016-05-02 Thread Luck, Tony
> Replace smp_call_function_single() with a direct call to > ia64_mca_cmc_vector_adjust(). The function itselfs handles disable and > enable interrupts, therefore the smp_call_function_single() calling > convention is not preserved. Applied. Thanks. -Tony

RE: [PATCH 0/5] ia64: Fix compiler warnings

2016-05-05 Thread Luck, Tony
> ia64/PCI: Fix incorrect PCI resource end address > ia64/PCI: Remove unused 'addr' and fix build warning > ia64: Reduce stack usage by iterating over nodemask > ia64/traps: Silence GCC warning about uninitialised variable > ia64/unaligned: Silence another GCC warning about an uninitialized va

RE: [PATCH] ie31200_edac: add skylake support

2016-05-04 Thread Luck, Tony
> I've verified that the 'ce_count' is correctly incrementing with bad dimms. Did you re-test on at least one of the previous 3 generations of CPUs supported by this driver? All would be nice, but the bulk of the opportunities for cut&paste errors seem to be in code that looks like: if

RE: [PATCH] ie31200_edac: add skylake support

2016-05-04 Thread Luck, Tony
> I verified that at least the memory sizes, ie the 'size_mb' files > are correct on the old h/w. I don't have bad dimms atm to test > the old h/w error paths though. That said this driver does get a > lot indirect testing here (just from being loaded), - so I would > likely find out if there were

RE: [PATCH 2/3] perf/x86/mbm: Fix mbm counting for RMID reuse

2016-05-10 Thread Luck, Tony
>> (3) Also we may not want to count at every sched_in and sched_out >> because the MSR reads involve quite a bit of overhead. > > Every single other PMU driver just does this; why are you special? They just have to read a register. We have to write the IA32_EM_EVT_SEL MSR and then read fro

RE: [lkp] [EDAC, sb_edac] 2c1ea4c700: kmsg.EDAC_sbridge:Failed_to_register_device_with_error

2016-05-24 Thread Luck, Tony
> [ 55.677523] EDAC sbridge: ECC is disabled. Aborting Works on my HSW-EX. Maybe it depends on memory configuration or some BIOS settings? The EDAC driver is looking at the MCMTR register to determine whether ECC is enabled (and this change in the code shouldn't really affect that). What doe

RE: [PATCH] x86, signals: add missing signal_compat code for x86 features

2016-05-24 Thread Luck, Tony
> Tony / Borislav, do we have tests for the machine check code that could > have caught this? If I had built one of my recovery test programs as a 32-byte binary instead of native 64-bit I might have noticed (I only print the lsb field ... which would have been garbage on the stack, maybe I'd ha

Re: [PATCH] Cleanup useless codes in CMCI handler

2015-11-11 Thread Luck, Tony
On Wed, Nov 11, 2015 at 10:16:45AM -0500, Chen, Gong wrote: > UCNA errors share the same handler with CMCI. But it doesn't > need extra operation to save error record in genpool. Remove > these uselss codes. I'd have emphasised that this same mce is being added to the genpool *twice* (once here, a

Re: [RFC PATCH 0/3] Machine check recovery when kernel accesses poison

2015-11-11 Thread Luck, Tony
On Wed, Nov 11, 2015 at 09:41:58PM +0100, Borislav Petkov wrote: > On Tue, Nov 10, 2015 at 01:55:46PM -0800, Luck, Tony wrote: > > I need to add more to the motivation part of this. The people who want > > this are playing with NVDIMMs as storage. So think of many GBytes of > >

RE: [RFC PATCH 0/3] Machine check recovery when kernel accesses poison

2015-11-11 Thread Luck, Tony
> If you know that it is in the nvdimm range, you can grade the error with > lower severity... Grading the severity isn't the main issue. > Or do you mean that without the exception table we'll return back to the > insn causing the error and loop indefinitely this way? Yes. We need to NOT return

RE: [PATCH v2 2/3] efi-pstore: implement efivars_pstore_exit()

2015-11-11 Thread Luck, Tony
>>> module_init(efivars_pstore_init); >> >> Looks OK to me. Kees, are you picking this up? > > I can, though usually it goes through Tony. Can I count that as "Acked-by" from both of you? -Tony

Re: [PATCH 1/3] x86, ras: Add new infrastructure for machine check fixup tables

2015-11-12 Thread Luck, Tony
On Wed, Nov 11, 2015 at 08:14:56PM -0800, Andy Lutomirski wrote: > On 11/06/2015 12:57 PM, Tony Luck wrote: > >Copy the existing page fault fixup mechanisms to create a new table > >to be used when fixing machine checks. Note: > >1) At this time we only provide a macro to annotate assembly code > >

Re: [PATCH 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas

2015-11-12 Thread Luck, Tony
On Wed, Nov 11, 2015 at 08:19:35PM -0800, Andy Lutomirski wrote: > >@@ -1132,9 +1133,15 @@ void do_machine_check(struct pt_regs *regs, long > >error_code) > > if (no_way_out) > > mce_panic("Fatal machine check on current CPU", &m, > > msg); > > if (wors

Re: [PATCH 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-11-12 Thread Luck, Tony
On Thu, Nov 12, 2015 at 08:53:13AM +0100, Ingo Molnar wrote: > > +extern phys_addr_t mcsafe_memcpy(void *dst, const void __user *src, > > + unsigned size); > > So what's the longer term purpose, where will mcsafe_memcpy() be used? The initial plan is to use this for file

Re: [PATCH 1/3] x86, ras: Add new infrastructure for machine check fixup tables

2015-11-12 Thread Luck, Tony
On Thu, Nov 12, 2015 at 12:04:36PM -0800, Andy Lutomirski wrote: > > We already have code to recover from machine checks encountered > > while the processor is executing ring3 code. > > I meant failures during copy_from_user, copy_to_user, etc. Yes. copy_from_user() will be pretty interesting fr

RE: [Patch V0] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-04 Thread Luck, Tony
> Franky, I'm not sure at all and very very wary of adding *any* code > which runs on an offlined CPU. Because *no one* does that and it hasn't > been tested at all. So who knows what happens. > > What we should be doing is execute the *minimal* amount of code possible > and get out. No counting, n

RE: [PATCH 1/5] ia64: ftrace: fix the comments for ftrace_modify_code

2015-12-04 Thread Luck, Tony
> Suggested-by: Steven Rostedt > Signed-off-by: Li Bin Sure. Acked-by: Tony Luck [assuming that Steven is going to apply this whole series] -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo in

RE: [Patch V0] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-04 Thread Luck, Tony
> I don't mean that - I mean the stuff we do before we call > cpu_is_offline() like ist_enter, this_cpu_inc(mce_exception_count), > etc. Then we do a whole another bunch of stuff at the "out:" label like > printk and whatnot which shouldn't run on an offlined CPU. ist_enter() is black magic to me.

RE: [Patch V0] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-04 Thread Luck, Tony
> Whether it is kosher or not is beside the point. Why should an offlined > CPU even noodle through all that code if it doesn't need/have to? It can > return immediately instead. Ashok wants to move in stage 2 to having the offline cpu scan banks and report any errors seen there. To do that we'll

RE: [Patch V1] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-04 Thread Luck, Tony
> With that hunk here you want to clear MSR_IA32_MCG_STATUS in the > !cfg->banks case, right? I can't imagine how we'd get into do_machine_check without any banks. Would indeed be a separate patch ... but value seems limited. -Tony N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v��

some 4.4 issues on 4S Xeon servers

2015-12-07 Thread Luck, Tony
4.4 isn't going smoothly on my 4 socket Xeon servers (18 core per socket if that is important). User space is RHEL 7.2. Kernel config is the RHEL one (with whatever mods happen running "make oldconfig" and hitting to every question.) 1) there was a problem in drm_calc_timestamping_constants(), th

RE: [Patch V2] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-07 Thread Luck, Tony
> Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast > exception handler Is that what we printed in this case? ... boy is that a misleading message ... we got *extra* cpus (the offline ones), not "Not all". Good job we have a fix :-) -Tony N�r��yb�X��ǧv�^�)޺{.n�+{

RE: [Patch V2] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-07 Thread Luck, Tony
> And that is incorrect too, because the MCE (at least the one I'm > injecting) gets broadcasted to the CPUs on the *node* and not to the > whole system. Which system? What kind of machine check? On Intel we expect machine checks to be broadcast to all logical cpus on all nodes (unless local mac

Re: [Patch V2] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-07 Thread Luck, Tony
On Mon, Dec 07, 2015 at 11:34:27PM +0100, Borislav Petkov wrote: > BIOS is doing funny cores enumeration: > > node #0, CPUs 0-7 > node #1, CPUs 8-15 > node #2, CPUs 16-23 > node #3, CPUs 24-31 > > and then starts from node 0 again: > > node #0, CPUs:#32 #33 #34 #35 #36 #37 #38

RE: [PATCHV2 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas

2015-12-15 Thread Luck, Tony
>> +/* Fault was in recoverable area of the kernel */ >> +if ((m.cs & 3) != 3 && worst == MCE_AR_SEVERITY) >> +if (!fixup_mcexception(regs, m.addr)) >> +mce_panic("Failed kernel mode recovery", &m, NULL); > ^^^

RE: [PATCHV3 1/3] x86, ras: Add new infrastructure for machine check fixup tables

2015-12-16 Thread Luck, Tony
> Looks generally good. > > Reviewed-by: Andy Lutomirski You say that to part 1/3 ... what happens when you get to part 3/3 and you read my attempts at writing x86 assembly code? >> +#ifdef CONFIG_MCE_KERNEL_RECOVERY >> +int fixup_mcexception(struct pt_regs *regs) >> +{ >> + const struct e

RE: [PATCH v3 2/2] mm: Introduce kernelcore=mirror option

2015-12-17 Thread Luck, Tony
>>> As Tony requested, we may need a knob to stop a fallback in >>> "movable->normal", later. >>> >> >> If the mirrored memory is small and the other is large, >> I think we can both enable "non-mirrored -> normal" and "normal -> >> non-mirrored". > > Size of mirrored memory can be configured by

RE: [PATCH v3 2/2] mm: Introduce kernelcore=mirror option

2015-12-17 Thread Luck, Tony
>Hmm...like this ? > sysctl.vm.fallback_mirror_memory = 0 // never fallback # default. > sysctl.vm.fallback_mirror_memory = 1 // the user memory may be > allocated from mirrored zone. > sysctl.vm.fallback_mirror_memory = 2 // usually kernel allocates > memory from mirrored z

RE: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred errors into the genpool.

2015-11-19 Thread Luck, Tony
> Applied, thanks. Did you test it (note the "UNTESTED" in the subject!). My usual system for this is getting upgrades and being flaky at the moment. > Btw, looking at that mce.usable_addr, it doesn't make a whole lotta > sense to me and we can use mce_usable_address() directly instead and use

RE: [PATCH 1/4] EDAC: add DDR4 flag

2015-12-03 Thread Luck, Tony
> For patch 2 and 3 I'd need an ack from Mauro/Tony. CCed. parts 2 & 3 are OK Acked-by: Tony Luck part4 (the actual KNL piece) seems not to break earlier (Broadwell) system ... but that doesn't qualify enough for Ack/Review/Tested -by. -Tony N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}�

RE: [PATCH 1/4] EDAC: add DDR4 flag

2015-12-03 Thread Luck, Tony
> It already has your Reviewed-by. Is it still valid? So it does ... that was a long time ago ... but not so long that anything important changed. Yes, still valid. -Tony

Re: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-14 Thread Luck, Tony
On Mon, Dec 14, 2015 at 09:36:25AM +0100, Ingo Molnar wrote: > > /* deal with it */ > > > > That way the magic is isolated to the function that needs the magic. > > Seconded - this is the usual pattern we use in all assembly functions. Ok - you want me to write some x86 assembly code (you ma

Re: [PATCHV2 1/3] x86, ras: Add new infrastructure for machine check fixup tables

2015-12-14 Thread Luck, Tony
On Sat, Dec 12, 2015 at 11:11:42AM +0100, Borislav Petkov wrote: > > +config MCE_KERNEL_RECOVERY > > + depends on X86_MCE && X86_64 > > + def_bool y > > Shouldn't that depend on NVDIMM or whatnot? Looks too generic now. Not sure what the "whatnot" would be though. Making it depend on X86_MCE

RE: [PATCHV2 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-15 Thread Luck, Tony
>> ... and the non-temporal version is the optimal one even though we're >> defaulting to copy_user_enhanced_fast_string for memcpy on modern Intel >> CPUs...? My current generation cpu has a bit of an issue with recovering from a machine check in a "rep mov" ... so I'm working with a version of m

RE: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred errors into the genpool.

2015-11-24 Thread Luck, Tony
>> Ok ... applied those two on top of my "UNTESTED" patch and injected an error >> to force a UCNA log. > > Ok, what error type is that in EINJ nomenclature? I had only > > /sys/kernel/debug/apei/einj/available_error_type:0x0002 Processor > Uncorrectable non-fatal > /sys/kernel/debug/apei

RE: [Patch V2] x86, mce: Ensure offline CPU's don't participate in mce rendezvous process.

2015-12-08 Thread Luck, Tony
> No, the system did panic in both times. The "strange" observation is > that the MCE gets reported only on the cores on node 0. Or at least only > the printks from mce_panic() on the cores on node0 reach the serial > console. You only see messages and logs from node0, because the cpus there are t

RE: [PATCH 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-12-08 Thread Luck, Tony
> Is that an "Acked-by"? I'd like to pull this plus Vishal's > gendisk-badblocks patches into a unified libnvdimm-error-handling > branch. We're looking to have v4.5 able to avoid or survive nvdimm > media errors through the pmem driver and DAX paths. I'm making a V2 that fixes some build errors

RE: [PATCH v3 2/2] mm: Introduce kernelcore=mirror option

2015-12-09 Thread Luck, Tony
> How about add some comment, if mirrored memroy is too small, then the > normal zone is small, so it may be oom. > The mirrored memory is at least 1/64 of whole memory, because struct > pages usually take 64 bytes per page. 1/64th is the absolute lower bound (for the page structures as you say).

RE: [PATCH 03/34] ia64: rename nop->iosapic_nop

2015-12-30 Thread Luck, Tony
> asm-generic/barrier.h defines a nop() macro. > To be able to use this header on ia64, we shouldn't > call local functions/variables nop(). > > There's one instance where this breaks on ia64: > rename the function to iosapic_nop to avoid the conflict. Acked-by: Tony Luck -- To unsubscribe from

RE: [PATCH 04/34] ia64: reuse asm-generic/barrier.h

2015-12-30 Thread Luck, Tony
> On ia64 smp_rmb, smp_wmb, read_barrier_depends, smp_read_barrier_depends > and smp_store_mb() match the asm-generic variants exactly. Drop the > local definitions and pull in asm-generic/barrier.h instead. > > This is in preparation to refactoring this code area. Acked-by: Tony Luck -- To unsu

RE: [PATCH v6 2/4] x86: Cleanup and add a new exception class

2016-01-04 Thread Luck, Tony
> So you're touching those again in patch 2. Why not add those defines to > patch 1 directly and diminish the churn? To preserve authorship. Andy did patch 1 (the clever part). Patch 2 is just syntactic sugar on top of it. -Tony

RE: [PATCH 6/6] arm64: switch to relative exception tables

2016-01-04 Thread Luck, Tony
> May I humbly ask why the [Finnish] you don't use the equivalent of the > x86 _ASM_EXTABLE() macro? In fact, why don't we make that one generic, too? I'm messing with that right now (with help from Andy Lutomirski and Boris) to add different classes of exception table (so I can tag some instruct

Re: [PATCH 6/6] arm64: switch to relative exception tables

2016-01-04 Thread Luck, Tony
On Mon, Jan 04, 2016 at 08:28:52PM +0100, Ard Biesheuvel wrote: > On 4 January 2016 at 20:21, H. Peter Anvin wrote: > > I suspect that means we will also need to go back to arch-specific > > sorting for x86. > > > > AFAICT, Tony's patches are not incompatible with mine. The fixup > address is off

[GIT PULL] Machine check recovery when kernel accesses poison

2016-02-16 Thread Luck, Tony
The following changes since commit 18558cae0272f8fd9647e69d3fec1565a7949865: Linux 4.5-rc4 (2016-02-14 13:05:20 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-mcsafev11 for you to fetch changes up to 2e5bfb23c89800a

RE: [PATCHv2 0/2] fs/pstore: Use memcpy_from/toio() instead of memcpy.

2016-02-17 Thread Luck, Tony
> Tony, are you able to pull these? I've been distracted ... I need to dig into the pile of pending pstore patches. Was there a consensus on the device tree ones? I saw a "you shouldn't do that", and a "but it's really convenient and doesn't hurt anyone else" exchange. -Tony

RE: [PATCH v11 3/4] x86, mce: Add __mcsafe_copy()

2016-02-18 Thread Luck, Tony
> > > I think the whole notion of mcsafe here is 'wrong'. This copy variant > > > simply > > > reports the kind of trap that happened (#PF or #MC) and could arguably be > > > extended to include more types if the hardware were to generate more. > > > > What would a better name be? memcpy_ret()

<    2   3   4   5   6   7   8   9   10   11   >