[GIT PULL] Some error injection fixes to queue for 3.11

2013-06-07 Thread Luck, Tony
The following changes since commit d683b96b072dc4680fc74964eca77e6a23d1fa6e: Linux 3.10-rc4 (2013-06-02 17:11:17 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-einj for you to fetch changes up to ace3647afb3eca214f6

RE: [PATCH v3 06/27] ia64, irq: Add dummy create_irq_nr()

2013-06-11 Thread Luck, Tony
> Still need ia64 guys to kill create_irq() in arch/ia64. Was there already a patch to do that? I'm afraid my eyes tend to glaze over when I see [part 64/87: ia64 ...] and assume that its general cleanup that will flow through with all the other parts of the patch series. Please point me at som

[GIT PULL] Fixes for pstore for 3.11 merge window

2013-07-01 Thread Luck, Tony
The following changes since commit d683b96b072dc4680fc74964eca77e6a23d1fa6e: Linux 3.10-rc4 (2013-06-02 17:11:17 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to 0d838347f1325c

[GIT PULL] Fix thermal power-limit code

2013-07-01 Thread Luck, Tony
The following changes since commit 317ddd256b9c24b0d78fa8018f80f1e495481a10: Linux 3.10-rc5 (2013-06-08 17:41:04 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-mce-therm for you to fetch changes up to 6bb2ff846f2

RE: [PATCH] x86/mce: Update MCE severity condition check

2013-06-26 Thread Luck, Tony
MCESEV( > - KEEP, "HT thread notices Action required: data load error", > - SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, > MCI_UC_SAR|MCI_ADDR|MCACOD_DATA), > - MCGMASK(MCG_STATUS_EIPV, 0) > + KEEP, "Action required but unaffected th

RE: [PATCH] x86/mce: Update MCE severity condition check

2013-06-26 Thread Luck, Tony
> And this obviously is the case for the hardware too, I assume, not only > the SDM? Yes - we have a magic process which reconfigures all deployed silicon whenever a new SDM is published :-) Actually the SDM had been collecting new features for each generation ... each time just bolting on a new

[GIT PULL] Another mce cleanup for the 3.11 queue

2013-06-27 Thread Luck, Tony
The following changes since commit 9e895ace5d82df8929b16f58e9f515f6d54ab82d: Linux 3.10-rc7 (2013-06-22 09:47:31 -1000) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-mce for you to fetch changes up to 33d7885b594e169256dae

RE: [patch] pstore: d_alloc_name() doesn't return an ERR_PTR

2013-08-14 Thread Luck, Tony
>> Signed-off-by: Dan Carpenter > > Thanks for the catch! > > Acked-by: Kees Cook Thanks. Applied. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-inf

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-14 Thread Luck, Tony
> Didn't we say at some point, "log only the panic messsage which kills > the machine"? We've wandered around different strategies here. We definitely want the panic log. Some people want all other "kernel exit" logs (shutdown, reboot, kexec). When there is enough space in the pstore backend we

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-15 Thread Luck, Tony
> * We parse some APEI table and disable those MCA banks which the BIOS > wants to handle first. We have no idea which errors the BIOS has chosen for itself. We just know which bank numbers ... and Intel processors change mappings of which errors are logged in which banks in every new processor t

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-15 Thread Luck, Tony
> Well, if I have serial connected to the box, it will contain basically > everything the machine said, no? Yes - but the serial port is too slow to log everything that you might conceivably need to debug your problem. Imagine trying to log every interrupt and every pagefault on every processor d

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-15 Thread Luck, Tony
> AFAIKT, APEI doesn't provide the silkscreen label. Some code (or some > datasheet) is needed to translate between what APEI provides into the > silkscreen label. In theory it could. The ACPI generic error structure used to report includes a 20-byte free format field which a BIOS could use to des

RE: [RFC PATCH v2 00/11] Add (de)compression support to pstore

2013-08-16 Thread Luck, Tony
> Needs testing with erst backend, efivars and persistent ram. Tested against ERST - works fine for me now. Need to stare at the code to see if there are any more bits that could be cleaned up. Thanks for addressing my issues from v1 -Tony

RE: [Ksummit-2013-discuss] When to push bug fixes to mainline

2013-07-16 Thread Luck, Tony
>> Maybe some QA period before the release might help, but who would >> care? (Especially under the situation where everybody has own x.y >> stable tree?) > > Hopefully people tracking the upstream stable trees would be throwing > any pre-release stuff into their QA processes before it was officia

RE: [Ksummit-2013-discuss] [ATTEND] How to act on LKML

2013-07-17 Thread Luck, Tony
> Those are just stories; things that happened. What you need to provide > is *evidence* that if the community changes, things will be better, > and unless you have a study of series of collaborative groups like the > Linux kernel, that demonstrates that suppressing swearing has a > positive effect

[GIT PULL] Fix a regression in mce-severity.c

2013-07-30 Thread Luck, Tony
[ Sent this as a patch last Thursday - seems to have been lost in all the general noise, or is in a queue and I'm just too impatient: http://marc.info/?l=linux-kernel&m=137478705715276&w=2 ] The following changes since commit 5ae90d8e467e625e447000cb4335c4db973b1095: Linux 3.11-rc3 (2013-07

RE: [PATCH 00/11] Add compression support to pstore

2013-08-01 Thread Luck, Tony
> Could you please review and let me know your comments!! Skimmed through it today and didn't notice anything I hated. It built fine - but doesn't seem to be working on top of ERST. This doesn't seem to be your fault though, when I rebuilt a plain 3.11-rc3 it didn't log anything via pstore

[GIT PULL] x86/mce fix to queue for 3.12

2013-08-05 Thread Luck, Tony
The following changes since commit c095ba7224d8edc71dcef0d655911399a8bd4a3f: Linux 3.11-rc4 (2013-08-04 13:46:46 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-mce-f-bit for you to fetch changes up to 0ca06c0857aee1

RE: [RFC PATCH v2 04/11] pstore: Add compression support to pstore

2013-08-22 Thread Luck, Tony
<1>[ 383.209057] RIP [] sysrq_handle_crash+0x16/0x20 <4>[ 383.209057] RSP <4>[ 383.209057] CR2: <4>[ 383.209057] ---[ end trace 04a1cddad37b4b33 ]--- <3>[ 383.209057] pstore: compression failed for Part 2 returned -5 <3>[ 383.209057] pstore: Capture uncompressed oops/panic

RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-18 Thread Luck, Tony
+ BUG_ON(mci->mc_idx >= EDAC_MAX_MCS); Do we have to "BUG_ON()" here? Couldn't we be gentler with something like: if (mci->mc_idx >= EDAC_MAX_MCS) { printk_once(KERN_WARNING "Too many memory controllers\n"); return; /* probably need to make sure call

[PATCH] mce: acpi/apei: Only disable banks listed in HEST if mce is configured

2013-07-19 Thread Luck, Tony
From: "Naveen N. Rao" Randconfig testing found this error: >> hest.c(.init.text+0x6004): undefined reference to 'mce_disable_bank' Fix by wrapping body of hest_parse_cmc() inside #ifdef CONFIG_X86_MCE Reported-by: "Wu, Fengguang" Signed-off-by: Naveen N. Rao Signed-off-by: Tony Luck --- [Pe

RE: [Ksummit-2013-discuss] Maybe it's time to shut this thread down (Was: Re: [ 00/19] 3.10.1-stable review)

2013-07-22 Thread Luck, Tony
On 07/18/2013 03:54 PM, Sarah Sharp wrote: > Let's shift this discussion away from the terms "abuse" and > "professionalism" to "respect" and "civility". And Daniel Philips replied: > Brilliant, and +1 for a session at KS. In the mean time, why don't we > all try to demonstrate the real meaning o

[PATCH] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors.

2013-07-23 Thread Luck, Tony
The 0x1000 bit of the MCACOD field of machine check MCi_STATUS registers is only defined for corrected errors (where it means that hardware may be filtering errors see SDM section 15.9.2.1). For uncorrected errors it may, or may not be set - so we should mask it out when checking for the architect

RE: [PATCH] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors.

2013-07-24 Thread Luck, Tony
> How about just changing MCACOD to 0xefff? I don't think we ever care > about the 'F' bit, so we could simplify this by just changing MCACOD. That certainly reduces the size of the patch ... I was a little worried about just changing this because it doesn't match the definition of the MCACOD fiel

RE: [PATCH] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors.

2013-07-25 Thread Luck, Tony
MCESEV( + PANIC, "Action required but kernel thread is not continuable", + SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR, MCI_UC_SAR|MCI_ADDR), + MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV|MCG_STATUS_EIPV), + KERNEL +

[PATCH] x86/mce: Fix mce regression from recent cleanup

2013-07-25 Thread Luck, Tony
In commit 33d7885b594e169256daef652e8d3527b2298e75 x86/mce: Update MCE severity condition check we simplified the rules to recognise each classification of recoverable machine check combining the instruction and data fetch rules into a single entry based on clarifications in the June 2013 SDM t

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-12 Thread Luck, Tony
>> We are, of course, going to have only one tracepoint which reports >> memory errors, not two. > > Yes, that's my point. Is life that simple? We have systems that have no EDAC driver (in some cases because the architecture precludes one from ever being written, in other because we either don't

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-13 Thread Luck, Tony
> In the meantime, like Boris suggests, I think we can have a different > trace event for raw APEI reports - userspace can use it as it pleases. > > Once ghes_edac gets better, users can decide whether they want raw APEI > reports or the EDAC-processed version and choose one or the other trace >

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-13 Thread Luck, Tony
> Why would you need dmesg if you get your hw errors over the tracepoint? Redundancy is a good thing when talking about mission critical systems. dmesg may be feeding to a serial console to be logged and analysed on another system. The tracepoint data goes to a process on the system experiencing

RE: [PATCH 3/3] mce: acpi/apei: trace: Enable ghes memory error trace event

2013-08-13 Thread Luck, Tony
> What about sending tracepoint data over serial and/or network? I agree > that dmesg over serial would be helpful but we need a similar sure-fire > way for carrying error info out. Generic tracepoints are architected to be able to fire at very high rates and log huge amounts of information. So w

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> With hyperthread turns on, the num_online_cpus reports the number of all > logical cores. > What I found in testing is only half the cores receives the mce broadcast, so > I assume only the physical cores get broadcast. See Intel Software Developer Manual Volume 3B Section 15.10.4.1, 3rd bulle

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> I used intel edac error injector and saw the same problem. I actually wrote > down the core numbers > and I saw mce got to 0-5 and 12-17, but not the others. I have 2 sockets, 24 > logical cores. Mauro: How does the EDAC injector work on E5645 (Westmere-EP)? Does it create a real error in m

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> So only one socket gets the machine check. So is there still a problem but > the fix will be different? > I think the error inject creates a real machine check, but since each CPU has > its own memory controller, > the machine check may only send to the CPU the error happens. If there is a rea

RE: CONFIG_HYPERVISOR_GUEST=y {-- replace -- CONFIG_PARAVIRT_GUEST=y {= { # CONFIG_HYPERVISOR_GUEST is not set } Re: 3.10-rc1 Fw: [PATCH 2/2] x86: Make Linux guest support optional

2013-05-15 Thread Luck, Tony
> Tony, what's your take on this - CONFIG_PARAVIRT_GUEST is > present in ia64 and I recently changed the Kconfig symbol to > CONFIG_HYPERVISOR_GUEST on x86. I can fix it for correctness but is > hyperv and vmware balloon even attempted on ia64? > > Btw, it depends on BROKEN on ia64 so I'm already s

RE: [PATCH] staging/adt7316 Fix some 'interesting' string operations

2013-04-08 Thread Luck, Tony
> I think it is a good idea to switch directly to strtobool. But anyway, if you > don't want to respin the patch it is fine as it is. I didn't know that strtobool() existed ... but now that I do I agree that it would be better to use it here. But ... I'm less comfortable updating the patch to u

RE: [PATCH] x86, mce: Print warning if MCE handler fails to register /dev/mcelog

2013-04-09 Thread Luck, Tony
- misc_register(&mce_chrdev_device); + if (misc_register(&mce_chrdev_device) != 0) + pr_warn("Failed to register mcelog device\n"); Did this actually happen to you? Or is this just "good practice" to check the return value from misc_register? If this can really happen,

RE: [PATCH] pstore/ram: fix error return code in ramoops_probe()

2013-05-08 Thread Luck, Tony
> From: Wei Yongjun > > Fix to return a negative error code from the error handling > case instead of 0, as done elsewhere in this function. Applied - will be in a "please pull" to Linus soon. Thanks -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body o

[GIT PULL] Couple of trivial pstore cleanups

2013-05-09 Thread Luck, Tony
The following changes since commit f6161aa153581da4a3867a2d1a7caf4be19b6ec9: Linux 3.9-rc2 (2013-03-10 16:54:19 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to 47110b88912a997

RE: x86_mce: mce_start uses number of phsical cores instead of logical cores

2013-05-10 Thread Luck, Tony
> +#if NR_CPUS > 1 > + cpus /= cpumask_weight(cpu_core_mask(0)) / cpu_data(0).booted_cores; > +#endif Not entirely sure what you are trying to do here (apart from making "cpus" be a smaller number). What is the reasoning behind the right hand side of this expression? Is this problem more rel

RE: memcpy_fromio in dmi_scan.c

2013-04-23 Thread Luck, Tony
> I don't have much knowledge about IA64 either. All I see is that while > x86 implements memcpy_fromio() with memcpy [1], ia64 implements it with > readb [2]. There must be a reason for that, and I can only suppose that > memcpy on __iomem pointers doesn't work on IA64. If memcpy doesn't work > th

RE: memcpy_fromio in dmi_scan.c

2013-04-24 Thread Luck, Tony
> That being said, "my" SN2 machine was previously running kernel 3.0.34 > which has the old dmi_scan code and it also said "DMI not present or > invalid." Plus dmidecode fails on this machine with: > > /sys/firmware/efi/systab: SMBIOS entry point missing > > So it might as well be that DMI support

RE: [PATCH RT v2] x86/mce: Defer mce wakeups to threads for PREEMPT_RT

2013-04-12 Thread Luck, Tony
> I'm not (yet). But I just wanted to make sure there wasn't any little > subtleties that I might be missing. I don't think there are any hidden subtleties ... if there are then they are hidden from me too. -Tony

RE: [PATCH] ia64: Fix example error_injection_tool

2013-03-28 Thread Luck, Tony
Nice catch. - mask[j]=1

RE: [PATCH 2/2] PCI/IA64: fix pci_dev->enable_cnt balance when doing pci hotplug

2013-04-01 Thread Luck, Tony
> In IA64 platform, we don't call pci_enable_bridges() > when scan all pci buses during system boot up. But in > X86 we do it in Your patch looks plausible ... but I have a question. X86 doesn't *directly* call pci_enable_bridges() from any arch/x86/* file. Do we need this in an arch/ia64 file be

Re: [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version

2012-08-03 Thread Luck, Tony
I applied this series on top of v3.6-rc1 and took it for a test drive with a little storm of 20 corrected interrupts. The series worked ... but the console log was entirely unhelpful in letting me know what had just happened to my system. All I saw was: mce: [Hardware Error]: Machine check event

RE: [PATCH] pstore: fix printk format warning

2012-08-06 Thread Luck, Tony
> Btw, I see no maintainers for the pstore, and it surely no longer > belongs to staging. Tony, I can send patches to you, or I can create > a git tree (actually, I already had it for my own convenience).. So > how about the following patch? Acked-by: Tony Luck N�r��yb�X��ǧv�^�)޺{.n�+

RE: [PATCH 1/2] x86, mce: Enable MCA support by default

2012-09-17 Thread Luck, Tony
> MCA is the basic support for hardware error logging and reporting, and > it is majorly unwise to run without it so enable machine check software > support by default on x86. Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message t

RE: [PATCH 0/6][RFC] Rework vsyscall to avoid truncation/rounding issue in timekeeping core

2012-09-19 Thread Luck, Tony
> Does anything except the vDSO actually use the vDSO data page? It's > mapped as part of the vDSO image (i.e. at a non-constant address), and > it's not immediate obvious how userspace would locate that page. Just for reference - on ia64 the address of the entry point for the magic fast system c

RE: [PATCH] pstore: avoid recursive spinlocks in the oops_in_progress case

2012-09-20 Thread Luck, Tony
> Mm... why break? We don't know what the back-end driver will do if we allow another call while a previous one is still in progress. It might end up corrupting the backing non-volatile storage and losing some previously saved records. Existing drivers (ERST and EFI) are dependent on f/w ... so

RE: [PATCH] pstore: avoid recursive spinlocks in the oops_in_progress case

2012-09-20 Thread Luck, Tony
> True, but the lock is used to protect pstore->buf, I doubt that > any backend will actually want to grab it, no? The lock is doing double duty to protect the buffer, and the back-end driver. But even if we split it into two (one for the buffer, taken by pstore, and one internal to the backend t

RE: [PATCH 3/3] HWPOISON: improve handling/reporting of memory error on dirty pagecache

2012-08-11 Thread Luck, Tony
> dirty pagecache error recoverable under some conditions. Consider that > if there is a copy of the corrupted dirty pagecache on user buffer and > you write() over the error page with the copy data, then we can ignore > the effect of the error because no one consumes the corrupted data. This soun

[GIT PULL] x86 - handle CMCI storms gracefully

2012-08-12 Thread Luck, Tony
The following changes since commit 0d7614f09c1ebdbaa1599a5aba7593f147bf96ee: Linux 3.6-rc1 (2012-08-02 16:38:10 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-cmci-storm3 for you to fetch changes up to 55babd8f41f12

RE: [BUGFIX PATCH][RESEND] kexec & iosapic: kexec oops when iosapic was removed

2012-08-12 Thread Luck, Tony
> vec = irq_to_vector(irq); > list_for_each_entry(rte, &info->rtes, > rte_list) { > + if (rte->refcnt == NO_REF_RTE) > + continue; > + > iosapic_write(rte->iosapic, >

RE: [RFC][PATCH] kprobes: kprobe-booster for ia64

2008-02-05 Thread Luck, Tony
+/* Insert a long branch code */ +static void __kprobes set_brl_inst(void *from, void *to) +{ + s64 rel = ((s64) to - (s64) from) >> 4; + bundle_t *brl; + brl = (bundle_t *) ((u64) from & ~0xf); + brl->quad0.template = 0x05; /* [MLX](stop) */ + brl->quad0.slot0 = N

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-20 Thread Luck, Tony
> > What is the harm of not using this and just letting the number be infinite > > (or until EFI runs out of space)? Is it a big deal if extra failures > > are logged? The big question is what happens when EFI runs out of space. Matthew avoided the question by implementing the "just one record"

RE: [PATCH 2/2] x86/mce: Add quirk for instruction recovery on Sandy Bridge processors

2012-07-23 Thread Luck, Tony
> Other things which could probably be used are alternatives or jump > labels but one if-test is simply not worth the complexity. It might be if this were a super-hot path. But if you are getting so many machine checks that you can see the effect of one extra "if" ... then you are hurting in so m

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-24 Thread Luck, Tony
> I talked with Matthew a bit privately and he suggested to use > QueryVariableInfo service which is supported in EFI 2.0 or later. > If we can use it, we know the remaining NVRAM space before calling > SetVariable. So we can have (pseudo)-code like this: if (QueryVariableInfo says eno

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-24 Thread Luck, Tony
> One thing that's worth noting - UEFI systems will typically only recover > deleted space on reset. create->delete->create->delete will reduce > available space until the platform is rebooted, at which point the > deleted portion will become available again. Some ACPI/ERST systems do this too

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-24 Thread Luck, Tony
> So, we don't need to introduce a overwriting policy. > I will make a patch using QueryVariableInfo and just writing multiple logs. I don't think that's what Matthew said. Here's the bad scenario he envisions: System is running. It has an OOPs, which gets logged by pstore, and the system carri

RE: [RFC][PATCH v2 2/3] Hold multiple logs

2012-07-24 Thread Luck, Tony
> I think we inevitably lose in that scenario. I'd need to verify, but my > recollection is that overwriting existing variables may be equivalent to > a delete/create cycle. This would mean that EFI really wants the OS to treat EFI variables as pretty much exclusively read-only. Any activity w

RE: [PATCH] debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISC

2012-07-25 Thread Luck, Tony
> Since the problem is an invalid assumption about how the stack grows, > why not just condition it on that. We actually have a config option for > this: CONFIG_STACK_GROWSUP. But for some reason ia64 doesn't define > this, why not, Tony? It looks deliberate because you have replaced a > lot of

RE: [PATCH] ia64: rename platform_* to ia64_platform_*

2012-07-25 Thread Luck, Tony
>> Is platform_name particularly special? Yes. It is the symbol that is currently colliding with other subsystem namespace. >> Perhaps it's be better to rename all the other >> platform_ uses to ia64_platform_ > > That's good point in general, oh well I just wanted to make the minimal > change.

RE: [PATCH] x86/mce: Need to let kill_proc() send signal to doomed process

2012-07-09 Thread Luck, Tony
> This makes mi->restartable unused? It does ... but it's not what I meant ... somehow I lost the code that set MF_MUST_KILL based on mi->restartable. Doh! >> +doit = !!PageDirty(ppage) || (flags & MF_MUST_KILL) != 0; > > Maybe > >!!(flags & MF_MUST_KILL)

[GIT PULL] x86/mce fix (ready for 3.6 merge window)

2012-07-10 Thread Luck, Tony
The following changes since commit 6887a4131da3adaab011613776d865f4bcfb5678: Linux 3.5-rc5 (2012-06-30 16:08:57 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-mce-ripvfix for you to fetch changes up to b99c2fc9366d4

[GIT PULL v2] x86/mce fix (ready for 3.6 merge window)

2012-07-11 Thread Luck, Tony
The following changes since commit 6887a4131da3adaab011613776d865f4bcfb5678: Linux 3.5-rc5 (2012-06-30 16:08:57 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git mce-ripvfix for you to fetch changes up to 6751ed65dc6642af64f7b8a440a7556

RE: [RFC PATCH 3/3] Convert mce_disabled

2012-10-12 Thread Luck, Tony
> Or, you can modify the mca_config I have there and use bools and pass a > pointer to each actual bool member in each DEVICE_BIT_ATTR invocation > (and rename it to DEVICE_BOOL_ATTR). Yeah, that could work, unless I'm > missing something else, of course. This looks like the best solution to me. S

RE: [PATCH v5 0/5] Add movablecore_map boot option

2013-01-14 Thread Luck, Tony
> hm, why. Obviously SRAT support will improve things, but is it > actually unusable/unuseful with the command line configuration? Users will want to set these moveable zones along node boundaries (the whole purpose is to be able to remove a node by making sure the kernel won't allocate anything

Re: [PATCH v5 0/5] Add movablecore_map boot option

2013-01-14 Thread Luck, Tony
>> >> I don't think so because user can easily get raw address by kernel >> message in x86. >> Which will fail if on some subsequent boot a DIMM fails BIST and is removed from the memory map by the BIOS which will then change all the mode boundaries for those above the failed DIMM. -Tony-- T

RE: [TRIVIAL PATCH 16/26] x86: Convert print_symbol to %pSR

2012-12-12 Thread Luck, Tony
> I think I'd go ahead and ACK this unless Tony has some comments. I'm not > happy about the two pr_emerg calls based on the conditional. As written the patch has the nice property of not making any changes to the console output (except to eliminate the possibility of interleaved output that the o

[GIT PULL] x86/mce allow bios to set per-bank CMCI threshold

2012-09-27 Thread Luck, Tony
The following changes since commit 961ebea4ae68075bb5a0acc19f5852bed82bb877: Merge tag 'ras_queue_for_3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/mce (2012-09-19 17:01:50 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/gi

RE: [PATCH v5 0/5] Add movablecore_map boot option

2013-01-17 Thread Luck, Tony
> 2. If the user *does* care which nodes are movable, then the user needs > to be able to specify that *in a way that makes sense to the user*. > This may mean involving the DMI information as well as SRAT in order to > get "silk screen" type information out. One reason they might care would be

RE: [PATCH v10 3/3] aerdrv: Cleanup log output for AER

2013-01-17 Thread Luck, Tony
> These changes make cper_print_aer more consistent with aer_print_error > and clean things up by eliminating the use of the prefix variable and > replacing it with dev_printk. Applied v10 series and put it into my "next" branch so linux-next will grab it on the next cycle. Will try to interest t

RE: [PATCH v5 0/5] Add movablecore_map boot option

2013-01-18 Thread Luck, Tony
> kernel absolutely should not care much about SMBIOS(DMI info), > AFAIK, every BIOS vendor did not fill accurate info in SMBIOS, > mostly only on demand when OEMs required SMBIOS to report some > specific info. > furthermore, SMBIOS is so old and benifit nobody(in my personal > opinion), so maybe

RE: [patch] sched: unlocked context-switches

2005-04-08 Thread Luck, Tony
>tested on x86, and all other arches should work as well, but if an >architecture has irqs-off assumptions in its switch_to() logic >it might break. (I havent found any but there may such assumptions.) The ia64_switch_to() code includes a section that can change a pinned MMU mapping (when the st

RE: more git updates..

2005-04-10 Thread Luck, Tony
>Also, I did actually debate that issue with myself, and decided that even >if we do have tons of files per directory, git doesn't much care. The >reason? Git never _searches_ for them. Assuming you have enough memory to >cache the tree, you just end up doing a "lookup", and inside the kernel >that

RE: [PATCH] consolidate sys_ptrace

2005-08-10 Thread Luck, Tony
>Some architectures have a too different ptrace so we have to exclude >them: alpha, ia64, m32r, parisc, sparc, sparc64. They continue to >keep their implementations. So it should be no surprise that this patch works ok for ia64, but here is the ACK anyway. >+#ifndef __ARCH_SYS_PTRACE Most of th

RE: [PATCH] IDE: don't offer IDE_GENERIC on ia64

2005-08-11 Thread Luck, Tony
>Tony, others, does this change give you any heartburn? On >the 460GX and 870 boxes I have, IDE is a PCI device. No heartburn for me ... as you say IDE is built into one of the 870 chips. I don't know whether any non-Intel chipsets provide legacy IDE. -Tony - To unsubscribe from this list: sen

RE: Multiple virtual address mapping for the same code on IA-64 linux kernel.

2005-08-16 Thread Luck, Tony
>I have been investigating a problem in which there has been a dramatic > core size (complete program size) of a program running on a IA-64 >machine running kernel version 2.4.21-4.0.1 (A redhat advanced server >distribution) compared to other 64-bit architectures like amd64 and >EM64T. There has

RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later

2005-04-21 Thread Luck, Tony
>I just checked 2.6.12-rc3 and the fls() fix is indeed missing. Do you >know what happened? If BitKeeper were still in use, I'd have dropped that patch into my "release" tree and asked Linus to "pull" ... but it's not, and I was stalled. I should have a "git" tree up and running in the next coup

RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later

2005-04-21 Thread Luck, Tony
>Yeah, I'm facing the same issue. I started playing with git last >night. Apart from disk-space usage, it's very nice, though I really >hope someone puts together a web-interface on top of git soon so we >can seek what changed when and by whom. Disk space issues? A complete git repository of th

RE: [Gelato-technical] Re: Serious performance degradation on a RAID with kernel 2.6.10-bk7 and later

2005-04-21 Thread Luck, Tony
>That said, is there any plan to change how this functions in the future >to solve these problems? I.e. have it not use so much diskspace and >thus use less bandwith. Am I misunderstanding in assuming that after >say 1000 commits go into the tree it could end up several megs or gigs >bigger? >

RE: [patch] properly stop devices before poweroff

2005-07-26 Thread Luck, Tony
I started on my OLS homework from Andrew ... and began looking into what is going on here. The story so far: Pavel added calls to device_suspend() to three of the cases in the sys_reboot() path. This stopped ia64 from being able to shutdown. There is a oops with a stacktrace pointing back at the

RE: [PATCH] e1000: no need for reboot notifier

2005-07-26 Thread Luck, Tony
>> sys_reboot() now calls device_suspend(), so it is no longer necessary for >> the e1000 driver to register a reboot notifier [in fact doing so results >> in e1000_suspend() getting called twice]. > >Does this fix the ia64 reboot, or do we still have the >mpt-fusion problem? We still have the mp

RE: [patch] properly stop devices before poweroff

2005-07-27 Thread Luck, Tony
>> The remaining problem is cause by the order of the calls in sys_reboot: >> >> device_suspend(PMSG_SUSPEND); >> device_shutdown(); >> >> The call to device_suspend() shuts down the mpt/fusion >driver. But then >> device_shutdown() calls sd_shutdown() which prin

removal of sys_set_zone_reclaim

2005-08-02 Thread Luck, Tony
The definition of __NR_set_zone_reclaim is still in the i386 and ia64 versions of . Was this intentional (keep the system call number reserved in case this is resurrected), or just an oversight in the removal patch? -Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" i

RE: [PATCH] optimize writer path in time_interpolator_get_counter()

2005-08-02 Thread Luck, Tony
>> I'm still seeing the asymmetric behavior where cpu3 sees the really high >> times, >> while cpu0,1,2 are seeing peaks of 170us, which is still not pretty. > >Is this an SMP system? Updates are performed by cpu0 and therefore the >cacheline is mostly exclusively owned by that processor and then

RE: [PATCH] optimize writer path in time_interpolator_get_counter()

2005-08-03 Thread Luck, Tony
>Think about a threaded process that gets time on multiple processors >and then compares the times. This means that the time value obtained later >on one thread may indicate a time earlier than that obtained on another >thread. An essential requirement for time values is that they are >monoton

CONFIG_PRINTK_TIME woes

2005-08-18 Thread Luck, Tony
It has been pointed out to me that ia64 doesn't boot with CONFIG_PRINTK_TIME=y. The issue is the call to sched_clock() ... which on ia64 accesses some per-cpu data to adjust for possible variations in processor speed between different cpus. Since the per-cpu page is not set up for the first few p

RE: CONFIG_PRINTK_TIME woes

2005-08-23 Thread Luck, Tony
>I'd hate to have to test for something for CONFIG_PRINTK_TIME >every time sched_clock() is being called. Me too. >The quick fix would seem to be to only allow CONFIG_PRINTK_TIME >from kernel cmdline to make it happen a bit later. So basically >make int printk_time = 0 until command line is evalu

RE: [PATCH 05/15] ia64: remove use of asm/segment.h

2005-08-24 Thread Luck, Tony
>There are still a few drivers that include asm/segment.h, so >I don't think we should remove asm/segment.h itself just yet. Agreed. The sequence should be to send patches to get rid of all "#include " references. Once they have all gone, then a patch can remove the files. If you are concerned

RE: [PATCH 05/15] ia64: remove use of asm/segment.h

2005-08-24 Thread Luck, Tony
>> I'll apply this for ia64 w/o the deletion. This is now in my test tree. I will send to Linus soon after 2.6.13 is released. >I've posted a patch before this to remove all non-architecture users >of asm/segment.h. > >http://www.ussg.iu.edu/hypermail/linux/kernel/0508.3/0099.html Good. Afte

RE: [patch 2.6.13] swiotlb: add swiotlb_sync_single_range_for_{cpu,device}

2005-08-30 Thread Luck, Tony
>+swiotlb_sync_single_range_for_cpu(struct device *hwdev, >+swiotlb_sync_single_range_for_device(struct device *hwdev, Huh? These look identical ... same args, same code, just a different name. -Tony - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a me

RE: [PATCH] Only process_die notifier in ia64_do_page_fault if KPROBES is configured.

2005-08-30 Thread Luck, Tony
>Please do not generate any code if the feature cannot ever be >used (CONFIG_KPROBES off). With this patch we still have lots of >unnecessary code being executed on each page fault. I can (eventually) wrap this call inside the #ifdef CONFIG_KPROBES. But I'd like to keep following leads on maki

RE: [RFC] Consistently use the name asm-offsets.h

2005-09-08 Thread Luck, Tony
The existing ia64 specific rule to generate offsets.h has to "echo #define IA64_TASK_SIZE 0 > include/asm-ia64/offsets.h" before building asm-offsets.s to avoid compilation errors. So long as you take care of this somehow in the generic version, go wild. -Tony

RE: [RFC PATCH 13/22 -v2] handle accurate time keeping over longdelays

2008-01-10 Thread Luck, Tony
> If you noticed in my email, the fix for ppc was a bit easier, as it has > only a 64bit counter that is quite unlikely to wrap twice between calls > to update_wall_time(). "quite unlikely" ... Hmmm just how fast are you driving the clocks on your ppc? Even at 100GHz It is almost SIX YEARS betwe

RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory

2007-12-13 Thread Luck, Tony
> Test case: Run 'find /usr -type f | xargs cat > /dev/null' > in the background to fill the buffer cache, then run > something that uses memory, e.g. 'gmake -j50 install'. > Instrumentation showed that the number of global TLB > purges went from a few millions down to about 170 over > a 12 hours r

RE: IA64 Linux 2.6.24-rc3-git1 build error

2007-11-20 Thread Luck, Tony
> unknown-linux-gnu/bin/ld: section .data.patch [a500 -> > a507] overlaps section .dynamic [a3c8 -> > a507] Andrew Morton saw a similar thing and proposed - . = GATE_ADDR + 0x500; + . = GATE_ADDR + 0x600; in gate.lds.S ... which does

RE: [PATCH] ia64: Avoid unnecessary TLB flushes when allocating memory

2007-12-14 Thread Luck, Tony
> As you can see, the global purge rates can be pretty respectable > under this kind of load. I chose -j50 to generate enough processes > to stress my own system, you may need more with 4G. Check with > xosview or similar that the buffer cache fills up memory but > is kept relatively small by user-

RE: [PATCH RFC][try 2] IA64 signal : remove redundant code in setup_sigcontext()

2007-12-18 Thread Luck, Tony
>> This patch removes some redundant code in the function setup_sigcontext(). >> >> The registers ar.ccv,b7,r14,ar.csd,ar.ssd,r2-r3 and r16-r31 are not restored >> in restore_sigcontext() when (flags & IA64_SC_FLAG_IN_SYSCALL) is true. >> So we don't need to zero those variables in setup_sigcont

RE: kernel BUG at drivers/block/cciss.c:1260! (with recent linux-2.6 tree)

2008-01-29 Thread Luck, Tony
> > Will try this next. > > Should work even better since it avoids a lock and copy, but please do > test if you have the time. That one works too (survived two full builds at "make -j32" on the 16-way system). Thanks for the quick turnaround. -Tony -- To unsubscribe from this list: send the lin

<    1   2   3   4   5   6   7   8   9   10   >