RE: linux-next: manual merge of the ia64 tree with Linus' tree

2012-11-26 Thread Luck, Tony
> I fixed it up (see below) and can carry the fix as necessary. I rebased the series onto 3.7-rc7 (using the same merge fix that you did) ... so you shouldn't see the merge error next time. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

RE: [PATCH v2 0/5] Add movablecore_map boot option

2012-11-28 Thread Luck, Tony
> 1. use firmware information > According to ACPI spec 5.0, SRAT table has memory affinity structure > and the structure has Hot Pluggable Filed. See "5.2.16.2 Memory > Affinity Structure". If we use the information, we might be able to > specify movable memory by firmware. For example, if

RE: [PATCH v2 0/5] Add movablecore_map boot option

2012-11-29 Thread Luck, Tony
> The other bit is that if you really really want high reliability, memory > mirroring is the way to go; it is the only way you will be able to > hotremove memory without having to have a pre-event to migrate the > memory away from the affected node before the memory is offlined. Some platforms do

RE: [PATCH v2 0/5] Add movablecore_map boot option

2012-11-29 Thread Luck, Tony
> If any significant percentage of memory is in ZONE_MOVABLE then the memory > hotplug people will have to deal with all the lowmem/highmem problems > that used to be faced by 32-bit x86 with PAE enabled. While these problems may still exist on large systems - I think it becomes harder to constru

RE: [PATCH] pstore: Create a convenient mount point for pstore

2013-02-12 Thread Luck, Tony
> > Signed-off-by: Josh Boyer > > Acked-by: Kees Cook Queued for next merge window. Thanks. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html P

RE: [PATCH v6 -next 0/2] make efivars/efi_pstore interrupt-safe

2013-02-12 Thread Luck, Tony
> Changelog > v5 -> v6 > - Rebase to a latest linux-next tree. > - Modify a comment from "efivar_update_sysfs_entry" to > "efivar_update_sysfs_entries" in include/linux/efi.h (Patch 2/2) Applied to my internal pstore topic branch - which feeds to linux-next. Note that my branch was based

RE: linux-next: manual merge of the vfs tree with the ia64 tree

2013-04-02 Thread Luck, Tony
> Today's linux-next merge of the vfs tree got a conflict in > arch/ia64/kernel/palinfo.c between commit 40c275bd92b8 ("[IA64] Fix stack > overflow in create_palinfo_proc_entries") from the ia64 tree and commit > d8e904861a28 ("palinfo fixes") from the vfs tree. > > I fixed it up (arbitrarily choos

RE: [PATCH 2/2] PCI/IA64: fix pci_dev->enable_cnt balance when doing pci hotplug

2013-04-02 Thread Luck, Tony
> But this patch mainly to fix the unbalanced dev->enable_cnt in IA64 which > will print WARNING Calltrace > in dmesg. Thanks for the explanation. > If you think it is valuable, I will try to improve resource assignment in > IA64 like other arch (eg arm, m68k, mips and sh..) > in another patch.

RE: [PATCH] x86/mce: Rework cmci_rediscover() to play well with CPU hotplug

2013-04-02 Thread Luck, Tony
>> Tony mentioned that this patch worked fine for him. So could you >> kindly pick up this patch? > > Normally, Tony picks up the Intel side of MCE. Tony, want me to do it? I'll pick it up. Thanks. -Tony

[GIT PULL] x86/mce - clean up cmci_rediscover()

2013-04-03 Thread Luck, Tony
The following changes since commit 07961ac7c0ee8b546658717034fe692fd12eefa9: Linux 3.9-rc5 (2013-03-31 15:12:43 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-cmci_rediscover for you to fetch changes up to 7a0c819d2

RE: [PATCH 1/2] efivars: Check max_size only if it is non-zero.

2013-04-04 Thread Luck, Tony
> Some (broken?) EFI implementations return always a MaximumVariableSize of 0, > check against max_size only if it is non-zero. The spec doesn't say that zero has any special meaning - so if an implementation returns max_size == 0 but lets you set a variable to a size > 0, then I don't think ther

RE: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Luck, Tony
+ if (WARN_ON_ONCE(!nb)) + goto out; + WARN_ON_ONCE() will drop a stack trace to the console - is that going to be useful? If you want a message perhaps: if (!nb) { printk_once("something interesting about not having a

[PATCH] staging/adt7316 Fix some 'interesting' string operations

2013-04-04 Thread Luck, Tony
Calling memcmp() to check the value of the first byte in a string is overkill. Just use buf[0] == '1' or buf[0] != '1' as appropriate. Signed-off-by: Tony Luck --- [Inspired by a rant on IRC about a different driver doing something similar] diff --git a/drivers/staging/iio/addac/adt7316.c b/d

RE: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-28 Thread Luck, Tony
> I will post a patch to fix it. How about always keep node0 unhotpluggable ? Node 0 (or more specifically the node that contains memory <4GB) will be full of BIOS reserved holes in the memory map. It probably isn't removable even if Linux thinks it is. Someday we might have a smart BIOS that can

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony
> assume first cpu only have 1G ram, and other 31 socket will have bunch of ram That doesn't seem to be a very realistic assumption. Can you even still buy 1G DIMMs for servers? I'd think that a minimum would be to have each of four channels populated with a 4G DIMM - so 16GB on first cpu. But e

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony
> b. it will be freed to slub before run time. > like init code and initrd disk. If this is a problem - I'd be inclined to disable the code that frees it. It's only a few hundred KB of code, and possibly a few MB of initrd. Too small to worry about on a hot pluggable server. > In that ca

RE: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-19 Thread Luck, Tony
> In efi_init() memory aligns in IA64_GRANULE_SIZE(16M). If set > "crashkernel=1024M-:600M" Is this where the real problem begins? Should we insist that users provide crashkernel parameters rounded to GRANULE boundaries? -Tony N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v������

[GIT PULL] pstore patches for 3.9 merge window

2013-02-21 Thread Luck, Tony
The following changes since commit d1c3ed669a2d452cacfb48c2d171a1f364dae2ed: Linux 3.8-rc2 (2013-01-02 18:13:21 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to fb0af3f2b1b613e

[GIT PULL] Fix ia64 build breakage

2013-02-22 Thread Luck, Tony
The following changes since commit 2ef14f465b9e096531343f5b734cffc5f759f4a6: Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip (2013-02-21 18:06:55 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linu

RE: [PATCH RFC] x86/mce: Move MCE sysfs attributes out of the per-cpu location

2012-08-29 Thread Luck, Tony
> Note: I'm not sure if it's ok to change sysfs entries and this does break > userspace tools that depend on the current path for some of these attributes. > So, they will need to be updated to use the new path. However, if we ever get > to a point where cpu0 can be offlined, these tools will need

RE: [PATCH RESEND] memory hotplug: fix a double register section info bug

2012-09-14 Thread Luck, Tony
> This is an unusual configuration but it's not unheard of. PPC64 in rare > (and usually broken) configurations can have one node span another. Tony > should know if such a configuration is normally allowed on Itanium or if > this should be considered a platform bug. Tony? We definitely have platf

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-30 Thread Luck, Tony
> this i believe builds an implicit dependency between the mca_asm.o > position within the image and the ia64_mca_data percpu variable it > accesses - it relies on the immediate 22 addressing mode that has 4MB of > scope. Per chance, the .config you sent creates a 14MB image, and the > percpu v

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-30 Thread Luck, Tony
> I'll start digging on why this doesn't boot ... but you might as well > send the fixes so far upstream to Linus so that the SMP fix is available Well a pure 2.6.24 version compiled with CONFIG_SMP=n booted just fine, so the breakage is recent ... and more than likely related to this change. I'v

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-31 Thread Luck, Tony
> hm, as far as i could check, on ia64 UP the .percpu section link > difference was the only ia64 difference i could find out of those > changes. Could you try to copy a 2.6.24 include/asm-generic/percpu.h, > include/asm-ia64.h and include/linux/percpu.h into your current tree, > and see whethe

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-31 Thread Luck, Tony
> So the percpu changes are innocent ... something else since 2.6.24 is > to blame. Only 5749 commits :-) I'll start bisecting. 12 bisections later ... nothing! I think I got lost in the maze. Bisection #5 had a crash, but it looked to be a very differnt crash (and looked to happen later than

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-02-05 Thread Luck, Tony
> Applied that patch and UP kernel built ok, and then crashed in the > same place with the memset() to a user-looking address from kmem_cache_alloc() > > So the percpu changes are innocent ... something else since 2.6.24 is > to blame. Only 5749 commits :-) I'll start bisecting. The bisection na

RE: [PATCH] MAINTAINERS: Add myself as the SWIOTLB maintainer.

2012-10-04 Thread Luck, Tony
> Now that I've an IA64 box on top of the other boxes > (IBM with Calgary-X, Intel VT-d, AMD Vi, and AMD GART - that > can use SWIOTLB as fallback) I can reliably do regression > testing. Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body o

RE: [PATCH v2 1/2] Replace if statement with WARN_ON_ONCE() in cmci_rediscover().

2012-10-23 Thread Luck, Tony
> First of all, I do think I was answering your question. As I said > before, if an online cpu == dying here, there must be something wrong. > Am I right here ? Yes - but there is a fuzzy line over where it is good to check for "something wrong" or whether to trust that the caller of the function

RE: [PATCH 02/26] pstore: add flags

2012-10-23 Thread Luck, Tony
> I wonder if the default should be to not show headers, and to add this > flag to the backends that want the pstore-added header. I think the > more common case going forward will to be without headers since > backends should arguably storing metadata themselves. Perhaps just add the headings whe

RE: [PATCH 011/193] arch/ia64: remove CONFIG_EXPERIMENTAL

2012-10-23 Thread Luck, Tony
> This config item has not carried much meaning for a while now and is > almost always enabled by default. As agreed during the Linux kernel > summit, remove it. Acked-by: Tony Luck [ditto for parts 012 and 013 of 193] -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" i

RE: [PATCH 0/5] Rework MCA configuration handling code, v2

2012-10-25 Thread Luck, Tony
> Third round, incorporating feedback from the last time. Paste one of these onto each piece: Acked-by: Tony Luck Acked-by: Tony Luck Acked-by: Tony Luck Acked-by: Tony Luck Acked-by: Tony Luck -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" i

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony
> Just curious -- can you reproduce the same problem with > CONFIG_PRINTK_TIME as I'm seeing? Yes I can reproduce this (on latest Linus tree). System dies with no console output ... looks like the boot cpu may have taken a machine check (it isn't responding to my debugger). -Tony -- To unsubscri

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony
> I guess sched_init() is too early... it does seem really strange to > me, but I just double checked with Ingo's patch and it does indeed > hang. The slow way to make progress is just to go through > start_kernel() line-by-line and enable cpu_clock() at each stage, and > see where it stops hangin

RE: [PATCH 1/5] driver core / ACPI: Move ACPI support to core device and driver types

2012-10-31 Thread Luck, Tony
On 10/31/2012 11:42 AM, Rafael J. Wysocki wrote: > I wonder if the x86 and/or ia64 maintainers have any reservations? Can you elaborate on the "tested by mika" that you put into the 0/5 message. Especially w.r.t. ia64. Compile tested? Boot tested? Ran with some new device that uses the ACPI enumer

RE: [PATCH 1/5] driver core / ACPI: Move ACPI support to core device and driver types

2012-10-31 Thread Luck, Tony
> By "tested" I mean "run with some new devices that use the ACPI enumeration > provided here, on x86". Sorry for being too vague. Do you or Mika have access to an ia64 box to test. If not, can you suggest some way that I could exercise this code w/o the new devices. Or at least reassure myself

RE: [PATCH 1/5] driver core / ACPI: Move ACPI support to core device and driver types

2012-10-31 Thread Luck, Tony
> The BIOSes of currently available ia64 systems don't contain ACPI nodes whose > IDs will match the IDs of the new devices (ie. the ones that are going to be > added to acpi_platform_device_ids[]), so for ia64 it should be sufficient to > test that code as is (ie. without any new devices in the sy

RE: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-11-01 Thread Luck, Tony
> That is correct, unfortunately. That information is not available to > software in all cases. Maybe APEI could be used for that DIMM location > mapping through simple tables instead of letting it fumble the error > handling path. Not much hope for "simple"[1] tables. There is also a timings iss

RE: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-11-01 Thread Luck, Tony
> Right, but at least in the csrow case, we still can compute back the > csrow even with the interleaving, after we know how it is done exactly > (on which address bits, etc). I think this should be doable on Intel > controllers too but I don't know. No. Architecturally all Intel provides is the p

RE: [PATCH] debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISC

2012-07-28 Thread Luck, Tony
> I agree with this. Most of it looks easily fixable, but how would I > enable the fix for ia64? For PA it's simple: I'll just use > CONFIG_STACK_GROWSUP, but that won't work for you. ia64 has an ugly chicken vs. egg build dependency. When trying to build our asm-offsets.h file (to get #define

RE: [PATCH] pstore: avoid recursive spinlocks in the oops_in_progress case

2012-09-24 Thread Luck, Tony
> And my plan was to get rid of the fact that backends touch pstore->buf > directly. Backends would always receive anonymous 'buf' pointer (we > already have write_buf callback that does exactly this), and thus it It feels like we are just shuffling the lock problem from one place to another. In

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
> If we go back to first principles, what do we want to do? We want the > system administrator to know that a file might be potentially > corrupted. And perhaps, if a program tries to read from that file, it > should get an error. If we have a program that has that file mmap'ed > at the time of

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
> Well, we could set a new attribute bit on the file which indicates > that the file has been corrupted, and this could cause any attempts to > open the file to return some error until the bit has been cleared. That sounds a lot better than renaming/moving the file. > This would persist across re

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Luck, Tony
> What I would recommend is adding a > > #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ > > ... and which could be accessed and cleared via the lsattr and chattr > programs. Good - but we need some space to save the corrupted range information too. These errors should be

[GIT PULL] Fix a cmci discovery problem

2012-10-30 Thread Luck, Tony
The following changes since commit 8f0d8163b50e01f398b14bcd4dc039ac5ab18d64: Linux 3.7-rc3 (2012-10-28 12:24:48 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-tangchen for you to fetch changes up to 85b97637bb40a9f4

RE: [GIT PULL] EFI changes for v3.11

2013-07-08 Thread Luck, Tony
>> >> Tony Luck (1): >> [IA64] sim: Add casts to avoid assignment warnings >> >> arch/ia64/hp/sim/boot/fw-emu.c | 20 ++-- >> 1 file changed, 10 insertions(+), 10 deletions(-) > > I don't see this commit in Lin

RE: [PATCH 4] mce: acpi/apei: Add a sysctl to control page offlining on firmware report

2013-07-08 Thread Luck, Tony
> Nope, this is a just-in-case thing. I think you or Tony asked to have > this in a previous discussion so that we're covered if firmware starts > acting up. Other than that, I'm ok if this is left out. I'm struggling to think of a case where this would help. It implies that we are on a running

RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Luck, Tony
> What would be a reasonable maximum limit for the number of memory > controllers, on a -EX machine? Westmere-EX has one memory controller per socket ... and there are glueless systems up to 8 sockets. So 8 there. Not sure if any OEM is building larger machines with a node controller (SGI? Not

RE: [RESEND PATCH v2 1/4] mm/hwpoison: fix traverse hugetlbfs page to avoid printk flood

2013-09-16 Thread Luck, Tony
This is good - but the real solution is to stop poisoning entire huge pages ... they should be broken into 4K pages and just one 4K page should be poisoned. Naoya Horiguchi: I thought that you were looking at this problem some months ago. Any progress? -Tony -- To unsubscribe from this list: se

RE: [RESEND PATCH v2 1/4] mm/hwpoison: fix traverse hugetlbfs page to avoid printk flood

2013-09-16 Thread Luck, Tony
>>Sorry, I have no meaningful progress on this. Splitting hugepages is not >>a trivial operation, and introduce more complexity on hugetlbfs code. >>I don't hit on any usecase of it rather than memory failure, so I'm not >>sure that it's worth doing now. > > Agreed. ;-) Agreed that huge pages shou

RE: [RESEND PATCH v2 1/4] mm/hwpoison: fix traverse hugetlbfs page to avoid printk flood

2013-09-17 Thread Luck, Tony
> Transparent huge pages are not helpful for DB workload which there is a lot > of > shared memory Hmm. Perhaps they should be. If a database allocates most[1] of the memory on a machine to a shared memory segment - that *ought* to be a candidate for using transparent huge pages. Now that we h

[GIT PULL] pstore/compression fixes

2013-09-18 Thread Luck, Tony
The following changes since commit e831cbfc1ad843b5542cc45f777e1a00b73c0685: Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux (2013-09-11 08:36:03 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.gi

RE: [RFC PATCH v2 04/11] pstore: Add compression support to pstore

2013-09-04 Thread Luck, Tony
> The reason behind compression failure is the size of big_oops_buf which is too > big for efivars case. I will do some experiments with different kind of texts > for buffer size 1024 to check if 100/53 suits for all the cases. ... > Yes this can be changed to zlib_inflateInit2(). Original patch

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony
> *If* however the cpu_relax() makes sense on other platforms maybe we could > add something like we have already with "arch_mutex_cpu_relax()": I'll do some more measurements on ia64. During my first tests cpu_relax() seemed to be a big win - but I only ran "./t" a couple of times. Later (with

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony
> And there can't be any livelock, since by definition somebody else > _did_ make progress. In fact, adding the cpu_relax() probably just > makes things much less fair - once somebody else raced on you, the > cpu_relax() now makes it more likely that _another_ cpu does so too. > > That said, let's

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony
> Also, it strikes me that ia64 has tons of different versions of > cmpxchg, and the one you use by default is the one with "acquire" > semantics Not "tons", just two. You can ask for "acquire" or "release" semantics, there is no relaxed option. Worse still - early processor implementations actu

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony
> That said, another thing that strikes me is that you have 32 CPU > threads, and the stupid test-program I sent out had MAX_THREADS set to > 16. Did you change that? Becuase if not, then some of the extreme > performance profile might be about how the threads get scheduled on > your machine (HT t

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony
>> Worse still - early processor implementations actually just ignored >> the acquire/release and did a full fence all the time. Unfortunately >> this meant a lot of badly written code that used .acq when they really >> wanted .rel became legacy out in the wild - so when we made a cpu >> that stri

RE: [PATCH 1/3] pstore: Adjust buffer size for compression for smaller registered buffers

2013-09-11 Thread Luck, Tony
- big_oops_buf_sz = (psinfo->bufsize * 100) / 45; + big_oops_buf_sz = (psinfo->bufsize * 100) / cmpr; Tested on an ERST backed system. Seems to be working (we save a little less information per ERST record than before this change (uncompressed size goes down from ~17500 to ~16400 by

RE: [PATCH v2] pstore: Adjust buffer size for compression for smaller registered buffers

2013-09-12 Thread Luck, Tony
+ default: + cmpr = 60; + break; + } Is this the right "default"? It may be a good choice for a backend with a really tiny buffer (1 ... 999). But less good for a (theoretical) backend with a larger buffer (10001 ... infinity and beyond). Which are you

RE: [PATCH] pstore/ram: (really) fix undefined usage of rounddown_pow_of_two

2013-08-30 Thread Luck, Tony
>> Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170 >> >> Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is >> false (documented behaviour). >> >> Signed-off-by: Maxime Bizon > > Yes, excellent point. :) > > Acked-by: Kees Cook Applied. Thanks. -Tony --

[GIT PULL] pstore changes for 3.12

2013-09-03 Thread Luck, Tony
The following changes since commit b36f4be3de1b123d8601de062e7dbfc904f305fb: Linux 3.11-rc6 (2013-08-18 14:36:53 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore for you to fetch changes up to 3bd11cf56e4d9c

[PATCH] lockref: Relax in cmpxchg loop

2013-09-03 Thread Luck, Tony
While we are likley to succeed and break out of this loop, it isn't guaranteed. We should be power and thread friendly if we do have to go around for a second (or third, or more) attempt. Signed-off-by: Tony Luck --- diff --git a/lib/lockref.c b/lib/lockref.c index 7819c2d..9d76f40 100644 ---

RE: [PATCH 01/11] random: don't feed stack data into pool when interrupt regs NULL

2013-09-30 Thread Luck, Tony
> In this case fast_mix would use two uninitialized ints from the stack > and mix it into the pool. Is the concern here is that an attacker might know (or be able to control) what is on the stack - and so get knowledge of what is being mixed into the pool? > In this case set the input to 0. And

[PATCH] X86 ACPI: Use #ifdef not #if for CONFIG_X86 check

2012-10-05 Thread Luck, Tony
Fix a build warning on ia64: include/linux/acpi.h:437:5: warning: "CONFIG_X86" is not defined Signed-off-by: Tony Luck --- diff --git a/include/linux/acpi.h b/include/linux/acpi.h index 4f42332..f70f18d 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -434,7 +434,7 @@ void acpi_

RE: boot failure on i7-3317u in Samsung 900x3c

2012-10-08 Thread Luck, Tony
> STATUS be23110a MCGSTATUS 5 The SDM can help here. See volume 3A, section 15.9.2 "Compound Error Codes". The low 16 bits of the status in this case are 0001 0001 1010 This tells us that you have a cache error in L2 cache severe enough that the processor has begun filtering further

RE: [RFC PATCH 0/3] mca_config stuff

2012-10-10 Thread Luck, Tony
> Therefore, I can toggle the bits in the mce code with mca_cfg.. > When defining accessing them through the device attributes in sysfs, I > use a new macro DEVICE_BIT_ATTR which gets the corresponding bit number > of that same bit in the bitfield. This gives only one function which > operates on a

RE: [RFC PATCH 3/3] Convert mce_disabled

2012-10-10 Thread Luck, Tony
struct mca_config { - u64 dont_log_ce : 1, -#define MCA_CFG_DONT_LOG_CE0 - __resv1 : 63; + u64 dont_log_ce : 1, +#define MCA_CFG_DONT_LOG_CE 0 + mca_disabled: 1, +#define MCA_CFG_MCA_DISABLED 1 + __resv1 : 62; }; If we

RE: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning

2012-12-05 Thread Luck, Tony
if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) { - if (PageAnon(page)) + if (PageHuge(page)) + ; + else if (PageAnon(page)) dec_mm_counter(mm, MM_ANONPAGES); else

RE: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage

2012-12-05 Thread Luck, Tony
> This patch fixes the warning from __list_del_entry() which is triggered > when a process tries to do free_huge_page() for a hwpoisoned hugepage. Ultimately it would be nice to avoid poisoning huge pages. Generally we know the location of the poison to a cache line granularity (but sometimes only

RE: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

2012-10-17 Thread Luck, Tony
> What's wrong with userspace tools parsing /proc/cmdline and seeing that > mce_bios_cmci_threshold has been set since this is the only way to set > it anyway? The argument might be on the command line, but may have been rejected because the BIOS didn't set the thresholds? So then you'd have to lo

RE: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

2012-10-18 Thread Luck, Tony
> @Tony: I'll send it upwards soonish in case there are no objections. > This way no stable backport will be needed. Acked-by: Tony Luck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http:/

RE: new execve/kernel_thread design

2012-10-19 Thread Luck, Tony
> Surprisingly enough, ia64 one seems to work on actual hardware; I have sent > Tony an incremental patch cleaning copy_thread() up, waiting for results of > testing that on SMP box. Tiny bit faster than plain 3.7-rc1. lmbench3 reports fork+execve test at between 558 to 567 usec with the new code,

RE: [PATCH v2 2/2] Do not change worker's running cpu in cmci_rediscover().

2012-10-19 Thread Luck, Tony
> In this case, the following BUG_ON in try_to_wake_up_local() will be > triggered: > BUG_ON(rq != this_rq()); Logically this looks OK - what is the test case to trigger this? I've done a moderate amount of testing of cpu online/offline while injecting corrected errors (when testing the CMCI s

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-07 Thread Luck, Tony
> This patch skips taking a psinfo->buf_lock when just one cpu is online > because stopped cpus turn to offline via smp_send_stop() > in some architectures like x86, powerpc or arm64. That seems an impressive list of preconditions. So for this to help we need to have taken all but one cpu offline

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Luck, Tony
> But you are assuming that kmsg_dump is perfect and it isn't, in which case > by putting kmsg_dump in the kdump path, you actually may be blocking kdump > from working. I think the concern is that kdump isn't perfect, so sometimes we don't get a good dump from it. In those cases it would have be

[GIT PULL] ACPI5 error injection fix

2012-12-11 Thread Luck, Tony
The following changes since commit b69f0859dc8e633c5d8c06845811588fe17e68b3: Linux 3.7-rc8 (2012-12-03 11:22:37 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-einj-fix-for-acpi5 for you to fetch changes up to 112f1f

[GIT PULL] pstore fixes for 3.8 merge window

2012-12-11 Thread Luck, Tony
The following changes since commit 9489e9dcae718d5fde988e4a684a0f55b5f94d17: Linux 3.7-rc7 (2012-11-25 17:59:19 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git tags/please-pull-pstore_mevent for you to fetch changes up to f94ec0c0

RE: [PATCH v5 0/7] efi_pstore: multiple event logging support

2012-11-13 Thread Luck, Tony
v4 -> v5 - Rebase to 3.7-rc5 - Add count to an argument of a write callback executed in pstore_console_write() to build successfully in case where CONSIG_PSTORE_CONSOLE=y is specified. (Patch 5/7) Applied. It's in my git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git next

RE: [PATCH 1/1] arch Kconfig: remove references to IRQ_PER_CPU

2012-11-13 Thread Luck, Tony
> But IRQ_PER_CPU wasn't removed from any of the architecture Kconfig > files where it was defined or selected. It's completely unused so remove > the remaining references. Acked-by: Tony Luck [Hope someone picks up this whole patch ... otherwise I can take the ia64 hunk] -- To unsubscribe from

RE: Fwd: PROBLEM: Random kernel panic & system freeze when watching video

2013-01-02 Thread Luck, Tony
>I had to build the latest mcelog from kernel.org and it tells you a >little bit more: it is an internal parity error. I don't know, though, >what errors reported in bank 2 pertain to on this cpu model - Intel >should know :). Intel is a big place ... we didn't document which bank reports which st

[GIT PULL] Use perf/event tracing to report PCI Express advanced errors

2013-01-04 Thread Luck, Tony
The following changes since commit d1c3ed669a2d452cacfb48c2d171a1f364dae2ed: Linux 3.8-rc2 (2013-01-02 18:13:21 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-aer-trace for you to fetch changes up to 2cced2d95961acd

RE: [PATCH v9 3/3] aerdrv: Cleanup log output for AER

2013-01-04 Thread Luck, Tony
I sent a "please pull" to Ingo/Peter/Thomas about an hour ago ... if they push back (or ignore) we can fold your ack and nit-picks into another version. > s/elimiating/eliminating/ above. Ugh ... nobody spotted this one ("many eyes" really does work!) > I remove the "v1-v2" notes when I merge pat

[PATCH] ia64: Make sure interrupts enabled when we "safe_halt()"

2013-04-16 Thread Luck, Tony
In commit d166991234347215dc23fc9dc15a63a83a1a54e1 idle: Implement generic idle function Thomas Gleixner cleaned up many things but perturbed some fragile code that was keeping ia64 alive. So we started seeing: WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380() and other unpleasantn

RE: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-14 Thread Luck, Tony
>> Tony promised me to test those patches on his box, so we'll know for sure >> in a while. Tested this series - and the box boots just fine with no unexpected messages. But I should note that this box doesn't have anything that is hot pluggable, so I couldn't test hotplug (which seems to be dee

RE: [GIT PULL] Some error injection fixes to queue for 3.11

2013-06-19 Thread Luck, Tony
>> Pulled, thanks Tony! >> >> Len, are you fine with this route [tip:x86/ras tree] for the >> drivers/acpi/apei/einj.c changes? > > Yes, the RAS guys basically own that code. These patches also got picked up by Rafael and are in his ACPI tree too. I think the patches were applied identically, so

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony
> Interesting, why? Why would we even need such an option? My impression > is, if ACPI tells us FF, MCE code doesn't poll those banks anymore. So > where do the duplicated reports come from? The option is only disabling the Linux side of firmware first ... the BIOS will still be doing it and gener

RE: [PATCH v4 0/8] Nvram-to-pstore

2013-06-19 Thread Luck, Tony
> You need to mount pstore to access the files. > > # mkdir /dev/pstore > # mount -t pstore - /dev/pstore > > to unmount > > # umount /dev/pstore > > References: http://lwn.net/Articles/421297/ Note that /dev/pstore has fallen out of fashion as the mount point ... we now (since 3.9) suggest /

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony
> Why, fill out struct mce and do mce_log(mce) does not suffice? There is (or should be) a lot more interesting stuff in the CPER than just the address. Stuff that we don't have fields for in the existing mcelog structure. We also need to treat filtered records from modern APEI implementations

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony
>> There is (or should be) > > Ha! Oh ye of little faith - I'm sure the BIOS will get this right this time :-) > Ok, seriously: so the situation should still be fine, FF reported errors > get the CPER format while the rest, the "old" MCE format. > > cper.c is doing printk so I'm guessing it woul

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony
> Ok, where is that semantics? What in a CPER record does say "this error > should tell you that you need to offline the containing page and I'm > telling you this exactly only once"? Error Severity 0, i.e. Recoverable? Naveen - this one is for you (or for your BIOS team). Can you get us a sample

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony
> The above question about what to do *without* going to userspace and > back is maybe more interesting and we'd need a clean design there... > we'll see. Yes - this case (where the BIOS did all the threshold math and made the decision) should be one where Linux kernel could just implement the ac

[PATCH] [IA64] sim: Add casts to avoid assignment warnings

2013-06-20 Thread Luck, Tony
Pointers in the efi_runtime_services_t structure now have type "void *" (formerly they were "unsigned long"). So we now see a bunch of warnings like this: arch/ia64/hp/sim/boot/fw-emu.c:293: warning: assignment makes pointer from integer without a cast Add (void *) casts to the 10 affected lines

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-20 Thread Luck, Tony
> - Two, the Generic Error Data Entry (aka UEFI Section Descriptor) has a > flag which indicates 'Error Threshold Exceeded'. From the UEFI spec, it > looks like we could consider this as an indication to offline the page; > though I am not sure if/how this relates to the threshold value above.

RE: [PATCH v3] aerdrv: Move cper_print_aer() call out of interrupt context

2013-05-29 Thread Luck, Tony
> + /* > + * TODO: This function needs to be re-written so that it's output > + * matches the output of aer_print_error(). Right now, the output > + * is formatted very differently. > + */ So we have this big "TODO" comment sitting there very prominently ... which Linus i

[GIT PULL] Fix aer error logging

2013-05-31 Thread Luck, Tony
The following changes since commit e4aa937ec75df0eea0bee03bffa3303ad36c986b: Linux 3.10-rc3 (2013-05-26 16:00:47 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-aertracefix for you to fetch changes up to 37448adfc7ce

RE: [PATCH] efi, pstore: Cocci spatch "memdup.spatch"

2013-06-03 Thread Luck, Tony
> Who wants to pick this one up? Tony? Sure - I'll take it. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.

RE: [PATCH 3/3] powerpc/pseries: Support compression of oops text via pstore

2013-06-25 Thread Luck, Tony
> Introducing headersize in pstore_write() API would need changes at > multiple places whereits being called. The idea is to move the > compression support to pstore infrastructure so that other platforms > could also make use of it. Any thoughts on the back/forward compatibility as we switch to c

RE: [PATCH] x86/MCE: Update MCE severity condition check

2013-06-25 Thread Luck, Tony
> The SDM talks about "non-affected" logical processors, but perhaps we > can call this an "unaffected" thread? "unaffected" sounds a bit more natural (but close enough to the wording in the SDM that people should see the connection). -Tony -- To unsubscribe from this list: send the line "unsubsc

RE: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC

2013-06-25 Thread Luck, Tony
+/* + * Indicates MCA banks controlled by the current cpu for CMCI. Note that this + * can change when a cpu is offlined or brought online since some MCA banks + * are shared across cpus. When a cpu is offlined, cmci_clear() disables CMCI + * on all banks owned by the cpu and clears this bitfield.

[GIT PULL] for tip x86/ras branch - queue for 3.11

2013-06-25 Thread Luck, Tony
The following changes since commit 9e895ace5d82df8929b16f58e9f515f6d54ab82d: Linux 3.10-rc7 (2013-06-22 09:47:31 -1000) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git tags/please-pull-mce-bitmap-comment for you to fetch changes up to 06444

  1   2   3   4   5   6   7   8   9   10   >