from:"Luck, Tony"

RE: linux-next: manual merge of the ia64 tree with Linus' tree

2012-11-26 Thread Luck, Tony

> I fixed it up (see below) and can carry the fix as necessary.

I rebased the series onto 3.7-rc7 (using the same merge fix that you did) ... 
so you shouldn't see the merge error next time.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 0/5] Add movablecore_map boot option

2012-11-28 Thread Luck, Tony

> 1. use firmware information
>   According to ACPI spec 5.0, SRAT table has memory affinity structure
>   and the structure has Hot Pluggable Filed. See "5.2.16.2 Memory
>   Affinity Structure". If we use the information, we might be able to
>   specify movable memory by firmware. For example, if Hot Pluggable
>   Filed is enabled, Linux sets the memory as movable memory.
> 
> 2. use boot option
>   This is our proposal. New boot option can specify memory range to use
>   as movable memory.

Isn't this just moving the work to the user? To pick good values for the
movable areas, they need to know how the memory lines up across
node boundaries ... because they need to make sure to allow some
non-movable memory allocations on each node so that the kernel can
take advantage of node locality.

So the user would have to read at least the SRAT table, and perhaps
more, to figure out what to provide as arguments.

Since this is going to be used on a dynamic system where nodes might
be added an removed - the right values for these arguments might
change from one boot to the next. So even if the user gets them right
on day 1, a month later when a new node has been added, or a broken
node removed the values would be stale.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 0/5] Add movablecore_map boot option

2012-11-29 Thread Luck, Tony

> The other bit is that if you really really want high reliability, memory
> mirroring is the way to go; it is the only way you will be able to
> hotremove memory without having to have a pre-event to migrate the
> memory away from the affected node before the memory is offlined.

Some platforms don't support cross-node mirrors ... but we still want to
be able to remove a node.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 0/5] Add movablecore_map boot option

2012-11-29 Thread Luck, Tony

> If any significant percentage of memory is in ZONE_MOVABLE then the memory
> hotplug people will have to deal with all the lowmem/highmem problems
> that used to be faced by 32-bit x86 with PAE enabled. 

While these problems may still exist on large systems - I think it becomes
harder to construct workloads that run into problems.  In those bad old days
a significant fraction of lowmem was consumed by the kernel ... so it was
pretty easy to find meta-data intensive workloads that would push it over
a cliff.  Here we  are talking about systems with say 128GB per node divided
into 64GB moveable and 64GB non-moveable (and I'd regard this as a rather
low-end machine).  Unless the workload consists of zillions of tiny processes
all mapping shared memory blocks, the percentage of memory allocated to
the kernel is going to be tiny compared with the old 4GB days.

-Tony

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] pstore: Create a convenient mount point for pstore

2013-02-12 Thread Luck, Tony

> > Signed-off-by: Josh Boyer 
>
> Acked-by: Kees Cook 

Queued for next merge window. Thanks.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v6 -next 0/2] make efivars/efi_pstore interrupt-safe

2013-02-12 Thread Luck, Tony

> Changelog
> v5 -> v6
>   - Rebase to a latest linux-next tree.
>   - Modify a comment from "efivar_update_sysfs_entry" to 
> "efivar_update_sysfs_entries" in include/linux/efi.h (Patch 2/2)

Applied to my internal pstore topic branch - which feeds to linux-next. Note
that my branch was based on 3.8-rc2 so I unwound the changes to match
up with Linus latest.  Hope I didn't break anything in the process.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: linux-next: manual merge of the vfs tree with the ia64 tree

2013-04-02 Thread Luck, Tony

> Today's linux-next merge of the vfs tree got a conflict in
> arch/ia64/kernel/palinfo.c between commit 40c275bd92b8 ("[IA64] Fix stack
> overflow in create_palinfo_proc_entries") from the ia64 tree and commit
> d8e904861a28 ("palinfo fixes") from the vfs tree.
>
> I fixed it up (arbitrarily choosing the vfs tree version) and can carry
> the fix as necessary (no action is required).

It looks like you picked the version from the ia64 tree - but it would have
been better to pick Al's version. He wastes less space on the stack by only
declaring cpustr[3+4+1] instead of cpustr[32], but more importantly he
checks the return value from proc_mkdir().

I'll drop 40c275bd92b8 out of my tree so Al's can go in without a conflict.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2/2] PCI/IA64: fix pci_dev->enable_cnt balance when doing pci hotplug

2013-04-02 Thread Luck, Tony

> But this patch mainly to fix the unbalanced dev->enable_cnt in IA64 which 
> will print WARNING Calltrace
> in dmesg.
Thanks for the explanation.

> If you think it is valuable, I will try to improve resource assignment in 
> IA64 like other arch (eg arm, m68k, mips and sh..)
> in another patch.

Making the ia64 code more like the x86 code might help avoid such problems in 
the future (lots
more people look at x86 than ia64 - if ours is the same, or very similar, then 
it is likely that changes
made to x86 will be correct for ia64 too).

Only you can decide how much this is worth to you and your company - perhaps 
there will be no more
changes that break ia64 even with the code differences. Or perhaps it will be 
easier for you to just
fix things as they break than to undertake a restructure of the code.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] x86/mce: Rework cmci_rediscover() to play well with CPU hotplug

2013-04-02 Thread Luck, Tony

>> Tony mentioned that this patch worked fine for him. So could you
>> kindly pick up this patch?
>
> Normally, Tony picks up the Intel side of MCE. Tony, want me to do it?

I'll pick it up. Thanks.

-Tony

[GIT PULL] x86/mce - clean up cmci_rediscover()

2013-04-03 Thread Luck, Tony

The following changes since commit 07961ac7c0ee8b546658717034fe692fd12eefa9:

  Linux 3.9-rc5 (2013-03-31 15:12:43 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-cmci_rediscover

for you to fetch changes up to 7a0c819d28f5c91955854e048766d6afef7c8a3d:

  x86/mce: Rework cmci_rediscover() to play well with CPU hotplug (2013-04-02 
14:04:01 -0700)


Clean up cmci_rediscover code to fix problems found by Dave Jones


Srivatsa S. Bhat (1):
  x86/mce: Rework cmci_rediscover() to play well with CPU hotplug

 arch/x86/include/asm/mce.h |  4 ++--
 arch/x86/kernel/cpu/mcheck/mce.c   |  2 +-
 arch/x86/kernel/cpu/mcheck/mce_intel.c | 25 +
 3 files changed, 8 insertions(+), 23 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/2] efivars: Check max_size only if it is non-zero.

2013-04-04 Thread Luck, Tony

> Some (broken?) EFI implementations return always a MaximumVariableSize of 0,
> check against max_size only if it is non-zero.

The spec doesn't say that zero has any special meaning - so if an implementation
returns max_size == 0 but lets you set a variable to a size > 0, then I don't 
think
there is a need for parentheses or a "?" in this commit comment.

But if Linux silently accepts such broken EFI, then there is no feedback loop
to let EFI implementations know that they are broken.  In other areas we have
thrown out messages about firmware being broken ... perhaps:

if (max_size == 0)
printk_once("Broken EFI implementation is returning 
MaxVariableSize=0\n");

would help? After all there probably *is* a maximum size - but EFI isn't 
telling us what it is.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] x86, amd, mce: Prevent potential cpu-online oops

2013-04-04 Thread Luck, Tony

+   if (WARN_ON_ONCE(!nb))
+   goto out;
+

WARN_ON_ONCE() will drop a stack trace to the console - is that going to be 
useful?

If you want a message perhaps:

if (!nb) {
printk_once("something interesting about not having 
access to north bridge\n")
goto out;
}

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging/adt7316 Fix some 'interesting' string operations

2013-04-04 Thread Luck, Tony

Calling memcmp() to check the value of the first byte in a string is overkill.
Just use buf[0] == '1' or buf[0] != '1' as appropriate.

Signed-off-by: Tony Luck 

---

[Inspired by a rant on IRC about a different driver doing something similar]

diff --git a/drivers/staging/iio/addac/adt7316.c 
b/drivers/staging/iio/addac/adt7316.c
index 0b431bc..506b5a7 100644
--- a/drivers/staging/iio/addac/adt7316.c
+++ b/drivers/staging/iio/addac/adt7316.c
@@ -256,7 +256,7 @@ static ssize_t adt7316_store_enabled(struct device *dev,
struct adt7316_chip_info *chip = iio_priv(dev_info);
int enable;
 
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
enable = 1;
else
enable = 0;
@@ -299,7 +299,7 @@ static ssize_t adt7316_store_select_ex_temp(struct device 
*dev,
return -EPERM;
 
config1 = chip->config1 & (~ADT7516_SEL_EX_TEMP);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config1 |= ADT7516_SEL_EX_TEMP;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG1, config1);
@@ -495,7 +495,7 @@ static ssize_t adt7316_store_disable_averaging(struct 
device *dev,
int ret;
 
config2 = chip->config2 & (~ADT7316_DISABLE_AVERAGING);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config2 |= ADT7316_DISABLE_AVERAGING;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG2, config2);
@@ -534,7 +534,7 @@ static ssize_t adt7316_store_enable_smbus_timeout(struct 
device *dev,
int ret;
 
config2 = chip->config2 & (~ADT7316_EN_SMBUS_TIMEOUT);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config2 |= ADT7316_EN_SMBUS_TIMEOUT;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG2, config2);
@@ -597,7 +597,7 @@ static ssize_t adt7316_store_powerdown(struct device *dev,
int ret;
 
config1 = chip->config1 & (~ADT7316_PD);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config1 |= ADT7316_PD;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG1, config1);
@@ -635,7 +635,7 @@ static ssize_t adt7316_store_fast_ad_clock(struct device 
*dev,
int ret;
 
config3 = chip->config3 & (~ADT7316_ADCLK_22_5);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config3 |= ADT7316_ADCLK_22_5;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG3, config3);
@@ -681,7 +681,7 @@ static ssize_t adt7316_store_da_high_resolution(struct 
device *dev,
 
chip->dac_bits = 8;
 
-   if (!memcmp(buf, "1", 1)) {
+   if (buf[0] == '1') {
config3 = chip->config3 | ADT7316_DA_HIGH_RESOLUTION;
if (chip->id == ID_ADT7316 || chip->id == ID_ADT7516)
chip->dac_bits = 12;
@@ -731,7 +731,7 @@ static ssize_t adt7316_store_AIN_internal_Vref(struct 
device *dev,
if ((chip->id & ID_FAMILY_MASK) != ID_ADT75XX)
return -EPERM;
 
-   if (memcmp(buf, "1", 1))
+   if (buf[0] != '1')
config3 = chip->config3 & (~ADT7516_AIN_IN_VREF);
else
config3 = chip->config3 | ADT7516_AIN_IN_VREF;
@@ -773,7 +773,7 @@ static ssize_t adt7316_store_enable_prop_DACA(struct device 
*dev,
int ret;
 
config3 = chip->config3 & (~ADT7316_EN_IN_TEMP_PROP_DACA);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config3 |= ADT7316_EN_IN_TEMP_PROP_DACA;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG3, config3);
@@ -812,7 +812,7 @@ static ssize_t adt7316_store_enable_prop_DACB(struct device 
*dev,
int ret;
 
config3 = chip->config3 & (~ADT7316_EN_EX_TEMP_PROP_DACB);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
config3 |= ADT7316_EN_EX_TEMP_PROP_DACB;
 
ret = chip->bus.write(chip->bus.client, ADT7316_CONFIG3, config3);
@@ -1018,7 +1018,7 @@ static ssize_t adt7316_store_DA_AB_Vref_bypass(struct 
device *dev,
return -EPERM;
 
dac_config = chip->dac_config & (~ADT7316_VREF_BYPASS_DAC_AB);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
dac_config |= ADT7316_VREF_BYPASS_DAC_AB;
 
ret = chip->bus.write(chip->bus.client, ADT7316_DAC_CONFIG, dac_config);
@@ -1063,7 +1063,7 @@ static ssize_t adt7316_store_DA_CD_Vref_bypass(struct 
device *dev,
return -EPERM;
 
dac_config = chip->dac_config & (~ADT7316_VREF_BYPASS_DAC_CD);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')
dac_config |= ADT7316_VREF_BYPASS_DAC_CD;
 
ret = chip->bus.write(chip->bus.client, ADT7316_DAC_CONFIG, dac_config);
@@ -1982,7 +1982,7 @@ static ssize_t adt7316_set_int_enabled(struct device *dev,
int ret;
 
config1 = chip->config1 & (~ADT7316_INT_EN);
-   if (!memcmp(buf, "1", 1))
+   if (buf[0] == '1')

RE: [PATCH 3/3] acpi, memory-hotplug: Support getting hotplug info from SRAT.

2013-01-28 Thread Luck, Tony

> I will post a patch to fix it. How about always keep node0 unhotpluggable ?

Node 0 (or more specifically the node that contains memory <4GB) will be
full of BIOS reserved holes in the memory map. It probably isn't removable
even if Linux thinks it is.  Someday we might have a smart BIOS that can
relocate itself to another node - but for now making node0 unhotpluggable
looks to be a plausible interim move.

Ultimately we'd like to be able to remove any node (just not all of them at
the same time ... just like we can now offline any cpu - but not all of them
together).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony

> assume first cpu only have 1G ram, and other 31 socket will have bunch of ram

That doesn't seem to be a very realistic assumption. Can you even still buy 1G
DIMMs for servers?   I'd think that a minimum would be to have each of four
channels populated with a 4G DIMM - so 16GB on first cpu. But even that feels
rather low.

I think that making sure that the system can boot is good (and maybe it should
ignore/override[*] parameters that would prevent booting). But let's be 
realistic
about the cases we actually have to deal with (before somebody comes and talks
about systems with just 16MB).

-Tony

[*] with some noisy warnings in the console log
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

2013-02-27 Thread Luck, Tony

>   b. it will be freed to slub before run time.
>   like init code and initrd disk.

If this is a problem - I'd be inclined to disable the code that frees it. It's 
only
a few hundred KB of code, and possibly a few MB of initrd. Too small to
worry about on a hot pluggable server.

> In that case, so they should just boot system with numa=off.

But we will still care about NUMA locality.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH V3] ia64/mm: fix a bad_page bug when crash kernel booting

2013-02-19 Thread Luck, Tony

> In efi_init() memory aligns in IA64_GRANULE_SIZE(16M). If set 
> "crashkernel=1024M-:600M"

Is this where the real problem begins?  Should we insist that users provide 
crashkernel
parameters rounded to GRANULE boundaries?

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

[GIT PULL] pstore patches for 3.9 merge window

2013-02-21 Thread Luck, Tony

The following changes since commit d1c3ed669a2d452cacfb48c2d171a1f364dae2ed:

  Linux 3.8-rc2 (2013-01-02 18:13:21 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git 
tags/please-pull-pstore

for you to fetch changes up to fb0af3f2b1b613e5ea75426d454c7e5b1d1eef49:

  pstore: Create a convenient mount point for pstore (2013-02-12 13:07:22 -0800)


A few fixes to reduce places where pstore might hang
a system in the crash path. Plus a new mountpoint
(/sys/fs/pstore ... makes more sense then /dev/pstore).


Josh Boyer (1):
  pstore: Create a convenient mount point for pstore

Seiji Aguchi (4):
  pstore: Avoid deadlock in panic and emergency-restart path
  efi_pstore: Avoid deadlock in non-blocking paths
  efivars: Disable external interrupt while holding efivars->lock
  efi_pstore: Introducing workqueue updating sysfs

 Documentation/ABI/testing/pstore |  10 +--
 drivers/firmware/efivars.c   | 180 +--
 fs/pstore/inode.c|  18 +++-
 fs/pstore/platform.c |  35 ++--
 include/linux/efi.h  |   3 +-
 include/linux/pstore.h   |   6 ++
 6 files changed, 192 insertions(+), 60 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Fix ia64 build breakage

2013-02-22 Thread Luck, Tony

The following changes since commit 2ef14f465b9e096531343f5b734cffc5f759f4a6:

  Merge branch 'x86-mm-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip (2013-02-21 18:06:55 
-0800)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git 
tags/please-pull-fix-ia64-build

for you to fetch changes up to bc681593b588786e6326b3e5f78ccc1683e2269c:

  sched: move RR_TIMESLICE from sysctl.h to rt.h (2013-02-22 09:20:11 -0800)


Fix ia64 build


Clark Williams (1):
  sched: move RR_TIMESLICE from sysctl.h to rt.h

 include/linux/sched/rt.h | 6 ++
 include/linux/sched/sysctl.h | 6 --
 2 files changed, 6 insertions(+), 6 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH RFC] x86/mce: Move MCE sysfs attributes out of the per-cpu location

2012-08-29 Thread Luck, Tony

> Note: I'm not sure if it's ok to change sysfs entries and this does break
> userspace tools that depend on the current path for some of these attributes.
> So, they will need to be updated to use the new path. However, if we ever get
> to a point where cpu0 can be offlined, these tools will need to be updated
> anyway (as they mostly hardcode machinecheck0 currently)

Linus' clarified his "never break user space" edict at the kernel summit
on Monday. Paraphrasing:

  If nobody notices, or nobody complains, then we can make changes. But
  if anyone does complain, then the patch gets reverted.

So if you want to do this, the right approach would be to change the
utilities that use this to look in the new location for these sysfs files
first, and fall back to looking in the old per-cpu place.

Next (or in parallel) have the kernel provide both interfaces.

Wait a long[1] time so that most people have updated utilities.

Delete the per-cpu interfaces from the kernel.

Delete the per-cpu references from the utilities.

-Tony

[1] Long enough that there are no complaints. At least a year, probably two or 
more.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH RESEND] memory hotplug: fix a double register section info bug

2012-09-14 Thread Luck, Tony

> This is an unusual configuration but it's not unheard of. PPC64 in rare
> (and usually broken) configurations can have one node span another. Tony
> should know if such a configuration is normally allowed on Itanium or if
> this should be considered a platform bug. Tony?

We definitely have platforms where the physical memory on node 0
that we skipped to leave physical address space for PCI mem mapped
devices gets tagged back at the very top of memory, after other nodes.

E.g. A 2-node system with 8G on each might look like this:

0-2G RAM on node 0
2G-4G  PCI map space
4G-8G RAM on node 0
8G-16GRAM on node 1
16G-18G RAM on node 0

Is this the situation that we are talking about? Or something different?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-30 Thread Luck, Tony

> this i believe builds an implicit dependency between the mca_asm.o 
> position within the image and the ia64_mca_data percpu variable it 
> accesses - it relies on the immediate 22 addressing mode that has 4MB of 
> scope. Per chance, the .config you sent creates a 14MB image, and the 
> percpu variables moved too far away for the linker to be able to fulfill 
> this constraint.

Sounds very plausible.

> The workaround is to define PER_CPU_ATTRIBUTES to link percpu variables 
> back into the .percpu section on UP too - which ia64 links specially 
> into its vmlinux.lds. But ultimately i think the better solution would 
> be to remove this dependency between arch/ia64/kernel/mca_asm.S and the 
> position of the percpu data.

Yup.  That fixes the build ... the resulting binary doesn't boot though :-(
I just realized that it has been a while since I tried booting a UP
kernel ... so the problem may be unrelated bitrot elsewhere.

Overall you are right that the mca_asm.S code should not be dependent on
the relative location of the data objects.

I'll start digging on why this doesn't boot ... but you might as well
send the fixes so far upstream to Linus so that the SMP fix is available
(which is all anyone really cares about ... there are very, very few
UP ia64 systems in existence).

Acked-by: Tony Luck <[EMAIL PROTECTED]>


-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-30 Thread Luck, Tony

> I'll start digging on why this doesn't boot ... but you might as well
> send the fixes so far upstream to Linus so that the SMP fix is available

Well a pure 2.6.24 version compiled with CONFIG_SMP=n booted just fine, so
the breakage is recent ... and more than likely related to this change.

I've only had a casual dig at the failing case ... kernel dies in memset()
as called from kmem_cache_alloc() with the address being written as
0x40117b48 (which is off in the virtual address space range used
by users ... not a kernel address).

I'll dig some more tomorrow.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-31 Thread Luck, Tony

> hm, as far as i could check, on ia64 UP the .percpu section link 
> difference was the only ia64 difference i could find out of those 
> changes. Could you try to copy a 2.6.24 include/asm-generic/percpu.h, 
> include/asm-ia64.h and include/linux/percpu.h into your current tree, 
> and see whether that boots? If yes, then it's the percpu changes. The 
> patch below does this ontop of very latest -git - and it builds fine 
> with your UP config with a crosscompiler.

Applied that patch and UP kernel built ok, and then crashed in the
same place with the memset() to a user-looking address from kmem_cache_alloc()

So the percpu changes are innocent ... something else since 2.6.24 is
to blame.  Only 5749 commits :-)  I'll start bisecting.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-01-31 Thread Luck, Tony

> So the percpu changes are innocent ... something else since 2.6.24 is
> to blame.  Only 5749 commits :-)  I'll start bisecting.

12 bisections later ... nothing!  I think I got lost in the
maze.  Bisection #5 had a crash, but it looked to be a very
differnt crash (and looked to happen later than the bug I was
hunting).  So I marked that as "good" on the theory that it
looked like this bug wasn't in the kernel. Same thing happened
at bisection #9.  But I ended up with:

commit bfada697bd534d2c16fd07fbef3a4924c4d4e014
Author: Pavel Emelyanov <[EMAIL PROTECTED]>
Date:   Sun Dec 2 00:57:08 2007 +1100

[IPV4]: Use ctl paths to register devinet sysctls


Which just looks too improbable to be the cause of the UP
crash.  Git won't revert it out from top of tree automatically
so I can't easily test whether some weird magic means that
this is the buggy commit.

Perhaps the issue is another offset of object X in kernel w.r.t.
object Y ... and so the good/bad choices in the bisection are
actually pretty random depending on how much code is stuffed
between X & Y at each bisection point.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: x86/non-x86: percpu, node ids, apic ids x86.git fixup

2008-02-05 Thread Luck, Tony

> Applied that patch and UP kernel built ok, and then crashed in the
> same place with the memset() to a user-looking address from kmem_cache_alloc()
>
> So the percpu changes are innocent ... something else since 2.6.24 is
> to blame.  Only 5749 commits :-)  I'll start bisecting.

The bisection narrowed in on an innocent patch in ipv4 space.  Meanwhile
the rush of patches continues.  When I retested yesterday when Linus
HEAD was 8af03e782... the CONFIG_SMP=n kernel worked perfectly.  So
maybe it was fixed?  Or maybe the bug depends on the relative
location of various bits of code/data and as the kernel grows and
shrinks with incoming changes the problem comes and goes :-(

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] MAINTAINERS: Add myself as the SWIOTLB maintainer.

2012-10-04 Thread Luck, Tony

> Now that I've an IA64 box on top of the other boxes
> (IBM with Calgary-X, Intel VT-d, AMD Vi, and AMD GART - that
> can use SWIOTLB as fallback) I can reliably do regression
> testing.

Acked-by: Tony Luck 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 1/2] Replace if statement with WARN_ON_ONCE() in cmci_rediscover().

2012-10-23 Thread Luck, Tony

> First of all, I do think I was answering your question. As I said
> before, if an online cpu == dying here, there must be something wrong.
> Am I right here ?

Yes - but there is a fuzzy line over where it is good to check for "something 
wrong"
or whether to trust that the caller of the function knew what they were doing.

For example we trust that "dying" is a valid cpu number.  If we were
super-paranoid that someone might change the code and call us with a
bad argument, we might add:

BUG_ON(dying < 0 || dying >= MAX_NR_CPUS);

This would certainly help debug the case if someone did make a bogus
change ... but I think it is clear that this test is way past the fuzzy line and
into pointless.

Back to the case in question: do we think there is a credible case where
the "dying" cpu can show up in our "for_each_cpu_online()" loop? The
original author of the code was worried enough to make a test, but thought
that the appropriate action was to silently skip it. You want to add a WARN_ON,
which will cause users who read the console logs to worry, but that most users
will never see.

-Tony

RE: [PATCH 02/26] pstore: add flags

2012-10-23 Thread Luck, Tony

> I wonder if the default should be to not show headers, and to add this
> flag to the backends that want the pstore-added header. I think the
> more common case going forward will to be without headers since
> backends should arguably storing metadata themselves.

Perhaps just add the headings when pstore breaks a dump into
pieces because of a back-end size limitation. I.e. if there is only
one piece, then no headings. If there are two or more, include
a heading to aid with putting the pieces together later.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 011/193] arch/ia64: remove CONFIG_EXPERIMENTAL

2012-10-23 Thread Luck, Tony

> This config item has not carried much meaning for a while now and is
> almost always enabled by default. As agreed during the Linux kernel
> summit, remove it.

Acked-by: Tony Luck 

[ditto for parts 012 and 013 of 193]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 0/5] Rework MCA configuration handling code, v2

2012-10-25 Thread Luck, Tony

> Third round, incorporating feedback from the last time.

Paste one of these onto each piece:

Acked-by: Tony Luck 
Acked-by: Tony Luck 
Acked-by: Tony Luck 
Acked-by: Tony Luck 
Acked-by: Tony Luck 

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony

> Just curious -- can you reproduce the same problem with
> CONFIG_PRINTK_TIME as I'm seeing?

Yes I can reproduce this (on latest Linus tree).  System
dies with no console output ... looks like the boot cpu
may have taken a machine check (it isn't responding to my
debugger).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Strange hang on ia64 with CONFIG_PRINTK_TIME=y

2008-02-14 Thread Luck, Tony

> I guess sched_init() is too early... it does seem really strange to
> me, but I just double checked with Ingo's patch and it does indeed
> hang.  The slow way to make progress is just to go through
> start_kernel() line-by-line and enable cpu_clock() at each stage, and
> see where it stops hanging.  I'll give that a shot as a background
> process (my ia64 box takes quite a while to boot, so each test takes a
> long time but requires very little of my attention).

We *ought* to be safe after cpu_init() ... which is called from setup_arch(),
which is several calls before sched_init().

Thanks for looking at this though ... my ability to test just went
away for a while ... some lab re-organization means all my systems
just got powered off and removed from their rack so the rack can be
moved to a new location.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/5] driver core / ACPI: Move ACPI support to core device and driver types

2012-10-31 Thread Luck, Tony

On 10/31/2012 11:42 AM, Rafael J. Wysocki wrote:
> I wonder if the x86 and/or ia64 maintainers have any reservations?

Can you elaborate on the "tested by mika" that you put into the 0/5
message. Especially w.r.t. ia64. Compile tested? Boot tested? Ran with
some new device that uses the ACPI enumeration provided by this series?

Nothing in the concept or code scares me ... but I'd like to know that it
actually works :-)

-Tony

RE: [PATCH 1/5] driver core / ACPI: Move ACPI support to core device and driver types

2012-10-31 Thread Luck, Tony

> By "tested" I mean "run with some new devices that use the ACPI enumeration
> provided here, on x86".  Sorry for being too vague.

Do you or Mika have access to an ia64 box to test.  If not, can you suggest
some way that I could exercise this code w/o the new devices. Or at least
reassure myself that all is benign in a system full of old devices.

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH 1/5] driver core / ACPI: Move ACPI support to core device and driver types

2012-10-31 Thread Luck, Tony

> The BIOSes of currently available ia64 systems don't contain ACPI nodes whose
> IDs will match the IDs of the new devices (ie. the ones that are going to be
> added to acpi_platform_device_ids[]), so for ia64 it should be sufficient to
> test that code as is (ie. without any new devices in the system).

Ok - built cleanly on ia64.  Boots too. Just one new console message:

ACPI: bus type platform registered

that seems pretty harmless.

Acked-by: Tony Luck

RE: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-11-01 Thread Luck, Tony

> That is correct, unfortunately. That information is not available to
> software in all cases. Maybe APEI could be used for that DIMM location
> mapping through simple tables instead of letting it fumble the error
> handling path.

Not much hope for "simple"[1] tables.  There is also a timings issue on
system with rank sparing, memory mirroring etc.  ... you need to decode
to the DIMM at the time the error happened. If you wait until later, then
the system may have switched over to the spare rank or mirror ... and then
your decode will point at the new target, rather than the old.

-Tony

[1] Consider a 4 cpu-socket machine with 4 channels per socket and three
DIMMs per channel - so there are 48 sockets on the motherboard. Then
some lab monkey takes a box of random 1, 2, 4, 8 GB DIMMs and fills
most of the sockets. BIOS will somehow make sense out of this and interleave
where it finds matching speeds across pairs/quads of channels (though size
need not match ... if you have a 2G and 4G DIMM you may get interleaving for
the part. then non-interleaved for the "extra" 2G).

RE: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

2012-11-01 Thread Luck, Tony

> Right, but at least in the csrow case, we still can compute back the
> csrow even with the interleaving, after we know how it is done exactly
> (on which address bits, etc). I think this should be doable on Intel
> controllers too but I don't know.

No. Architecturally all Intel provides is the physical address in MCi_ADDR.
To do anything with that you are into per-system space, and the
registers that define the mappings are not necessarily available
to OS code ... sometimes they are, and sometimes they are even
documented in places where Mauro can use them to write an
EDAC driver ... but there are no guarantees.

-Tony

RE: [PATCH] debug: Do not permit CONFIG_DEBUG_STACK_USAGE=y on IA64 or PARISC

2012-07-28 Thread Luck, Tony

> I agree with this.  Most of it looks easily fixable, but how would I
> enable the fix for ia64?  For PA it's simple: I'll just use
> CONFIG_STACK_GROWSUP, but that won't work for you.

ia64 has an ugly chicken vs. egg build dependency. When trying to build our 
asm-offsets.h
file (to get #define constants for various structure sizes and offsets in a 
format that is
usable in assembly code) we get:

include/linux/sched.h:2539: error: 'IA64_TASK_SIZE' undeclared (first use in 
this function)

Which is sad because IA64_TASK_SIZE is one of the #defines that asm-offsets.h 
is trying
to produce.

Which is why I just threw up my hands in despair and said "!IA64" for this 
option.

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH] pstore: avoid recursive spinlocks in the oops_in_progress case

2012-09-24 Thread Luck, Tony

> And my plan was to get rid of the fact that backends touch pstore->buf
> directly. Backends would always receive anonymous 'buf' pointer (we
> already have write_buf callback that does exactly this), and thus it

It feels like we are just shuffling the lock problem from one place
to another.  In the panic case we have to use a pre-allocated buffer
(hoping that we can allocate one seems to be a foolish plan). So
we'd need a lock around use of that buffer somewhere - whether
it is in the panic code, the pstore generic code, or the back-end
driver.

Can you describe where you'd like to end up?

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony

> If we go back to first principles, what do we want to do?  We want the
> system administrator to know that a file might be potentially
> corrupted.  And perhaps, if a program tries to read from that file, it
> should get an error.  If we have a program that has that file mmap'ed
> at the time of the error, perhaps we should kill the program with some
> kind of signal.  But to force a reboot of the entire system?  Or to
> remounte the file system read-only?  That seems to be completely
> disproportionate for what might be 2 or 3 bits getting flipped in a
> page cache for a file.

I think that we know that the file *is* corrupted, not just "potentially".
We probably know the location of the corruption to cache-line granularity.
Perhaps better on systems where we have access to ecc syndrome bits,
perhaps worse ... we do have some errors where the low bits of the address
are not known.

I'm in total agreement that forcing a reboot or fsck is unhelpful here.

But what should we do?  We don't want to let the error be propagated. That
could cause a cascade of more failures as applications make bad decisions
based on the corrupted data.

Perhaps we could ask the filesystem to move the file to a top-level
"corrupted" directory (analogous to "lost+found") with some attached
metadata to help recovery tools know where the file came from, and the
range of corrupted bytes in the file? We'd also need to invalidate existing
open file descriptors (or less damaging - flag them to avoid the corrupted
area??). Whatever we do, it needs to be persistent across a reboot ... the
lost bits are not going to magically heal themselves.

We already have code to send SIGBUS to applications that have the
corrupted page mmap(2)'d (see mm/memory-failure.c).

Other ideas?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony

> Well, we could set a new attribute bit on the file which indicates
> that the file has been corrupted, and this could cause any attempts to
> open the file to return some error until the bit has been cleared.

That sounds a lot better than renaming/moving the file.

> This would persist across reboots.  The only problem is that system
> administrators might get very confused (at least at first, when they
> first run a kernel or a distribution which has this feature enabled).

Yes. This would require some education. But new attributes have been
added in the past (e.g. immutable) that caused confusion to users and
tools that didn't know about them.

> Application programs could also get very confused when any attempt to
> open or read from a file suddenly returned some new error code (EIO,
> or should we designate a new errno code for this purpose, so there is
> a better indication of what the heck was going on?)

EIO sounds wrong ... but it is perhaps the best of the existing codes. Adding
a new one is also challenging too.

> Also, if we just log the message in dmesg, if the system administrator
> doesn't find the "this file is corrupted" bit right away

This is pretty much a given. Nobody will see the message in the console log
until it is far too late.

> I'm not sure it's worth it to go to these extents, but I could imagine
> some customers wanting to have this sort of information.  Do we know
> what their "nice to have" / "must have" requirements might be?

18 years ago Intel rather famously attempted to sell users on the idea that a
rare divide error that sometimes gave the wrong answer could be ignored. Before
my time at Intel, but it is still burned into the corporate psyche that 
customers
really don't like to get the wrong answers from their computers.

Whether it is worth it may depend on the relative frequency of data being
corrupted this way, compared to all the other ways that it might get messed
up. If it were a thousand times more likely that data got silently corrupted
on its path to media, sitting spinning on the media, and then back off the
drive again - then all this fancy stuff wouldn't make any real difference.
I have no data on the relative error rates of memory and i/o - so I can't
answer this.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Luck, Tony

> What I would recommend is adding a 
>
> #define FS_CORRUPTED_FL   0x0100 /* File is corrupted */
>
> ... and which could be accessed and cleared via the lsattr and chattr
> programs.

Good - but we need some space to save the corrupted range information
too. These errors should be quite rare, so one range per file should be
enough.

New file systems should plan to add space in their on-disk format. The
corruption isn't going to go away across a reboot.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Fix a cmci discovery problem

2012-10-30 Thread Luck, Tony

The following changes since commit 8f0d8163b50e01f398b14bcd4dc039ac5ab18d64:

  Linux 3.7-rc3 (2012-10-28 12:24:48 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-tangchen

for you to fetch changes up to 85b97637bb40a9f486459dd254598759af9c3d50:

  x86/mce: Do not change worker's running cpu in cmci_rediscover(). (2012-10-30 
14:38:12 -0700)


Fix problem in CMCI rediscovery code that was illegally
migrating worker threads to other cpus.


Tang Chen (1):
  x86/mce: Do not change worker's running cpu in cmci_rediscover().

 arch/x86/kernel/cpu/mcheck/mce_intel.c |   31 ++-
 1 file changed, 18 insertions(+), 13 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [GIT PULL] EFI changes for v3.11

2013-07-08 Thread Luck, Tony

>> 
>> Tony Luck (1):
>>   [IA64] sim: Add casts to avoid assignment warnings
>> 
>>  arch/ia64/hp/sim/boot/fw-emu.c | 20 ++--
>>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> I don't see this commit in Linus' tree so presumably Tony is still
> seeing these warnings.

Correct - I see 10 warning about "assignment makes pointer from integer"
when building Linus' tree (HEAD = d2b4a646).

My patch doesn't appear to be in linux-next either (next-20130708).

I had hoped to have this patch follow in the same path that the
one that changed the types and introduced the warnings took ...
but since that didn't work perhaps I should just ask Linus to pull
it from my ia64 tree.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 4] mce: acpi/apei: Add a sysctl to control page offlining on firmware report

2013-07-08 Thread Luck, Tony

> Nope, this is a just-in-case thing. I think you or Tony asked to have 
> this in a previous discussion so that we're covered if firmware starts 
> acting up. Other than that, I'm ok if this is left out.

I'm struggling to think of a case where this would help.  It implies that
we are on a running system, and we somehow notice that the BIOS is
telling us to take some pages offline - and that we know better than the
BIOS that we'd like to just ignore any more such messages from the BIOS.

But we still leave the BIOS in charge of logging the errors and keeping
track of the thresholds.

I'm happy with just the acpi=nocmcff to avoid a BIOS that does weird
stuff.  Or do you think we might still have to deal with a string of APEI
messages?

-Tony

N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

RE: BUG: key ffff880c1148c478 not in .data! (V3.10.0)

2013-07-12 Thread Luck, Tony

> What would be a reasonable maximum limit for the number of memory
> controllers, on a -EX machine?

Westmere-EX has one memory controller per socket ... and there are glueless 
systems up to 8 sockets.  So 8 there. Not sure if any OEM is building larger 
machines with a node controller (SGI? Not sure if they build their behemoths 
from -EP or -EX parts).

Ivy Bridge ups the ante with two memory controllers on a socket. So plan on 
doubling soon.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RESEND PATCH v2 1/4] mm/hwpoison: fix traverse hugetlbfs page to avoid printk flood

2013-09-16 Thread Luck, Tony

This is good - but the real solution is to stop poisoning entire huge pages ... 
they should
be broken into 4K pages and just one 4K page should be poisoned.

Naoya Horiguchi: I thought that you were looking at this problem some months 
ago. Any progress?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RESEND PATCH v2 1/4] mm/hwpoison: fix traverse hugetlbfs page to avoid printk flood

2013-09-16 Thread Luck, Tony

>>Sorry, I have no meaningful progress on this. Splitting hugepages is not
>>a trivial operation, and introduce more complexity on hugetlbfs code.
>>I don't hit on any usecase of it rather than memory failure, so I'm not
>>sure that it's worth doing now.
>
> Agreed. ;-)

Agreed that huge pages should be split - or that it is not worth splitting them?

Actually I wonder how useful huge pages still are - transparent huge pages may
give most of the benefits without having to modify applications to use them.
Plus the kernel does know how to split them when an error occurs (which I care
about more than most people).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RESEND PATCH v2 1/4] mm/hwpoison: fix traverse hugetlbfs page to avoid printk flood

2013-09-17 Thread Luck, Tony

> Transparent huge pages are not helpful for DB workload which there is a lot 
> of 
> shared memory

Hmm. Perhaps they should be.  If a database allocates most[1] of the memory on a
machine to a shared memory segment - that *ought* to be a candidate for using
transparent huge pages.  Now that we have them they seem a better choice (much
more flexibility) than hugetlbfs.

-Tony

[1] I've been told that it is normal to configure over 95% of physical memory 
to the
shared memory region to run a particular transaction based benchmark with one
commercial data base application.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] pstore/compression fixes

2013-09-18 Thread Luck, Tony

The following changes since commit e831cbfc1ad843b5542cc45f777e1a00b73c0685:

  Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux (2013-09-11 08:36:03 
-0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git 
tags/please-pull-pstore

for you to fetch changes up to 802e4c6f5887205eda110c6bfb90c9bfa93dc8a7:

  pstore: Remove the messages related to compression failure (2013-09-16 
09:28:29 -0700)


Three pstore fixes related to compression:
1) Better adjustment of size of compression buffer (was too big
   for EFIVARS backend resulting in compression failure
2) Use zlib_inflateInit2 instead of zlib_inflateInit
3) Don't print messages about compression failure.  They will
   waste space that may better be used to log console output
   leading to the crash.


Aruna Balakrishnaiah (3):
  pstore: Adjust buffer size for compression for smaller registered buffers
  pstore: Use zlib_inflateInit2 instead of zlib_inflateInit
  pstore: Remove the messages related to compression failure

 fs/pstore/platform.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC PATCH v2 04/11] pstore: Add compression support to pstore

2013-09-04 Thread Luck, Tony

> The reason behind compression failure is the size of big_oops_buf which is too
> big for efivars case. I will do some experiments with different kind of texts
> for buffer size 1024 to check if 100/53 suits for all the cases.
...

> Yes this can be changed to zlib_inflateInit2().

Original patch series was just pulled by Linus ... so we'll need a patch on top
of current Linus git tree to fix these issues.  But let's make sure that 
efivars, erst,
etc. are all happy with the changes we make before I ask Linus to pull another
pstore piece.

Thanks

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony

> *If* however the cpu_relax() makes sense on other platforms maybe we could
> add something like we have already with "arch_mutex_cpu_relax()":

I'll do some more measurements on ia64.  During my first tests cpu_relax() 
seemed
to be a big win - but I only ran "./t" a couple of times.  Later (with the 
cpu_relax() in
place) I ran a bunch more iterations, and found that the variation from run to 
run
is much larger with lockref.  The mean score is 60% higher, but the standard 
deviation
is an order of magnitude bigger (enough that one run out of 20 with lockref 
scored
lower than the pre-lockref kernel).

I think this is expected ... cmpxchg is a free-for-all - and sometimes poor 
placement
across the four socket system might cause short term starvation to a thread 
while
threads on another socket monopolize the cache line.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony

> And there can't be any livelock, since by definition somebody else
> _did_ make progress. In fact, adding the cpu_relax() probably just
> makes things much less fair - once somebody else raced on you, the
> cpu_relax() now makes it more likely that _another_ cpu does so too.
>
> That said, let's see Tony's numbers are.

Data from 20 runs of "./t"

3.11 + Linus enabling patches, but ia64 not enabled (commit bc08b449ee14a from 
Linus tree).
mean 3469553.80
min 3367709.00
max 3494154.00
stddev = 43613.722742

Now add ia64 enabling (including the cpu_relax())
mean 5509067.15 // nice boost
min 3191639.00 // worst case is worse than worst case before we made the 
change
max 6508629.00
stddev = 793243.943875 // much more variation from run to run

Comment out the cpu_relax()
mean 2185864.40 // this sucks
min 2141242.00
max 2286505.00
stddev = 40847.960152 // but it consistently sucks

So Linus is right that the cpu_relax() makes things less fair ... but without 
it performance sucks so
much that I don't want to use the clever cmpxchg at all - I'm much better off 
without it!

This may be caused by Itanium hyper-threading (SOEMT - switch on event 
multi-threading) where
the spinning thread means that its buddy retires no instructions until h/w 
times it out and forces
a switch.  But that's just a guess - losing the cacheline to whoever made the 
change that caused
the cmpxchg to fail should also force a thread switch.

-Tony

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony

> Also, it strikes me that ia64 has tons of different versions of
> cmpxchg, and the one you use by default is the one with "acquire"
> semantics

Not "tons", just two.  You can ask for "acquire" or "release" semantics,
there is no relaxed option.

Worse still - early processor implementations actually just ignored
the acquire/release and did a full fence all the time.  Unfortunately
this meant a lot of badly written code that used .acq when they really
wanted .rel became legacy out in the wild - so when we made a cpu
that strictly did the .acq or .rel ... all that code started breaking - so
we had to back-pedal and keep the "legacy" behavior of a full fence :-(

-Tony

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony

> That said, another thing that strikes me is that you have 32 CPU
> threads, and the stupid test-program I sent out had MAX_THREADS set to
> 16.  Did you change that? Becuase if not, then some of the extreme
> performance profile might be about how the threads get scheduled on
> your machine (HT threads vs full cores etc).

I'll try to get new numbers with 32 threads[*] - but even if they look good, I'd
be upset about the 16 thread case being worse with the cmpxchg/no-cpu-relax
case than the original code.

-Tony

[*] probably not till tomorrow

RE: [PATCH] lockref: remove cpu_relax() again

2013-09-05 Thread Luck, Tony

>> Worse still - early processor implementations actually just ignored
>> the acquire/release and did a full fence all the time.  Unfortunately
>> this meant a lot of badly written code that used .acq when they really
>> wanted .rel became legacy out in the wild - so when we made a cpu
>> that strictly did the .acq or .rel ... all that code started breaking - so
>> we had to back-pedal and keep the "legacy" behavior of a full fence :-(
>
> Ugh. Can you try what happens with the weaker release-semantics
> performance-wise for that code? Do it *just* for the lockref code..

No.  I can change the Linux code to say "cmpxchg.rel" here ... but the
h/w will do exactly the same thing it did when I had "cmpxchg.acq".

-Tony

RE: [PATCH 1/3] pstore: Adjust buffer size for compression for smaller registered buffers

2013-09-11 Thread Luck, Tony

-   big_oops_buf_sz = (psinfo->bufsize * 100) / 45;
+   big_oops_buf_sz = (psinfo->bufsize * 100) / cmpr;

Tested on an ERST backed system.  Seems to be working (we save a little less 
information
per ERST record than before this change (uncompressed size goes down from 
~17500 to
~16400 bytes) - but this patch switched the denominator from 45 to 48 (for 
ERST) - so that
seems plausible.

Seiji: let me know how the efivars tests go.

-Tony

RE: [PATCH v2] pstore: Adjust buffer size for compression for smaller registered buffers

2013-09-12 Thread Luck, Tony

+   default:
+   cmpr = 60;
+   break;
+   }
 
Is this the right "default"?  It may be a good choice for a backend with a 
really
tiny buffer (1 ... 999).  But less good for a (theoretical) backend with a 
larger
buffer (10001 ... infinity and beyond).  Which are you trying to catch here?

-Tony

RE: [PATCH] pstore/ram: (really) fix undefined usage of rounddown_pow_of_two

2013-08-30 Thread Luck, Tony

>> Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170
>>
>> Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is
>> false (documented behaviour).
>>
>> Signed-off-by: Maxime Bizon 
>
> Yes, excellent point. :)
>
> Acked-by: Kees Cook 

Applied.  Thanks.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] pstore changes for 3.12

2013-09-03 Thread Luck, Tony

The following changes since commit b36f4be3de1b123d8601de062e7dbfc904f305fb:

  Linux 3.11-rc6 (2013-08-18 14:36:53 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git 
tags/please-pull-pstore

for you to fetch changes up to 3bd11cf56e4d9c9a79c0c1a4ebe381c674ec9709:

  pstore/ram: (really) fix undefined usage of rounddown_pow_of_two (2013-08-30 
15:57:01 -0700)


Big part of this is the addition of compression to the
generic pstore layer so that all backends can use the
pitiful amounts of storage they control more effectively.
Three other small fixes/cleanups too.


Aruna Balakrishnaiah (11):
  powerpc/pseries: Remove (de)compression in nvram with pstore enabled
  pstore: Add new argument 'compressed' in pstore write callback
  pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is 
selected
  pstore: Add compression support to pstore
  pstore: Introduce new argument 'compressed' in the read callback
  pstore: Add decompression support to pstore
  pstore: Add file extension to pstore file if compressed
  powerpc/pseries: Read and write to the 'compressed' flag of pstore
  erst: Read and write to the 'compressed' flag of pstore
  efi-pstore: Read and write to the 'compressed' flag of pstore
  pstore/ram: Read and write to the 'compressed' flag of pstore

Dan Carpenter (1):
  pstore: d_alloc_name() doesn't return an ERR_PTR

Maxime Bizon (1):
  pstore/ram: (really) fix undefined usage of rounddown_pow_of_two

Wei Yongjun (1):
  acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()

 arch/powerpc/platforms/pseries/nvram.c | 112 -
 drivers/acpi/apei/erst.c   |  25 +++-
 drivers/firmware/efi/efi-pstore.c  |  27 -
 fs/pstore/Kconfig  |   2 +
 fs/pstore/inode.c  |  10 +-
 fs/pstore/internal.h   |   5 +-
 fs/pstore/platform.c   | 212 ++---
 fs/pstore/ram.c|  47 ++--
 include/linux/pstore.h |   6 +-
 9 files changed, 306 insertions(+), 140 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] lockref: Relax in cmpxchg loop

2013-09-03 Thread Luck, Tony

While we are likley to succeed and break out of this loop, it isn't
guaranteed.  We should be power and thread friendly if we do have to
go around for a second (or third, or more) attempt.

Signed-off-by: Tony Luck 

---

diff --git a/lib/lockref.c b/lib/lockref.c
index 7819c2d..9d76f40 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -19,6 +19,7 @@
if (likely(old.lock_count == prev.lock_count)) {
\
SUCCESS;
\
}   
\
+   cpu_relax();
\
}   
\
 } while (0)
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 01/11] random: don't feed stack data into pool when interrupt regs NULL

2013-09-30 Thread Luck, Tony

> In this case fast_mix would use two uninitialized ints from the stack
> and mix it into the pool.

Is the concern here is that an attacker might know (or be able to control) what 
is on
the stack - and so get knowledge of what is being mixed into the pool?

> In this case set the input to 0.

And the fix is to guarantee that everyone knows what is being mixed in? (!)

Wouldn't it be better to adjust the "nbytes" parameter to

fast_mix(..., ..., sizeof (input));

to only mix in the part of input[] that we successfully initialized?

Untested patch below.

Signed-off-by: Tony Luck 

---

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 7737b5bd26af..5c4ec0abb702 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -745,16 +745,19 @@ void add_interrupt_randomness(int irq, int irq_flags)
struct pt_regs  *regs = get_irq_regs();
unsigned long   now = jiffies;
__u32   input[4], cycles = get_cycles();
+   int nbytes;
 
input[0] = cycles ^ jiffies;
input[1] = irq;
+   nbytes = 2 * sizeof(input[0]);
if (regs) {
__u64 ip = instruction_pointer(regs);
input[2] = ip;
input[3] = ip >> 32;
+   nbytes += 2 * sizeof(input[0]);
}
 
-   fast_mix(fast_pool, input, sizeof(input));
+   fast_mix(fast_pool, input, nbytes);
 
if ((fast_pool->count & 1023) &&
!time_after(now, fast_pool->last + HZ))


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] X86 ACPI: Use #ifdef not #if for CONFIG_X86 check

2012-10-05 Thread Luck, Tony

Fix a build warning on ia64:

include/linux/acpi.h:437:5: warning: "CONFIG_X86" is not defined

Signed-off-by: Tony Luck 

---

diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 4f42332..f70f18d 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -434,7 +434,7 @@ void acpi_os_set_prepare_sleep(int (*func)(u8 sleep_state,
 
 acpi_status acpi_os_prepare_sleep(u8 sleep_state,
  u32 pm1a_control, u32 pm1b_control);
-#if CONFIG_X86
+#ifdef CONFIG_X86
 void arch_reserve_mem_area(acpi_physical_address addr, size_t size);
 #else
 static inline void arch_reserve_mem_area(acpi_physical_address addr,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: boot failure on i7-3317u in Samsung 900x3c

2012-10-08 Thread Luck, Tony

> STATUS be23110a MCGSTATUS 5

The SDM can help here. See volume 3A, section 15.9.2 "Compound Error Codes".

The low 16 bits of the status in this case are 0001 0001  1010

This tells us that you have a cache error in L2 cache severe enough that the
processor has begun filtering further error reports.

-Tony

RE: [RFC PATCH 0/3] mca_config stuff

2012-10-10 Thread Luck, Tony

> Therefore, I can toggle the bits in the mce code with mca_cfg..
> When defining accessing them through the device attributes in sysfs, I
> use a new macro DEVICE_BIT_ATTR which gets the corresponding bit number
> of that same bit in the bitfield. This gives only one function which
> operates on a bitfield instead of a single function per bit in the
> bitfield.

Is this true across all architectures? I know that pa-risc instructions
that operate on bitfields use "0" to operate on the high order bit
rather than the low order one. I don't recall whether this spills over
into the compiler. If it did, then you'd have to have different #defines
for the bit numbers[1].  For this specific use case it wouldn't matter because
you are just using it in x86 code. But device_store_bit() and device_show_bit()
are in generic code - so they must be able to work across all architectures.

-Tony

[1] Or fix the store/show bit functions to transform the bit numbers from
"little-bitian" to "big-bitian" on architectures that count the other way.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC PATCH 3/3] Convert mce_disabled

2012-10-10 Thread Luck, Tony

struct mca_config {
-   u64 dont_log_ce : 1,
-#define MCA_CFG_DONT_LOG_CE0
-   __resv1 : 63;
+   u64 dont_log_ce : 1,
+#define MCA_CFG_DONT_LOG_CE  0
+   mca_disabled: 1,
+#define MCA_CFG_MCA_DISABLED 1
+   __resv1 : 62;
 };

If we do head in this direction - I don't think it is useful to change just one 
bit
on each commit. We should batch in larger groups.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 3/3] HWPOISON, hugetlbfs: fix RSS-counter warning

2012-12-05 Thread Luck, Tony

if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) {
-   if (PageAnon(page))
+   if (PageHuge(page))
+   ;
+   else if (PageAnon(page))
dec_mm_counter(mm, MM_ANONPAGES);
else
dec_mm_counter(mm, MM_FILEPAGES);

This style minimizes the "diff" ... but wouldn't it be nicer to say:

if (!PageHuge(page)) {
old code in here
}

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/3] HWPOISON, hugetlbfs: fix warning on freeing hwpoisoned hugepage

2012-12-05 Thread Luck, Tony

> This patch fixes the warning from __list_del_entry() which is triggered
> when a process tries to do free_huge_page() for a hwpoisoned hugepage.

Ultimately it would be nice to avoid poisoning huge pages. Generally we know the
location of the poison to a cache line granularity (but sometimes only to a 4K
granularity) ... and it is rather inefficient to take an entire 2M page out of 
service.
With 1G pages things would be even worse!!

It also makes life harder for applications that would like to catch the SIGBUS
and try to take their own recovery actions. Losing more data than they really
need to will make it less likely that they can do something to work around the
loss.

Has anyone looked at how hard it might be to have the code in memory-failure.c
break up a huge page and only poison the 4K that needs to be taken out of 
service?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

2012-10-17 Thread Luck, Tony

> What's wrong with userspace tools parsing /proc/cmdline and seeing that
> mce_bios_cmci_threshold has been set since this is the only way to set
> it anyway?

The argument might be on the command line, but may have been rejected
because the BIOS didn't set the thresholds? So then you'd have to look at
the command line, *and* check /var/log/messages to make sure we hadn't
printed the message saying the BIOS was unsupportive.

BUT ... I don't think that knowing this is sufficient. A userspace tool would
want to know what value had been set for each bank. So if it really wants to
do something interesting, just knowing that "bios set some thresholds" doesn't
sound like enough information.

BUT (squared) do you even really need to know that thresholds were set? You
could look at bits {52:38} in the MCi_STATUS information for the bank to see
how many corrected errors had been logged.

-Tony

RE: [PATCH v3] x86/mce: Honour bios-set CMCI threshold

2012-10-18 Thread Luck, Tony

> @Tony: I'll send it upwards soonish in case there are no objections.
> This way no stable backport will be needed.

Acked-by: Tony Luck 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: new execve/kernel_thread design

2012-10-19 Thread Luck, Tony

> Surprisingly enough, ia64 one seems to work on actual hardware; I have sent
> Tony an incremental patch cleaning copy_thread() up, waiting for results of
> testing that on SMP box.

Tiny bit faster than plain 3.7-rc1. lmbench3 reports fork+execve test at between
558 to 567 usec with the new code, compared with 562-572 usec with the old.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 2/2] Do not change worker's running cpu in cmci_rediscover().

2012-10-19 Thread Luck, Tony

> In this case, the following BUG_ON in try_to_wake_up_local() will be 
> triggered:
> BUG_ON(rq != this_rq());

Logically this looks OK - what is the test case to trigger this?  I've done a 
moderate
amount of testing of cpu online/offline while injecting corrected errors (when 
testing
the CMCI storm patches) ... but didn't see this problem.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-07 Thread Luck, Tony

> This patch skips taking a psinfo->buf_lock when just one cpu is online
> because stopped cpus turn to offline via smp_send_stop()
> in some architectures like x86, powerpc or arm64.

That seems an impressive list of preconditions.  So for this to
help we need to have taken all but one cpu offline, then be in
some code that is holding the pstore lock and get hit by an NMI
which causes us to recurse into the pstore code.

Can all these things really happen (did you run into this problem
on a real system?). Or is this just a theoretical problem.  Ugly (but
practical) hacks might be OK to solve real problems. But do we really
want them to fix problems that actually never happen?

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC][PATCH] pstore: Skip spinlock when just one cpu is online

2012-12-10 Thread Luck, Tony

> But you are assuming that kmsg_dump is perfect and it isn't, in which case
> by putting kmsg_dump in the kdump path, you actually may be blocking kdump
> from working.

I think the concern is that kdump isn't perfect, so sometimes we don't get a 
good dump
from it. In those cases it would have been nice to have a pstore log of the 
original
problem.

But ... I don't see an answer to this problem.  Adding more code just increases 
the
number of possible places we can fail (especially as we are executing in a state
where we know that things are all messed up ... the first kernel panic'd because
something bad happened  that we didn't know how to fix).

A boot argument might help - so we can force use of pstore in cases where
kdump is failing (or prevent use of pstore in cases where it seem to be 
preventing
us getting to kdump ... I don't have a preference).  BUT this would only be 
useful
if we had a repeatable problem so that we could switch to the other mode ... and
it seems likely that the kinds of problems that cause pstore or kdump to fail 
would
be weird cases that are not very repeatable :-(

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] ACPI5 error injection fix

2012-12-11 Thread Luck, Tony

The following changes since commit b69f0859dc8e633c5d8c06845811588fe17e68b3:

  Linux 3.7-rc8 (2012-12-03 11:22:37 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-einj-fix-for-acpi5

for you to fetch changes up to 112f1fc08d0b3f81c594af617d88c0db6ce0873c:

  ACPI, APEI, EINJ: Add missed ACPI5 support for error trigger table 
(2012-12-07 11:50:02 -0800)


Trivial fix for error injection code using ACPI5 version of EINJ


Chen Gong (1):
  ACPI, APEI, EINJ: Add missed ACPI5 support for error trigger table

 drivers/acpi/apei/einj.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] pstore fixes for 3.8 merge window

2012-12-11 Thread Luck, Tony

The following changes since commit 9489e9dcae718d5fde988e4a684a0f55b5f94d17:

  Linux 3.7-rc7 (2012-11-25 17:59:19 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git 
tags/please-pull-pstore_mevent

for you to fetch changes up to f94ec0c0594ef73ab3a2f1f32735aca8ddaf65e2:

  efi_pstore: Add a format check for an existing variable name at erasing time 
(2012-11-26 16:08:37 -0800)


Patch series to allow EFI variable backend to pstore
to hold multiple records.


Seiji Aguchi (7):
  efi_pstore: Check remaining space with QueryVariableInfo() before writing 
data
  efi_pstore: Add a logic erasing entries to an erase callback
  efi_pstore: Remove a logic erasing entries from a write callback to hold 
multiple logs
  efi_pstore: Add ctime to argument of erase callback
  efi_pstore: Add a sequence counter to a variable name
  efi_pstore: Add a format check for an existing variable name at reading 
time
  efi_pstore: Add a format check for an existing variable name at erasing 
time

 drivers/acpi/apei/erst.c   |   16 ++---
 drivers/firmware/efivars.c |  163 +++-
 fs/pstore/inode.c  |7 +-
 fs/pstore/internal.h   |2 +-
 fs/pstore/platform.c   |   13 ++--
 fs/pstore/ram.c|9 ++-
 include/linux/efi.h|1 +
 include/linux/pstore.h |6 +-
 8 files changed, 144 insertions(+), 73 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v5 0/7] efi_pstore: multiple event logging support

2012-11-13 Thread Luck, Tony

v4 -> v5
   - Rebase to 3.7-rc5
   - Add count to an argument of a write callback executed in 
pstore_console_write()
 to build successfully in case where CONSIG_PSTORE_CONSOLE=y is specified. 
(Patch 5/7)

Applied. It's in my
   git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git next
tree now so should pick up into linux-next by tomorrow or the day after.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/1] arch Kconfig: remove references to IRQ_PER_CPU

2012-11-13 Thread Luck, Tony

> But IRQ_PER_CPU wasn't removed from any of the architecture Kconfig
> files where it was defined or selected. It's completely unused so remove
> the remaining references.

Acked-by: Tony Luck 

[Hope someone picks up this whole patch ... otherwise I can take the ia64 hunk]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: Fwd: PROBLEM: Random kernel panic & system freeze when watching video

2013-01-02 Thread Luck, Tony

>I had to build the latest mcelog from kernel.org and it tells you a
>little bit more: it is an internal parity error. I don't know, though,
>what errors reported in bank 2 pertain to on this cpu model - Intel
>should know :).

Intel is a big place ... we didn't document which bank reports
which structure in the public data sheet ... so I'd have to find
some internal docs for that.  Not sure it would help though as
the MC2STATUS register value 0xb205 just says
"internal error" - without any extra bits in the "mscod" subfield.
This is the "default:" at the end of the internal "switch" where
all the usual suspects have been eliminated and we just know
that something went wrong.

One possibility is a thermal problem (since I presume that
video decode is making the CPU work hard). Does the machine
get noticeably warmer, or does the fan kick up to high speed
when this happens?

Another (harder to isolate) is power supply ... if the CPU is
drawing more power during video decode, then you might
see a brownout where voltage drops too low. This could cause
all sorts of internal problems.

How predictable is the problem?  If you power on the system
from cold (unused for a few minutes) and immediately start
watching a video (same video in multiple tests) ... do you see
the hang/crash at the same point in the movie? Or are there
large variations.

If we have a plain s/w bug doing something evil that trips
the problem, I'd expect a high level of repeatability. If you
have a thermal problem then you may be able to accelerate
it by partially blocking the fan exhaust. Electrical ... I suppose
it might be different on AC power vs. battery.

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

[GIT PULL] Use perf/event tracing to report PCI Express advanced errors

2013-01-04 Thread Luck, Tony

The following changes since commit d1c3ed669a2d452cacfb48c2d171a1f364dae2ed:

  Linux 3.8-rc2 (2013-01-02 18:13:21 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-aer-trace

for you to fetch changes up to 2cced2d95961acd318e9395578a60ee424d9db80:

  aerdrv: Cleanup log output for AER (2013-01-03 14:35:41 -0800)


Use perf/event tracing to report PCI Express advanced errors.


Lance Ortiz (3):
  aerdrv: Trace Event for PCI Express Advanced Error Reporting
  aerdrv: Enhanced AER logging
  aerdrv: Cleanup log output for AER

 drivers/acpi/apei/cper.c   |   19 ++--
 drivers/pci/pcie/aer/aerdrv_errprint.c |   63 ++
 include/linux/aer.h|4 +-
 include/trace/events/ras.h |   77 
 4 files changed, 129 insertions(+), 34 deletions(-)
 create mode 100644 include/trace/events/ras.h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v9 3/3] aerdrv: Cleanup log output for AER

2013-01-04 Thread Luck, Tony

I sent a "please pull" to Ingo/Peter/Thomas about an hour ago ... if they
push back (or ignore) we can fold your ack and nit-picks into another
version.

> s/elimiating/eliminating/ above.
Ugh ... nobody spotted this one ("many eyes" really does work!)

> I remove the "v1-v2" notes when I merge patches because I don't think
> they're useful any more.  But if Tony applies these, he can use his
> judgment.

Yes - I do too.  In fact it is easier on the maintainer if this sort of 
meta-commentary
goes *after* the "---" in the patch. Then it is available for review (where it 
is
most helpful), but tools will automatically drop it when applying the patch.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ia64: Make sure interrupts enabled when we "safe_halt()"

2013-04-16 Thread Luck, Tony

In commit d166991234347215dc23fc9dc15a63a83a1a54e1
   idle: Implement generic idle function
Thomas Gleixner cleaned up many things but perturbed some
fragile code that was keeping ia64 alive. So we started
seeing:
   WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380()
and other unpleasantness like system hangs during boot.

We really shouldn't ever halt with interrupts disabled.

Signed-off-by: Tony Luck 

---

Please fold into the same branch as the generic idle changes.

diff --git a/arch/ia64/include/asm/irqflags.h b/arch/ia64/include/asm/irqflags.h
index 2b68d85..1bf2cf2 100644
--- a/arch/ia64/include/asm/irqflags.h
+++ b/arch/ia64/include/asm/irqflags.h
@@ -89,6 +89,7 @@ static inline bool arch_irqs_disabled(void)
 
 static inline void arch_safe_halt(void)
 {
+   arch_local_irq_enable();
ia64_pal_halt_light();  /* PAL_HALT_LIGHT */
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-14 Thread Luck, Tony

>> Tony promised me to test those patches on his box, so we'll know for sure
>> in a while.

Tested this series - and the box boots just fine with no unexpected messages.

But I should note that this box doesn't have anything that is hot pluggable, so 
I
couldn't test hotplug (which seems to be deeply involved with things that this
patch is touching).

Of course that means that I haven't been testing hotplug - so it might have been
broken for years and I'd never have noticed.

-Tony

RE: [GIT PULL] Some error injection fixes to queue for 3.11

2013-06-19 Thread Luck, Tony

>> Pulled, thanks Tony!
>> 
>> Len, are you fine with this route [tip:x86/ras tree] for the
>> drivers/acpi/apei/einj.c changes?
>
> Yes, the RAS guys basically own that code.

These patches also got picked up by Rafael and are in his ACPI tree
too.  I think the patches were applied identically, so there should not
be any merge conflicts when this all comes back together in the 3.11
merge window.

Rafael already had a chat about who will take future apei changes
so that we won't have this happen again.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony

> Interesting, why? Why would we even need such an option? My impression
> is, if ACPI tells us FF, MCE code doesn't poll those banks anymore. So
> where do the duplicated reports come from?

The option is only disabling the Linux side of firmware first ... the BIOS
will still be doing it and generating records to feed to the OS using APEI.

So Linux may see the error in a bank and report it, and BIOS may report
the same error.  Though I'd expect that to be rare as whoever saw it first
would most likely clear the bank before the other could see it.

I asked for the option because I'm nervous about just skipping some banks
on the say-so of the BIOS ... what if the BIOS did something wrong. This
option gives us a way to return to the way things were before this patch.

These parts are now looking good ... but we still need to tackle what Linux
does when it does get the CPER record.  I suspect we need to preserve
the existing "fake an mcelog entry with just the address" on old platforms,
but need to do something smarter on new ones.

-Tony

RE: [PATCH v4 0/8] Nvram-to-pstore

2013-06-19 Thread Luck, Tony

> You need to mount pstore to access the files.
> 
> # mkdir /dev/pstore
> # mount -t pstore - /dev/pstore
> 
> to unmount
> 
> # umount /dev/pstore
> 
> References: http://lwn.net/Articles/421297/

Note that /dev/pstore has fallen out of fashion as the mount point ... we now 
(since 3.9)
suggest /sys/fs/pstore

> Documentation/ABI/testing/pstore

This file was updated with the new location.

-Tony
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony

> Why, fill out struct mce and do mce_log(mce) does not suffice?

There is (or should be) a lot more interesting stuff in the CPER than just the 
address. Stuff
that we don't have fields for in the existing mcelog structure.  We also need 
to treat filtered
records from modern APEI implementations a bit differently from the old stuff.  
The original
user of this code was Westmere-EX, which used it as a workaround for a missing 
address in
MCi_ADDR for corrected errors.  So in that scenario we had every error being 
reported and
mcelog(8) deamon doing the threshold analysis to decide when to take action.

In this new modern world - Naveen wants to have the BIOS decide the threshold, 
so we'd
like Linux to take some action as soon as it sees just one CPER.

-Tony

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony

>> There is (or should be)
>
> Ha!

Oh ye of little faith - I'm sure the BIOS will get this right this time :-)


> Ok, seriously: so the situation should still be fine, FF reported errors
> get the CPER format while the rest, the "old" MCE format.
>
> cper.c is doing printk so I'm guessing it would need to get its own
> tracepoint and carry that to userspace.

Yes - a tracepoint is the right answer here for all the new stuff.

> Concerning the RAS daemon, Robert and I are making good progress so once
> we have the persistent events in perf, we can read that tracepoint in
> userspace and do whatever we want with the error info.

Mauro has a rasdaemon in progress
   git://git.fedorahosted.org/rasdaemon.git
just picks up perf/events and logs to a sqlite database.

>> In this new modern world - Naveen wants to have the BIOS decide the
>> threshold, so we'd like Linux to take some action as soon as it sees
>> just one CPER.
>
> Why would Linux have to intervene if it is doing FF - wasn't the deal
> behind Firmware First for the firmware to get the error first and handle
> accordingly?

Because Linux can do runtime things that the BIOS can't - like offline a 4K 
page.
Idea here is that BIOS does whatever the OEM thinks is the right level of
threshholding - not bothering the OS with petty details of random corrected
erorrs that mean nothing.  But if there is some repeated error (like a stuck 
bit)
then the BIOS can provide a CPER to the OS telling it that it would be a good 
idea
to stop using that page.

And this is where the semantics of a CPER change between the original WSM-EX
implementation ... where Linux expects to see all the errors and do its own
thresholding only taking a page offline if it sees a lot of CPER refer to the 
same
page; and now - where the BIOS does the counting and tells Linux just once to
take the page offline.

-Tony

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony

> Ok, where is that semantics? What in a CPER record does say "this error
> should tell you that you need to offline the containing page and I'm
> telling you this exactly only once"? Error Severity 0, i.e. Recoverable?

Naveen - this one is for you (or for your BIOS team).  Can you get us a sample
CPER that you plan to provide when the BIOS decides that its threshold has
been exceeded?  How will it be different from what old WSM-EX platforms
were sending to us?  Hopefully the answer is encoded in the CPER record
and not in some code we have to put in Linux to say "if (IBMplatform) 
do_thing_1(); else ... "

> Ok, we're talking about the S in RAS now. Do we have error recovery
> strategies specified anywhere? Are they per-platform or generic? Is this
> CPER strategy above, for example, only valid for some platforms or for
> all APEI-using hardware?

mcelog(8) daemon has been doing this for years ... but it used the "predictive
failure analysis" buzzwords that were popular way back then (today the
marketing people seem to prefer "self healing" ). Whatever the name, the
concept is the same ... take some set of corrected event reports and infer
from them that something worse may happen soon, and use that information
to try to avoid the (possibly) impending crash.

> Questions over questions...

Questions are good - they help fill out gaps

-Tony

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-19 Thread Luck, Tony

> The above question about what to do *without* going to userspace and
> back is maybe more interesting and we'd need a clean design there...
> we'll see.

Yes - this case (where the BIOS did all the threshold math and made the 
decision)
should be one where Linux kernel could just implement the action directly.
Perhaps controlled by a knob to say whether we really trust the BIOS that much.

But we will also have cases where a smart user agent can correlate data
from multiple sources to identify the real root cause (e.g. some temperature
anomalies around the same time as some memory errors that occur at 10am
on the third Tuesday each month -> cause is air conditioner maintenance guy
that shuts down the a/c for 10 minutes to change the filter).

I'll leave writing an agent that smart as an exercise for the concerned data
center manager :-)

-Tony

[PATCH] [IA64] sim: Add casts to avoid assignment warnings

2013-06-20 Thread Luck, Tony

Pointers in the efi_runtime_services_t structure now have type
"void *" (formerly they were "unsigned long"). So we now see a
bunch of warnings like this:

arch/ia64/hp/sim/boot/fw-emu.c:293: warning: assignment makes pointer from 
integer without a cast

Add (void *) casts to the 10 affected lines to make the build quiet again.

Signed-off-by: Tony Luck 

---

Boris, Matt - Can you add this patch to the same tree that

   commit 43ab0476a648053e5998bf081f47f215375a4502 [linux-next id]
   efi: Convert runtime services function ptrs

is in so that it will follow along behind it.  Thanks.

 arch/ia64/hp/sim/boot/fw-emu.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/ia64/hp/sim/boot/fw-emu.c b/arch/ia64/hp/sim/boot/fw-emu.c
index 271f412..87bf9ad 100644
--- a/arch/ia64/hp/sim/boot/fw-emu.c
+++ b/arch/ia64/hp/sim/boot/fw-emu.c
@@ -290,16 +290,16 @@ sys_fw_init (const char *args, int arglen)
efi_runtime->hdr.signature = EFI_RUNTIME_SERVICES_SIGNATURE;
efi_runtime->hdr.revision = EFI_RUNTIME_SERVICES_REVISION;
efi_runtime->hdr.headersize = sizeof(efi_runtime->hdr);
-   efi_runtime->get_time = __pa(&fw_efi_get_time);
-   efi_runtime->set_time = __pa(&efi_unimplemented);
-   efi_runtime->get_wakeup_time = __pa(&efi_unimplemented);
-   efi_runtime->set_wakeup_time = __pa(&efi_unimplemented);
-   efi_runtime->set_virtual_address_map = __pa(&efi_unimplemented);
-   efi_runtime->get_variable = __pa(&efi_unimplemented);
-   efi_runtime->get_next_variable = __pa(&efi_unimplemented);
-   efi_runtime->set_variable = __pa(&efi_unimplemented);
-   efi_runtime->get_next_high_mono_count = __pa(&efi_unimplemented);
-   efi_runtime->reset_system = __pa(&efi_reset_system);
+   efi_runtime->get_time = (void *)__pa(&fw_efi_get_time);
+   efi_runtime->set_time = (void *)__pa(&efi_unimplemented);
+   efi_runtime->get_wakeup_time = (void *)__pa(&efi_unimplemented);
+   efi_runtime->set_wakeup_time = (void *)__pa(&efi_unimplemented);
+   efi_runtime->set_virtual_address_map = (void *)__pa(&efi_unimplemented);
+   efi_runtime->get_variable = (void *)__pa(&efi_unimplemented);
+   efi_runtime->get_next_variable = (void *)__pa(&efi_unimplemented);
+   efi_runtime->set_variable = (void *)__pa(&efi_unimplemented);
+   efi_runtime->get_next_high_mono_count = (void 
*)__pa(&efi_unimplemented);
+   efi_runtime->reset_system = (void *)__pa(&efi_reset_system);
 
efi_tables->guid = SAL_SYSTEM_TABLE_GUID;
efi_tables->table = __pa(sal_systab);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ff mode for corrected errors

2013-06-20 Thread Luck, Tony

> - Two, the Generic Error Data Entry (aka UEFI Section Descriptor) has a 
> flag which indicates 'Error Threshold Exceeded'. From the UEFI spec, it 
> looks like we could consider this as an indication to offline the page; 
> though I am not sure if/how this relates to the threshold value above.

This one sounds to make sense ... the flag description sounds exactly what
we want - I won't feel embarrassed explaining to people why Linux takes
action when it sees a record like this.

-Tony

RE: [PATCH v3] aerdrv: Move cper_print_aer() call out of interrupt context

2013-05-29 Thread Luck, Tony

> + /*
> +  * TODO: This function needs to be re-written so that it's output
> +  * matches the output of aer_print_error().  Right now, the output
> +  * is formatted very differently.
> +  */

So we have this big "TODO" comment sitting there very prominently ... which 
Linus
is bound to ask about if I ask him to pull this into 3.10-rcX ... what's the 
impact of
this?  What should I say when he asks why should he pull this fix into 3.10 when
there is still some work to do?  Is matching the output no big deal and can 
wait for
some future, while moving the pci bits to the work function needs to go in now?

-Tony
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

[GIT PULL] Fix aer error logging

2013-05-31 Thread Luck, Tony

The following changes since commit e4aa937ec75df0eea0bee03bffa3303ad36c986b:

  Linux 3.10-rc3 (2013-05-26 16:00:47 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-aertracefix

for you to fetch changes up to 37448adfc7ce0d6d5892b87aa8d57edde4126f49:

  aerdrv: Move cper_print_aer() call out of interrupt context (2013-05-30 
10:51:20 -0700)


Can't call pci_get_domain_bus_and_slot() from interupt context


Lance Ortiz (1):
  aerdrv: Move cper_print_aer() call out of interrupt context

 drivers/acpi/apei/cper.c   | 18 --
 drivers/acpi/apei/ghes.c   |  4 +++-
 drivers/pci/pcie/aer/aerdrv_core.c |  5 -
 drivers/pci/pcie/aer/aerdrv_errprint.c |  4 ++--
 include/linux/aer.h|  5 +++--
 5 files changed, 12 insertions(+), 24 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] efi, pstore: Cocci spatch "memdup.spatch"

2013-06-03 Thread Luck, Tony

> Who wants to pick this one up? Tony?

Sure - I'll take it.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 3/3] powerpc/pseries: Support compression of oops text via pstore

2013-06-25 Thread Luck, Tony

> Introducing headersize in pstore_write() API would need changes at
> multiple places whereits being called. The idea is to move the
> compression support to pstore infrastructure so that other platforms
> could also make use of it.

Any thoughts on the back/forward compatibility as we switch to compressed
pstore data?   E.g. imagine I have a system installed with some Linux 
distribution
with a kernel too old to know about compressed pstore. I use that machine to
run the latest kernels that do compression ... and one fine day one of them 
crashes
hard - logging in compressed form to pstore. Now I boot my distro kernel to pick
up the pieces ... what do I see in /sys/fs/pstore/*? Some compressed files? Can 
I
read them with some tool?

This somewhat of a corner case - but not completely unrealistic ... I'd at least
like to be reassured that the old kernel won't choke when it sees the compressed
blobs.

-Tony

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] x86/MCE: Update MCE severity condition check

2013-06-25 Thread Luck, Tony

> The SDM talks about "non-affected" logical processors, but perhaps we
> can call this an "unaffected" thread?

"unaffected" sounds a bit more natural (but close enough to the wording in
the SDM that people should see the connection).

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v2 1/2] mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC

2013-06-25 Thread Luck, Tony

+/*
+ * Indicates MCA banks controlled by the current cpu for CMCI. Note that this
+ * can change when a cpu is offlined or brought online since some MCA banks
+ * are shared across cpus. When a cpu is offlined, cmci_clear() disables CMCI
+ * on all banks owned by the cpu and clears this bitfield. At this point,
+ * cmci_rediscover() kicks in and a different cpu may end up taking
+ * ownership of some of the shared MCA banks that were previously owned
+ * by the offlined cpu.
+ */
 static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);

Maybe an extra sentence or two at the beginning to say *why* we need this.
E.g.

/*
 * CMCI can be delivered to multiple cpus that share a machine check bank
 * so we need to designate a single cpu to process errors logged in each bank
 * in the interrupt handler (otherwise we would have many races and potential
 * double reporting of the same error.
 */
...

-Tony

[GIT PULL] for tip x86/ras branch - queue for 3.11

2013-06-25 Thread Luck, Tony

The following changes since commit 9e895ace5d82df8929b16f58e9f515f6d54ab82d:

  Linux 3.10-rc7 (2013-06-22 09:47:31 -1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/please-pull-mce-bitmap-comment

for you to fetch changes up to 0644414e62561f0ba1bea7c5ba6a94cc50dac3e3:

  mce: acpi/apei: Add comments to clarify usage of the various bitfields in the 
MCA subsystem (2013-06-25 13:53:27 -0700)


Better comments so we understand our existing machine check
bank bitmaps - prelude to adding another bitmap soon.


Naveen N. Rao (1):
  mce: acpi/apei: Add comments to clarify usage of the various bitfields in 
the MCA subsystem

 arch/x86/kernel/cpu/mcheck/mce.c   |  5 -
 arch/x86/kernel/cpu/mcheck/mce_intel.c | 12 
 2 files changed, 16 insertions(+), 1 deletion(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1172 matches

Mail list logo