from:"Mike Travis"

Re: [PATCH 13/14] x86/UV: Update UV support for external NMI signals

2013-03-19 Thread Mike Travis

On 3/14/2013 12:20 AM, Ingo Molnar wrote:
> 
> * Mike Travis  wrote:
> 
>>
>> There is an exception where the NMI_LOCAL notifier chain is used. When 
>> the perf tools are in use, it's possible that our NMI was captured by 
>> some other NMI handler and then ignored.  We set a per_cpu flag for 
>> those CPUs that ignored the initial NMI, and then send them an IPI NMI 
>> signal.
> 
> "Other" NMI handlers should never lose NMIs - if they do then they should 
> be fixed I think.
> 
> Thanks,
> 
>   Ingo

Hi Ingo,

I suspect that the other NMI handlers would not grab ours if we were
on the NMI_LOCAL chain to claim them.  The problem though is the UV
Hub is not designed to have that amount of traffic reading the MMRs.
This was handled in previous kernel versions by a.) putting us at the
bottom of the chain; and b.) as soon as a handler claimed an NMI as
it's own, the search would be stopped.

Neither of these are true any more as all handlers are called for
all NMIs.  (I measured anywhere from .5M to 4M NMIs per second on a
64 socket, 1024 cpu thread system [not sure why the rate changes]).
This was the primary motivation for placing the UV NMI handler on the
NMI_UNKNOWN chain, so it would be called only if all other handlers
"gave up", and thus not incur the overhead of the MMR reads on every
NMI event.

The good news is that I haven't yet encountered a case where the
"missing" cpus were not called into the NMI loop.  Even better news
is that on the previous (3.0 vintage) kernels running two perf tops
would almost always cause either tons of the infamous "dazed and
confused" messages, or would lock up the system.  Now it results in
quite a few messages like:

[  961.119417] perf_event_intel: clearing PMU state on CPU#652

followed by a dump of a number of cpu PMC registers.  But the system
remains responsive.  (This was experienced in our Customer Training
Lab where multiple system admins were in the class.)

The bad news is I'm not sure why the errant NMI interrupts are lost.
I have noticed that restricting the 'perf tops' to separate and
distinct cpusets seems to lessen this "stomping on each other's perf
event handlers" effect, which might be more representative of actual
customer usage.

So in total the situation is vastly improved... :)

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/7] x86: SGI UV3 Kernel Updates

2013-02-05 Thread Mike Travis


Kernel updates for SGI Ultraviolet system 3 (UV3)

The new MMR definitions are added, and then the updates
to each module are applied.  Afterwards, a "trim" patch
reduces the size of the MMR definitions file by about
a third.  This keeps "bi-sectability" in place.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/7] x86: UV3 Update Hub Info

2013-02-05 Thread Mike Travis

This patch updates the UV HUB info for UV3.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
---
 arch/x86/include/asm/uv/uv_hub.h |   44 +++
 1 file changed, 36 insertions(+), 8 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -5,7 +5,7 @@
  *
  * SGI UV architectural definitions
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 
 #ifndef _ASM_X86_UV_UV_HUB_H
@@ -175,6 +175,7 @@ DECLARE_PER_CPU(struct uv_hub_info_s, __
  */
 #define UV1_HUB_REVISION_BASE  1
 #define UV2_HUB_REVISION_BASE  3
+#define UV3_HUB_REVISION_BASE  5
 
 static inline int is_uv1_hub(void)
 {
@@ -183,6 +184,23 @@ static inline int is_uv1_hub(void)
 
 static inline int is_uv2_hub(void)
 {
+   return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) &&
+   (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE));
+}
+
+static inline int is_uv3_hub(void)
+{
+   return uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE;
+}
+
+static inline int is_uv_hub(void)
+{
+   return uv_hub_info->hub_revision;
+}
+
+/* code common to uv2 and uv3 only */
+static inline int is_uvx_hub(void)
+{
return uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE;
 }
 
@@ -230,14 +248,23 @@ union uvh_apicid {
 #define UV2_LOCAL_MMR_SIZE (32UL * 1024 * 1024)
 #define UV2_GLOBAL_MMR32_SIZE  (32UL * 1024 * 1024)
 
-#define UV_LOCAL_MMR_BASE  (is_uv1_hub() ? UV1_LOCAL_MMR_BASE \
-   : UV2_LOCAL_MMR_BASE)
-#define UV_GLOBAL_MMR32_BASE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_BASE  \
-   : UV2_GLOBAL_MMR32_BASE)
-#define UV_LOCAL_MMR_SIZE  (is_uv1_hub() ? UV1_LOCAL_MMR_SIZE :   \
-   UV2_LOCAL_MMR_SIZE)
+#define UV3_LOCAL_MMR_BASE 0xfa00UL
+#define UV3_GLOBAL_MMR32_BASE  0xfc00UL
+#define UV3_LOCAL_MMR_SIZE (32UL * 1024 * 1024)
+#define UV3_GLOBAL_MMR32_SIZE  (32UL * 1024 * 1024)
+
+#define UV_LOCAL_MMR_BASE  (is_uv1_hub() ? UV1_LOCAL_MMR_BASE : \
+   (is_uv2_hub() ? UV2_LOCAL_MMR_BASE : \
+   UV3_LOCAL_MMR_BASE))
+#define UV_GLOBAL_MMR32_BASE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_BASE :\
+   (is_uv2_hub() ? UV2_GLOBAL_MMR32_BASE :\
+   UV3_GLOBAL_MMR32_BASE))
+#define UV_LOCAL_MMR_SIZE  (is_uv1_hub() ? UV1_LOCAL_MMR_SIZE : \
+   (is_uv2_hub() ? UV2_LOCAL_MMR_SIZE : \
+   UV3_LOCAL_MMR_SIZE))
 #define UV_GLOBAL_MMR32_SIZE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_SIZE :\
-   UV2_GLOBAL_MMR32_SIZE)
+   (is_uv2_hub() ? UV2_GLOBAL_MMR32_SIZE :\
+   UV3_GLOBAL_MMR32_SIZE))
 #define UV_GLOBAL_MMR64_BASE   (uv_hub_info->global_mmr_base)
 
 #define UV_GLOBAL_GRU_MMR_BASE 0x400
@@ -599,6 +626,7 @@ static inline void uv_hub_send_ipi(int p
  * 1 - UV1 rev 1.0 initial silicon
  * 2 - UV1 rev 2.0 production silicon
  * 3 - UV2 rev 1.0 initial silicon
+ * 5 - UV3 rev 1.0 initial silicon
  */
 static inline int uv_get_min_hub_revision_id(void)
 {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/7] x86, UV: UV3 Update ACPI Check

2013-02-05 Thread Mike Travis

Add UV3 to exclusion list.  Instead of adding every
new series of SGI UV systems, just check oem_id to
have a prefix of "SGI".

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Jiang Liu 
Cc: Bjorn Helgaas 
Cc: Yinghai Lu 
Cc: Greg Kroah-Hartman 
---
 arch/x86/pci/mmconfig-shared.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- linux.orig/arch/x86/pci/mmconfig-shared.c
+++ linux/arch/x86/pci/mmconfig-shared.c
@@ -548,8 +548,7 @@ static int __init acpi_mcfg_check_entry(
if (cfg->address < 0x)
return 0;
 
-   if (!strcmp(mcfg->header.oem_id, "SGI") ||
-   !strcmp(mcfg->header.oem_id, "SGI2"))
+   if (!strncmp(mcfg->header.oem_id, "SGI", 3))
return 0;
 
if (mcfg->header.revision >= 1) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] x86, UV: UV3 Check current gru hub support.

2013-02-05 Thread Mike Travis

This patch checks current hub support.

Signed-off-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 drivers/misc/sgi-gru/grufile.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux.orig/drivers/misc/sgi-gru/grufile.c
+++ linux/drivers/misc/sgi-gru/grufile.c
@@ -517,7 +517,7 @@ static int __init gru_init(void)
 {
int ret;
 
-   if (!is_uv_system())
+   if (!is_uv_system() || (is_uvx_hub() && !is_uv2_hub()))
return 0;
 
 #if defined CONFIG_IA64

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/7] x86, UV: UV3 Update x2apic Support

2013-02-05 Thread Mike Travis

This patch add support for the SGI UV3 hub to the
common x2apic functions.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: Michael S. Tsirkin 
Cc: Steffen Persvold 
---
 arch/x86/kernel/apic/x2apic_uv_x.c |  206 ++---
 1 file changed, 171 insertions(+), 35 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -5,7 +5,7 @@
  *
  * SGI UV APIC functions (note: not an Intel compatible APIC)
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 #include 
 #include 
@@ -91,10 +91,16 @@ static int __init early_get_pnodeid(void
m_n_config.v = uv_early_read_mmr(UVH_RH_GAM_CONFIG_MMR);
uv_min_hub_revision_id = node_id.s.revision;
 
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER)
-   uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER_X)
+   switch (node_id.s.part_number) {
+   case UV2_HUB_PART_NUMBER:
+   case UV2_HUB_PART_NUMBER_X:
uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
+   break;
+   case UV3_HUB_PART_NUMBER:
+   case UV3_HUB_PART_NUMBER_X:
+   uv_min_hub_revision_id += UV3_HUB_REVISION_BASE - 1;
+   break;
+   }
 
uv_hub_info->hub_revision = uv_min_hub_revision_id;
pnode = (node_id.s.node_id >> 1) & ((1 << m_n_config.s.n_skt) - 1);
@@ -130,13 +136,16 @@ static void __init uv_set_apicid_hibit(v
 
 static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 {
-   int pnodeid, is_uv1, is_uv2;
+   int pnodeid, is_uv1, is_uv2, is_uv3;
 
is_uv1 = !strcmp(oem_id, "SGI");
is_uv2 = !strcmp(oem_id, "SGI2");
-   if (is_uv1 || is_uv2) {
+   is_uv3 = !strncmp(oem_id, "SGI3", 4);   /* there are varieties of UV3 */
+   if (is_uv1 || is_uv2 || is_uv3) {
uv_hub_info->hub_revision =
-   is_uv1 ? UV1_HUB_REVISION_BASE : UV2_HUB_REVISION_BASE;
+   (is_uv1 ? UV1_HUB_REVISION_BASE :
+   (is_uv2 ? UV2_HUB_REVISION_BASE :
+ UV3_HUB_REVISION_BASE));
pnodeid = early_get_pnodeid();
early_get_apic_pnode_shift();
x86_platform.is_untracked_pat_range =  
uv_is_untracked_pat_range;
@@ -450,14 +459,17 @@ static __init void map_high(char *id, un
 
paddr = base << pshift;
bytes = (1UL << bshift) * (max_pnode + 1);
-   printk(KERN_INFO "UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr,
-   paddr + bytes);
+   if (!paddr) {
+   pr_info("UV: Map %s_HI base address NULL\n", id);
+   return;
+   }
+   pr_info("UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr, paddr + bytes);
if (map_type == map_uc)
init_extra_mapping_uc(paddr, bytes);
else
init_extra_mapping_wb(paddr, bytes);
-
 }
+
 static __init void map_gru_high(int max_pnode)
 {
union uvh_rh_gam_gru_overlay_config_mmr_u gru;
@@ -468,7 +480,8 @@ static __init void map_gru_high(int max_
map_high("GRU", gru.s.base, shift, shift, max_pnode, map_wb);
gru_start_paddr = ((u64)gru.s.base << shift);
gru_end_paddr = gru_start_paddr + (1UL << shift) * (max_pnode + 
1);
-
+   } else {
+   pr_info("UV: GRU disabled\n");
}
 }
 
@@ -480,23 +493,146 @@ static __init void map_mmr_high(int max_
mmr.v = uv_read_local_mmr(UVH_RH_GAM_MMR_OVERLAY_CONFIG_MMR);
if (mmr.s.enable)
map_high("MMR", mmr.s.base, shift, shift, max_pnode, map_uc);
+   else
+   pr_info("UV: MMR disabled\n");
 }
 
-static __init void map_mmioh_high(int max_pnode)
+/*
+ * This commonality works because both 0 & 1 versions of the MMIOH OVERLAY
+ * and REDIRECT MMR regs are exactly the same on UV3.
+ */
+struct mmioh_config {
+   unsigned long overlay;
+   unsigned long redirect;
+   char *id;
+};
+
+static __initdata struct mmioh_config mmiohs[] = {
+   {
+   UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR,
+   UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR,
+   "MMIOH0"
+   },
+   {
+   UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR,
+   UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR,
+   "MMIOH1"
+   },
+};
+
+static __init void map_mmioh_high_uv3(int index, int min_pnode, int max_pnode)
+{
+   union uv3h_rh_gam_mmioh_overlay_config0_mmr_u overlay;
+

[PATCH 5/7] x86, UV: UV3 Update Time Support

2013-02-05 Thread Mike Travis

This patch updates time support for the SGI UV3 hub.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
---
 arch/x86/platform/uv/uv_time.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_time.c
+++ linux/arch/x86/platform/uv/uv_time.c
@@ -15,7 +15,7 @@
  *  along with this program; if not, write to the Free Software
  *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  *
- *  Copyright (c) 2009 Silicon Graphics, Inc.  All Rights Reserved.
+ *  Copyright (c) 2009-2013 Silicon Graphics, Inc.  All Rights Reserved.
  *  Copyright (c) Dimitri Sivanich
  */
 #include 
@@ -102,9 +102,10 @@ static int uv_intr_pending(int pnode)
if (is_uv1_hub())
return uv_read_global_mmr64(pnode, UVH_EVENT_OCCURRED0) &
UV1H_EVENT_OCCURRED0_RTC1_MASK;
-   else
-   return uv_read_global_mmr64(pnode, UV2H_EVENT_OCCURRED2) &
-   UV2H_EVENT_OCCURRED2_RTC_1_MASK;
+   else if (is_uvx_hub())
+   return uv_read_global_mmr64(pnode, UVXH_EVENT_OCCURRED2) &
+   UVXH_EVENT_OCCURRED2_RTC_1_MASK;
+   return 0;
 }
 
 /* Setup interrupt and return non-zero if early expiration occurred. */
@@ -122,8 +123,8 @@ static int uv_setup_intr(int cpu, u64 ex
uv_write_global_mmr64(pnode, UVH_EVENT_OCCURRED0_ALIAS,
UV1H_EVENT_OCCURRED0_RTC1_MASK);
else
-   uv_write_global_mmr64(pnode, UV2H_EVENT_OCCURRED2_ALIAS,
-   UV2H_EVENT_OCCURRED2_RTC_1_MASK);
+   uv_write_global_mmr64(pnode, UVXH_EVENT_OCCURRED2_ALIAS,
+   UVXH_EVENT_OCCURRED2_RTC_1_MASK);
 
val = (X86_PLATFORM_IPI_VECTOR << UVH_RTC1_INT_CONFIG_VECTOR_SHFT) |
((u64)apicid << UVH_RTC1_INT_CONFIG_APIC_ID_SHFT);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/7] x86, UV: UV3 Update x2apic Support

2013-02-06 Thread Mike Travis



Subject: x86, UV: UV3 Update x2apic Support

This patch adds support for the SGI UV3 hub to the common
x2apic functions.  The primary changes are to account for the
similarities between UV2 and UV3 which are encompassed within the
"UVX" nomenclature.  One significant difference within UV3 is
the handling of the MMIOH regions which are redirected to the
target blade (with the device) in a different manner, and now
has two MMIOH regions for both small and large BARs.  This aids
in limiting the amount of physical address space used for I/O
(and removed from real memory) in the max config of 64TB.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: Michael S. Tsirkin 
Cc: Steffen Persvold 
---
 arch/x86/kernel/apic/x2apic_uv_x.c |  206 ++---
 1 file changed, 171 insertions(+), 35 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -5,7 +5,7 @@
  *
  * SGI UV APIC functions (note: not an Intel compatible APIC)
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 #include 
 #include 
@@ -91,10 +91,16 @@ static int __init early_get_pnodeid(void
m_n_config.v = uv_early_read_mmr(UVH_RH_GAM_CONFIG_MMR);
uv_min_hub_revision_id = node_id.s.revision;
 
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER)
-   uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER_X)
+   switch (node_id.s.part_number) {
+   case UV2_HUB_PART_NUMBER:
+   case UV2_HUB_PART_NUMBER_X:
uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
+   break;
+   case UV3_HUB_PART_NUMBER:
+   case UV3_HUB_PART_NUMBER_X:
+   uv_min_hub_revision_id += UV3_HUB_REVISION_BASE - 1;
+   break;
+   }
 
uv_hub_info->hub_revision = uv_min_hub_revision_id;
pnode = (node_id.s.node_id >> 1) & ((1 << m_n_config.s.n_skt) - 1);
@@ -130,13 +136,16 @@ static void __init uv_set_apicid_hibit(v
 
 static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 {
-   int pnodeid, is_uv1, is_uv2;
+   int pnodeid, is_uv1, is_uv2, is_uv3;
 
is_uv1 = !strcmp(oem_id, "SGI");
is_uv2 = !strcmp(oem_id, "SGI2");
-   if (is_uv1 || is_uv2) {
+   is_uv3 = !strncmp(oem_id, "SGI3", 4);   /* there are varieties of UV3 */
+   if (is_uv1 || is_uv2 || is_uv3) {
uv_hub_info->hub_revision =
-   is_uv1 ? UV1_HUB_REVISION_BASE : UV2_HUB_REVISION_BASE;
+   (is_uv1 ? UV1_HUB_REVISION_BASE :
+   (is_uv2 ? UV2_HUB_REVISION_BASE :
+ UV3_HUB_REVISION_BASE));
pnodeid = early_get_pnodeid();
early_get_apic_pnode_shift();
x86_platform.is_untracked_pat_range =  
uv_is_untracked_pat_range;
@@ -450,14 +459,17 @@ static __init void map_high(char *id, un
 
paddr = base << pshift;
bytes = (1UL << bshift) * (max_pnode + 1);
-   printk(KERN_INFO "UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr,
-   paddr + bytes);
+   if (!paddr) {
+   pr_info("UV: Map %s_HI base address NULL\n", id);
+   return;
+   }
+   pr_info("UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr, paddr + bytes);
if (map_type == map_uc)
init_extra_mapping_uc(paddr, bytes);
else
init_extra_mapping_wb(paddr, bytes);
-
 }
+
 static __init void map_gru_high(int max_pnode)
 {
union uvh_rh_gam_gru_overlay_config_mmr_u gru;
@@ -468,7 +480,8 @@ static __init void map_gru_high(int max_
map_high("GRU", gru.s.base, shift, shift, max_pnode, map_wb);
gru_start_paddr = ((u64)gru.s.base << shift);
gru_end_paddr = gru_start_paddr + (1UL << shift) * (max_pnode + 
1);
-
+   } else {
+   pr_info("UV: GRU disabled\n");
}
 }
 
@@ -480,23 +493,146 @@ static __init void map_mmr_high(int max_
mmr.v = uv_read_local_mmr(UVH_RH_GAM_MMR_OVERLAY_CONFIG_MMR);
if (mmr.s.enable)
map_high("MMR", mmr.s.base, shift, shift, max_pnode, map_uc);
+   else
+   pr_info("UV: MMR disabled\n");
 }
 
-static __init void map_mmioh_high(int max_pnode)
+/*
+ * This commonality works because both 0 & 1 versions of the MMIOH OVERLAY
+ * and REDIRECT MMR regs are exactly the same on UV3.
+ */
+struct mmioh_config {
+   unsigned long overlay;
+   uns

[PATCH 3/7] x86: UV3 Update Hub Info

2013-02-08 Thread Mike Travis

This patch updates the UV HUB info for UV3.  The "is_uv3_hub" and
"is_uvx_hub" (UV2 or UV3) functions are added as well as the addresses
and sizes of the MMR regions for UV3.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
---
 arch/x86/include/asm/uv/uv_hub.h |   44 +++
 1 file changed, 36 insertions(+), 8 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -5,7 +5,7 @@
  *
  * SGI UV architectural definitions
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 
 #ifndef _ASM_X86_UV_UV_HUB_H
@@ -175,6 +175,7 @@ DECLARE_PER_CPU(struct uv_hub_info_s, __
  */
 #define UV1_HUB_REVISION_BASE  1
 #define UV2_HUB_REVISION_BASE  3
+#define UV3_HUB_REVISION_BASE  5
 
 static inline int is_uv1_hub(void)
 {
@@ -183,6 +184,23 @@ static inline int is_uv1_hub(void)
 
 static inline int is_uv2_hub(void)
 {
+   return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) &&
+   (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE));
+}
+
+static inline int is_uv3_hub(void)
+{
+   return uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE;
+}
+
+static inline int is_uv_hub(void)
+{
+   return uv_hub_info->hub_revision;
+}
+
+/* code common to uv2 and uv3 only */
+static inline int is_uvx_hub(void)
+{
return uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE;
 }
 
@@ -230,14 +248,23 @@ union uvh_apicid {
 #define UV2_LOCAL_MMR_SIZE (32UL * 1024 * 1024)
 #define UV2_GLOBAL_MMR32_SIZE  (32UL * 1024 * 1024)
 
-#define UV_LOCAL_MMR_BASE  (is_uv1_hub() ? UV1_LOCAL_MMR_BASE \
-   : UV2_LOCAL_MMR_BASE)
-#define UV_GLOBAL_MMR32_BASE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_BASE  \
-   : UV2_GLOBAL_MMR32_BASE)
-#define UV_LOCAL_MMR_SIZE  (is_uv1_hub() ? UV1_LOCAL_MMR_SIZE :   \
-   UV2_LOCAL_MMR_SIZE)
+#define UV3_LOCAL_MMR_BASE 0xfa00UL
+#define UV3_GLOBAL_MMR32_BASE  0xfc00UL
+#define UV3_LOCAL_MMR_SIZE (32UL * 1024 * 1024)
+#define UV3_GLOBAL_MMR32_SIZE  (32UL * 1024 * 1024)
+
+#define UV_LOCAL_MMR_BASE  (is_uv1_hub() ? UV1_LOCAL_MMR_BASE : \
+   (is_uv2_hub() ? UV2_LOCAL_MMR_BASE : \
+   UV3_LOCAL_MMR_BASE))
+#define UV_GLOBAL_MMR32_BASE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_BASE :\
+   (is_uv2_hub() ? UV2_GLOBAL_MMR32_BASE :\
+   UV3_GLOBAL_MMR32_BASE))
+#define UV_LOCAL_MMR_SIZE  (is_uv1_hub() ? UV1_LOCAL_MMR_SIZE : \
+   (is_uv2_hub() ? UV2_LOCAL_MMR_SIZE : \
+   UV3_LOCAL_MMR_SIZE))
 #define UV_GLOBAL_MMR32_SIZE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_SIZE :\
-   UV2_GLOBAL_MMR32_SIZE)
+   (is_uv2_hub() ? UV2_GLOBAL_MMR32_SIZE :\
+   UV3_GLOBAL_MMR32_SIZE))
 #define UV_GLOBAL_MMR64_BASE   (uv_hub_info->global_mmr_base)
 
 #define UV_GLOBAL_GRU_MMR_BASE 0x400
@@ -599,6 +626,7 @@ static inline void uv_hub_send_ipi(int p
  * 1 - UV1 rev 1.0 initial silicon
  * 2 - UV1 rev 2.0 production silicon
  * 3 - UV2 rev 1.0 initial silicon
+ * 5 - UV3 rev 1.0 initial silicon
  */
 static inline int uv_get_min_hub_revision_id(void)
 {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/7] x86, UV: UV3 Update ACPI Check

2013-02-08 Thread Mike Travis

Add UV3 to exclusion list.  Instead of adding every new series of
SGI UV systems, just check oem_id to have a prefix of "SGI".

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Jiang Liu 
Cc: Bjorn Helgaas 
Cc: Yinghai Lu 
Cc: Greg Kroah-Hartman 
---
 arch/x86/pci/mmconfig-shared.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- linux.orig/arch/x86/pci/mmconfig-shared.c
+++ linux/arch/x86/pci/mmconfig-shared.c
@@ -548,8 +548,7 @@ static int __init acpi_mcfg_check_entry(
if (cfg->address < 0x)
return 0;
 
-   if (!strcmp(mcfg->header.oem_id, "SGI") ||
-   !strcmp(mcfg->header.oem_id, "SGI2"))
+   if (!strncmp(mcfg->header.oem_id, "SGI", 3))
return 0;
 
if (mcfg->header.revision >= 1) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] x86, UV: UV3 Check current gru hub support.

2013-02-08 Thread Mike Travis

This patch checks current hub support to avoid panicing the
system until all the GRU changes for UV3+ are in place.

Signed-off-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 drivers/misc/sgi-gru/grufile.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux.orig/drivers/misc/sgi-gru/grufile.c
+++ linux/drivers/misc/sgi-gru/grufile.c
@@ -517,7 +517,7 @@ static int __init gru_init(void)
 {
int ret;
 
-   if (!is_uv_system())
+   if (!is_uv_system() || (is_uvx_hub() && !is_uv2_hub()))
return 0;
 
 #if defined CONFIG_IA64

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/7] x86: SGI UV3 Kernel Updates

2013-02-08 Thread Mike Travis


Kernel updates for SGI Ultraviolet system 3 (UV3).

The new MMR definitions are added, and then the updates to each module
are applied.  Afterwards, a "trim" patch reduces the size of the MMR
definitions file by about a third.  This keeps "bi-sectability" in place.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/7] x86, UV: UV3 Update x2apic Support

2013-02-08 Thread Mike Travis

This patch adds support for the SGI UV3 hub to the common x2apic
functions.  The primary changes are to account for the similarities
between UV2 and UV3 which are encompassed within the "UVX" nomenclature.

One significant difference within UV3 is the handling of the MMIOH
regions which are redirected to the target blade (with the device) in
a different manner.  It also now has two MMIOH regions for both small and
large BARs.  This aids in limiting the amount of physical address space
removed from real memory that's used for I/O in the max config of 64TB.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: Michael S. Tsirkin 
Cc: Steffen Persvold 
---
 arch/x86/kernel/apic/x2apic_uv_x.c |  206 ++---
 1 file changed, 171 insertions(+), 35 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -5,7 +5,7 @@
  *
  * SGI UV APIC functions (note: not an Intel compatible APIC)
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 #include 
 #include 
@@ -91,10 +91,16 @@ static int __init early_get_pnodeid(void
m_n_config.v = uv_early_read_mmr(UVH_RH_GAM_CONFIG_MMR);
uv_min_hub_revision_id = node_id.s.revision;
 
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER)
-   uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER_X)
+   switch (node_id.s.part_number) {
+   case UV2_HUB_PART_NUMBER:
+   case UV2_HUB_PART_NUMBER_X:
uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
+   break;
+   case UV3_HUB_PART_NUMBER:
+   case UV3_HUB_PART_NUMBER_X:
+   uv_min_hub_revision_id += UV3_HUB_REVISION_BASE - 1;
+   break;
+   }
 
uv_hub_info->hub_revision = uv_min_hub_revision_id;
pnode = (node_id.s.node_id >> 1) & ((1 << m_n_config.s.n_skt) - 1);
@@ -130,13 +136,16 @@ static void __init uv_set_apicid_hibit(v
 
 static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 {
-   int pnodeid, is_uv1, is_uv2;
+   int pnodeid, is_uv1, is_uv2, is_uv3;
 
is_uv1 = !strcmp(oem_id, "SGI");
is_uv2 = !strcmp(oem_id, "SGI2");
-   if (is_uv1 || is_uv2) {
+   is_uv3 = !strncmp(oem_id, "SGI3", 4);   /* there are varieties of UV3 */
+   if (is_uv1 || is_uv2 || is_uv3) {
uv_hub_info->hub_revision =
-   is_uv1 ? UV1_HUB_REVISION_BASE : UV2_HUB_REVISION_BASE;
+   (is_uv1 ? UV1_HUB_REVISION_BASE :
+   (is_uv2 ? UV2_HUB_REVISION_BASE :
+ UV3_HUB_REVISION_BASE));
pnodeid = early_get_pnodeid();
early_get_apic_pnode_shift();
x86_platform.is_untracked_pat_range =  
uv_is_untracked_pat_range;
@@ -450,14 +459,17 @@ static __init void map_high(char *id, un
 
paddr = base << pshift;
bytes = (1UL << bshift) * (max_pnode + 1);
-   printk(KERN_INFO "UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr,
-   paddr + bytes);
+   if (!paddr) {
+   pr_info("UV: Map %s_HI base address NULL\n", id);
+   return;
+   }
+   pr_info("UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr, paddr + bytes);
if (map_type == map_uc)
init_extra_mapping_uc(paddr, bytes);
else
init_extra_mapping_wb(paddr, bytes);
-
 }
+
 static __init void map_gru_high(int max_pnode)
 {
union uvh_rh_gam_gru_overlay_config_mmr_u gru;
@@ -468,7 +480,8 @@ static __init void map_gru_high(int max_
map_high("GRU", gru.s.base, shift, shift, max_pnode, map_wb);
gru_start_paddr = ((u64)gru.s.base << shift);
gru_end_paddr = gru_start_paddr + (1UL << shift) * (max_pnode + 
1);
-
+   } else {
+   pr_info("UV: GRU disabled\n");
}
 }
 
@@ -480,23 +493,146 @@ static __init void map_mmr_high(int max_
mmr.v = uv_read_local_mmr(UVH_RH_GAM_MMR_OVERLAY_CONFIG_MMR);
if (mmr.s.enable)
map_high("MMR", mmr.s.base, shift, shift, max_pnode, map_uc);
+   else
+   pr_info("UV: MMR disabled\n");
 }
 
-static __init void map_mmioh_high(int max_pnode)
+/*
+ * This commonality works because both 0 & 1 versions of the MMIOH OVERLAY
+ * and REDIRECT MMR regs are exactly the same on UV3.
+ */
+struct mmioh_config {
+   unsigned long overlay;
+   unsigned long redirect;
+   char *id;
+};
+
+static __initdata struct mmioh

[PATCH 5/7] x86, UV: UV3 Update Time Support

2013-02-08 Thread Mike Travis

This patch updates time support for the SGI UV3 hub.  Since the UV2
and UV3 time support is identical, "is_uvx_hub" is used instead of
having both "is_uv2_hub" and "is_uv3_hub".

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
---
 arch/x86/platform/uv/uv_time.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_time.c
+++ linux/arch/x86/platform/uv/uv_time.c
@@ -15,7 +15,7 @@
  *  along with this program; if not, write to the Free Software
  *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  *
- *  Copyright (c) 2009 Silicon Graphics, Inc.  All Rights Reserved.
+ *  Copyright (c) 2009-2013 Silicon Graphics, Inc.  All Rights Reserved.
  *  Copyright (c) Dimitri Sivanich
  */
 #include 
@@ -102,9 +102,10 @@ static int uv_intr_pending(int pnode)
if (is_uv1_hub())
return uv_read_global_mmr64(pnode, UVH_EVENT_OCCURRED0) &
UV1H_EVENT_OCCURRED0_RTC1_MASK;
-   else
-   return uv_read_global_mmr64(pnode, UV2H_EVENT_OCCURRED2) &
-   UV2H_EVENT_OCCURRED2_RTC_1_MASK;
+   else if (is_uvx_hub())
+   return uv_read_global_mmr64(pnode, UVXH_EVENT_OCCURRED2) &
+   UVXH_EVENT_OCCURRED2_RTC_1_MASK;
+   return 0;
 }
 
 /* Setup interrupt and return non-zero if early expiration occurred. */
@@ -122,8 +123,8 @@ static int uv_setup_intr(int cpu, u64 ex
uv_write_global_mmr64(pnode, UVH_EVENT_OCCURRED0_ALIAS,
UV1H_EVENT_OCCURRED0_RTC1_MASK);
else
-   uv_write_global_mmr64(pnode, UV2H_EVENT_OCCURRED2_ALIAS,
-   UV2H_EVENT_OCCURRED2_RTC_1_MASK);
+   uv_write_global_mmr64(pnode, UVXH_EVENT_OCCURRED2_ALIAS,
+   UVXH_EVENT_OCCURRED2_RTC_1_MASK);
 
val = (X86_PLATFORM_IPI_VECTOR << UVH_RTC1_INT_CONFIG_VECTOR_SHFT) |
((u64)apicid << UVH_RTC1_INT_CONFIG_APIC_ID_SHFT);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 6/7] x86, UV: UV3 Check current gru hub support.

2013-02-11 Thread Mike Travis




On 2/11/2013 1:40 AM, Ingo Molnar wrote:


* Mike Travis  wrote:


This patch checks current hub support to avoid panicing the
system until all the GRU changes for UV3+ are in place.

Signed-off-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 


It was Dimitri's patch, and I put the Author: right after
the Subject: line but since it was sent from my quilt mail
command, it was changed.  It was my mistake with the second
s-o-b: line.


That's a weird signoff sequence. It should either also include a
From: Dimitri tag at the front, or the Signed-off-by should be
Acked-by.

Please also use consistent titles in your patches, standardizing
on this pattern would be nice:

x86/UV/UV3: Check current gru hub support

while UV patches affecting all UV models would come with a
'x86/UV:' title or so.


I can do this.

Thanks,
Mike



Thanks,

Ingo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] x86/UV/UV3: Kernel Updates for SGI UV3.

2013-02-11 Thread Mike Travis


[sorry about that, it sent the mailbox instead of each patch.  I talked
to Dimitri and he doesn't care about being the originator of the patch
so I'm just going to resend with me as the From: person.]

On 2/11/2013 11:32 AM, Mike Travis wrote:

Kernel updates for SGI Ultraviolet System 3 (UV3).

The new MMR definitions are added, and then the updates to each module
are applied.  Afterwards, a "trim" patch reduces the size of the MMR
definitions file by about a third.  This keeps "bi-sectability" in place.



fromtra...@gulag1.americas.sgi.com  Mon Feb 11 13:32:53 2013

Message-Id:<20130211193252.982866...@gulag1.americas.sgi.com>
User-Agent: quilt/0.47-15.17.1
Date: Mon, 11 Feb 2013 13:32:53 -0600
From: Mike Travis
To: Thomas Gleixner,
  Ingo Molnar,
  "H. Peter Anvin"
Cc: Andrew Morton,
  x...@kernel.org,
  linux-kernel@vger.kernel.org,
  Russ Anderson
Bcc:tra...@sgi.com
Subject: [PATCH 1/7] x86/UV/UV3: Update MMR register definitions for SGI 
Ultraviolet System 3 (UV3)
References:<20130211193252.865583...@gulag1.americas.sgi.com>
Content-Disposition: inline; filename=uv3-update-mmrs.patch

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/7] x86/UV/UV3: Kernel Updates for SGI UV3.

2013-02-11 Thread Mike Travis


Kernel updates for SGI Ultraviolet System 3 (UV3).

The new MMR definitions are added, and then the updates to each module
are applied.  Afterwards, a "trim" patch reduces the size of the MMR
definitions file by about a third.  This keeps "bi-sectability" in place.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] x86/UV/UV3: Check current gru hub support for SGI UV3

2013-02-11 Thread Mike Travis

This patch checks current hub support to avoid panicing the
system until all the GRU changes for UV3+ are in place.

Signed-off-by: Mike Travis 
Acked-by: Dimitri Sivanich 
---
 drivers/misc/sgi-gru/grufile.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux.orig/drivers/misc/sgi-gru/grufile.c
+++ linux/drivers/misc/sgi-gru/grufile.c
@@ -517,7 +517,7 @@ static int __init gru_init(void)
 {
int ret;
 
-   if (!is_uv_system())
+   if (!is_uv_system() || (is_uvx_hub() && !is_uv2_hub()))
return 0;
 
 #if defined CONFIG_IA64

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/7] x86/UV/UV3: Update x2apic Support for SGI UV3

2013-02-11 Thread Mike Travis

This patch adds support for the SGI UV3 hub to the common x2apic
functions.  The primary changes are to account for the similarities
between UV2 and UV3 which are encompassed within the "UVX" nomenclature.

One significant difference within UV3 is the handling of the MMIOH
regions which are redirected to the target blade (with the device) in
a different manner.  It also now has two MMIOH regions for both small and
large BARs.  This aids in limiting the amount of physical address space
removed from real memory that's used for I/O in the max config of 64TB.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: Michael S. Tsirkin 
Cc: Steffen Persvold 
---
 arch/x86/kernel/apic/x2apic_uv_x.c |  206 ++---
 1 file changed, 171 insertions(+), 35 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -5,7 +5,7 @@
  *
  * SGI UV APIC functions (note: not an Intel compatible APIC)
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 #include 
 #include 
@@ -91,10 +91,16 @@ static int __init early_get_pnodeid(void
m_n_config.v = uv_early_read_mmr(UVH_RH_GAM_CONFIG_MMR);
uv_min_hub_revision_id = node_id.s.revision;
 
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER)
-   uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
-   if (node_id.s.part_number == UV2_HUB_PART_NUMBER_X)
+   switch (node_id.s.part_number) {
+   case UV2_HUB_PART_NUMBER:
+   case UV2_HUB_PART_NUMBER_X:
uv_min_hub_revision_id += UV2_HUB_REVISION_BASE - 1;
+   break;
+   case UV3_HUB_PART_NUMBER:
+   case UV3_HUB_PART_NUMBER_X:
+   uv_min_hub_revision_id += UV3_HUB_REVISION_BASE - 1;
+   break;
+   }
 
uv_hub_info->hub_revision = uv_min_hub_revision_id;
pnode = (node_id.s.node_id >> 1) & ((1 << m_n_config.s.n_skt) - 1);
@@ -130,13 +136,16 @@ static void __init uv_set_apicid_hibit(v
 
 static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 {
-   int pnodeid, is_uv1, is_uv2;
+   int pnodeid, is_uv1, is_uv2, is_uv3;
 
is_uv1 = !strcmp(oem_id, "SGI");
is_uv2 = !strcmp(oem_id, "SGI2");
-   if (is_uv1 || is_uv2) {
+   is_uv3 = !strncmp(oem_id, "SGI3", 4);   /* there are varieties of UV3 */
+   if (is_uv1 || is_uv2 || is_uv3) {
uv_hub_info->hub_revision =
-   is_uv1 ? UV1_HUB_REVISION_BASE : UV2_HUB_REVISION_BASE;
+   (is_uv1 ? UV1_HUB_REVISION_BASE :
+   (is_uv2 ? UV2_HUB_REVISION_BASE :
+ UV3_HUB_REVISION_BASE));
pnodeid = early_get_pnodeid();
early_get_apic_pnode_shift();
x86_platform.is_untracked_pat_range =  
uv_is_untracked_pat_range;
@@ -450,14 +459,17 @@ static __init void map_high(char *id, un
 
paddr = base << pshift;
bytes = (1UL << bshift) * (max_pnode + 1);
-   printk(KERN_INFO "UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr,
-   paddr + bytes);
+   if (!paddr) {
+   pr_info("UV: Map %s_HI base address NULL\n", id);
+   return;
+   }
+   pr_info("UV: Map %s_HI 0x%lx - 0x%lx\n", id, paddr, paddr + bytes);
if (map_type == map_uc)
init_extra_mapping_uc(paddr, bytes);
else
init_extra_mapping_wb(paddr, bytes);
-
 }
+
 static __init void map_gru_high(int max_pnode)
 {
union uvh_rh_gam_gru_overlay_config_mmr_u gru;
@@ -468,7 +480,8 @@ static __init void map_gru_high(int max_
map_high("GRU", gru.s.base, shift, shift, max_pnode, map_wb);
gru_start_paddr = ((u64)gru.s.base << shift);
gru_end_paddr = gru_start_paddr + (1UL << shift) * (max_pnode + 
1);
-
+   } else {
+   pr_info("UV: GRU disabled\n");
}
 }
 
@@ -480,23 +493,146 @@ static __init void map_mmr_high(int max_
mmr.v = uv_read_local_mmr(UVH_RH_GAM_MMR_OVERLAY_CONFIG_MMR);
if (mmr.s.enable)
map_high("MMR", mmr.s.base, shift, shift, max_pnode, map_uc);
+   else
+   pr_info("UV: MMR disabled\n");
 }
 
-static __init void map_mmioh_high(int max_pnode)
+/*
+ * This commonality works because both 0 & 1 versions of the MMIOH OVERLAY
+ * and REDIRECT MMR regs are exactly the same on UV3.
+ */
+struct mmioh_config {
+   unsigned long overlay;
+   unsigned long redirect;
+   char *id;
+};
+
+static __initdata struct mmioh

[PATCH 2/7] x86/UV/UV3: Update ACPI Check to include SGI UV3

2013-02-11 Thread Mike Travis

Add UV3 to exclusion list.  Instead of adding every new series of
SGI UV systems, just check oem_id to have a prefix of "SGI".

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
Cc: Jiang Liu 
Cc: Bjorn Helgaas 
Cc: Yinghai Lu 
Cc: Greg Kroah-Hartman 
---
 arch/x86/pci/mmconfig-shared.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- linux.orig/arch/x86/pci/mmconfig-shared.c
+++ linux/arch/x86/pci/mmconfig-shared.c
@@ -548,8 +548,7 @@ static int __init acpi_mcfg_check_entry(
if (cfg->address < 0x)
return 0;
 
-   if (!strcmp(mcfg->header.oem_id, "SGI") ||
-   !strcmp(mcfg->header.oem_id, "SGI2"))
+   if (!strncmp(mcfg->header.oem_id, "SGI", 3))
return 0;
 
if (mcfg->header.revision >= 1) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/7] x86/UV/UV3: Update Time Support for SGI UV3

2013-02-11 Thread Mike Travis

This patch updates time support for the SGI UV3 hub.  Since the UV2
and UV3 time support is identical, "is_uvx_hub" is used instead of
having both "is_uv2_hub" and "is_uv3_hub".

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
---
 arch/x86/platform/uv/uv_time.c |   13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_time.c
+++ linux/arch/x86/platform/uv/uv_time.c
@@ -15,7 +15,7 @@
  *  along with this program; if not, write to the Free Software
  *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  *
- *  Copyright (c) 2009 Silicon Graphics, Inc.  All Rights Reserved.
+ *  Copyright (c) 2009-2013 Silicon Graphics, Inc.  All Rights Reserved.
  *  Copyright (c) Dimitri Sivanich
  */
 #include 
@@ -102,9 +102,10 @@ static int uv_intr_pending(int pnode)
if (is_uv1_hub())
return uv_read_global_mmr64(pnode, UVH_EVENT_OCCURRED0) &
UV1H_EVENT_OCCURRED0_RTC1_MASK;
-   else
-   return uv_read_global_mmr64(pnode, UV2H_EVENT_OCCURRED2) &
-   UV2H_EVENT_OCCURRED2_RTC_1_MASK;
+   else if (is_uvx_hub())
+   return uv_read_global_mmr64(pnode, UVXH_EVENT_OCCURRED2) &
+   UVXH_EVENT_OCCURRED2_RTC_1_MASK;
+   return 0;
 }
 
 /* Setup interrupt and return non-zero if early expiration occurred. */
@@ -122,8 +123,8 @@ static int uv_setup_intr(int cpu, u64 ex
uv_write_global_mmr64(pnode, UVH_EVENT_OCCURRED0_ALIAS,
UV1H_EVENT_OCCURRED0_RTC1_MASK);
else
-   uv_write_global_mmr64(pnode, UV2H_EVENT_OCCURRED2_ALIAS,
-   UV2H_EVENT_OCCURRED2_RTC_1_MASK);
+   uv_write_global_mmr64(pnode, UVXH_EVENT_OCCURRED2_ALIAS,
+   UVXH_EVENT_OCCURRED2_RTC_1_MASK);
 
val = (X86_PLATFORM_IPI_VECTOR << UVH_RTC1_INT_CONFIG_VECTOR_SHFT) |
((u64)apicid << UVH_RTC1_INT_CONFIG_APIC_ID_SHFT);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/7] x86/UV/UV3: Update Hub Info for SGI UV3

2013-02-11 Thread Mike Travis

This patch updates the UV HUB info for UV3.  The "is_uv3_hub" and
"is_uvx_hub" (UV2 or UV3) functions are added as well as the addresses
and sizes of the MMR regions for UV3.

Signed-off-by: Mike Travis 
Acked-by: Russ Anderson 
Reviewed-by: Dimitri Sivanich 
---
 arch/x86/include/asm/uv/uv_hub.h |   44 +++
 1 file changed, 36 insertions(+), 8 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -5,7 +5,7 @@
  *
  * SGI UV architectural definitions
  *
- * Copyright (C) 2007-2010 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2007-2013 Silicon Graphics, Inc. All rights reserved.
  */
 
 #ifndef _ASM_X86_UV_UV_HUB_H
@@ -175,6 +175,7 @@ DECLARE_PER_CPU(struct uv_hub_info_s, __
  */
 #define UV1_HUB_REVISION_BASE  1
 #define UV2_HUB_REVISION_BASE  3
+#define UV3_HUB_REVISION_BASE  5
 
 static inline int is_uv1_hub(void)
 {
@@ -183,6 +184,23 @@ static inline int is_uv1_hub(void)
 
 static inline int is_uv2_hub(void)
 {
+   return ((uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE) &&
+   (uv_hub_info->hub_revision < UV3_HUB_REVISION_BASE));
+}
+
+static inline int is_uv3_hub(void)
+{
+   return uv_hub_info->hub_revision >= UV3_HUB_REVISION_BASE;
+}
+
+static inline int is_uv_hub(void)
+{
+   return uv_hub_info->hub_revision;
+}
+
+/* code common to uv2 and uv3 only */
+static inline int is_uvx_hub(void)
+{
return uv_hub_info->hub_revision >= UV2_HUB_REVISION_BASE;
 }
 
@@ -230,14 +248,23 @@ union uvh_apicid {
 #define UV2_LOCAL_MMR_SIZE (32UL * 1024 * 1024)
 #define UV2_GLOBAL_MMR32_SIZE  (32UL * 1024 * 1024)
 
-#define UV_LOCAL_MMR_BASE  (is_uv1_hub() ? UV1_LOCAL_MMR_BASE \
-   : UV2_LOCAL_MMR_BASE)
-#define UV_GLOBAL_MMR32_BASE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_BASE  \
-   : UV2_GLOBAL_MMR32_BASE)
-#define UV_LOCAL_MMR_SIZE  (is_uv1_hub() ? UV1_LOCAL_MMR_SIZE :   \
-   UV2_LOCAL_MMR_SIZE)
+#define UV3_LOCAL_MMR_BASE 0xfa00UL
+#define UV3_GLOBAL_MMR32_BASE  0xfc00UL
+#define UV3_LOCAL_MMR_SIZE (32UL * 1024 * 1024)
+#define UV3_GLOBAL_MMR32_SIZE  (32UL * 1024 * 1024)
+
+#define UV_LOCAL_MMR_BASE  (is_uv1_hub() ? UV1_LOCAL_MMR_BASE : \
+   (is_uv2_hub() ? UV2_LOCAL_MMR_BASE : \
+   UV3_LOCAL_MMR_BASE))
+#define UV_GLOBAL_MMR32_BASE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_BASE :\
+   (is_uv2_hub() ? UV2_GLOBAL_MMR32_BASE :\
+   UV3_GLOBAL_MMR32_BASE))
+#define UV_LOCAL_MMR_SIZE  (is_uv1_hub() ? UV1_LOCAL_MMR_SIZE : \
+   (is_uv2_hub() ? UV2_LOCAL_MMR_SIZE : \
+   UV3_LOCAL_MMR_SIZE))
 #define UV_GLOBAL_MMR32_SIZE   (is_uv1_hub() ? UV1_GLOBAL_MMR32_SIZE :\
-   UV2_GLOBAL_MMR32_SIZE)
+   (is_uv2_hub() ? UV2_GLOBAL_MMR32_SIZE :\
+   UV3_GLOBAL_MMR32_SIZE))
 #define UV_GLOBAL_MMR64_BASE   (uv_hub_info->global_mmr_base)
 
 #define UV_GLOBAL_GRU_MMR_BASE 0x400
@@ -599,6 +626,7 @@ static inline void uv_hub_send_ipi(int p
  * 1 - UV1 rev 1.0 initial silicon
  * 2 - UV1 rev 2.0 production silicon
  * 3 - UV2 rev 1.0 initial silicon
+ * 5 - UV3 rev 1.0 initial silicon
  */
 static inline int uv_get_min_hub_revision_id(void)
 {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/14] KDB: fix errant character in KDB show regs

2013-03-12 Thread Mike Travis

When KDB prints the process regs and backtrace, every line is preceeded
with the character 'd'.  This is the level argument to printk which
is not interpreted when KDB is printing.  Skip over this possible
printk level in the outgoing string to fix this.

Here is a small sample:

dRIP: 0010:[]  [] poll_idle+0x4a/0x90
dRSP: 0018:88081d5eddd8  EFLAGS: 0246
dRAX: 0004 RBX: 0216ae7fbf5d RCX: 021658a8e600
dRDX: 88081d5ec010 RSI: 819a7d20 RDI: 8193c140

Cc: Tim Bird 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_io.c |   16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

--- linux.orig/kernel/debug/kdb/kdb_io.c
+++ linux/kernel/debug/kdb/kdb_io.c
@@ -559,6 +559,7 @@ int vkdb_printf(const char *fmt, va_list
int retlen = 0;
int fnd, len;
char *cp, *cp2, *cphold = NULL, replaced_byte = ' ';
+   const char *ostring;
char *moreprompt = "more> ";
struct console *c = console_drivers;
static DEFINE_SPINLOCK(kdb_printf_lock);
@@ -690,20 +691,21 @@ kdb_printit:
/*
 * Write to all consoles.
 */
-   retlen = strlen(kdb_buffer);
+   ostring = printk_skip_level(kdb_buffer);
+   retlen = strlen(ostring);
if (!dbg_kdb_mode && kgdb_connected) {
-   gdbstub_msg_write(kdb_buffer, retlen);
+   gdbstub_msg_write(ostring, retlen);
} else {
if (dbg_io_ops && !dbg_io_ops->is_console) {
len = retlen;
-   cp = kdb_buffer;
+   cp = (char *)ostring;
while (len--) {
dbg_io_ops->write_char(*cp);
cp++;
}
}
while (c) {
-   c->write(c, kdb_buffer, retlen);
+   c->write(c, ostring, retlen);
touch_nmi_watchdog();
c = c->next;
}
@@ -711,7 +713,7 @@ kdb_printit:
if (logging) {
saved_loglevel = console_loglevel;
console_loglevel = 0;
-   printk(KERN_INFO "%s", kdb_buffer);
+   pr_info("%s", ostring);
}
 
if (KDB_STATE(PAGER)) {
@@ -723,10 +725,10 @@ kdb_printit:
int got = 0;
len = retlen;
while (len--) {
-   if (kdb_buffer[len] == '\n') {
+   if (ostring[len] == '\n') {
kdb_nextline++;
got = 0;
-   } else if (kdb_buffer[len] == '\r') {
+   } else if (ostring[len] == '\r') {
got = 0;
} else {
got++;

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 01/14] KDB: fix the interrupt of the KDB btc command

2013-03-12 Thread Mike Travis

The KDB 'btc' (backtrace cpus) command ignores the 'quit' reply
to the 'more>' prompt.  This is quite annoying when you have a
large number of processors and thousands of lines are being
printed.  This fixes that problem.

Cc: David Howells 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_bt.c |2 ++
 1 file changed, 2 insertions(+)

--- linux.orig/kernel/debug/kdb/kdb_bt.c
+++ linux/kernel/debug/kdb/kdb_bt.c
@@ -123,6 +123,8 @@ kdb_bt(int argc, const char **argv)
kdb_ps_suppressed();
/* Run the active tasks first */
for_each_online_cpu(cpu) {
+   if (KDB_FLAG(CMD_INTERRUPT))
+   return 0;
p = kdb_curr_task(cpu);
if (kdb_bt1(p, mask, argcount, btaprompt))
return 0;

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 00/14] x86/UV/KDB/NMI: Updates for NMI/KDB handler for SGI UV

2013-03-12 Thread Mike Travis


These are kernel updates for the NMI/KDB handler on the SGI
Ultraviolet System.

* Fix problem where 'quit' to more> prompt doesn't stop output.
* Fix problem where reg dump shows letter 'd' in first column of
  every line.
* Up the number of LINES so the entire entry message is displayed.
* Moves KDB header defines to externally visable header so external
  KDB modules can be built.
* Exports some significant KDB functions for use by external modules.
* Consolidates the '| grep' support into new kdb_grep.c file.
* Updates kdb grep support to add some new options.
* Restablishes support for kdump from the kdb prompt.
* Adds back in the 'pshelp' command.
* Adds new entry into KGDB for systems with global NMI support.
* Adds support for external 'uvtrace' module.
* Updates NMI support for new internal system (SMM) NMI handler.
* Adds back the capability of NMI entering KDB/KGDB.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 13/14] x86/UV: Update UV support for external NMI signals

2013-03-12 Thread Mike Travis

This patch updates the UV NMI handler for the external SMM
'POWER NMI' command.  This command sets a special flag in one
of the MMRs on each HUB and sends the NMI signal to all cpus
in the system.

The code has also been optimized to minimize reading of the MMRs as
much as possible, by using a per HUB atomic NMI flag.  Too high a
rate of reading MMRs not only disrupts the UV Hub's primary function
of directing NumaLink traffic, but can also cause problems.  And to
avoid excessive overhead when perf tools are causing millions of
NMIs per second (when running on a large number of CPUS), this
handler uses primarily the NMI_UNKNOWN notifier chain.

There is an exception where the NMI_LOCAL notifier chain is used.
When the perf tools are in use, it's possible that our NMI was
captured by some other NMI handler and then ignored.  We set a
per_cpu flag for those CPUs that ignored the initial NMI, and then
send them an IPI NMI signal.

There are also some new parameters introduced to alter and tune the
behavior of the NMI handler.  These parameters are not documented in
Documentation/kernel-parameters.txt as they are only useful to SGI
support personnel, and are not generally useful to system users.

Cc: Russ Anderson 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: "Michael S. Tsirkin" 
Cc: Steffen Persvold 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 arch/x86/include/asm/uv/uv_hub.h   |   57 +++
 arch/x86/include/asm/uv/uv_mmrs.h  |   31 +
 arch/x86/kernel/apic/x2apic_uv_x.c |1 
 arch/x86/platform/uv/uv_nmi.c  |  600 ++---
 4 files changed, 648 insertions(+), 41 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -502,8 +502,8 @@ struct uv_blade_info {
unsigned short  nr_online_cpus;
unsigned short  pnode;
short   memory_nid;
-   spinlock_t  nmi_lock;
-   unsigned long   nmi_count;
+   spinlock_t  nmi_lock;   /* obsolete, see uv_hub_nmi */
+   unsigned long   nmi_count;  /* obsolete, see uv_hub_nmi */
 };
 extern struct uv_blade_info *uv_blade_info;
 extern short *uv_node_to_blade;
@@ -576,6 +576,59 @@ static inline int uv_num_possible_blades
return uv_possible_blades;
 }
 
+/* Per Hub NMI support */
+extern void uv_nmi_setup(void);
+
+/* BMC sets a bit this MMR non-zero before sending an NMI */
+#define UVH_NMI_MMRUVH_SCRATCH5
+#define UVH_NMI_MMR_CLEAR  UVH_SCRATCH5_ALIAS
+#define UVH_NMI_MMR_SHIFT  63
+#defineUVH_NMI_MMR_TYPE"SCRATCH5"
+
+/* Newer SMM NMI handler, not present in all systems */
+#define UVH_NMI_MMRX   UVH_EVENT_OCCURRED0
+#define UVH_NMI_MMRX_CLEAR UVH_EVENT_OCCURRED0_ALIAS
+#define UVH_NMI_MMRX_SHIFT (is_uv1_hub() ? \
+   UV1H_EVENT_OCCURRED0_EXTIO_INT0_SHFT :\
+   UVXH_EVENT_OCCURRED0_EXTIO_INT0_SHFT)
+#defineUVH_NMI_MMRX_TYPE   "EXTIO_INT0"
+
+/* Non-zero indicates newer SMM NMI handler present */
+#define UVH_NMI_MMRX_SUPPORTED UVH_EXTIO_INT0_BROADCAST
+
+/* Indicates to BIOS that we want to use the newer SMM NMI handler */
+#define UVH_NMI_MMRX_REQ   UVH_SCRATCH5_ALIAS_2
+#define UVH_NMI_MMRX_REQ_SHIFT 62
+
+struct uv_hub_nmi_s {
+   raw_spinlock_t  nmi_lock;
+   atomic_tin_nmi; /* flag this node in UV NMI IRQ */
+   atomic_tcpu_owner;  /* last locker of this struct */
+   atomic_tread_mmr_count; /* count of MMR reads */
+   atomic_tnmi_count;  /* count of true UV NMIs */
+   unsigned long   nmi_value;  /* last value read from NMI MMR */
+};
+
+struct uv_cpu_nmi_s {
+   struct uv_hub_nmi_s *hub;
+   atomic_tstate;
+   atomic_tpinging;
+   int queries;
+   int pings;
+};
+
+DECLARE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
+#define uv_cpu_nmi (__get_cpu_var(__uv_cpu_nmi))
+#define uv_hub_nmi (uv_cpu_nmi.hub)
+#define uv_cpu_nmi_per(cpu)(per_cpu(__uv_cpu_nmi, cpu))
+#define uv_hub_nmi_per(cpu)(uv_cpu_nmi_per(cpu).hub)
+
+/* uv_cpu_nmi_states */
+#defineUV_NMI_STATE_OUT0
+#defineUV_NMI_STATE_IN 1
+#defineUV_NMI_STATE_DUMP   2
+#defineUV_NMI_STATE_DUMP_DONE  3
+
 /* Update SCIR state */
 static inline void uv_set_scir_bits(unsigned char value)
 {
--- linux.orig/arch/x86/include/asm/uv/uv_mmrs.h
+++ linux/arch/x86/include/asm/uv/uv_mmrs.h
@@ -461,6 +461,23 @@ union uvh_event_occurred0_u {
 
 
 /* = */
+/* UVH_EXTIO_INT0_BROADCAST  */
+/* ===

[PATCH 10/14] KGDB/KDB: add support for external NMI handler to call KGDB/KDB.

2013-03-12 Thread Mike Travis

This patch adds an interface (kgdb_nmicallin) that can be used by
external NMI handlers to call the KGDB/KDB handler.  The primary need
for this is for those types of NMI interrupts where all the CPUs
have already received the NMI signal.  Therefore no send_IPI(NMI)
is required, and in fact it will cause a 2nd unhandled NMI to occur.

Since all the CPUs are getting the NMI at roughly the same time, it's not
guaranteed that the first CPU that hits the NMI handler will manage to
enter KGDB and set the dbg_master_lock before the slaves start entering.
The new argument "send_ready" is used by KGDB to signal the NMI handler
to release the slave CPUs for entry into KGDB.

Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 include/linux/kgdb.h  |1 +
 kernel/debug/debug_core.c |   39 +++
 kernel/debug/debug_core.h |1 +
 3 files changed, 41 insertions(+)

--- linux.orig/include/linux/kgdb.h
+++ linux/include/linux/kgdb.h
@@ -310,6 +310,7 @@ extern int
 kgdb_handle_exception(int ex_vector, int signo, int err_code,
  struct pt_regs *regs);
 extern int kgdb_nmicallback(int cpu, void *regs);
+extern int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *snd_rdy);
 extern void gdbstub_exit(int status);
 
 extern int kgdb_single_step;
--- linux.orig/kernel/debug/debug_core.c
+++ linux/kernel/debug/debug_core.c
@@ -578,6 +578,10 @@ return_normal:
/* Signal the other CPUs to enter kgdb_wait() */
if ((!kgdb_single_step) && kgdb_do_roundup)
kgdb_roundup_cpus(flags);
+
+   /* If optional send ready pointer, signal CPUs to proceed */
+   if (kgdb_info[cpu].send_ready)
+   atomic_set(kgdb_info[cpu].send_ready, 1);
 #endif
 
/*
@@ -729,6 +733,41 @@ int kgdb_nmicallback(int cpu, void *regs
return 0;
}
 #endif
+   return 1;
+}
+
+int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *send_ready)
+{
+#ifdef CONFIG_SMP
+   if (!kgdb_io_ready(0))
+   return 1;
+
+   if (kgdb_info[cpu].enter_kgdb == 0) {
+   struct kgdb_state kgdb_var;
+   struct kgdb_state *ks = &kgdb_var;
+   int save_kgdb_do_roundup = kgdb_do_roundup;
+
+   memset(ks, 0, sizeof(struct kgdb_state));
+   ks->cpu = cpu;
+   ks->ex_vector   = trapnr;
+   ks->signo   = SIGTRAP;
+   ks->err_code= 0;
+   ks->kgdb_usethreadid= 0;
+   ks->linux_regs  = regs;
+
+   /* Do not broadcast NMI */
+   kgdb_do_roundup = 0;
+   kgdb_info[cpu].send_ready = send_ready;
+   kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
+   kgdb_do_roundup = save_kgdb_do_roundup;
+   kgdb_info[cpu].send_ready = NULL;
+
+   /* Wait till all the CPUs have quit from the debugger. */
+   while (atomic_read(&slaves_in_kgdb))
+   cpu_relax();
+   return 0;
+   }
+#endif
return 1;
 }
 
--- linux.orig/kernel/debug/debug_core.h
+++ linux/kernel/debug/debug_core.h
@@ -37,6 +37,7 @@ struct kgdb_state {
 struct debuggerinfo_struct {
void*debuggerinfo;
struct task_struct  *task;
+   atomic_t*send_ready;
int exception_state;
int ret_state;
int irq_depth;

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 06/14] KDB: consolidate KDB grep code

2013-03-12 Thread Mike Travis

This patch consolidates various parts of the grep code in KDB
into a new file, kdb_grep.c, in preparation of various cleanups
and additions.

Cc: Tim Bird 
Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Cc: "Vincent StehlÃ©" 
Cc: Andrei Warkentin 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/Makefile  |2 
 kernel/debug/kdb/kdb_grep.c|  145 +
 kernel/debug/kdb/kdb_io.c  |   38 --
 kernel/debug/kdb/kdb_main.c|   92 --
 kernel/debug/kdb/kdb_private.h |4 +
 5 files changed, 152 insertions(+), 129 deletions(-)

--- linux.orig/kernel/debug/kdb/Makefile
+++ linux/kernel/debug/kdb/Makefile
@@ -7,7 +7,7 @@
 #
 
 CCVERSION  := $(shell $(CC) -v 2>&1 | sed -ne '$$p')
-obj-y := kdb_io.o kdb_main.o kdb_support.o kdb_bt.o gen-kdb_cmds.o kdb_bp.o 
kdb_debugger.o
+obj-y := kdb_io.o kdb_main.o kdb_support.o kdb_bt.o gen-kdb_cmds.o kdb_bp.o 
kdb_grep.o kdb_debugger.o
 obj-$(CONFIG_KDB_KEYBOARD)+= kdb_keyboard.o
 
 clean-files := gen-kdb_cmds.c
--- /dev/null
+++ linux/kernel/debug/kdb/kdb_grep.c
@@ -0,0 +1,145 @@
+/*
+ * Kernel Debugger Architecture Grep Support
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 1999-2004,2013 Silicon Graphics, Inc.  All Rights Reserved.
+ * Copyright (c) 2009 Wind River Systems, Inc.  All Rights Reserved.
+ */
+
+#include 
+#include 
+#include 
+#include "kdb_private.h"
+
+#define GREP_LEN 256
+char kdb_grep_string[GREP_LEN];
+int kdb_grepping_flag;
+EXPORT_SYMBOL(kdb_grepping_flag);
+int kdb_grep_leading;
+int kdb_grep_trailing;
+
+/*
+ * The "str" argument may point to something like  | grep xyz
+ */
+void kdb_grep_parse(const char *str)
+{
+   int len;
+   char*cp = (char *)str, *cp2;
+
+   /* sanity check: we should have been called with the \ first */
+   if (*cp != '|')
+   return;
+   cp++;
+   while (isspace(*cp))
+   cp++;
+   if (strncmp(cp, "grep ", 5)) {
+   kdb_printf("invalid 'pipe', see grephelp\n");
+   return;
+   }
+   cp += 5;
+   while (isspace(*cp))
+   cp++;
+   cp2 = strchr(cp, '\n');
+   if (cp2)
+   *cp2 = '\0'; /* remove the trailing newline */
+   len = strlen(cp);
+   if (len == 0) {
+   kdb_printf("invalid 'pipe', see grephelp\n");
+   return;
+   }
+   /* now cp points to a nonzero length search string */
+   if (*cp == '"') {
+   /* allow it be "x y z" by removing the "'s - there must
+  be two of them */
+   cp++;
+   cp2 = strchr(cp, '"');
+   if (!cp2) {
+   kdb_printf("invalid quoted string, see grephelp\n");
+   return;
+   }
+   *cp2 = '\0'; /* end the string where the 2nd " was */
+   }
+   kdb_grep_leading = 0;
+   if (*cp == '^') {
+   kdb_grep_leading = 1;
+   cp++;
+   }
+   len = strlen(cp);
+   kdb_grep_trailing = 0;
+   if (*(cp+len-1) == '$') {
+   kdb_grep_trailing = 1;
+   *(cp+len-1) = '\0';
+   }
+   len = strlen(cp);
+   if (!len)
+   return;
+   if (len >= GREP_LEN) {
+   kdb_printf("search string too long\n");
+   return;
+   }
+   strcpy(kdb_grep_string, cp);
+   kdb_grepping_flag++;
+   return;
+}
+
+
+/*
+ * search arg1 to see if it contains arg2
+ * (kdmain.c provides flags for ^pat and pat$)
+ *
+ * return 1 for found, 0 for not found
+ */
+int kdb_grep_search(char *searched)
+{
+   char firstchar, *cp;
+   char *searchfor = kdb_grep_string;
+   int len1, len2;
+
+   /* not counting the newline at the end of "searched" */
+   len1 = strlen(searched)-1;
+   len2 = strlen(searchfor);
+   if (len1 < len2)
+   return 0;
+   if (kdb_grep_leading && kdb_grep_trailing && len1 != len2)
+   return 0;
+   if (kdb_grep_leading) {
+   if (!strncmp(searched, searchfor, len2))
+   return 1;
+   } else if (kdb_grep_trailing) {
+   if (!strncmp(searched+len1-len2, searchfor, len2))
+   return 1;
+   } else {
+   firstchar = *searchfor;
+   cp = searched;
+   while ((cp = strchr(cp, firstchar))) {
+   if (!strncmp(cp, searchfor,

[PATCH 07/14] KDB: clean up KDB grep code, add some options

2013-03-12 Thread Mike Travis

This patch cleans up the grep 'pipe' code in KDB and adds some new options:

* allows multiple '| grep' options to be used.
* adds '-v' flag to invert the search.
* adds '-o' flag for optional ('OR') patterns.
* adds '-u' flag to delay printing until match found.

Options may be mixed in any combination.

Cc: Tim Bird 
Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Cc: "Vincent StehlÃ©" 
Cc: Andrei Warkentin 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_grep.c|  352 +
 kernel/debug/kdb/kdb_io.c  |   39 ++--
 kernel/debug/kdb/kdb_main.c|   10 -
 kernel/debug/kdb/kdb_private.h |   55 +-
 4 files changed, 361 insertions(+), 95 deletions(-)

--- linux.orig/kernel/debug/kdb/kdb_grep.c
+++ linux/kernel/debug/kdb/kdb_grep.c
@@ -11,80 +11,224 @@
 
 #include 
 #include 
+#include 
 #include 
 #include "kdb_private.h"
 
-#define GREP_LEN 256
-char kdb_grep_string[GREP_LEN];
-int kdb_grepping_flag;
+#define KDB_GREP_PATT_LEN 511
+#define KDB_GREP_MAX 8
+
+static char kdb_grep_patterns[KDB_GREP_PATT_LEN+1];
+static int kdb_grep_pattern_idx;
+
+/* Note: kdb_grep_stack[0] intentially left zero */
+struct kdb_grep_stack_s kdb_grep_stack[KDB_GREP_MAX+1];
+int kdb_grepping_flag; /* now kdb_grep_stack index */
 EXPORT_SYMBOL(kdb_grepping_flag);
-int kdb_grep_leading;
-int kdb_grep_trailing;
 
-/*
- * The "str" argument may point to something like  | grep xyz
- */
-void kdb_grep_parse(const char *str)
+static void kdb_grep_stack_clear(void)
 {
-   int len;
-   char*cp = (char *)str, *cp2;
+   kdb_grep_stack[kdb_grepping_flag].flags = 0;
+   kdb_grep_stack[kdb_grepping_flag].pattern_idx = 0;
+}
+
+static int kdb_grep_push(void)
+{
+   if (kdb_grepping_flag < KDB_GREP_MAX) {
+   ++kdb_grepping_flag;
+   kdb_grep_stack_clear();
+   kdb_grep_set(enabled);
+   return 1;
+   }
+   return 0;
+}
+
+static void kdb_grep_pop(void)
+{
+   if (kdb_grepping_flag > 0) {
+   kdb_grep_pattern_idx =
+   kdb_grep_stack[kdb_grepping_flag].pattern_idx;
+
+   kdb_grep_patterns[kdb_grep_pattern_idx] = '\0';
+
+   if (kdb_grep(suspended))
+   kdb_grep_set_lvl(suspended, kdb_grepping_flag - 1);
+
+   kdb_grep_stack_clear();
+   --kdb_grepping_flag;
+
+   if (!kdb_grep(enabled))
+   kdb_grep_pop();
+   }
+}
+
+void kdb_grep_clear_all(void)
+{
+   kdb_grepping_flag = 0;
+   kdb_grep_pattern_idx = 0;
+   kdb_grep_patterns[0] = 0;
+   memset(kdb_grep_stack, 0, sizeof(kdb_grep_stack));
+}
+
+static int kdb_grep_error(const char *str)
+{
+   kdb_grep_clear_all();
+   kdb_printf("grep error: %s, see grephelp\n", str);
+   return -1;
+}
 
-   /* sanity check: we should have been called with the \ first */
+static const char *kdb_grep_pattern(int lvl)
+{
+   return &kdb_grep_patterns[kdb_grep_stack[lvl].pattern_idx];
+}
+
+static int kdb_grep_add_pattern(char *str)
+{
+   int len = strlen(str);
+
+   if (!len) {
+   kdb_grep_error("empty search pattern");
+   return 0;
+   }
+
+   if ((kdb_grep_pattern_idx + len) >= KDB_GREP_PATT_LEN) {
+   kdb_grep_error("search string(s) too long");
+   return 0;
+   }
+
+   /* copy string into pattern(s) buffer */
+   kdb_grep_stack[kdb_grepping_flag].pattern_idx = kdb_grep_pattern_idx;
+   strcpy((char *)kdb_grep_pattern(kdb_grepping_flag), str);
+   kdb_grep_pattern_idx += len + 1;
+   kdb_grep_patterns[kdb_grep_pattern_idx] = '\0';
+   return 1;
+}
+
+static char *is_grep(const char *cp)
+{
+   /* sanity check: we should have been called with the | first */
if (*cp != '|')
-   return;
+   return 0;
cp++;
while (isspace(*cp))
cp++;
+
if (strncmp(cp, "grep ", 5)) {
-   kdb_printf("invalid 'pipe', see grephelp\n");
-   return;
+   kdb_grep_error("invalid 'pipe'");
+   return NULL;
}
cp += 5;
+   return (char *)cp;
+}
+
+/*
+ * The "str" argument may point to something like  | grep xyz
+ */
+int kdb_grep_parse(char *str)
+{
+   int len;
+   char*cp, *cp2;
+   char*newgrep;
+
+   cp = is_grep(str);
+   if (!cp)
+   return -1;
+repeat:
+   if (!kdb_grep_push())
+   return kdb_grep_error("too many grep's");
+
+   newgrep = NULL;
+
while (isspace(*cp))
cp++

[PATCH 12/14] x86/UV: Add uvtrace support

2013-03-12 Thread Mike Travis

This patch adds support for the uvtrace KDB module by providing a
skeleton call to the registered trace function.  It also provides
another separate 'NMI' tracer that is triggered by the system wide
'power nmi' command.

Cc: Alex Shi 
Cc: Cliff Wickman 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: "Michael S. Tsirkin" 
Cc: Steffen Persvold 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 arch/x86/include/asm/uv/uv.h  |   12 ++--
 arch/x86/platform/uv/uv_nmi.c |   10 +-
 2 files changed, 19 insertions(+), 3 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -14,24 +14,32 @@ extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
 extern void uv_register_nmi_notifier(void);
 extern void uv_system_init(void);
+extern void (*uv_trace_nmi_func)(int cpu, struct pt_regs *regs, int ignored);
+extern void (*uv_trace_func)(const char *f, const int l, const char *fmt, ...);
+#define uvtrace(fmt, ...)  \
+do {   \
+   if (unlikely(uv_trace_func))\
+   (uv_trace_func)(__func__, __LINE__, fmt, ##__VA_ARGS__);\
+} while (0)
 extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 struct mm_struct *mm,
 unsigned long start,
 unsigned long end,
 unsigned int cpu);
 
-#else  /* X86_UV */
+#else  /* !X86_UV */
 
 static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
 static inline int is_uv_system(void)   { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
+static inline void uvtrace(void *fmt, ...) { }
 static inline void uv_register_nmi_notifier(void) { }
 static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
unsigned long start, unsigned long end, unsigned int cpu)
 { return cpumask; }
 
-#endif /* X86_UV */
+#endif /* !X86_UV */
 
 #endif /* _ASM_X86_UV_UV_H */
--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -1,5 +1,5 @@
 /*
- * SGI NMI support routines
+ * SGI NMI/TRACE support routines
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -20,6 +20,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include 
@@ -34,6 +35,13 @@
 DEFINE_PER_CPU(unsigned long, cpu_last_nmi_count);
 static DEFINE_SPINLOCK(uv_nmi_lock);
 
+void (*uv_trace_func)(const char *f, const int l, const char *fmt, ...);
+EXPORT_SYMBOL(uv_trace_func);
+
+void (*uv_trace_nmi_func)(int cpu, struct pt_regs *regs, int ignored);
+EXPORT_SYMBOL(uv_trace_nmi_func);
+
+
 /*
  * When NMI is received, print a stack trace.
  */

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 05/14] KDB: add more exports for supporting KDB modules

2013-03-12 Thread Mike Travis

This patch adds some important KDB functions to be externally
usable by loadable KDB modules.  Note that often drivers bring
in KDB modules for debugging, and in the past KDB has not been
limited to use by GPL only modules.  This patch restores KDB
usefullness to non-GPL modules.

Cc: Tim Bird 
Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Cc: Cong Wang 
Cc: Stephen Boyd 
Cc: Al Viro 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Serge Hallyn 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_io.c  |5 -
 kernel/debug/kdb/kdb_main.c|   21 ++---
 kernel/debug/kdb/kdb_support.c |   17 +
 kernel/kallsyms.c  |9 +
 kernel/signal.c|5 +++--
 5 files changed, 47 insertions(+), 10 deletions(-)

--- linux.orig/kernel/debug/kdb/kdb_io.c
+++ linux/kernel/debug/kdb/kdb_io.c
@@ -30,6 +30,7 @@
 char kdb_prompt_str[CMD_BUFLEN];
 
 int kdb_trap_printk;
+EXPORT_SYMBOL(kdb_trap_printk);
 
 static int kgdb_transition_check(char *buffer)
 {
@@ -447,6 +448,7 @@ char *kdb_getstr(char *buffer, size_t bu
kdb_nextline = 1;   /* Prompt and input resets line number */
return kdb_read(buffer, bufsize);
 }
+EXPORT_SYMBOL(kdb_getstr);
 
 /*
  * kdb_input_flush
@@ -839,6 +841,7 @@ kdb_print_out:
preempt_enable();
return retlen;
 }
+EXPORT_SYMBOL(vkdb_printf);
 
 int kdb_printf(const char *fmt, ...)
 {
@@ -851,4 +854,4 @@ int kdb_printf(const char *fmt, ...)
 
return r;
 }
-EXPORT_SYMBOL_GPL(kdb_printf);
+EXPORT_SYMBOL(kdb_printf);
--- linux.orig/kernel/debug/kdb/kdb_main.c
+++ linux/kernel/debug/kdb/kdb_main.c
@@ -53,19 +53,23 @@ int kdb_grep_trailing;
  * Kernel debugger state flags
  */
 int kdb_flags;
+EXPORT_SYMBOL(kdb_flags);
 atomic_t kdb_event;
+EXPORT_SYMBOL(kdb_event);
 
 /*
  * kdb_lock protects updates to kdb_initial_cpu.  Used to
  * single thread processors through the kernel debugger.
  */
 int kdb_initial_cpu = -1;  /* cpu number that owns kdb */
+EXPORT_SYMBOL(kdb_initial_cpu);
 int kdb_nextline = 1;
 int kdb_state; /* General KDB state */
 
 struct task_struct *kdb_current_task;
 EXPORT_SYMBOL(kdb_current_task);
 struct pt_regs *kdb_current_regs;
+EXPORT_SYMBOL(kdb_current_regs);
 
 const char *kdb_diemsg;
 static int kdb_go_count;
@@ -186,6 +190,7 @@ struct task_struct *kdb_curr_task(int cp
 #endif
return p;
 }
+EXPORT_SYMBOL(kdb_curr_task);
 
 /*
  * kdbgetenv - This function will return the character string value of
@@ -217,6 +222,7 @@ char *kdbgetenv(const char *match)
}
return NULL;
 }
+EXPORT_SYMBOL(kdbgetenv);
 
 /*
  * kdballocenv - This function is used to allocate bytes for
@@ -293,6 +299,7 @@ int kdbgetintenv(const char *match, int
*value = (int) val;
return diag;
 }
+EXPORT_SYMBOL(kdbgetintenv);
 
 /*
  * kdbgetularg - This function will convert a numeric string into an
@@ -325,6 +332,7 @@ int kdbgetularg(const char *arg, unsigne
 
return 0;
 }
+EXPORT_SYMBOL(kdbgetularg);
 
 int kdbgetu64arg(const char *arg, u64 *value)
 {
@@ -344,6 +352,7 @@ int kdbgetu64arg(const char *arg, u64 *v
 
return 0;
 }
+EXPORT_SYMBOL(kdbgetu64arg);
 
 /*
  * kdb_set - This function implements the 'set' command.  Alter an
@@ -425,6 +434,7 @@ int kdb_set(int argc, const char **argv)
 
return KDB_ENVFULL;
 }
+EXPORT_SYMBOL(kdb_set);
 
 static int kdb_check_regs(void)
 {
@@ -585,6 +595,7 @@ int kdbgetaddrarg(int argc, const char *
 
return 0;
 }
+EXPORT_SYMBOL(kdbgetaddrarg);
 
 static void kdb_cmderror(int diag)
 {
@@ -1049,6 +1060,7 @@ int kdb_parse(const char *cmdstr)
return 0;
}
 }
+EXPORT_SYMBOL(kdb_parse);
 
 
 static int handle_ctrl_cmd(char *cmd)
@@ -1109,6 +1121,7 @@ void kdb_set_current_task(struct task_st
}
kdb_current_regs = NULL;
 }
+EXPORT_SYMBOL(kdb_set_current_task);
 
 /*
  * kdb_local - The main code for kdb.  This routine is invoked on a
@@ -2249,6 +2262,7 @@ void kdb_ps_suppressed(void)
kdb_printf(" suppressed,\nuse 'ps A' to see all.\n");
}
 }
+EXPORT_SYMBOL(kdb_ps_suppressed);
 
 /*
  * kdb_ps - This function implements the 'ps' command which shows a
@@ -2281,6 +2295,7 @@ void kdb_ps1(const struct task_struct *p
}
}
 }
+EXPORT_SYMBOL(kdb_ps1);
 
 static int kdb_ps(int argc, const char **argv)
 {
@@ -2697,7 +2712,7 @@ int kdb_register_repeat(char *cmd,
 
return 0;
 }
-EXPORT_SYMBOL_GPL(kdb_register_repeat);
+EXPORT_SYMBOL(kdb_register_repeat);
 
 
 /*
@@ -2721,7 +2736,7 @@ int kdb_register(char *cmd,
return kdb_register_repeat(cmd, func, usage, help, minlen,
   KDB_REPEAT_NONE);
 }
-EXPORT_SYMBOL_GPL(kdb_register);
+EXPORT_SYMBOL(kdb_register);
 
 /*
  * kdb_unregister - This function is used to unregister a kernel
@@ -2750,7

[PATCH 11/14] x86/UV: Move NMI support

2013-03-12 Thread Mike Travis

This patch moves the UV NMI support from the x2apic file to a
new separate uv_nmi.c file in preparation for the next sequence
of patches.  It minimizes bloat of the x2apic file, and has the
added benefit of putting the upcoming /sys/module parameters under
the name 'uv_nmi' instead of 'x2apic_uv_x', which was obscure.

Cc: Alex Shi 
Cc: Cliff Wickman 
Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: "Michael S. Tsirkin" 
Cc: Steffen Persvold 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 arch/x86/include/asm/uv/uv.h   |2 
 arch/x86/kernel/apic/x2apic_uv_x.c |   69 -
 arch/x86/platform/uv/Makefile  |2 
 arch/x86/platform/uv/uv_nmi.c  |  101 +
 4 files changed, 104 insertions(+), 70 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,7 @@ extern enum uv_system_type get_uv_system
 extern int is_uv_system(void);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
+extern void uv_register_nmi_notifier(void);
 extern void uv_system_init(void);
 extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 struct mm_struct *mm,
@@ -25,6 +26,7 @@ static inline enum uv_system_type get_uv
 static inline int is_uv_system(void)   { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
+static inline void uv_register_nmi_notifier(void) { }
 static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
unsigned long start, unsigned long end, unsigned int cpu)
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -39,12 +39,6 @@
 #include 
 #include 
 
-/* BMC sets a bit this MMR non-zero before sending an NMI */
-#define UVH_NMI_MMRUVH_SCRATCH5
-#define UVH_NMI_MMR_CLEAR  (UVH_NMI_MMR + 8)
-#define UV_NMI_PENDING_MASK(1UL << 63)
-DEFINE_PER_CPU(unsigned long, cpu_last_nmi_count);
-
 DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 #define PR_DEVEL(fmt, args...) pr_devel("%s: " fmt, __func__, args)
@@ -56,7 +50,6 @@ int uv_min_hub_revision_id;
 EXPORT_SYMBOL_GPL(uv_min_hub_revision_id);
 unsigned int uv_apicid_hibits;
 EXPORT_SYMBOL_GPL(uv_apicid_hibits);
-static DEFINE_SPINLOCK(uv_nmi_lock);
 
 static struct apic apic_x2apic_uv_x;
 
@@ -795,68 +788,6 @@ void __cpuinit uv_cpu_init(void)
set_x2apic_extra_bits(uv_hub_info->pnode);
 }
 
-/*
- * When NMI is received, print a stack trace.
- */
-int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
-{
-   unsigned long real_uv_nmi;
-   int bid;
-
-   /*
-* Each blade has an MMR that indicates when an NMI has been sent
-* to cpus on the blade. If an NMI is detected, atomically
-* clear the MMR and update a per-blade NMI count used to
-* cause each cpu on the blade to notice a new NMI.
-*/
-   bid = uv_numa_blade_id();
-   real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
-
-   if (unlikely(real_uv_nmi)) {
-   spin_lock(&uv_blade_info[bid].nmi_lock);
-   real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & 
UV_NMI_PENDING_MASK);
-   if (real_uv_nmi) {
-   uv_blade_info[bid].nmi_count++;
-   uv_write_local_mmr(UVH_NMI_MMR_CLEAR, 
UV_NMI_PENDING_MASK);
-   }
-   spin_unlock(&uv_blade_info[bid].nmi_lock);
-   }
-
-   if (likely(__get_cpu_var(cpu_last_nmi_count) == 
uv_blade_info[bid].nmi_count))
-   return NMI_DONE;
-
-   __get_cpu_var(cpu_last_nmi_count) = uv_blade_info[bid].nmi_count;
-
-   /*
-* Use a lock so only one cpu prints at a time.
-* This prevents intermixed output.
-*/
-   spin_lock(&uv_nmi_lock);
-   pr_info("UV NMI stack dump cpu %u:\n", smp_processor_id());
-   dump_stack();
-   spin_unlock(&uv_nmi_lock);
-
-   return NMI_HANDLED;
-}
-
-void uv_register_nmi_notifier(void)
-{
-   if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
-   printk(KERN_WARNING "UV NMI handler failed to register\n");
-}
-
-void uv_nmi_init(void)
-{
-   unsigned int value;
-
-   /*
-* Unmask NMI on all cpus
-*/
-   value = apic_read(APIC_LVT1) | APIC_DM_NMI;
-   value &= ~APIC_LVT_MASKED;
-   apic_write(APIC_LVT1, value);
-}
-
 void __init uv_system_init(void)
 {
union uvh_rh_gam_config_mmr_u  m_n_config;
--- linux.orig/arch/x86/platform/uv/Makefile
+++ linux/arch/x86/platform/uv/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_X86_UV)   += tlb_uv.o bios_uv.o uv_irq.o uv_sysfs.o 
uv_time.o
+obj-$(CON

[PATCH 14/14] x86/UV: Add call to KGDB/KDB from NMI handler

2013-03-12 Thread Mike Travis

This patch restores the ability to enter KDB (and KGDB) from the UV
NMI handler.  It utilizes the newly added kgdb_nmicallin function
to gain entry to KGDB/KDB by the master.  The slaves still enter via
the standard kgdb_nmicallback function.

The handler also uses the new 'send_ready' pointer to tell KGDB/KDB
to signal the slaves when to proceed into the KGDB slave loop.

Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: "Michael S. Tsirkin" 
Cc: Steffen Persvold 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 arch/x86/platform/uv/uv_nmi.c |   73 --
 1 file changed, 71 insertions(+), 2 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,6 +21,8 @@
 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -511,6 +514,68 @@ static void uv_nmi_touch_watchdogs(void)
touch_nmi_watchdog();
 }
 
+#ifdef CONFIG_KGDB_KDB
+
+/* Disable to force process dump instead of entering KDB or KGDB */
+static int uv_nmi_kdb_on = 1;
+module_param_named(kdb_on, uv_nmi_kdb_on, int, 0644);
+
+/* Call KDB from NMI handler */
+static void uv_call_kdb(int cpu, struct pt_regs *regs,
+   int master, unsigned long *flags)
+{
+   int ret;
+
+   if (master) {
+   /* call KGDB NMI handler as MASTER */
+   local_irq_restore(*flags);
+   ret = kgdb_nmicallin(cpu, X86_TRAP_NMI, regs,
+   &uv_nmi_slave_continue);
+   local_irq_save(*flags);
+
+   /*
+* if KGDB/KDB did not handle the NMI, then signal slaves
+*   to do process dump instead.
+*/
+   if (ret) {
+   uv_nmi_dump_state(cpu, regs, 1);
+   return;
+   }
+   } else {
+   int sig;
+
+   /* wait for KGDB to say it's ready for slaves to enter */
+   do {
+   cpu_relax();
+   sig = atomic_read(&uv_nmi_slave_continue);
+   } while (!sig);
+
+   /*
+* if KGDB/KDB did not handle the NMI for the master, then
+*   the master signals the slaves to do process dump instead.
+*/
+   if (sig == 2) {
+   uv_nmi_dump_state(cpu, regs, 0);
+   return;
+   }
+
+   /* call KGDB as slave */
+   local_irq_restore(*flags);
+   ret = kgdb_nmicallback(cpu, regs);
+   local_irq_save(*flags);
+   }
+   uv_nmi_sync_exit(master);
+}
+
+#else /* !CONFIG_KGDB_KDB */
+static inline void uv_call_kdb(int cpu, struct pt_regs *regs,
+   int master, unsigned long *flags)
+{
+   pr_err("UV: NMI error: KDB is not enabled in this kernel\n");
+   uv_nmi_dump_state(cpu, regs, master);
+}
+#endif /* !CONFIG_KGDB_KDB */
+
 /*
  * UV NMI handler
  */
@@ -537,8 +602,12 @@ int uv_handle_nmi(unsigned int reason, s
if (master && uv_nmi_kdump_requested)
uv_nmi_kdump(regs);
 
-   /* Dump state of each cpu */
-   uv_nmi_dump_state(cpu, regs, master);
+   /* Call KDB if enabled */
+   if (uv_nmi_kdb_on)
+   uv_call_kdb(cpu, regs, master, &flags);
+
+   else/* Otherwise dump state of each cpu */
+   uv_nmi_dump_state(cpu, regs, master);
 
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/14] KDB: up the default LINES value

2013-03-12 Thread Mike Travis

Currently the default for the # of lines displayed by the KDB pager
is 24.  This does not allow all of the lines for the entry messages,
reg dump and process trace.  Increase it to something more reasonable.

Cc: Tim Bird 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_io.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux.orig/kernel/debug/kdb/kdb_io.c
+++ linux/kernel/debug/kdb/kdb_io.c
@@ -586,7 +586,7 @@ int vkdb_printf(const char *fmt, va_list
 
diag = kdbgetintenv("LINES", &linecount);
if (diag || linecount <= 1)
-   linecount = 24;
+   linecount = 60;
 
diag = kdbgetintenv("COLUMNS", &colcount);
if (diag || colcount <= 1)

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 09/14] KDB: Add pshelp command.

2013-03-12 Thread Mike Travis

This patch restores the capability for providing help with the
PS and BT arguments.

Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_main.c |   28 
 1 file changed, 28 insertions(+)

--- linux.orig/kernel/debug/kdb/kdb_main.c
+++ linux/kernel/debug/kdb/kdb_main.c
@@ -2787,6 +2787,32 @@ static int kdb_grep_help(int argc, const
 }
 
 /*
+ * display help for the ps and bt status flag
+ */
+static int kdb_ps_help(int argc, const char **argv)
+{
+   kdb_printf("The meaning of the State flag in ps command output:\n");
+   kdb_printf("  R RUNNING\n");
+   kdb_printf("  D TASK_UNINTERRUPTIBLE\n");
+   kdb_printf("  S TASK_INTERRUPTIBLE\n");
+   kdb_printf("  T TASK_STOPPED\n");
+   kdb_printf("  C TASK_TRACED\n");
+   kdb_printf("  Z EXIT_ZOMBIE\n");
+   kdb_printf("  E EXIT_DEAD\n");
+   kdb_printf("  U UNRUNNABLE\n");
+   kdb_printf("  M sleeping DAEMON\n");
+   kdb_printf("  I IDLE\n");
+   kdb_printf("  (note that most idles are named 'kworker/NN')\n");
+   kdb_printf("\n");
+   kdb_printf(
+   "The above can be specified to ps and bta to select tasks\n");
+   kdb_printf("  A all of above\n");
+   kdb_printf("  default is RDSTCZEU  (not Idle or sleeping Daemon)\n");
+   return 0;
+}
+
+
+/*
  * kdb_register_repeat - This function is used to register a kernel
  * debugger command.
  * Inputs:
@@ -2972,6 +2998,8 @@ static void __init kdb_inittab(void)
  "Enter kgdb mode", 0, KDB_REPEAT_NONE);
kdb_register_repeat("ps", kdb_ps, "[|A]",
  "Display active task list", 0, KDB_REPEAT_NONE);
+   kdb_register_repeat("pshelp", kdb_ps_help, "",
+ "Display help for the ps and bt task State flag", 0, KDB_REPEAT_NONE);
kdb_register_repeat("pid", kdb_pid, "",
  "Switch to another task", 0, KDB_REPEAT_NONE);
kdb_register_repeat("reboot", kdb_reboot, "",

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 08/14] KDB: Restore call to kdump from KDB

2013-03-12 Thread Mike Travis

This patch restores the capability of calling kdump from inside
KDB.  First it returns to the original CPU that KDB was called
by, and also verifies that the crash_kexec kernel has been loaded.
Both are better than just using the 'sr c' option and possibly
hanging the system.

Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 include/linux/kdb.h |7 +++
 kernel/debug/kdb/kdb_main.c |   79 
 2 files changed, 86 insertions(+)

--- linux.orig/include/linux/kdb.h
+++ linux/include/linux/kdb.h
@@ -144,6 +144,13 @@ static inline const char *kdb_walk_kalls
 }
 #endif /* ! CONFIG_KALLSYMS */
 
+#if defined(CONFIG_KEXEC)
+enum {
+   KDB_KDUMP_RESET,
+   KDB_KDUMP_KDUMP,
+};
+#endif
+
 /* Dynamic kdb shell command registration */
 extern int kdb_register(char *, kdb_func_t, char *, char *, short);
 extern int kdb_register_repeat(char *, kdb_func_t, char *, char *,
--- linux.orig/kernel/debug/kdb/kdb_main.c
+++ linux/kernel/debug/kdb/kdb_main.c
@@ -42,6 +42,10 @@
 #include 
 #include "kdb_private.h"
 
+#if defined(CONFIG_KEXEC)
+#include 
+#endif
+
 /*
  * Kernel debugger state flags
  */
@@ -1052,6 +1056,73 @@ void kdb_set_current_task(struct task_st
 }
 EXPORT_SYMBOL(kdb_set_current_task);
 
+#if defined(CONFIG_KEXEC)
+
+static int kdb_kdump_state = KDB_KDUMP_RESET;  /* KDB kdump state */
+
+static int kdb_cpu(int argc, const char **argv);
+
+/*
+ * kdb_kdump_check
+ *
+ * This is where the kdump on monarch cpu is handled.
+ *
+ */
+void kdb_kdump_check(struct pt_regs *regs)
+{
+   if (kdb_kdump_state != KDB_KDUMP_RESET) {
+   crash_kexec(regs);
+
+   /*
+* If the call above returned then something didn't work
+*/
+   kdb_printf("kdb_kdump_check: crash_kexec failed!\n");
+   kdb_printf
+   ("Please check if the kdump kernel has been properly loaded\n");
+   kdb_kdump_state = KDB_KDUMP_RESET;
+   }
+}
+
+
+/*
+ * kdb_kdump
+ * This function implements the 'kdump' command.
+ *
+ * Returns:
+ * zero for success, a kdb diagnostic if error
+ */
+
+static int
+kdb_kdump(int argc, const char **argv)
+{
+   char cpu_id[8];
+   const char *cpu_argv[] = {NULL, cpu_id, NULL};
+   int ret = KDB_CMD_CPU;
+
+   if (!kexec_crash_image) {
+   kdb_printf("kdump error: crash kernel not loaded\n");
+   return KDB_NOTFOUND;
+   }
+
+   kdb_kdump_state = KDB_KDUMP_KDUMP;
+
+   /* Switch back to the initial cpu before process kdump command */
+   if (smp_processor_id() != kdb_initial_cpu) {
+   scnprintf(cpu_id, sizeof(cpu_id), "%d", kdb_initial_cpu);
+   ret = kdb_cpu(1, cpu_argv);
+   if (ret != KDB_CMD_CPU) {
+   kdb_printf
+   ("kdump: Failed to switch to initial cpu %d; aborted\n",
+   kdb_initial_cpu);
+   kdb_kdump_state = KDB_KDUMP_RESET;
+   }
+   }
+
+   return ret;
+}
+
+#endif /* CONFIG_KEXEC */
+
 /*
  * kdb_local - The main code for kdb.  This routine is invoked on a
  * specific processor, it is not global.  The main kdb() routine
@@ -1079,6 +1150,10 @@ static int kdb_local(kdb_reason_t reason
struct task_struct *kdb_current =
kdb_curr_task(raw_smp_processor_id());
 
+#if defined(CONFIG_KEXEC)
+   kdb_kdump_check(regs);
+#endif
+
KDB_DEBUG_STATE("kdb_local 1", reason);
kdb_go_count = 0;
if (reason == KDB_REASON_DEBUG) {
@@ -2726,6 +2801,10 @@ static void __init kdb_inittab(void)
  "Display Help Message", 0, KDB_REPEAT_NONE);
kdb_register_repeat("cpu", kdb_cpu, "",
  "Switch to new cpu", 0, KDB_REPEAT_NONE);
+#if defined(CONFIG_KEXEC)
+   kdb_register_repeat("kdump", kdb_kdump, "",
+ "Enter kdump crash kexec", 0, KDB_REPEAT_NONE);
+#endif
kdb_register_repeat("kgdb", kdb_kgdb, "",
  "Enter kgdb mode", 0, KDB_REPEAT_NONE);
kdb_register_repeat("ps", kdb_ps, "[|A]",

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 04/14] KDB: allow KDB modules to be external modules

2013-03-12 Thread Mike Travis

Since KDB modules are not built within the Linux kernel build domain,
symbols needed by them must be available in a header file that is
accessible.  This patch moves the significant routines used by KDB
modules to the external kdb.h header file.

Cc: Vincent Stehle 
Cc: Andrei Warkentin 
Cc: Anton Vorontsov 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 include/linux/kdb.h|   89 +
 kernel/debug/debug_core.h  |1 
 kernel/debug/kdb/kdb_private.h |   79 
 3 files changed, 89 insertions(+), 80 deletions(-)

--- linux.orig/include/linux/kdb.h
+++ linux/include/linux/kdb.h
@@ -25,6 +25,7 @@ typedef int (*kdb_func_t)(int, const cha
 #include 
 #include 
 #include 
+#include 
 
 #define KDB_POLL_FUNC_MAX  5
 extern int kdb_poll_idx;
@@ -148,6 +149,93 @@ extern int kdb_register(char *, kdb_func
 extern int kdb_register_repeat(char *, kdb_func_t, char *, char *,
   short, kdb_repeat_t);
 extern int kdb_unregister(char *);
+
+/*
+ * Exported Symbols for kernel loadable modules to use.
+ *
+ * (All of these need to be within an #ifdef CONFIG_KGDB_KDB domain)
+ */
+extern int kdb_parse(const char *cmdstr);
+extern int kdb_getarea_size(void *, unsigned long, size_t);
+extern int kdb_putarea_size(unsigned long, void *, size_t);
+
+/*
+ * Like get_user and put_user, kdb_getarea and kdb_putarea take variable
+ * names, not pointers.  The underlying *_size functions take pointers.
+ */
+#define kdb_getarea(x, addr) kdb_getarea_size(&(x), addr, sizeof((x)))
+#define kdb_putarea(addr, x) kdb_putarea_size(addr, &(x), sizeof((x)))
+
+extern int kdb_getphysword(unsigned long *word,
+   unsigned long addr, size_t size);
+extern int kdb_getword(unsigned long *, unsigned long, size_t);
+extern int kdb_putword(unsigned long, unsigned long, size_t);
+
+extern int kdbgetularg(const char *, unsigned long *);
+extern int kdbgetu64arg(const char *, u64 *);
+extern char *kdbgetenv(const char *);
+extern int kdbgetaddrarg(int, const char **, int*, unsigned long *,
+long *, char **);
+
+/* Symbol table format returned by kallsyms. */
+typedef struct __ksymtab {
+   unsigned long value;/* Address of symbol */
+   const char *mod_name;   /* Module containing symbol or
+* "kernel" */
+   unsigned long mod_start;
+   unsigned long mod_end;
+   const char *sec_name;   /* Section containing symbol */
+   unsigned long sec_start;
+   unsigned long sec_end;
+   const char *sym_name;   /* Full symbol name, including
+* any version */
+   unsigned long sym_start;
+   unsigned long sym_end;
+} kdb_symtab_t;
+extern int kallsyms_symbol_next(char *prefix_name, int flag);
+extern int kallsyms_symbol_complete(char *prefix_name, int max_len);
+
+extern int kdbgetsymval(const char *, kdb_symtab_t *);
+extern int kdbnearsym(unsigned long, kdb_symtab_t *);
+extern void kdbnearsym_cleanup(void);
+extern char *kdb_strdup(const char *str, gfp_t type);
+extern void kdb_symbol_print(unsigned long, const kdb_symtab_t *, unsigned 
int);
+
+extern void kdb_ps_suppressed(void);
+extern void kdb_ps1(const struct task_struct *p);
+extern void kdb_print_nameval(const char *name, unsigned long val);
+extern void kdb_send_sig_info(struct task_struct *p, struct siginfo *info);
+extern void kdb_meminfo_proc_show(void);
+extern char *kdb_getstr(char *, size_t, char *);
+
+/* Defines for kdb_symbol_print */
+#define KDB_SP_SPACEB  0x0001  /* Space before string */
+#define KDB_SP_SPACEA  0x0002  /* Space after string */
+#define KDB_SP_PAREN   0x0004  /* Parenthesis around string */
+#define KDB_SP_VALUE   0x0008  /* Print the value of the address */
+#define KDB_SP_SYMSIZE 0x0010  /* Print the size of the symbol */
+#define KDB_SP_NEWLINE 0x0020  /* Newline after string */
+#define KDB_SP_DEFAULT (KDB_SP_VALUE|KDB_SP_PAREN)
+
+#define KDB_TSK(cpu) (kgdb_info[cpu].task)
+#define KDB_TSKREGS(cpu) (kgdb_info[cpu].debuggerinfo)
+
+extern struct task_struct *kdb_curr_task(int);
+
+#define kdb_task_has_cpu(p) (task_curr(p))
+
+/* Simplify coexistence with NPTL */
+#definekdb_do_each_thread(g, p) do_each_thread(g, p)
+#definekdb_while_each_thread(g, p) while_each_thread(g, p)
+
+#define GFP_KDB (in_interrupt() ? GFP_ATOMIC : GFP_KERNEL)
+
+extern void *debug_kmalloc(size_t size, gfp_t flags);
+extern void debug_kfree(void *);
+extern void debug_kusage(void);
+
+extern void kdb_set_current_task(struct task_struct *);
+extern struct task_struct *kdb_current_task;
 #else /* ! CONFIG_KGDB_KDB */
 static inline __printf(1, 2) int kdb_printf(const char *fmt, ...) { return 0; }
 static inline void kdb_init(int level) {

Re: [PATCH 05/14] KDB: add more exports for supporting KDB modules

2013-03-12 Thread Mike Travis


Let me see if I can understand the concept better.  By denying
an external hardware vendor the use of KDB to support a significant
piece of proprietary hardware on Linux, I furthering the interests
of Linux and the community how?

Looking back at the KDB sources originally posted on oss.sgi.com I
did not see any restrictions on the use of KDB.  How/why was that
restriction granted and by whom?  Was SGI, the original copyright
owner of KDB, asked or even informed of that decision?  I'm not
trying to be a lawyer here, but someone decided (perhaps wrongly)
that KDB should only be used by GPL modules.

I'm not married to this matter by any means and I will change them all
if that's what's needed for acceptance.  But I do think that placing
unnecessary roadblocks in the path of developing more capabilities
for the Linux system, is causing a disservice to the the users of
Linux and the overall Linux community.

Thanks,
Mike

On 3/12/2013 1:09 PM, Eric W. Biederman wrote:

Mike Travis  writes:


This patch adds some important KDB functions to be externally
usable by loadable KDB modules.  Note that often drivers bring
in KDB modules for debugging, and in the past KDB has not been
limited to use by GPL only modules.  This patch restores KDB
usefullness to non-GPL modules.


It is not ok to change EXPORT_SYMBOL_GPL to EXPORT_SYMBOL.

The symbols you are changing to EXPORT_SYMBOL from EXPORT_SYMBOL_GPL you
should not even be messing with if your source code is not in the main
kernel tree.

This patch is totally not ok.

I don't know what past you are referring to but you are changing symbols
that have never been exported as anything other than EXPORT_SYMBOL_GPL
to EXPORT_SYMBOL.  The past I remember is the past where kdb was not in
the kernel tree at all.

Please go back to the drawing board and come back with a solution where
you are working with the community instead of trying asking the rest of
us to support something you won't share.

Nacked-by: "Eric W. Biederman" 


--- linux.orig/kernel/signal.c
+++ linux/kernel/signal.c
@@ -1419,7 +1419,7 @@ out_unlock:
rcu_read_unlock();
return ret;
  }
-EXPORT_SYMBOL_GPL(kill_pid_info_as_cred);
+EXPORT_SYMBOL(kill_pid_info_as_cred);

  /*
   * kill_something_info() interprets pid in interesting ways just like kill(2).
@@ -2491,7 +2491,7 @@ out:
  }

  EXPORT_SYMBOL(recalc_sigpending);
-EXPORT_SYMBOL_GPL(dequeue_signal);
+EXPORT_SYMBOL(dequeue_signal);
  EXPORT_SYMBOL(flush_signals);
  EXPORT_SYMBOL(force_sig);
  EXPORT_SYMBOL(send_sig);
@@ -3661,4 +3661,5 @@ kdb_send_sig_info(struct task_struct *t,
else
kdb_printf("Signal %d is sent to process %d.\n", sig, t->pid);
  }
+EXPORT_SYMBOL(kdb_send_sig_info);
  #endif/* CONFIG_KGDB_KDB */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/14] KDB: add more exports for supporting KDB modules

2013-03-12 Thread Mike Travis




On 3/12/2013 3:01 PM, Thomas Gleixner wrote:

On Tue, 12 Mar 2013, Mike Travis wrote:

This patch adds some important KDB functions to be externally
usable by loadable KDB modules.  Note that often drivers bring
in KDB modules for debugging, and in the past KDB has not been
limited to use by GPL only modules.  This patch restores KDB
usefullness to non-GPL modules.


Which past? We only care about Linus git tree as THE past.


-EXPORT_SYMBOL_GPL(kdb_register);
+EXPORT_SYMBOL(kdb_register);


AFAICT that function has never been an non GPL symbol in Linus
tree. Whatever the original out of tree kdb stuff used is totally
irrelevant.

Stop trying to resolve your companys or your companys customers legal
issues by false claims.

The GPL is there for a reason.

Aside of that the whole attempt to export stuff which has been not
exported before without the _GPL extension is also:

 Nacked-by: Thomas Gleixner 

Thanks,

tglx



No problem, as I mentioned I don't care nor am I trying to resolve
anything except the problems of running Linux on the UV system.
There is no hidden agenda.

I will wait though for more feedback on the other patches before
submitting a 'v2' version.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/14] KDB: add more exports for supporting KDB modules

2013-03-12 Thread Mike Travis




On 3/12/2013 3:13 PM, Greg Kroah-Hartman wrote:

On Tue, Mar 12, 2013 at 03:03:17PM -0700, Mike Travis wrote:

Let me see if I can understand the concept better.  By denying
an external hardware vendor the use of KDB to support a significant
piece of proprietary hardware on Linux, I furthering the interests
of Linux and the community how?


Did SGI lawyers really agree to this patch?  I consider you running this
by them if you have any questions as to why we are objecting to this.
If, after discussing it with them, they still are asking for this
change, please resend it, with their signed-off-by: on it showing that
they really want this change.

greg k-h



There is nobody else involved believe me.  I am just trying to do
the right thing.  This is not that big an issue as it has absolutely
no relevance to anything within that patch set.  I'm trying to
improve the overall experience of using KDB, which I've found most
helpful in the past to get around some very thorny issues, particularly
in regards to bringing up new hardware.  If blocking that usage by
non-GPL modules is what's required, then by all means I'm for it.

But understanding more of why the restriction is in place, would be
very helpful the next time I encounter it.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 05/14] KDB: add more exports for supporting KDB modules

2013-03-12 Thread Mike Travis



On 3/12/2013 3:39 PM, Eric W. Biederman wrote:
> Mike Travis  writes:
> 
>> Let me see if I can understand the concept better.  By denying
>> an external hardware vendor the use of KDB to support a significant
>> piece of proprietary hardware on Linux, I furthering the interests
>> of Linux and the community how?
> 
> By ignoring interests of someone who does not cooperate with the
> community we encourage people to cooperate with the community.

I can see this point.
> 
>> Looking back at the KDB sources originally posted on oss.sgi.com I
>> did not see any restrictions on the use of KDB.  How/why was that
>> restriction granted and by whom?  Was SGI, the original copyright
>> owner of KDB, asked or even informed of that decision?  I'm not
>> trying to be a lawyer here, but someone decided (perhaps wrongly)
>> that KDB should only be used by GPL modules.
> 
> The symbols quoted below are have absolutely nothing to do with KDB
> ever.  They are pieces of code that you should only use in very
> exceptional circumpstances, or you risk breaking the kernel in strange
> and mysterious ways.

Yes, those below were indeed a mistake on my part.  Thanks for catching that.
> 
> Beyond that there are modules with GPL compatible licenses.  That is the
> only kind of module that the kernel license allows.

Okay.
> 
>> I'm not married to this matter by any means and I will change them all
>> if that's what's needed for acceptance.  But I do think that placing
>> unnecessary roadblocks in the path of developing more capabilities
>> for the Linux system, is causing a disservice to the the users of
>> Linux and the overall Linux community.
> 
> A capability that no one else can use, and that generates support
> requests that can not be supported is not developing more capabilities
> for the Linux system.  It is denying those of us who ask for repayment
> in code, our compensation.  It is theft.

Not sure I've ever looked at it this way, but again I can see your point.
> 
> Eric

Thanks for the meaningful feedback Eric.

Mike

> 
>>>> --- linux.orig/kernel/signal.c
>>>> +++ linux/kernel/signal.c
>>>> @@ -1419,7 +1419,7 @@ out_unlock:
>>>>rcu_read_unlock();
>>>>return ret;
>>>>}
>>>> -EXPORT_SYMBOL_GPL(kill_pid_info_as_cred);
>>>> +EXPORT_SYMBOL(kill_pid_info_as_cred);
>>>>
>>>>/*
>>>> * kill_something_info() interprets pid in interesting ways just like 
>>>> kill(2).
>>>> @@ -2491,7 +2491,7 @@ out:
>>>>}
>>>>
>>>>EXPORT_SYMBOL(recalc_sigpending);
>>>> -EXPORT_SYMBOL_GPL(dequeue_signal);
>>>> +EXPORT_SYMBOL(dequeue_signal);
>>>>EXPORT_SYMBOL(flush_signals);
>>>>EXPORT_SYMBOL(force_sig);
>>>>EXPORT_SYMBOL(send_sig);
>>>> @@ -3661,4 +3661,5 @@ kdb_send_sig_info(struct task_struct *t,
>>>>else
>>>>kdb_printf("Signal %d is sent to process %d.\n", sig, 
>>>> t->pid);
>>>>}
>>>> +EXPORT_SYMBOL(kdb_send_sig_info);
>>>>#endif  /* CONFIG_KGDB_KDB */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/6] percpu: Change Kconfig to HAVE_SETUP_PER_CPU_AREA linux-2.6.git

2008-01-30 Thread Mike Travis

Ingo Molnar wrote:
> * [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> 
>> Change:
>>  config ARCH_SETS_UP_PER_CPU_AREA
>> to:
>>  config HAVE_SETUP_PER_CPU_AREA
> 
> undocumented change:
> 
>>  config ARCH_NO_VIRT_TO_BUS
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -380,6 +380,8 @@ static void __init setup_per_cpu_areas(v
>>  
>>  /* Copy section for each CPU (we discard the original) */
>>  size = ALIGN(PERCPU_ENOUGH_ROOM, PAGE_SIZE);
>> +printk(KERN_INFO
>> +"PERCPU: Allocating %lu bytes of per cpu data (main)\n", size);
>>  ptr = alloc_bootmem_pages(size * nr_possible_cpus);
> 
> but looks fine to me.
> 
>   Ingo

Sorry, I should have noted this.  The primary reason I put this in, is
that if the HAVE_SETUP_PER_CPU_AREA is not set when it should be, then
the incorrect (generic) setup_per_cpu_areas() is used and weird things
happen later on.  The above line documents that PERCPU has been allocated
by init/main.c version of this function in the startup messages.
(Since it's a static function, there is no "duplicate label" error in
the linker.)

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] percpu: fix DEBUG_PREEMPT per_cpu checking

2008-02-11 Thread Mike Travis

Thanks Hugh for catching this.  I've added it to my test code base
and it works fine for x86_64...

Reviewed-by: Mike Travis <[EMAIL PROTECTED]>

Hugh Dickins wrote:
> Recent percpu changes have broken CONFIG_DEBUG_PREEMPT's per_cpu checking
> on several architectures.  On s390, sparc64 and x86 it's been weakened to
> not checking at all; whereas on powerpc64 it's become too strict, issuing
> warnings from __raw_get_cpu_var in io_schedule and init_timer for example.
> 
> Fix this by weakening powerpc's __my_cpu_offset to use the non-checking
> local_paca instead of get_paca (which itself contains such a check);
> and strengthening the generic my_cpu_offset to go the old slow way via
> smp_processor_id when CONFIG_DEBUG_PREEMPT (debug_smp_processor_id is
> where all the knowledge of what's correct when lives).
> 
> Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
> ---
> ia64 would be in the first group too, but does not support DEBUG_PREEMPT?
> 
>  include/asm-generic/percpu.h |2 ++
>  include/asm-powerpc/percpu.h |2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> --- 2.6.24-git18/include/asm-generic/percpu.h 2008-02-08 11:31:30.0 
> +
> +++ linux/include/asm-generic/percpu.h2008-02-08 12:27:08.0 
> +
> @@ -32,6 +32,8 @@ extern unsigned long __per_cpu_offset[NR
>   */
>  #ifndef __my_cpu_offset
>  #define __my_cpu_offset per_cpu_offset(raw_smp_processor_id())
> +#endif
> +#ifdef CONFIG_DEBUG_PREEMPT
>  #define my_cpu_offset per_cpu_offset(smp_processor_id())
>  #else
>  #define my_cpu_offset __my_cpu_offset
> --- 2.6.24-git18/include/asm-powerpc/percpu.h 2008-02-08 11:31:31.0 
> +
> +++ linux/include/asm-powerpc/percpu.h2008-02-08 12:29:17.0 
> +
> @@ -13,7 +13,7 @@
>  #include 
>  
>  #define __per_cpu_offset(cpu) (paca[cpu].data_offset)
> -#define __my_cpu_offset get_paca()->data_offset
> +#define __my_cpu_offset local_paca->data_offset
>  #define per_cpu_offset(x) (__per_cpu_offset(x))
>  
>  #endif /* CONFIG_SMP */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 git2/mm1: cpu_to_node mapping to non-existant nodes causing boot failure

2008-02-13 Thread Mike Travis

Mel Gorman wrote:
> On (03/02/08 17:16), Andrew Morton didst pronounce:
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/
>>
> 
> bl6-13 (4-way x86_64 machine) from test.kernel.org is failing to boot recent
> -mm and mainline trees. I noticed it when testing -mm before rebasing other
> patches but the oops on mainline looks the same. The full console log is
> below but the important difference between a working and non-working kernel
> is the following
> 
> -PERCPU: Allocating 62512 bytes of per cpu data
> -Built 1 zonelists in Node order, mobility grouping on.  Total pages: 255875
> +PERCPU: Allocating 65560 bytes of per cpu data
> +cpu with no node 2, num_online_nodes 1
> +cpu with no node 3, num_online_nodes 1
> +Built 1 zonelists in Node order, mobility grouping on.  Total pages:
> 251257
> 
> "cpu with no node 2" is actually saying that cpu 2 has no node and the
> message is a just misleading. The number of online nodes and cpu mappings
> are not adding up as I got this from a debugging patch

I'll take a closer look though I've not been able to duplicate your
error yet.  It does appear from the message text that the code is
out-of-date.  The latest "setup_per_cpu_areas()" should say:

   "cpu %d has no node, num_online_nodes %d\n",
i, num_online_nodes());

There are a number of backed up patches in the queue.  I'm resubmitting
the whole set re-based on 2.6.25-rc1 shortly.  (I don't know though, that
any will address this problem.)

Thanks,
Mike

> 
> Online nodes
>  o 0
> CPU <-> node mappings (cpu_to_node)
>  o CPU 0 -> 0
>  o CPU 1 -> 0
>  o CPU 2 -> 1
>  o CPU 3 -> 1
> 
> As the failing code in __alloc_pages() is;
> 
> restart:
> z = zonelist->zones;  /* the list of zones suitable for gfp_mask */
> if (unlikely(*z == NULL)) {
> 
> it implies that an attempt is been made to use an uninitialised zonelist.
> 
> If I bodge cpu_to_node() to returning 0,the machine boots but I didn't
> see an obvious candidate in origin.patch for the root-cause when I looked
> around. I'll bisect this in the morning if this is not a known problem
> and no one suggests a possibility.

The 
> 
> Linux version 2.6.24-mm1-autokern1 ([EMAIL PROTECTED]) (gcc version 4.1.1 
> 20060525 (Red Hat 4.1.1-1)) #1 SMP Wed Feb 13 08:15:47 CST 2008
> Command line: ro root=/dev/VolGroup00/LogVol00 console=tty0 
> console=ttyS1,19200 selinux=no autobench_args: 
> root=/dev/mapper/VolGroup00-LogVol00 ABAT:1202913709 
> earlyprintk=serial,ttyS1,19200
> BIOS-provided physical RAM map:
>  BIOS-e820:  - 0009d400 (usable)
>  BIOS-e820: 0009d400 - 000a (reserved)
>  BIOS-e820: 000e - 0010 (reserved)
>  BIOS-e820: 0010 - 3ffcddc0 (usable)
>  BIOS-e820: 3ffcddc0 - 3ffd (ACPI data)
>  BIOS-e820: 3ffd - 4000 (reserved)
>  BIOS-e820: fec0 - 0001 (reserved)
> console [earlyser0] enabled
> end_pfn_map = 1048576
> kernel direct mapping tables up to 1 @ 8000-d000
> DMI 2.3 present.
> ACPI: RSDP 000FDFC0, 0014 (r0 IBM   )
> ACPI: RSDT 3FFCFF80, 0034 (r1 IBMSERBLADE 1000 IBM  45444F43)
> ACPI: FACP 3FFCFEC0, 0084 (r2 IBMSERBLADE 1000 IBM  45444F43)
> ACPI: DSDT 3FFCDDC0, 1EA6 (r1 IBMSERBLADE 1000 INTL  2002025)
> ACPI: FACS 3FFCFCC0, 0040
> ACPI: APIC 3FFCFE00, 009C (r1 IBMSERBLADE 1000 IBM  45444F43)
> ACPI: SRAT 3FFCFD40, 0098 (r1 IBMSERBLADE 1000 IBM  45444F43)
> ACPI: HPET 3FFCFD00, 0038 (r1 IBMSERBLADE 1000 IBM  45444F43)
> SRAT: PXM 0 -> APIC 0 -> Node 0
> SRAT: PXM 0 -> APIC 1 -> Node 0
> SRAT: PXM 1 -> APIC 2 -> Node 1
> SRAT: PXM 1 -> APIC 3 -> Node 1
> SRAT: Node 0 PXM 0 0-4000
> Bootmem setup node 0 -3ffcd000
> early res: 0 [0-fff] BIOS data page
> early res: 1 [6000-7fff] SMP_TRAMPOLINE
> early res: 2 [20-9e87ef] TEXT DATA BSS
> early res: 3 [37e5f000-37fef981] RAMDISK
> early res: 4 [9d400-a03ff] EBDA
> early res: 5 [8000-afff] PGTABLE
> Zone PFN ranges:
>   DMA 0 -> 4096
>   DMA324096 ->  1048576
>   Normal1048576 ->  1048576
> Movable zone start PFN for each node
> early_node_map[2] active PFN ranges
> 0:0 ->  157
> 0:  256 ->   262093
> Detected use of extended apic ids on hypertransport bus
> Detected use of extended apic ids on hypertransport bus
> ACPI: PM-Timer IO Port: 0x2208
> ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> Processor #0 (Bootup-CPU)
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> Processor #1
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
> Processor #2
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
> Processor #3
> ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x0e]

Re: [PATCH 2/4] acpi: change cpufreq tables to per_cpu variables

2008-02-13 Thread Mike Travis

Andrew Morton wrote:
> On Fri, 08 Feb 2008 15:37:40 -0800
> Mike Travis <[EMAIL PROTECTED]> wrote:
> 
>> Change cpufreq tables from arrays to per_cpu variables in
>> drivers/acpi/processor_thermal.c
>>
>> Based on linux-2.6.git + x86.git
> 
> I fixed a bunch of rejects in "[PATCH 1/4] cpufreq: change cpu freq tables
> to per_cpu variables" and it compiles OK.  But this one was beyond my
> should-i-repair-it threshold, sorry.

Should I rebase all the pending patches on 2.6.25-rc1 or 2.6.24-mm1
(or some other combination)?

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] bitmap relative operator for mempolicy extensions

2008-02-14 Thread Mike Travis

Christoph Lameter wrote:
> On Thu, 14 Feb 2008, Andi Kleen wrote:
> 
>> You're saying the kernel should use these relative masks internally?
> 
> There is just some thoughts about this. Did not have time to look into the 
> details. Mike?

There are a few places where the entire cpumask is not needed.  For
example, in the area of core siblings on a node.  There's a limit
to how many cores/threads can be on a node and the full 4k cpumask
is not needed.  How this pertains to this new functionality I'm
not sure yet.

>  
>> That means it would be impossible to run workloads that use the complete
>> machine because you couldn't represent all nodes.
> 
> Not sure how they are addressing this.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24 git2/mm1: cpu_to_node mapping to non-existant nodes causing boot failure

2008-02-14 Thread Mike Travis

Mel Gorman wrote:
> On (13/02/08 10:45), Mike Travis didst pronounce:
>> Mel Gorman wrote:
>>> On (03/02/08 17:16), Andrew Morton didst pronounce:
>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/
>>>>
>>> bl6-13 (4-way x86_64 machine) from test.kernel.org is failing to boot recent
>>> -mm and mainline trees. I noticed it when testing -mm before rebasing other
>>> patches but the oops on mainline looks the same. The full console log is
>>> below but the important difference between a working and non-working kernel
>>> is the following
>>>
>>> -PERCPU: Allocating 62512 bytes of per cpu data
>>> -Built 1 zonelists in Node order, mobility grouping on.  Total pages: 255875
>>> +PERCPU: Allocating 65560 bytes of per cpu data
>>> +cpu with no node 2, num_online_nodes 1
>>> +cpu with no node 3, num_online_nodes 1
>>> +Built 1 zonelists in Node order, mobility grouping on.  Total pages:
>>> 251257
>>>
>>> "cpu with no node 2" is actually saying that cpu 2 has no node and the
>>> message is a just misleading. The number of online nodes and cpu mappings
>>> are not adding up as I got this from a debugging patch
>> I'll take a closer look though I've not been able to duplicate your
>> error yet.  It does appear from the message text that the code is
>> out-of-date.  The latest "setup_per_cpu_areas()" should say:
>>
>>"cpu %d has no node, num_online_nodes %d\n",
>> i, num_online_nodes());
>>
>> There are a number of backed up patches in the queue.  I'm resubmitting
>> the whole set re-based on 2.6.25-rc1 shortly.  (I don't know though, that
>> any will address this problem.)
>>
> 
> According to git-bisect, the problem patch is below. It doesn't back out
> cleanly so I haven't verified for sure the bisect is correct yet.

This might make sense.  This code is in preparation for the extended
apic's available on the new processors.  I've tested the code with
our simulator (with no errors) and I'm setting up to test on a real
machine that has multiple numa nodes.  I wonder if maybe BIOS is not
providing correct node data, or the ACPI parsing is in error?  You
might try adding "apic=debug" to the boot command line.

For the short term, we can remove this patch if it's causing the
problem.  A more complete patch will be available soon that contains
the entire set of x2apic changes.

Thanks,
Mike
 
> 
> commit ef97001f3d869d7cc1956e0cc0d89e514e3f7db0
> Author: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
> Date:   Wed Jan 30 13:33:10 2008 +0100
> 
> x86: change size of APICIDs from u8 to u16
> 
> Change the size of APICIDs from u8 to u16.  This partially
> supports the new x2apic mode that will be present on future
> processor chips. (Chips actually support 32-bit APICIDs, but that
> change is more intrusive. Supporting 16-bit is sufficient for now).
> 
> Signed-off-by: Jack Steiner <[EMAIL PROTECTED]>
>     
> I've included just the partial change from u8 to u16 apicids.  The
> remaining x2apic changes will be in a separate patch.
> 
> In addition, the fake_node_to_pxm_map[] and fake_apicid_to_node[]
> tables have been moved from local data to the __initdata section
> reducing stack pressure when MAX_NUMNODES and MAX_LOCAL_APIC are
> increased in size.
> 
> Signed-off-by: Mike Travis <[EMAIL PROTECTED]>
> Reviewed-by: Christoph Lameter <[EMAIL PROTECTED]>
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
> 
> diff --git a/arch/x86/kernel/genapic_64.c b/arch/x86/kernel/genapic_64.c
> index ce703e2..ac2b78f 100644
> --- a/arch/x86/kernel/genapic_64.c
> +++ b/arch/x86/kernel/genapic_64.c
> @@ -32,10 +32,10 @@
>   * array during this time.  Is it zeroed when the per_cpu
>   * data area is removed.
>   */
> -u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata
> +u16 x86_cpu_to_apicid_init[NR_CPUS] __initdata
>   = { [0 ... NR_CPUS-1] = BAD_APICID };
>  void *x86_cpu_to_apicid_ptr;
> -DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
> +DEFINE_PER_CPU(u16, x86_cpu_to_apicid) = BAD_APICID;
>  EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid);
>  
>  struct genapic __read_mostly *genapic = &apic_flat;
> diff --git a/arch/x86/kernel/mpparse_64.c b/arch/x86/kernel/mpparse_64.c
> index ef4aab1..17d21e5 100644
> --- a/arch/x86/kernel/mpparse_64.c
> +++ b/arch/x86/kernel/mpparse_64.c
> @@ -6

Re: 2.6.24 git2/mm1: cpu_to_node mapping to non-existant nodes causing boot failure

2008-02-15 Thread Mike Travis

Mel Gorman wrote:
> On (14/02/08 12:41), Mike Travis didst pronounce:
>> Mel Gorman wrote:
>>> On (13/02/08 10:45), Mike Travis didst pronounce:
>>>> Mel Gorman wrote:
>>>>> On (03/02/08 17:16), Andrew Morton didst pronounce:
>>>>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24/2.6.24-mm1/
>>>>>>
>>>>> bl6-13 (4-way x86_64 machine) from test.kernel.org is failing to boot 
>>>>> recent
>>>>> -mm and mainline trees. I noticed it when testing -mm before rebasing 
>>>>> other
>>>>> patches but the oops on mainline looks the same. The full console log is
>>>>> below but the important difference between a working and non-working 
>>>>> kernel
>>>>> is the following
>>>>>
>>>>> -PERCPU: Allocating 62512 bytes of per cpu data
>>>>> -Built 1 zonelists in Node order, mobility grouping on.  Total pages: 
>>>>> 255875
>>>>> +PERCPU: Allocating 65560 bytes of per cpu data
>>>>> +cpu with no node 2, num_online_nodes 1
>>>>> +cpu with no node 3, num_online_nodes 1
>>>>> +Built 1 zonelists in Node order, mobility grouping on.  Total pages:
>>>>> 251257
>>>>>
>>>>> "cpu with no node 2" is actually saying that cpu 2 has no node and the
>>>>> message is a just misleading. The number of online nodes and cpu mappings
>>>>> are not adding up as I got this from a debugging patch
>>>> I'll take a closer look though I've not been able to duplicate your
>>>> error yet.  It does appear from the message text that the code is
>>>> out-of-date.  The latest "setup_per_cpu_areas()" should say:
>>>>
>>>>"cpu %d has no node, num_online_nodes %d\n",
>>>> i, num_online_nodes());
>>>>
>>>> There are a number of backed up patches in the queue.  I'm resubmitting
>>>> the whole set re-based on 2.6.25-rc1 shortly.  (I don't know though, that
>>>> any will address this problem.)
>>>>
>>> According to git-bisect, the problem patch is below. It doesn't back out
>>> cleanly so I haven't verified for sure the bisect is correct yet.
>> This might make sense.  This code is in preparation for the extended
>> apic's available on the new processors.  I've tested the code with
>> our simulator (with no errors) and I'm setting up to test on a real
>> machine that has multiple numa nodes.  I wonder if maybe BIOS is not
>> providing correct node data, or the ACPI parsing is in error?  You
>> might try adding "apic=debug" to the boot command line.
>>
> 
> I tried this, but the dmesg complained about a malformed option. I'll
> check out why tomorrow but it didn't appear particularly helpful.
> 
>> For the short term, we can remove this patch if it's causing the
>> problem.  A more complete patch will be available soon that contains
>> the entire set of x2apic changes.
>>
> 
> If you send me patches to apply on top of 2.6.25-rc1, I'll give them a spin
> on the machine in question. Reverting didn't work out very well as there are
> too many collisions with patches that were applied later. I eventually got
> the machine booting but it only succeeds because it only brings up one core
> on each processor.  The patch, which is pretty brain damaged is below in case
> it helps you guess what the real problem is. dmesg logs are attached of the
> vanilla failure with acpi=debug and the log with the patch applied showing
> "__cpu_up: bad cpu 1" and "__cpu_up: bad cpu3" (i.e. the second cores of
> each machine).
> 

Thanks Mel.  I'm heading up to MV today to debug on the NUMA machine.

-Mike
> 
> diff -ru linux-2.6/arch/x86/kernel/genapic_64.c 
> linux-2.6-working/arch/x86/kernel/genapic_64.c
> --- linux-2.6/arch/x86/kernel/genapic_64.c2008-02-14 16:32:55.0 
> -0600
> +++ linux-2.6-working/arch/x86/kernel/genapic_64.c2008-02-14 
> 15:46:18.0 -0600
> @@ -25,10 +25,10 @@
>  #endif
>  
>  /* which logical CPU number maps to which CPU (physical APIC ID) */
> -u16 x86_cpu_to_apicid_init[NR_CPUS] __initdata
> +u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata
>   = { [0 ... NR_CPUS-1] = BAD_APICID };
>  void *x86_cpu_to_apicid_early_ptr;
> -DEFINE_PER_CPU(u16, x86_cpu_to_apicid) = BAD_APICID;
> +DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
>  E

[PATCH 2/2] x86/UV: Add call to KGDB/KDB from NMI handler

2013-10-02 Thread Mike Travis

This patch restores the capability to enter KDB (and KGDB) from the UV NMI
handler.  This is needed because the UV system console is not capable of
sending the 'break' signal to the serial console port.  It is also useful
when the kernel is hung in such a way that it isn't responding to normal
external I/O, so sending 'g' to sysreq-trigger does not work either.

Another benefit of the external NMI command is that all the cpus receive
the NMI signal at roughly the same time so they are more closely aligned
timewise.

It utilizes the newly added kgdb_nmicallin function to gain entry
to KGDB/KDB by the master.  The slaves still enter via the standard
kgdb_nmicallback function.  It also uses the new 'send_ready' pointer
to tell KGDB/KDB to signal the slaves when to proceed into the KGDB
slave loop.

It is enabled when the nmi action is set to "kdb" and the kernel is
built with CONFIG_KDB enabled.  Note that if kgdb is connected that
interface will be used instead.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,7 +21,9 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -32,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -153,8 +156,9 @@ module_param_named(retry_count, uv_nmi_r
  *  "dump" - dump process stack for each cpu
  *  "ips"  - dump IP info for each cpu
  *  "kdump"- do crash dump
+ *  "kdb"  - enter KDB/KGDB (default)
  */
-static char uv_nmi_action[8] = "dump";
+static char uv_nmi_action[8] = "kdb";
 module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
 
 static inline bool uv_nmi_action_is(const char *action)
@@ -540,6 +544,43 @@ static inline void uv_nmi_kdump(int cpu,
 }
 #endif /* !CONFIG_KEXEC */
 
+#ifdef CONFIG_KGDB_KDB
+/* Call KDB from NMI handler */
+static void uv_call_kdb(int cpu, struct pt_regs *regs, int master)
+{
+   int ret;
+
+   if (master) {
+   /* call KGDB NMI handler as MASTER */
+   ret = kgdb_nmicallin(cpu, X86_TRAP_NMI, regs,
+   &uv_nmi_slave_continue);
+   if (ret) {
+   pr_alert("KDB returned error, is kgdboc set?\n");
+   atomic_set(&uv_nmi_slave_continue, SLAVE_EXIT);
+   }
+   } else {
+   /* wait for KGDB signal that it's ready for slaves to enter */
+   int sig;
+
+   do {
+   cpu_relax();
+   sig = atomic_read(&uv_nmi_slave_continue);
+   } while (!sig);
+
+   /* call KGDB as slave */
+   if (sig == SLAVE_CONTINUE)
+   kgdb_nmicallback(cpu, regs);
+   }
+   uv_nmi_sync_exit(master);
+}
+
+#else /* !CONFIG_KGDB_KDB */
+static inline void uv_call_kdb(int cpu, struct pt_regs *regs, int master)
+{
+   pr_err("UV: NMI error: KGDB/KDB is not enabled in this kernel\n");
+}
+#endif /* !CONFIG_KGDB_KDB */
+
 /*
  * UV NMI handler
  */
@@ -576,6 +617,10 @@ static int uv_handle_nmi(unsigned int re
if (uv_nmi_action_is("ips") || uv_nmi_action_is("dump"))
uv_nmi_dump_state(cpu, regs, master);
 
+   /* Call KDB if enabled */
+   else if (uv_nmi_action_is("kdb"))
+   uv_call_kdb(cpu, regs, master);
+
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] Add back capability of calling KGDB/KDB via UV NMI call.

2013-10-02 Thread Mike Travis


The following two patches add the capability of calling KGDB/KDB after
receiving the NMI signal from the UV system 'power nmi' command.  This
is mainly required because the system console on UV cannot send the
break signal so when the system I/O is not working, the power nmi
command from the CMC is the only method to interrupt the system.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] KGDB/KDB: add support for external NMI handler to call KGDB/KDB.

2013-10-02 Thread Mike Travis

This patch adds a kgdb_nmicallin() interface that can be used by
external NMI handlers to call the KGDB/KDB handler.  The primary need
for this is for those types of NMI interrupts where all the CPUs
have already received the NMI signal.  Therefore no send_IPI(NMI)
is required, and in fact it will cause a 2nd unhandled NMI to occur.
This generates the "Dazed and Confuzed" messages.

Since all the CPUs are getting the NMI at roughly the same time, it's not
guaranteed that the first CPU that hits the NMI handler will manage to
enter KGDB and set the dbg_master_lock before the slaves start entering.
The new argument "send_ready" was added for KGDB to signal the NMI handler
to release the slave CPUs for entry into KGDB.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
Acked-by: Jason Wessel 
---
v3: make entry into SYSTEM_NMI section more specific
---
 include/linux/kdb.h |1 +
 include/linux/kgdb.h|1 +
 kernel/debug/debug_core.c   |   30 +-
 kernel/debug/debug_core.h   |1 +
 kernel/debug/kdb/kdb_debugger.c |5 -
 kernel/debug/kdb/kdb_main.c |3 +++
 6 files changed, 39 insertions(+), 2 deletions(-)

--- linux.orig/include/linux/kdb.h
+++ linux/include/linux/kdb.h
@@ -109,6 +109,7 @@ typedef enum {
KDB_REASON_RECURSE, /* Recursive entry to kdb;
 * regs probably valid */
KDB_REASON_SSTEP,   /* Single Step trap. - regs valid */
+   KDB_REASON_SYSTEM_NMI,  /* In NMI due to SYSTEM cmd; regs valid */
 } kdb_reason_t;
 
 extern int kdb_trap_printk;
--- linux.orig/include/linux/kgdb.h
+++ linux/include/linux/kgdb.h
@@ -310,6 +310,7 @@ extern int
 kgdb_handle_exception(int ex_vector, int signo, int err_code,
  struct pt_regs *regs);
 extern int kgdb_nmicallback(int cpu, void *regs);
+extern int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *snd_rdy);
 extern void gdbstub_exit(int status);
 
 extern int kgdb_single_step;
--- linux.orig/kernel/debug/debug_core.c
+++ linux/kernel/debug/debug_core.c
@@ -575,8 +575,12 @@ return_normal:
raw_spin_lock(&dbg_slave_lock);
 
 #ifdef CONFIG_SMP
+   /* If SYSTEM_NMI, slaves are already waiting */
+   if (ks->err_code == KDB_REASON_SYSTEM_NMI)
+   atomic_set(ks->send_ready, 1);
+
/* Signal the other CPUs to enter kgdb_wait() */
-   if ((!kgdb_single_step) && kgdb_do_roundup)
+   else if ((!kgdb_single_step) && kgdb_do_roundup)
kgdb_roundup_cpus(flags);
 #endif
 
@@ -729,6 +733,30 @@ int kgdb_nmicallback(int cpu, void *regs
return 0;
}
 #endif
+   return 1;
+}
+
+int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *send_ready)
+{
+#ifdef CONFIG_SMP
+   if (!kgdb_io_ready(0) || !send_ready)
+   return 1;
+
+   if (kgdb_info[cpu].enter_kgdb == 0) {
+   struct kgdb_state kgdb_var;
+   struct kgdb_state *ks = &kgdb_var;
+
+   memset(ks, 0, sizeof(struct kgdb_state));
+   ks->cpu = cpu;
+   ks->ex_vector   = trapnr;
+   ks->signo   = SIGTRAP;
+   ks->err_code= KDB_REASON_SYSTEM_NMI;
+   ks->linux_regs  = regs;
+   ks->send_ready  = send_ready;
+   kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
+   return 0;
+   }
+#endif
return 1;
 }
 
--- linux.orig/kernel/debug/debug_core.h
+++ linux/kernel/debug/debug_core.h
@@ -26,6 +26,7 @@ struct kgdb_state {
unsigned long   threadid;
longkgdb_usethreadid;
struct pt_regs  *linux_regs;
+   atomic_t*send_ready;
 };
 
 /* Exception state values */
--- linux.orig/kernel/debug/kdb/kdb_debugger.c
+++ linux/kernel/debug/kdb/kdb_debugger.c
@@ -69,7 +69,10 @@ int kdb_stub(struct kgdb_state *ks)
if (atomic_read(&kgdb_setting_breakpoint))
reason = KDB_REASON_KEYBOARD;
 
-   if (in_nmi())
+   if (ks->err_code == KDB_REASON_SYSTEM_NMI && ks->signo == SIGTRAP)
+   reason = KDB_REASON_SYSTEM_NMI;
+
+   else if (in_nmi())
reason = KDB_REASON_NMI;
 
for (i = 0, bp = kdb_breakpoints; i < KDB_MAXBPT; i++, bp++) {
--- linux.orig/kernel/debug/kdb/kdb_main.c
+++ linux/kernel/debug/kdb/kdb_main.c
@@ -1200,6 +1200,9 @@ static int kdb_local(kdb_reason_t reason
   instruction_pointer(regs));
kdb_dumpregs(regs);
break;
+   case KDB_REASON_SYSTEM_NMI:
+   kdb_printf("due to System NonMaskable Interrupt\n");
+   break;
case KDB_REASON_NMI:
kdb_printf("due t

[PATCH 1/1] PATCH: KGDB/KDB Fix no KDB config problem.

2013-10-03 Thread Mike Travis

Some code added to the debug_core module had KDB dependencies
that it shouldn't have.

Signed-off-by: Mike Travis 
---
 kernel/debug/debug_core.c |8 
 kernel/debug/debug_core.h |2 ++
 2 files changed, 6 insertions(+), 4 deletions(-)

--- linux.orig/kernel/debug/debug_core.c
+++ linux/kernel/debug/debug_core.c
@@ -575,8 +575,8 @@ return_normal:
raw_spin_lock(&dbg_slave_lock);
 
 #ifdef CONFIG_SMP
-   /* If SYSTEM_NMI, slaves are already waiting */
-   if (ks->err_code == KDB_REASON_SYSTEM_NMI)
+   /* If send_ready set, slaves are already waiting */
+   if (ks->send_ready)
atomic_set(ks->send_ready, 1);
 
/* Signal the other CPUs to enter kgdb_wait() */
@@ -682,11 +682,11 @@ kgdb_handle_exception(int evector, int s
if (arch_kgdb_ops.enable_nmi)
arch_kgdb_ops.enable_nmi(0);
 
+   memset(ks, 0, sizeof(struct kgdb_state));
ks->cpu = raw_smp_processor_id();
ks->ex_vector   = evector;
ks->signo   = signo;
ks->err_code= ecode;
-   ks->kgdb_usethreadid= 0;
ks->linux_regs  = regs;
 
if (kgdb_reenter_check(ks))
@@ -750,7 +750,7 @@ int kgdb_nmicallin(int cpu, int trapnr,
ks->cpu = cpu;
ks->ex_vector   = trapnr;
ks->signo   = SIGTRAP;
-   ks->err_code= KDB_REASON_SYSTEM_NMI;
+   ks->err_code= KGDB_KDB_REASON_SYSTEM_NMI;
ks->linux_regs  = regs;
ks->send_ready  = send_ready;
kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
--- linux.orig/kernel/debug/debug_core.h
+++ linux/kernel/debug/debug_core.h
@@ -75,11 +75,13 @@ extern int kdb_stub(struct kgdb_state *k
 extern int kdb_parse(const char *cmdstr);
 extern int kdb_common_init_state(struct kgdb_state *ks);
 extern int kdb_common_deinit_state(void);
+#define KGDB_KDB_REASON_SYSTEM_NMI KDB_REASON_SYSTEM_NMI
 #else /* ! CONFIG_KGDB_KDB */
 static inline int kdb_stub(struct kgdb_state *ks)
 {
return DBG_PASS_EVENT;
 }
+#define KGDB_KDB_REASON_SYSTEM_NMI 0
 #endif /* CONFIG_KGDB_KDB */
 
 #endif /* _DEBUG_CORE_H_ */

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/1] PATCH: KGDB/KDB Fix no KDB config problem.

2013-10-03 Thread Mike Travis


* Fix build problem when KDB is not defined.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PATCH: KGDB/KDB Fix no KDB config problem.

2013-10-03 Thread Mike Travis



On 10/3/2013 9:51 AM, Ingo Molnar wrote:
> 
> * Mike Travis  wrote:
> 
>> Some code added to the debug_core module had KDB dependencies
>> that it shouldn't have.
>>
>> Signed-off-by: Mike Travis 
>> ---
>>  kernel/debug/debug_core.c |8 
>>  kernel/debug/debug_core.h |2 ++
>>  2 files changed, 6 insertions(+), 4 deletions(-)
>>
>> --- linux.orig/kernel/debug/debug_core.c
>> +++ linux/kernel/debug/debug_core.c
>> @@ -575,8 +575,8 @@ return_normal:
>>  raw_spin_lock(&dbg_slave_lock);
>>  
>>  #ifdef CONFIG_SMP
>> -/* If SYSTEM_NMI, slaves are already waiting */
>> -if (ks->err_code == KDB_REASON_SYSTEM_NMI)
>> +/* If send_ready set, slaves are already waiting */
>> +if (ks->send_ready)
>>  atomic_set(ks->send_ready, 1);
>>  
>>  /* Signal the other CPUs to enter kgdb_wait() */
>> @@ -682,11 +682,11 @@ kgdb_handle_exception(int evector, int s
>>  if (arch_kgdb_ops.enable_nmi)
>>  arch_kgdb_ops.enable_nmi(0);
>>  
>> +memset(ks, 0, sizeof(struct kgdb_state));
>>  ks->cpu = raw_smp_processor_id();
>>  ks->ex_vector   = evector;
>>  ks->signo   = signo;
>>  ks->err_code= ecode;
>> -ks->kgdb_usethreadid= 0;
>>  ks->linux_regs  = regs;
>>  
>>  if (kgdb_reenter_check(ks))
>> @@ -750,7 +750,7 @@ int kgdb_nmicallin(int cpu, int trapnr,
>>  ks->cpu = cpu;
>>  ks->ex_vector   = trapnr;
>>  ks->signo   = SIGTRAP;
>> -ks->err_code= KDB_REASON_SYSTEM_NMI;
>> +ks->err_code= KGDB_KDB_REASON_SYSTEM_NMI;
>>  ks->linux_regs  = regs;
>>  ks->send_ready  = send_ready;
>>  kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
>> --- linux.orig/kernel/debug/debug_core.h
>> +++ linux/kernel/debug/debug_core.h
>> @@ -75,11 +75,13 @@ extern int kdb_stub(struct kgdb_state *k
>>  extern int kdb_parse(const char *cmdstr);
>>  extern int kdb_common_init_state(struct kgdb_state *ks);
>>  extern int kdb_common_deinit_state(void);
>> +#define KGDB_KDB_REASON_SYSTEM_NMI KDB_REASON_SYSTEM_NMI
>>  #else /* ! CONFIG_KGDB_KDB */
>>  static inline int kdb_stub(struct kgdb_state *ks)
>>  {
>>  return DBG_PASS_EVENT;
>>  }
>> +#define KGDB_KDB_REASON_SYSTEM_NMI 0
>>  #endif /* CONFIG_KGDB_KDB */
>>  
>>  #endif /* _DEBUG_CORE_H_ */
> 
> Hm, the KGDB_KDB_REASON_SYSTEM_NMI definition is a bit ugly. I still think 
> there are layering violations here and just kludging it around doesn't 
> solve it - a helper function that keeps kgdb details to the kgdb code 
> would.
> 
> Anyway, if Jason is fine with this solution and upholds his Acked-by then 
> I'll merge this into the first patch and apply the two patches.
> 
> Thanks,
> 
>   Ingo
> 

Would you prefer a simple #ifdef CONFIG_KGDB_KDB around the assignment?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PATCH: KGDB/KDB Fix no KDB config problem.

2013-10-03 Thread Mike Travis



On 10/3/2013 10:15 AM, Ingo Molnar wrote:
> 
> * Mike Travis  wrote:
> 
>>
>>
>> On 10/3/2013 9:51 AM, Ingo Molnar wrote:
>>>
>>> * Mike Travis  wrote:
>>>
>>>> Some code added to the debug_core module had KDB dependencies
>>>> that it shouldn't have.
>>>>
>>>> Signed-off-by: Mike Travis 
>>>> ---
>>>>  kernel/debug/debug_core.c |8 
>>>>  kernel/debug/debug_core.h |2 ++
>>>>  2 files changed, 6 insertions(+), 4 deletions(-)
>>>>
>>>> --- linux.orig/kernel/debug/debug_core.c
>>>> +++ linux/kernel/debug/debug_core.c
>>>> @@ -575,8 +575,8 @@ return_normal:
>>>>raw_spin_lock(&dbg_slave_lock);
>>>>  
>>>>  #ifdef CONFIG_SMP
>>>> -  /* If SYSTEM_NMI, slaves are already waiting */
>>>> -  if (ks->err_code == KDB_REASON_SYSTEM_NMI)
>>>> +  /* If send_ready set, slaves are already waiting */
>>>> +  if (ks->send_ready)
>>>>atomic_set(ks->send_ready, 1);
>>>>  
>>>>/* Signal the other CPUs to enter kgdb_wait() */
>>>> @@ -682,11 +682,11 @@ kgdb_handle_exception(int evector, int s
>>>>if (arch_kgdb_ops.enable_nmi)
>>>>arch_kgdb_ops.enable_nmi(0);
>>>>  
>>>> +  memset(ks, 0, sizeof(struct kgdb_state));
>>>>ks->cpu = raw_smp_processor_id();
>>>>ks->ex_vector   = evector;
>>>>ks->signo   = signo;
>>>>ks->err_code= ecode;
>>>> -  ks->kgdb_usethreadid= 0;
>>>>ks->linux_regs  = regs;
>>>>  
>>>>if (kgdb_reenter_check(ks))
>>>> @@ -750,7 +750,7 @@ int kgdb_nmicallin(int cpu, int trapnr,
>>>>ks->cpu = cpu;
>>>>ks->ex_vector   = trapnr;
>>>>ks->signo   = SIGTRAP;
>>>> -  ks->err_code= KDB_REASON_SYSTEM_NMI;
>>>> +  ks->err_code= KGDB_KDB_REASON_SYSTEM_NMI;
>>>>ks->linux_regs  = regs;
>>>>ks->send_ready  = send_ready;
>>>>kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
>>>> --- linux.orig/kernel/debug/debug_core.h
>>>> +++ linux/kernel/debug/debug_core.h
>>>> @@ -75,11 +75,13 @@ extern int kdb_stub(struct kgdb_state *k
>>>>  extern int kdb_parse(const char *cmdstr);
>>>>  extern int kdb_common_init_state(struct kgdb_state *ks);
>>>>  extern int kdb_common_deinit_state(void);
>>>> +#define KGDB_KDB_REASON_SYSTEM_NMI KDB_REASON_SYSTEM_NMI
>>>>  #else /* ! CONFIG_KGDB_KDB */
>>>>  static inline int kdb_stub(struct kgdb_state *ks)
>>>>  {
>>>>return DBG_PASS_EVENT;
>>>>  }
>>>> +#define KGDB_KDB_REASON_SYSTEM_NMI 0
>>>>  #endif /* CONFIG_KGDB_KDB */
>>>>  
>>>>  #endif /* _DEBUG_CORE_H_ */
>>>
>>> Hm, the KGDB_KDB_REASON_SYSTEM_NMI definition is a bit ugly. I still think 
>>> there are layering violations here and just kludging it around doesn't 
>>> solve it - a helper function that keeps kgdb details to the kgdb code 
>>> would.
>>>
>>> Anyway, if Jason is fine with this solution and upholds his Acked-by then 
>>> I'll merge this into the first patch and apply the two patches.
>>>
>>> Thanks,
>>>
>>> Ingo
>>>
>>
>> Would you prefer a simple #ifdef CONFIG_KGDB_KDB around the assignment?
> 
> An #ifdef doesn't solve the layering violation!
> 
> kernel/debug/debug_core.c didn't have any serious use of #ifdef 
> CONFIG_KGDB_KDB before (it used it to flavor a few printks) and I'm not 
> convinced it needs one for this feature either.
> 
> So if that whole chunk is kdb dependent, why not shuffle that into a 
> nicely named function and host it somewhere appropriate in 
> kernel/debug/kdb/? That function would turn into an empty inline function 
> in the !CONFIG_KGDB_KDB case.
> 
> But ... it's up to Jason really whether he wants to abstract that piece of 
> code out, I'm just kibitzing here really.
> 
> Thanks,
> 
>   Ingo
> 

If I moved it to KDB then we would lose the capability to enter KGDB
mode and the gdb_stub functionality.  The #ifdef could have a comment
to this effect?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] PATCH: KGDB/KDB Fix no KDB config problem.

2013-10-03 Thread Mike Travis



On 10/3/2013 10:53 AM, Mike Travis wrote:
> 
> 
> On 10/3/2013 10:15 AM, Ingo Molnar wrote:
>>
>> * Mike Travis  wrote:
>>
>>>
>>>
>>> On 10/3/2013 9:51 AM, Ingo Molnar wrote:
>>>>
>>>> * Mike Travis  wrote:
>>>>
>>>>> Some code added to the debug_core module had KDB dependencies
>>>>> that it shouldn't have.
>>>>>
>>>>> Signed-off-by: Mike Travis 
>>>>> ---
>>>>>  kernel/debug/debug_core.c |8 
>>>>>  kernel/debug/debug_core.h |2 ++
>>>>>  2 files changed, 6 insertions(+), 4 deletions(-)
>>>>>
>>>>> --- linux.orig/kernel/debug/debug_core.c
>>>>> +++ linux/kernel/debug/debug_core.c
>>>>> @@ -575,8 +575,8 @@ return_normal:
>>>>>   raw_spin_lock(&dbg_slave_lock);
>>>>>  
>>>>>  #ifdef CONFIG_SMP
>>>>> - /* If SYSTEM_NMI, slaves are already waiting */
>>>>> - if (ks->err_code == KDB_REASON_SYSTEM_NMI)
>>>>> + /* If send_ready set, slaves are already waiting */
>>>>> + if (ks->send_ready)
>>>>>   atomic_set(ks->send_ready, 1);
>>>>>  
>>>>>   /* Signal the other CPUs to enter kgdb_wait() */
>>>>> @@ -682,11 +682,11 @@ kgdb_handle_exception(int evector, int s
>>>>>   if (arch_kgdb_ops.enable_nmi)
>>>>>   arch_kgdb_ops.enable_nmi(0);
>>>>>  
>>>>> + memset(ks, 0, sizeof(struct kgdb_state));
>>>>>   ks->cpu = raw_smp_processor_id();
>>>>>   ks->ex_vector   = evector;
>>>>>   ks->signo   = signo;
>>>>>   ks->err_code= ecode;
>>>>> - ks->kgdb_usethreadid= 0;
>>>>>   ks->linux_regs  = regs;
>>>>>  
>>>>>   if (kgdb_reenter_check(ks))
>>>>> @@ -750,7 +750,7 @@ int kgdb_nmicallin(int cpu, int trapnr,
>>>>>   ks->cpu = cpu;
>>>>>   ks->ex_vector   = trapnr;
>>>>>   ks->signo   = SIGTRAP;
>>>>> - ks->err_code= KDB_REASON_SYSTEM_NMI;
>>>>> + ks->err_code= KGDB_KDB_REASON_SYSTEM_NMI;
>>>>>   ks->linux_regs  = regs;
>>>>>   ks->send_ready  = send_ready;
>>>>>   kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
>>>>> --- linux.orig/kernel/debug/debug_core.h
>>>>> +++ linux/kernel/debug/debug_core.h
>>>>> @@ -75,11 +75,13 @@ extern int kdb_stub(struct kgdb_state *k
>>>>>  extern int kdb_parse(const char *cmdstr);
>>>>>  extern int kdb_common_init_state(struct kgdb_state *ks);
>>>>>  extern int kdb_common_deinit_state(void);
>>>>> +#define KGDB_KDB_REASON_SYSTEM_NMI KDB_REASON_SYSTEM_NMI
>>>>>  #else /* ! CONFIG_KGDB_KDB */
>>>>>  static inline int kdb_stub(struct kgdb_state *ks)
>>>>>  {
>>>>>   return DBG_PASS_EVENT;
>>>>>  }
>>>>> +#define KGDB_KDB_REASON_SYSTEM_NMI 0
>>>>>  #endif /* CONFIG_KGDB_KDB */
>>>>>  
>>>>>  #endif /* _DEBUG_CORE_H_ */
>>>>
>>>> Hm, the KGDB_KDB_REASON_SYSTEM_NMI definition is a bit ugly. I still think 
>>>> there are layering violations here and just kludging it around doesn't 
>>>> solve it - a helper function that keeps kgdb details to the kgdb code 
>>>> would.
>>>>
>>>> Anyway, if Jason is fine with this solution and upholds his Acked-by then 
>>>> I'll merge this into the first patch and apply the two patches.
>>>>
>>>> Thanks,
>>>>
>>>>Ingo
>>>>
>>>
>>> Would you prefer a simple #ifdef CONFIG_KGDB_KDB around the assignment?
>>
>> An #ifdef doesn't solve the layering violation!
>>
>> kernel/debug/debug_core.c didn't have any serious use of #ifdef 
>> CONFIG_KGDB_KDB before (it used it to flavor a few printks) and I'm not 
>> convinced it needs one for this feature either.
>>
>> So if that whole chunk is kdb dependent, why not shuffle that into a 
>> nicely named function and host it somewhere appropriate in 
>> kernel/debug/kdb/? That function would turn into an empty inline function 
>> in the !CONFIG_KGDB_KDB case.
>>
>> But ... it's up to Jason really whether he wants to abstract that piece of 
>> code out, I'm just kibitzing here really.
>>
>> Thanks,
>>
>>  Ingo
>>
> 
> If I moved it to KDB then we would lose the capability to enter KGDB
> mode and the gdb_stub functionality.  The #ifdef could have a comment
> to this effect?
> 

Another option would be to pass the code in as the caller knows if KDB
is defined or not?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/9] x86/UV: Add uvtrace support

2013-09-05 Thread Mike Travis

This patch adds support for the uvtrace module by providing a skeleton
call to the registered trace function.  It also provides another separate
'NMI' tracer that is triggered by the system wide 'power nmi' command.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/uv.h  |8 
 arch/x86/platform/uv/uv_nmi.c |   13 -
 2 files changed, 20 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -14,6 +14,13 @@ extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
 extern void uv_register_nmi_notifier(void);
 extern void uv_system_init(void);
+extern void (*uv_trace_nmi_func)(unsigned int reason, struct pt_regs *regs);
+extern void (*uv_trace_func)(const char *f, const int l, const char *fmt, ...);
+#define uv_trace(fmt, ...) \
+do {   \
+   if (unlikely(uv_trace_func))\
+   (uv_trace_func)(__func__, __LINE__, fmt, ##__VA_ARGS__);\
+} while (0)
 extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 struct mm_struct *mm,
 unsigned long start,
@@ -26,6 +33,7 @@ static inline enum uv_system_type get_uv
 static inline int is_uv_system(void)   { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
+static inline void uv_trace(void *fmt, ...){ }
 static inline void uv_register_nmi_notifier(void) { }
 static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -1,5 +1,5 @@
 /*
- * SGI NMI support routines
+ * SGI NMI/TRACE support routines
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -39,6 +39,13 @@
 #include 
 #include 
 
+void (*uv_trace_func)(const char *f, const int l, const char *fmt, ...);
+EXPORT_SYMBOL(uv_trace_func);
+
+void (*uv_trace_nmi_func)(unsigned int reason, struct pt_regs *regs);
+EXPORT_SYMBOL(uv_trace_nmi_func);
+
+
 /*
  * UV handler for NMI
  *
@@ -592,6 +599,10 @@ int uv_handle_nmi(unsigned int reason, s
return NMI_DONE;
}
 
+   /* Call possible NMI trace function */
+   if (unlikely(uv_trace_nmi_func))
+   (uv_trace_nmi_func)(reason, regs);
+
/* Indicate we are the first CPU into the NMI handler */
master = (atomic_read(&uv_nmi_cpu) == cpu);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/9] x86/UV: Add summary of cpu activity to UV NMI handler

2013-09-05 Thread Mike Travis

The standard NMI handler dumps the states of all the cpus.  This includes
a full register dump and stack trace.  This can be way more information
than what is needed.  This patch adds a "summary" dump that is basically
a form of the "ps" command.  It includes the symbolic IP address as well
as the command field and basic process information.

It is enabled when the nmi action is changed to "ips".

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   48 ++
 1 file changed, 44 insertions(+), 4 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -139,6 +139,19 @@ module_param_named(wait_count, uv_nmi_wa
 static int uv_nmi_retry_count = 500;
 module_param_named(retry_count, uv_nmi_retry_count, int, 0644);
 
+/*
+ * Valid NMI Actions:
+ *  "dump" - dump process stack for each cpu
+ *  "ips"  - dump IP info for each cpu
+ */
+static char uv_nmi_action[8] = "dump";
+module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
+
+static inline bool uv_nmi_action_is(const char *action)
+{
+   return (strncmp(uv_nmi_action, action, strlen(action)) == 0);
+}
+
 /* Setup which NMI support is present in system */
 static void uv_nmi_setup_mmrs(void)
 {
@@ -367,13 +380,38 @@ static void uv_nmi_wait(int master)
atomic_read(&uv_nmi_cpus_in_nmi), num_online_cpus());
 }
 
+static void uv_nmi_dump_cpu_ip_hdr(void)
+{
+   printk(KERN_DEFAULT
+   "\nUV: %4s %6s %-32s %s   (Note: PID 0 not listed)\n",
+   "CPU", "PID", "COMMAND", "IP");
+}
+
+static void uv_nmi_dump_cpu_ip(int cpu, struct pt_regs *regs)
+{
+   printk(KERN_DEFAULT "UV: %4d %6d %-32.32s ",
+   cpu, current->pid, current->comm);
+
+   printk_address(regs->ip, 1);
+}
+
 /* Dump this cpu's state */
 static void uv_nmi_dump_state_cpu(int cpu, struct pt_regs *regs)
 {
const char *dots = " . ";
 
-   printk(KERN_DEFAULT "UV:%sNMI process trace for CPU %d\n", dots, cpu);
-   show_regs(regs);
+   if (uv_nmi_action_is("ips")) {
+   if (cpu == 0)
+   uv_nmi_dump_cpu_ip_hdr();
+
+   if (current->pid != 0)
+   uv_nmi_dump_cpu_ip(cpu, regs);
+
+   } else if (uv_nmi_action_is("dump")) {
+   printk(KERN_DEFAULT
+   "UV:%sNMI process trace for CPU %d\n", dots, cpu);
+   show_regs(regs);
+   }
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_DUMP_DONE);
 }
 
@@ -420,7 +458,8 @@ static void uv_nmi_dump_state(int cpu, s
int ignored = 0;
int saved_console_loglevel = console_loglevel;
 
-   pr_alert("UV: tracing processes for %d CPUs from CPU %d\n",
+   pr_alert("UV: tracing %s for %d CPUs from CPU %d\n",
+   uv_nmi_action_is("ips") ? "IPs" : "processes",
atomic_read(&uv_nmi_cpus_in_nmi), cpu);
 
console_loglevel = uv_nmi_loglevel;
@@ -482,7 +521,8 @@ int uv_handle_nmi(unsigned int reason, s
uv_nmi_wait(master);
 
/* Dump state of each cpu */
-   uv_nmi_dump_state(cpu, regs, master);
+   if (uv_nmi_action_is("ips") || uv_nmi_action_is("dump"))
+   uv_nmi_dump_state(cpu, regs, master);
 
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/9] KGDB/KDB: add support for external NMI handler to call KGDB/KDB.

2013-09-05 Thread Mike Travis

This patch adds a kgdb_nmicallin() interface that can be used by
external NMI handlers to call the KGDB/KDB handler.  The primary need
for this is for those types of NMI interrupts where all the CPUs
have already received the NMI signal.  Therefore no send_IPI(NMI)
is required, and in fact it will cause a 2nd unhandled NMI to occur.
This generates the "Dazed and Confuzed" messages.

Since all the CPUs are getting the NMI at roughly the same time, it's not
guaranteed that the first CPU that hits the NMI handler will manage to
enter KGDB and set the dbg_master_lock before the slaves start entering.
The new argument "send_ready" was added for KGDB to signal the NMI handler
to release the slave CPUs for entry into KGDB.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 include/linux/kgdb.h  |1 +
 kernel/debug/debug_core.c |   41 +
 kernel/debug/debug_core.h |1 +
 3 files changed, 43 insertions(+)

--- linux.orig/include/linux/kgdb.h
+++ linux/include/linux/kgdb.h
@@ -310,6 +310,7 @@ extern int
 kgdb_handle_exception(int ex_vector, int signo, int err_code,
  struct pt_regs *regs);
 extern int kgdb_nmicallback(int cpu, void *regs);
+extern int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *snd_rdy);
 extern void gdbstub_exit(int status);
 
 extern int kgdb_single_step;
--- linux.orig/kernel/debug/debug_core.c
+++ linux/kernel/debug/debug_core.c
@@ -578,6 +578,10 @@ return_normal:
/* Signal the other CPUs to enter kgdb_wait() */
if ((!kgdb_single_step) && kgdb_do_roundup)
kgdb_roundup_cpus(flags);
+
+   /* If optional send ready pointer, signal CPUs to proceed */
+   if (kgdb_info[cpu].send_ready)
+   atomic_set(kgdb_info[cpu].send_ready, 1);
 #endif
 
/*
@@ -729,6 +733,43 @@ int kgdb_nmicallback(int cpu, void *regs
return 0;
}
 #endif
+   return 1;
+}
+
+int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *send_ready)
+{
+#ifdef CONFIG_SMP
+   if (!kgdb_io_ready(0))
+   return 1;
+
+   if (kgdb_info[cpu].enter_kgdb == 0) {
+   struct kgdb_state kgdb_var;
+   struct kgdb_state *ks = &kgdb_var;
+   int save_kgdb_do_roundup = kgdb_do_roundup;
+
+   memset(ks, 0, sizeof(struct kgdb_state));
+   ks->cpu = cpu;
+   ks->ex_vector   = trapnr;
+   ks->signo   = SIGTRAP;
+   ks->err_code= 0;
+   ks->kgdb_usethreadid= 0;
+   ks->linux_regs  = regs;
+
+   /* Do not broadcast NMI */
+   kgdb_do_roundup = 0;
+
+   /* Indicate there are slaves waiting */
+   kgdb_info[cpu].send_ready = send_ready;
+   kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
+   kgdb_do_roundup = save_kgdb_do_roundup;
+   kgdb_info[cpu].send_ready = NULL;
+
+   /* Wait till all the CPUs have quit from the debugger. */
+   while (atomic_read(&slaves_in_kgdb))
+   cpu_relax();
+   return 0;
+   }
+#endif
return 1;
 }
 
--- linux.orig/kernel/debug/debug_core.h
+++ linux/kernel/debug/debug_core.h
@@ -37,6 +37,7 @@ struct kgdb_state {
 struct debuggerinfo_struct {
void*debuggerinfo;
struct task_struct  *task;
+   atomic_t*send_ready;
int exception_state;
int ret_state;
int irq_depth;

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/9] x86/UV: Add ability to disable UV NMI handler

2013-09-05 Thread Mike Travis

For performance reasons, the NMI handler may be disabled to lessen the
performance impact caused by the multiple perf tools running concurently.
If the system nmi command is issued when the UV NMI handler is disabled,
the "Dazed and Confused" messages occur for all cpus.  The NMI handler is
disabled by setting the nmi disabled variable to '1'.  Setting it back to
'0' will re-enable the NMI handler.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   69 ++
 1 file changed, 69 insertions(+)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -73,6 +73,7 @@ static struct uv_hub_nmi_s **uv_hub_nmi_
 DEFINE_PER_CPU(struct uv_cpu_nmi_s, __uv_cpu_nmi);
 EXPORT_PER_CPU_SYMBOL_GPL(__uv_cpu_nmi);
 
+static int uv_nmi_registered;
 static unsigned long nmi_mmr;
 static unsigned long nmi_mmr_clear;
 static unsigned long nmi_mmr_pending;
@@ -130,6 +131,31 @@ module_param_named(ping_count, uv_nmi_pi
 static local64_t uv_nmi_ping_misses;
 module_param_named(ping_misses, uv_nmi_ping_misses, local64, 0644);
 
+static int uv_nmi_disabled;
+static int param_get_disabled(char *buffer, const struct kernel_param *kp)
+{
+   return sprintf(buffer, "%u\n", uv_nmi_disabled);
+}
+
+static void uv_nmi_notify_disabled(void);
+static int param_set_disabled(const char *val, const struct kernel_param *kp)
+{
+   int ret = param_set_bint(val, kp);
+
+   if (ret)
+   return ret;
+
+   uv_nmi_notify_disabled();
+   return 0;
+}
+
+static struct kernel_param_ops param_ops_disabled = {
+   .get = param_get_disabled,
+   .set = param_set_disabled,
+};
+#define param_check_disabled(name, p) __param_check(name, p, int)
+module_param_named(disabled, uv_nmi_disabled, disabled, 0644);
+
 /*
  * Following values allow tuning for large systems under heavy loading
  */
@@ -634,6 +660,8 @@ int uv_handle_nmi(unsigned int reason, s
atomic_set(&uv_nmi_cpus_in_nmi, -1);
atomic_set(&uv_nmi_cpu, -1);
atomic_set(&uv_in_nmi, 0);
+   if (uv_nmi_disabled)
+   uv_nmi_notify_disabled();
}
 
uv_nmi_touch_watchdogs();
@@ -664,11 +692,30 @@ int uv_handle_nmi_ping(unsigned int reas
 
 void uv_register_nmi_notifier(void)
 {
+   if (uv_nmi_registered || uv_nmi_disabled)
+   return;
+
if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
pr_warn("UV: NMI handler failed to register\n");
 
if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi_ping, 0, "uvping"))
pr_warn("UV: PING NMI handler failed to register\n");
+
+   uv_nmi_registered = 1;
+   pr_info("UV: NMI handler registered\n");
+}
+
+static void uv_nmi_disabled_msg(void)
+{
+   pr_err("UV: NMI handler disabled, power nmi command will be ignored\n");
+}
+
+static void uv_unregister_nmi_notifier(void)
+{
+   unregister_nmi_handler(NMI_UNKNOWN, "uv");
+   unregister_nmi_handler(NMI_LOCAL, "uvping");
+   uv_nmi_registered = 0;
+   uv_nmi_disabled_msg();
 }
 
 void uv_nmi_init(void)
@@ -688,6 +735,11 @@ void uv_nmi_setup(void)
int size = sizeof(void *) * (1 << NODES_SHIFT);
int cpu, nid;
 
+   if (uv_nmi_disabled) {
+   uv_nmi_disabled_msg();
+   return;
+   }
+
/* Setup hub nmi info */
uv_nmi_setup_mmrs();
uv_hub_nmi_list = kzalloc(size, GFP_KERNEL);
@@ -709,4 +761,21 @@ void uv_nmi_setup(void)
BUG_ON(!uv_nmi_cpu_mask);
 }
 
+static void uv_nmi_notify_disabled(void)
+{
+   if (uv_nmi_disabled) {
+   /* if in nmi, handler will disable when finished */
+   if (atomic_read(&uv_in_nmi))
+   return;
 
+   if (uv_nmi_registered)
+   uv_unregister_nmi_notifier();
+
+   } else {
+   /* nmi control lists not yet allocated? */
+   if (!uv_hub_nmi_list)
+   uv_nmi_setup();
+
+   uv_register_nmi_notifier();
+   }
+}

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] x86/UV: Add kdump to UV NMI handler

2013-09-05 Thread Mike Travis

If a system has hung and it no longer responds to external events, this
patch adds the capability of doing a standard kdump and system reboot
then triggered by the system NMI command.

It is enabled when the nmi action is changed to "kdump" and the
kernel is built with CONFIG_KEXEC enabled.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   41 +
 1 file changed, 41 insertions(+)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,6 +21,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -70,6 +71,7 @@ static atomic_t   uv_in_nmi;
 static atomic_t uv_nmi_cpu = ATOMIC_INIT(-1);
 static atomic_t uv_nmi_cpus_in_nmi = ATOMIC_INIT(-1);
 static atomic_t uv_nmi_slave_continue;
+static atomic_t uv_nmi_kexec_failed;
 static cpumask_var_t uv_nmi_cpu_mask;
 
 /* Values for uv_nmi_slave_continue */
@@ -143,6 +145,7 @@ module_param_named(retry_count, uv_nmi_r
  * Valid NMI Actions:
  *  "dump" - dump process stack for each cpu
  *  "ips"  - dump IP info for each cpu
+ *  "kdump"- do crash dump
  */
 static char uv_nmi_action[8] = "dump";
 module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
@@ -496,6 +499,40 @@ static void uv_nmi_touch_watchdogs(void)
touch_nmi_watchdog();
 }
 
+#if defined(CONFIG_KEXEC)
+static void uv_nmi_kdump(int cpu, int master, struct pt_regs *regs)
+{
+   /* Call crash to dump system state */
+   if (master) {
+   pr_emerg("UV: NMI executing crash_kexec on CPU%d\n", cpu);
+   crash_kexec(regs);
+
+   pr_emerg("UV: crash_kexec unexpectedly returned, ");
+   if (!kexec_crash_image) {
+   pr_cont("crash kernel not loaded\n");
+   atomic_set(&uv_nmi_kexec_failed, 1);
+   uv_nmi_sync_exit(1);
+   return;
+   }
+   pr_cont("kexec busy, stalling cpus while waiting\n");
+   }
+
+   /* If crash exec fails the slaves should return, otherwise stall */
+   while (atomic_read(&uv_nmi_kexec_failed) == 0)
+   mdelay(10);
+
+   /* Crash kernel most likely not loaded, return in an orderly fashion */
+   uv_nmi_sync_exit(0);
+}
+
+#else /* !CONFIG_KEXEC */
+static inline void uv_nmi_kdump(int cpu, int master, struct pt_regs *regs)
+{
+   if (master)
+   pr_err("UV: NMI kdump: KEXEC not supported in this kernel\n");
+}
+#endif /* !CONFIG_KEXEC */
+
 /*
  * UV NMI handler
  */
@@ -517,6 +554,10 @@ int uv_handle_nmi(unsigned int reason, s
/* Indicate we are the first CPU into the NMI handler */
master = (atomic_read(&uv_nmi_cpu) == cpu);
 
+   /* If NMI action is "kdump", then attempt to do it */
+   if (uv_nmi_action_is("kdump"))
+   uv_nmi_kdump(cpu, master, regs);
+
/* Pause as all cpus enter the NMI handler */
uv_nmi_wait(master);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] x86/UV: Update UV support for external NMI signals

2013-09-05 Thread Mike Travis

The current UV NMI handler has not been updated for the changes in the
system NMI handler and the perf operations.  The UV NMI handler reads
an MMR in the UV Hub to check to see if the NMI event was caused by
the external 'system NMI' that the operator can initiate on the System
Mgmt Controller.

The problem arises when the perf tools are running, causing millions of
perf events per second on very large CPU count systems.  Previously this
was okay because the perf NMI handler ran at a higher priority on the
NMI call chain and if the NMI was a perf event, it would stop calling
other NMI handlers remaining on the NMI call chain.

Now the system NMI handler calls all the handlers on the NMI call
chain including the UV NMI handler.  This causes the UV NMI handler
to read the MMRs at the same millions per second rate.  This can lead
to significant performance loss and possible system failures.  It also
can cause thousands of 'Dazed and Confused' messages being sent to the
system console.  This effectively makes perf tools unusable on UV systems.

To avoid this excessive overhead when perf tools are running, this code
has been optimized to minimize reading of the MMRs as much as possible,
by moving to the NMI_UNKNOWN notifier chain.  This chain is called only
when all the users on the standard NMI_LOCAL call chain have been called
and none of them have claimed this NMI.

There is an exception where the NMI_LOCAL notifier chain is used.  When
the perf tools are in use, it's possible that the UV NMI was captured by
some other NMI handler and then either ignored or mistakenly processed as
a perf event.  We set a per_cpu ('ping') flag for those CPUs that ignored
the initial NMI, and then send them an IPI NMI signal.  The NMI_LOCAL
handler on each cpu does not need to read the MMR, but instead checks the
in memory flag indicating it was pinged.  There are two module variables,
'ping_count' indicating how many requested NMI events occurred, and
'ping_misses' indicating how many stray NMI events.  These most likely
are perf events so it shows the overhead of the perf NMI interrupts
and how many MMR reads were avoided.

This patch also minimizes the reads of the MMRs by having the first
cpu entering the NMI handler on each node set a per HUB in-memory
atomic value.  (Having a per HUB value avoids sending lock traffic over
NumaLink.)  Both types of UV NMIs from the SMI layer are supported.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/uv_hub.h   |   57 +++
 arch/x86/include/asm/uv/uv_mmrs.h  |   31 ++
 arch/x86/kernel/apic/x2apic_uv_x.c |1 
 arch/x86/platform/uv/uv_nmi.c  |  551 ++---
 4 files changed, 599 insertions(+), 41 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -502,8 +502,8 @@ struct uv_blade_info {
unsigned short  nr_online_cpus;
unsigned short  pnode;
short   memory_nid;
-   spinlock_t  nmi_lock;
-   unsigned long   nmi_count;
+   spinlock_t  nmi_lock;   /* obsolete, see uv_hub_nmi */
+   unsigned long   nmi_count;  /* obsolete, see uv_hub_nmi */
 };
 extern struct uv_blade_info *uv_blade_info;
 extern short *uv_node_to_blade;
@@ -576,6 +576,59 @@ static inline int uv_num_possible_blades
return uv_possible_blades;
 }
 
+/* Per Hub NMI support */
+extern void uv_nmi_setup(void);
+
+/* BMC sets a bit this MMR non-zero before sending an NMI */
+#define UVH_NMI_MMRUVH_SCRATCH5
+#define UVH_NMI_MMR_CLEAR  UVH_SCRATCH5_ALIAS
+#define UVH_NMI_MMR_SHIFT  63
+#defineUVH_NMI_MMR_TYPE"SCRATCH5"
+
+/* Newer SMM NMI handler, not present in all systems */
+#define UVH_NMI_MMRX   UVH_EVENT_OCCURRED0
+#define UVH_NMI_MMRX_CLEAR UVH_EVENT_OCCURRED0_ALIAS
+#define UVH_NMI_MMRX_SHIFT (is_uv1_hub() ? \
+   UV1H_EVENT_OCCURRED0_EXTIO_INT0_SHFT :\
+   UVXH_EVENT_OCCURRED0_EXTIO_INT0_SHFT)
+#defineUVH_NMI_MMRX_TYPE   "EXTIO_INT0"
+
+/* Non-zero indicates newer SMM NMI handler present */
+#define UVH_NMI_MMRX_SUPPORTED UVH_EXTIO_INT0_BROADCAST
+
+/* Indicates to BIOS that we want to use the newer SMM NMI handler */
+#define UVH_NMI_MMRX_REQ   UVH_SCRATCH5_ALIAS_2
+#define UVH_NMI_MMRX_REQ_SHIFT 62
+
+struct uv_hub_nmi_s {
+   raw_spinlock_t  nmi_lock;
+   atomic_tin_nmi; /* flag this node in UV NMI IRQ */
+   atomic_tcpu_owner;  /* last locker of this struct */
+   atomic_tread_mmr_count; /* count of MMR reads */
+   atomic_tnmi_count;  /* count of true UV NMIs */
+   unsigned long   nmi_value;  /* last value read from NMI MMR */
+};
+
+struct uv_cpu_nmi_s {
+   struct uv_hub_nmi_s *hub;
+

[PATCH 0/9] x86/UV/KDB/NMI: Updates for NMI/KDB handler for SGI UV

2013-09-05 Thread Mike Travis


V2:  Split KDB updates from NMI updates.  Broke up the big patch to
 uv_nmi.c into smaller patches.  Updated to the latest linux
 kernel version.

The current UV NMI handler has not been updated for the changes in the
system NMI handler and the perf operations.  The UV NMI handler reads
an MMR in the UV Hub to check to see if the NMI event was caused by
the external 'system NMI' that the operator can initiate on the System
Mgmt Controller.

The problem arises when the perf tools are running, causing millions of
perf events per second on very large CPU count systems.  Previously this
was okay because the perf NMI handler ran at a higher priority on the
NMI call chain and if the NMI was a perf event, it would stop calling
other NMI handlers remaining on the NMI call chain.

Now the system NMI handler calls all the handlers on the NMI call
chain including the UV NMI handler.  This causes the UV NMI handler
to read the MMRs at the same millions per second rate.  This can lead
to significant performance loss and possible system failures.  It also
can cause thousands of 'Dazed and Confused' messages being sent to the
system console.  This effectively makes perf tools unusable on UV systems.

This patch set addresses this problem and allows the perf tools to run on
UV without impacting performance and causing system failures.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/9] KGDB/KDB: add new system NMI entry code to KDB

2013-09-05 Thread Mike Travis

This patch adds a new "KDB_REASON" code (KDB_REASON_SYSTEM_NMI).  This
is purely cosmetic to distinguish it from the other various reasons that
NMI may occur and are usually after an error occurred.  Also the dumping
of registers is not done to more closely match what is displayed when KDB
is entered manually via the sysreq 'g' key.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 include/linux/kdb.h |1 +
 include/linux/kgdb.h|1 +
 kernel/debug/debug_core.c   |5 +
 kernel/debug/kdb/kdb_debugger.c |5 -
 kernel/debug/kdb/kdb_main.c |3 +++
 5 files changed, 14 insertions(+), 1 deletion(-)

--- linux.orig/include/linux/kdb.h
+++ linux/include/linux/kdb.h
@@ -109,6 +109,7 @@ typedef enum {
KDB_REASON_RECURSE, /* Recursive entry to kdb;
 * regs probably valid */
KDB_REASON_SSTEP,   /* Single Step trap. - regs valid */
+   KDB_REASON_SYSTEM_NMI,  /* In NMI due to SYSTEM cmd; regs valid */
 } kdb_reason_t;
 
 extern int kdb_trap_printk;
--- linux.orig/include/linux/kgdb.h
+++ linux/include/linux/kgdb.h
@@ -52,6 +52,7 @@ extern int kgdb_connected;
 extern int kgdb_io_module_registered;
 
 extern atomic_tkgdb_setting_breakpoint;
+extern atomic_tkgdb_system_nmi;
 extern atomic_tkgdb_cpu_doing_single_step;
 
 extern struct task_struct  *kgdb_usethread;
--- linux.orig/kernel/debug/debug_core.c
+++ linux/kernel/debug/debug_core.c
@@ -125,6 +125,7 @@ static atomic_t masters_in_kgdb;
 static atomic_tslaves_in_kgdb;
 static atomic_tkgdb_break_tasklet_var;
 atomic_t   kgdb_setting_breakpoint;
+atomic_t   kgdb_system_nmi;
 
 struct task_struct *kgdb_usethread;
 struct task_struct *kgdb_contthread;
@@ -760,7 +761,11 @@ int kgdb_nmicallin(int cpu, int trapnr,
 
/* Indicate there are slaves waiting */
kgdb_info[cpu].send_ready = send_ready;
+
+   /* Use new reason code "SYSTEM_NMI" */
+   atomic_inc(&kgdb_system_nmi);
kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
+   atomic_dec(&kgdb_system_nmi);
kgdb_do_roundup = save_kgdb_do_roundup;
kgdb_info[cpu].send_ready = NULL;
 
--- linux.orig/kernel/debug/kdb/kdb_debugger.c
+++ linux/kernel/debug/kdb/kdb_debugger.c
@@ -69,7 +69,10 @@ int kdb_stub(struct kgdb_state *ks)
if (atomic_read(&kgdb_setting_breakpoint))
reason = KDB_REASON_KEYBOARD;
 
-   if (in_nmi())
+   if (atomic_read(&kgdb_system_nmi))
+   reason = KDB_REASON_SYSTEM_NMI;
+
+   else if (in_nmi())
reason = KDB_REASON_NMI;
 
for (i = 0, bp = kdb_breakpoints; i < KDB_MAXBPT; i++, bp++) {
--- linux.orig/kernel/debug/kdb/kdb_main.c
+++ linux/kernel/debug/kdb/kdb_main.c
@@ -1200,6 +1200,9 @@ static int kdb_local(kdb_reason_t reason
   instruction_pointer(regs));
kdb_dumpregs(regs);
break;
+   case KDB_REASON_SYSTEM_NMI:
+   kdb_printf("due to System NonMaskable Interrupt\n");
+   break;
case KDB_REASON_NMI:
kdb_printf("due to NonMaskable Interrupt @ "
   kdb_machreg_fmt "\n",

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/9] x86/UV: Add call to KGDB/KDB from NMI handler

2013-09-05 Thread Mike Travis

This patch restores the capability to enter KDB (and KGDB) from the UV NMI
handler.  This is needed because the UV system console is not capable of
sending the 'break' signal to the serial console port.  It is also useful
when the kernel is hung in such a way that it isn't responding to normal
external I/O, so sending 'g' to sysreq-trigger does not work either.

Another benefit of the external NMI command is that all the cpus receive
the NMI signal at roughly the same time so they are more closely aligned
timewise.

It utilizes the newly added kgdb_nmicallin function to gain entry
to KGDB/KDB by the master.  The slaves still enter via the standard
kgdb_nmicallback function.  It also uses the new 'send_ready' pointer
to tell KGDB/KDB to signal the slaves when to proceed into the KGDB
slave loop.

It is enabled when the nmi action is set to "kdb" and the kernel is
built with CONFIG_KDB enabled.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,7 +21,9 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -32,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -146,8 +149,9 @@ module_param_named(retry_count, uv_nmi_r
  *  "dump" - dump process stack for each cpu
  *  "ips"  - dump IP info for each cpu
  *  "kdump"- do crash dump
+ *  "kdb"  - enter KDB/KGDB (default)
  */
-static char uv_nmi_action[8] = "dump";
+static char uv_nmi_action[8] = "kdb";
 module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
 
 static inline bool uv_nmi_action_is(const char *action)
@@ -533,6 +537,43 @@ static inline void uv_nmi_kdump(int cpu,
 }
 #endif /* !CONFIG_KEXEC */
 
+#ifdef CONFIG_KGDB_KDB
+/* Call KDB from NMI handler */
+static void uv_call_kdb(int cpu, struct pt_regs *regs, int master)
+{
+   int ret;
+
+   if (master) {
+   /* call KGDB NMI handler as MASTER */
+   ret = kgdb_nmicallin(cpu, X86_TRAP_NMI, regs,
+   &uv_nmi_slave_continue);
+   if (ret) {
+   pr_alert("KDB returned error, is kgdboc set?\n");
+   atomic_set(&uv_nmi_slave_continue, SLAVE_EXIT);
+   }
+   } else {
+   /* wait for KGDB signal that it's ready for slaves to enter */
+   int sig;
+
+   do {
+   cpu_relax();
+   sig = atomic_read(&uv_nmi_slave_continue);
+   } while (!sig);
+
+   /* call KGDB as slave */
+   if (sig == SLAVE_CONTINUE)
+   ret = kgdb_nmicallback(cpu, regs);
+   }
+   uv_nmi_sync_exit(master);
+}
+
+#else /* !CONFIG_KGDB_KDB */
+static inline void uv_call_kdb(int cpu, struct pt_regs *regs, int master)
+{
+   pr_err("UV: NMI error: KGDB/KDB is not enabled in this kernel\n");
+}
+#endif /* !CONFIG_KGDB_KDB */
+
 /*
  * UV NMI handler
  */
@@ -565,6 +606,10 @@ int uv_handle_nmi(unsigned int reason, s
if (uv_nmi_action_is("ips") || uv_nmi_action_is("dump"))
uv_nmi_dump_state(cpu, regs, master);
 
+   /* Call KDB if enabled */
+   else if (uv_nmi_action_is("kdb"))
+   uv_call_kdb(cpu, regs, master);
+
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/9] x86/UV: Move NMI support

2013-09-05 Thread Mike Travis

This patch moves the UV NMI support from the x2apic file to a new
separate uv_nmi.c file in preparation for the next sequence of patches.
It prevents upcoming bloat of the x2apic file, and has the added benefit
of putting the upcoming /sys/module parameters under the name 'uv_nmi'
instead of 'x2apic_uv_x', which was obscure.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/uv.h   |2 
 arch/x86/kernel/apic/x2apic_uv_x.c |   69 -
 arch/x86/platform/uv/Makefile  |2 
 arch/x86/platform/uv/uv_nmi.c  |  102 +
 4 files changed, 105 insertions(+), 70 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,7 @@ extern enum uv_system_type get_uv_system
 extern int is_uv_system(void);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
+extern void uv_register_nmi_notifier(void);
 extern void uv_system_init(void);
 extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 struct mm_struct *mm,
@@ -25,6 +26,7 @@ static inline enum uv_system_type get_uv
 static inline int is_uv_system(void)   { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
+static inline void uv_register_nmi_notifier(void) { }
 static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
unsigned long start, unsigned long end, unsigned int cpu)
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -39,12 +39,6 @@
 #include 
 #include 
 
-/* BMC sets a bit this MMR non-zero before sending an NMI */
-#define UVH_NMI_MMRUVH_SCRATCH5
-#define UVH_NMI_MMR_CLEAR  (UVH_NMI_MMR + 8)
-#define UV_NMI_PENDING_MASK(1UL << 63)
-DEFINE_PER_CPU(unsigned long, cpu_last_nmi_count);
-
 DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 #define PR_DEVEL(fmt, args...) pr_devel("%s: " fmt, __func__, args)
@@ -58,7 +52,6 @@ int uv_min_hub_revision_id;
 EXPORT_SYMBOL_GPL(uv_min_hub_revision_id);
 unsigned int uv_apicid_hibits;
 EXPORT_SYMBOL_GPL(uv_apicid_hibits);
-static DEFINE_SPINLOCK(uv_nmi_lock);
 
 static struct apic apic_x2apic_uv_x;
 
@@ -854,68 +847,6 @@ void uv_cpu_init(void)
set_x2apic_extra_bits(uv_hub_info->pnode);
 }
 
-/*
- * When NMI is received, print a stack trace.
- */
-int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
-{
-   unsigned long real_uv_nmi;
-   int bid;
-
-   /*
-* Each blade has an MMR that indicates when an NMI has been sent
-* to cpus on the blade. If an NMI is detected, atomically
-* clear the MMR and update a per-blade NMI count used to
-* cause each cpu on the blade to notice a new NMI.
-*/
-   bid = uv_numa_blade_id();
-   real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
-
-   if (unlikely(real_uv_nmi)) {
-   spin_lock(&uv_blade_info[bid].nmi_lock);
-   real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & 
UV_NMI_PENDING_MASK);
-   if (real_uv_nmi) {
-   uv_blade_info[bid].nmi_count++;
-   uv_write_local_mmr(UVH_NMI_MMR_CLEAR, 
UV_NMI_PENDING_MASK);
-   }
-   spin_unlock(&uv_blade_info[bid].nmi_lock);
-   }
-
-   if (likely(__get_cpu_var(cpu_last_nmi_count) == 
uv_blade_info[bid].nmi_count))
-   return NMI_DONE;
-
-   __get_cpu_var(cpu_last_nmi_count) = uv_blade_info[bid].nmi_count;
-
-   /*
-* Use a lock so only one cpu prints at a time.
-* This prevents intermixed output.
-*/
-   spin_lock(&uv_nmi_lock);
-   pr_info("UV NMI stack dump cpu %u:\n", smp_processor_id());
-   dump_stack();
-   spin_unlock(&uv_nmi_lock);
-
-   return NMI_HANDLED;
-}
-
-void uv_register_nmi_notifier(void)
-{
-   if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
-   printk(KERN_WARNING "UV NMI handler failed to register\n");
-}
-
-void uv_nmi_init(void)
-{
-   unsigned int value;
-
-   /*
-* Unmask NMI on all cpus
-*/
-   value = apic_read(APIC_LVT1) | APIC_DM_NMI;
-   value &= ~APIC_LVT_MASKED;
-   apic_write(APIC_LVT1, value);
-}
-
 void __init uv_system_init(void)
 {
union uvh_rh_gam_config_mmr_u  m_n_config;
--- linux.orig/arch/x86/platform/uv/Makefile
+++ linux/arch/x86/platform/uv/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_X86_UV)   += tlb_uv.o bios_uv.o uv_irq.o uv_sysfs.o 
uv_time.o
+obj-$(CONFIG_X86_UV)   += tlb_uv.o bios_uv.o uv_irq.o uv_sysfs.o 
uv_time.o uv_nmi.o
--- /dev/null
+++ lin

Re: [PATCH 7/9] KGDB/KDB: add new system NMI entry code to KDB

2013-09-06 Thread Mike Travis



On 9/5/2013 10:00 PM, Jason Wessel wrote:
> On 09/05/2013 05:50 PM, Mike Travis wrote:
>> This patch adds a new "KDB_REASON" code (KDB_REASON_SYSTEM_NMI).  This
>> is purely cosmetic to distinguish it from the other various reasons that
>> NMI may occur and are usually after an error occurred.  Also the dumping
>> of registers is not done to more closely match what is displayed when KDB
>> is entered manually via the sysreq 'g' key.
> 
> 
> This patch is not quite right.   See below.
> 
> 
>>
>> Signed-off-by: Mike Travis 
>> Reviewed-by: Dimitri Sivanich 
>> Reviewed-by: Hedi Berriche 
>> ---
>>  include/linux/kdb.h |1 +
>>  include/linux/kgdb.h|1 +
>>  kernel/debug/debug_core.c   |5 +
>>  kernel/debug/kdb/kdb_debugger.c |5 -
>>  kernel/debug/kdb/kdb_main.c |3 +++
>>  5 files changed, 14 insertions(+), 1 deletion(-)
>>
>> --- linux.orig/include/linux/kdb.h
>> +++ linux/include/linux/kdb.h
>> @@ -109,6 +109,7 @@ typedef enum {
>>  KDB_REASON_RECURSE, /* Recursive entry to kdb;
>>   * regs probably valid */
>>  KDB_REASON_SSTEP,   /* Single Step trap. - regs valid */
>> +KDB_REASON_SYSTEM_NMI,  /* In NMI due to SYSTEM cmd; regs valid */
>>  } kdb_reason_t;
>>  
>>  extern int kdb_trap_printk;
>> --- linux.orig/include/linux/kgdb.h
>> +++ linux/include/linux/kgdb.h
>> @@ -52,6 +52,7 @@ extern int kgdb_connected;
>>  extern int kgdb_io_module_registered;
>>  
>>  extern atomic_t kgdb_setting_breakpoint;
>> +extern atomic_t kgdb_system_nmi;
> 
> 
> We don't need extra atomics. You should add another variable to the
kgdb_state which is processor specific in this case.
> 
> Better yet, just set the ks->err_code properly in your
kgdb_nmicallin() or in the origination call to kgdb_nmicallback() from
your nmi handler (remember I still have the question pending if we
actually need kgdb_nmicallin() in the first place. You already did the
work of adding another NMI type to the enum. We just need to use the
ks->err_code variable as well.

Good idea, I hadn't thought of using that field.  In fact, it
simplified the patch enough that I just folded into the other.

I'll address your other question separately.

Thanks!
Mike
> 
> 
>>  extern atomic_t kgdb_cpu_doing_single_step;
>>  
>>  extern struct task_struct   *kgdb_usethread;
>> --- linux.orig/kernel/debug/debug_core.c
>> +++ linux/kernel/debug/debug_core.c
>> @@ -125,6 +125,7 @@ static atomic_t  masters_in_kgdb;
>>  static atomic_t slaves_in_kgdb;
>>  static atomic_t kgdb_break_tasklet_var;
>>  atomic_tkgdb_setting_breakpoint;
>> +atomic_tkgdb_system_nmi;
>>  
>>  struct task_struct  *kgdb_usethread;
>>  struct task_struct  *kgdb_contthread;
>> @@ -760,7 +761,11 @@ int kgdb_nmicallin(int cpu, int trapnr,
>>  
>>  /* Indicate there are slaves waiting */
>>  kgdb_info[cpu].send_ready = send_ready;
>> +
>> +/* Use new reason code "SYSTEM_NMI" */
>> +atomic_inc(&kgdb_system_nmi);
>>  kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
>> +atomic_dec(&kgdb_system_nmi);
>>  kgdb_do_roundup = save_kgdb_do_roundup;
>>  kgdb_info[cpu].send_ready = NULL;
>>  
>> --- linux.orig/kernel/debug/kdb/kdb_debugger.c
>> +++ linux/kernel/debug/kdb/kdb_debugger.c
>> @@ -69,7 +69,10 @@ int kdb_stub(struct kgdb_state *ks)
>>  if (atomic_read(&kgdb_setting_breakpoint))
>>  reason = KDB_REASON_KEYBOARD;
>>  
>> -if (in_nmi())
>> +if (atomic_read(&kgdb_system_nmi))
>> +reason = KDB_REASON_SYSTEM_NMI;
> 
> 
> This would get changed to if (ks->err == KDB_REASON_SYSNMI &&
ks->signo == SIGTRAP) 
> 
> Cheers,
> Jason.
> 
>> +
>> +else if (in_nmi())
>>  reason = KDB_REASON_NMI;
>>  
>>  for (i = 0, bp = kdb_breakpoints; i < KDB_MAXBPT; i++, bp++) {
>> --- linux.orig/kernel/debug/kdb/kdb_main.c
>> +++ linux/kernel/debug/kdb/kdb_main.c
>> @@ -1200,6 +1200,9 @@ static int kdb_local(kdb_reason_t reason
>> instruction_pointer(regs));
>>  kdb_dumpregs(regs);
>>  break;
>> +case KDB_REASON_SYSTEM_NMI:
>> +kdb_printf("due to System NonMaskable Interrupt\n");
>> +break;
>>  case KDB_REASON_NMI:
>>  kdb_printf("due to NonMaskable Interrupt @ "
>> kdb_machreg_fmt "\n",
>>
>> -- 
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 9/9] x86/UV: Add ability to disable UV NMI handler

2013-09-09 Thread Mike Travis

On 9/9/2013 5:43 AM, Peter Zijlstra wrote:
> On Thu, Sep 05, 2013 at 05:50:41PM -0500, Mike Travis wrote:
>> For performance reasons, the NMI handler may be disabled to lessen the
>> performance impact caused by the multiple perf tools running concurently.
>> If the system nmi command is issued when the UV NMI handler is disabled,
>> the "Dazed and Confused" messages occur for all cpus.  The NMI handler is
>> disabled by setting the nmi disabled variable to '1'.  Setting it back to
>> '0' will re-enable the NMI handler.
> 
> I'm not entirely sure why this is still needed now that you've moved all
> really expensive bits into the UNKNOWN handler.
> 

Yes, it could be considered optional.  My primary use was to isolate
new bugs I found to see if my NMI changes were causing them.  But it
appears that they are not since the problems occur with or without
using the NMI entry into KDB.  So it can be safely removed.

(The basic problem is that if you hang out in KDB too long the machine
locks up.  Other problems like the rcu stall detector does not have a
means to be "touched" like the nmi_watchdog_timer so it fires off a
few to many, many messages.  Another, any network connections will time
out if you are in KDB more than say 20 or 30 seconds.)

One other problem is with the perf tool.  It seems running more than
about 2 or 3 perf top instances on a medium (1k cpu threads) sized
system, they start behaving badly with a bunch of NMI stackdumps
appearing on the console.  Eventually the system become unusable.

On a large system (4k), the perf tools get an error message (sorry
don't have it handy at the moment) the basically implies that the
perf config option is not set.  Again, I wanted to remove the new
NMI handler to insure that it wasn't doing something weird, and
it wasn't.

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 9/9] x86/UV: Add ability to disable UV NMI handler

2013-09-12 Thread Mike Travis



On 9/12/2013 11:35 AM, Paul E. McKenney wrote:
> On Thu, Sep 12, 2013 at 10:27:31AM -0700, Paul E. McKenney wrote:
>> On Tue, Sep 10, 2013 at 11:03:49AM +0200, Peter Zijlstra wrote:
>>> On Mon, Sep 09, 2013 at 10:07:03AM -0700, Mike Travis wrote:
>>>> On 9/9/2013 5:43 AM, Peter Zijlstra wrote:
>>>>> On Thu, Sep 05, 2013 at 05:50:41PM -0500, Mike Travis wrote:
>>>>>> For performance reasons, the NMI handler may be disabled to lessen the
>>>>>> performance impact caused by the multiple perf tools running concurently.
>>>>>> If the system nmi command is issued when the UV NMI handler is disabled,
>>>>>> the "Dazed and Confused" messages occur for all cpus.  The NMI handler is
>>>>>> disabled by setting the nmi disabled variable to '1'.  Setting it back to
>>>>>> '0' will re-enable the NMI handler.
>>>>>
>>>>> I'm not entirely sure why this is still needed now that you've moved all
>>>>> really expensive bits into the UNKNOWN handler.
>>>>>
>>>>
>>>> Yes, it could be considered optional.  My primary use was to isolate
>>>> new bugs I found to see if my NMI changes were causing them.  But it
>>>> appears that they are not since the problems occur with or without
>>>> using the NMI entry into KDB.  So it can be safely removed.
>>>
>>> OK, as a debug option it might make sense, but removing it is (of course)
>>> fine with me ;-)
>>>
>>>> (The basic problem is that if you hang out in KDB too long the machine
>>>> locks up.  
>>>
>>> Yeah, known issue. Not much you can do about it either I suspect. The
>>> system generally isn't build for things like that.
>>>
>>>> Other problems like the rcu stall detector does not have a
>>>> means to be "touched" like the nmi_watchdog_timer so it fires off a
>>>> few to many, many messages.  
>>>
>>> That however might be easily cured if you ask Paul nicely ;-)
>>
>> RCU's grace-period mechanism is supposed to be what touches it.  ;-)
>>
>> But what is it that you are looking for?  If you want to silence it
>> completely, the rcu_cpu_stall_suppress boot/sysfs parameter is what
>> you want to use.
>>
>>>> Another, any network connections will time
>>>> out if you are in KDB more than say 20 or 30 seconds.)
>>
>> Ah, you are looking for RCU to refrain from complaining about grace
>> periods that have been delayed by breakpoints in the kernel?  Is there
>> some way that RCU can learn that a breakpoint has happened?  If so,
>> this should not be hard.
> 
> But wait...  RCU relies on the jiffies counter for RCU CPU stall warnings.
> Doesn't the jiffies counter stop during breakpoints?
> 
>   Thanx, Paul

All cpus entering the UV NMI event use local_irq_save (as does the
entry into KGDB/KDB).  So the question becomes more what happens
after all the cpus do the local_irq_restore?  The hardware clocks
are of course still running.

> 
>> If not, I must fall back on the rcu_cpu_stall_suppress that I mentioned
>> earlier.
>>
>>>> One other problem is with the perf tool.  It seems running more than
>>>> about 2 or 3 perf top instances on a medium (1k cpu threads) sized
>>>> system, they start behaving badly with a bunch of NMI stackdumps
>>>> appearing on the console.  Eventually the system become unusable.
>>>
>>> Yuck.. I haven't seen anything like that on the 'tiny' systems I have :/
>>
>> Indeed, with that definition of "medium", large must be truly impressive!
>>
>>  Thanx, Paul
>>
>>>> On a large system (4k), the perf tools get an error message (sorry
>>>> don't have it handy at the moment) the basically implies that the
>>>> perf config option is not set.  Again, I wanted to remove the new
>>>> NMI handler to insure that it wasn't doing something weird, and
>>>> it wasn't.
>>>
>>> Cute.. 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 9/9] x86/UV: Add ability to disable UV NMI handler

2013-09-12 Thread Mike Travis



On 9/12/2013 10:27 AM, Paul E. McKenney wrote:
> On Tue, Sep 10, 2013 at 11:03:49AM +0200, Peter Zijlstra wrote:
>> On Mon, Sep 09, 2013 at 10:07:03AM -0700, Mike Travis wrote:
>>> On 9/9/2013 5:43 AM, Peter Zijlstra wrote:
>>>> On Thu, Sep 05, 2013 at 05:50:41PM -0500, Mike Travis wrote:
>>>>> For performance reasons, the NMI handler may be disabled to lessen the
>>>>> performance impact caused by the multiple perf tools running concurently.
>>>>> If the system nmi command is issued when the UV NMI handler is disabled,
>>>>> the "Dazed and Confused" messages occur for all cpus.  The NMI handler is
>>>>> disabled by setting the nmi disabled variable to '1'.  Setting it back to
>>>>> '0' will re-enable the NMI handler.
>>>>
>>>> I'm not entirely sure why this is still needed now that you've moved all
>>>> really expensive bits into the UNKNOWN handler.
>>>>
>>>
>>> Yes, it could be considered optional.  My primary use was to isolate
>>> new bugs I found to see if my NMI changes were causing them.  But it
>>> appears that they are not since the problems occur with or without
>>> using the NMI entry into KDB.  So it can be safely removed.
>>
>> OK, as a debug option it might make sense, but removing it is (of course)
>> fine with me ;-)
>>
>>> (The basic problem is that if you hang out in KDB too long the machine
>>> locks up.  
>>
>> Yeah, known issue. Not much you can do about it either I suspect. The
>> system generally isn't build for things like that.
>>
>>> Other problems like the rcu stall detector does not have a
>>> means to be "touched" like the nmi_watchdog_timer so it fires off a
>>> few to many, many messages.  
>>
>> That however might be easily cured if you ask Paul nicely ;-)
> 
> RCU's grace-period mechanism is supposed to be what touches it.  ;-)
> 
> But what is it that you are looking for?  If you want to silence it
> completely, the rcu_cpu_stall_suppress boot/sysfs parameter is what
> you want to use.

We have by default rcutree.rcu_cpu_stall_suppress=1 on the kernel
cmdline.  I'll double check if it was set during my testing.

> 
>>> Another, any network connections will time
>>> out if you are in KDB more than say 20 or 30 seconds.)
> 
> Ah, you are looking for RCU to refrain from complaining about grace
> periods that have been delayed by breakpoints in the kernel?  Is there
> some way that RCU can learn that a breakpoint has happened?  If so,
> this should not be hard.

Yes, exactly.  After a UV NMI event which might or might not call KDB,
but definitely can consume some time with the system stopped, I have
these notifications:

static void uv_nmi_touch_watchdogs(void)
{
touch_softlockup_watchdog_sync();
clocksource_touch_watchdog();
rcu_cpu_stall_reset();
touch_nmi_watchdog();
}


In all the cases I checked, I had all the cpus in the NMI event so
I don't think it was a straggler who triggered the problem.  One
question though, the above is called by all cpus exiting the NMI
event.  Should I limit that to only one cpu?

Note btw, that this also happens when KGDB/KDB is entered via the
sysrq-trigger 'g' event.

Perhaps there is some other timer that is going off?

> If not, I must fall back on the rcu_cpu_stall_suppress that I mentioned
> earlier.
> 
>>> One other problem is with the perf tool.  It seems running more than
>>> about 2 or 3 perf top instances on a medium (1k cpu threads) sized
>>> system, they start behaving badly with a bunch of NMI stackdumps
>>> appearing on the console.  Eventually the system become unusable.
>>
>> Yuck.. I haven't seen anything like that on the 'tiny' systems I have :/
> 
> Indeed, with that definition of "medium", large must be truly impressive!

I say medium because it's only one rack w/~4TB of memory (and quite
popular).  Large would be 4k cpus/64TB.  Not sure yet what is "huge",
at least in terms of an SSI system.

> 
>   Thanx, Paul
> 
>>> On a large system (4k), the perf tools get an error message (sorry
>>> don't have it handy at the moment) the basically implies that the
>>> perf config option is not set.  Again, I wanted to remove the new
>>> NMI handler to insure that it wasn't doing something weird, and
>>> it wasn't.
>>
>> Cute.. 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/7] x86/UV: Add uvtrace support

2013-09-23 Thread Mike Travis

This patch adds support for the uvtrace module by providing a skeleton
call to the registered trace function.  It also provides another separate
'NMI' tracer that is triggered by the system wide 'power nmi' command.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/uv.h  |8 
 arch/x86/platform/uv/uv_nmi.c |   13 -
 2 files changed, 20 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -14,6 +14,13 @@ extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
 extern void uv_register_nmi_notifier(void);
 extern void uv_system_init(void);
+extern void (*uv_trace_nmi_func)(unsigned int reason, struct pt_regs *regs);
+extern void (*uv_trace_func)(const char *f, const int l, const char *fmt, ...);
+#define uv_trace(fmt, ...) \
+do {   \
+   if (unlikely(uv_trace_func))\
+   (uv_trace_func)(__func__, __LINE__, fmt, ##__VA_ARGS__);\
+} while (0)
 extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 struct mm_struct *mm,
 unsigned long start,
@@ -26,6 +33,7 @@ static inline enum uv_system_type get_uv
 static inline int is_uv_system(void)   { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
+static inline void uv_trace(void *fmt, ...){ }
 static inline void uv_register_nmi_notifier(void) { }
 static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -1,5 +1,5 @@
 /*
- * SGI NMI support routines
+ * SGI NMI/TRACE support routines
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -39,6 +39,13 @@
 #include 
 #include 
 
+void (*uv_trace_func)(const char *f, const int l, const char *fmt, ...);
+EXPORT_SYMBOL(uv_trace_func);
+
+void (*uv_trace_nmi_func)(unsigned int reason, struct pt_regs *regs);
+EXPORT_SYMBOL(uv_trace_nmi_func);
+
+
 /*
  * UV handler for NMI
  *
@@ -592,6 +599,10 @@ int uv_handle_nmi(unsigned int reason, s
return NMI_DONE;
}
 
+   /* Call possible NMI trace function */
+   if (unlikely(uv_trace_nmi_func))
+   (uv_trace_nmi_func)(reason, regs);
+
/* Indicate we are the first CPU into the NMI handler */
master = (atomic_read(&uv_nmi_cpu) == cpu);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/7] x86/UV/KDB/NMI: Updates for NMI/KDB handler for SGI UV

2013-09-23 Thread Mike Travis


V3:  Reduce number of changes to KGDB/KDB code to simplify special
 handling of SYSTEM NMI.  Remove disable UV NMI function.

V2:  Split KDB updates from NMI updates.  Broke up the big patch to
 uv_nmi.c into smaller patches.  Updated to the latest linux
 kernel version.

The current UV NMI handler has not been updated for the changes in the
system NMI handler and the perf operations.  The UV NMI handler reads
an MMR in the UV Hub to check to see if the NMI event was caused by
the external 'system NMI' that the operator can initiate on the System
Mgmt Controller.

The problem arises when the perf tools are running, causing millions of
perf events per second on very large CPU count systems.  Previously this
was okay because the perf NMI handler ran at a higher priority on the
NMI call chain and if the NMI was a perf event, it would stop calling
other NMI handlers remaining on the NMI call chain.

Now the system NMI handler calls all the handlers on the NMI call
chain including the UV NMI handler.  This causes the UV NMI handler
to read the MMRs at the same millions per second rate.  This can lead
to significant performance loss and possible system failures.  It also
can cause thousands of 'Dazed and Confused' messages being sent to the
system console.  This effectively makes perf tools unusable on UV systems.

This patch set addresses this problem and allows the perf tools to run on
UV without impacting performance and causing system failures.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/7] x86/UV: Move NMI support

2013-09-23 Thread Mike Travis

This patch moves the UV NMI support from the x2apic file to a new
separate uv_nmi.c file in preparation for the next sequence of patches.
It prevents upcoming bloat of the x2apic file, and has the added benefit
of putting the upcoming /sys/module parameters under the name 'uv_nmi'
instead of 'x2apic_uv_x', which was obscure.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/uv.h   |2 
 arch/x86/kernel/apic/x2apic_uv_x.c |   69 -
 arch/x86/platform/uv/Makefile  |2 
 arch/x86/platform/uv/uv_nmi.c  |  102 +
 4 files changed, 105 insertions(+), 70 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv.h
+++ linux/arch/x86/include/asm/uv/uv.h
@@ -12,6 +12,7 @@ extern enum uv_system_type get_uv_system
 extern int is_uv_system(void);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
+extern void uv_register_nmi_notifier(void);
 extern void uv_system_init(void);
 extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 struct mm_struct *mm,
@@ -25,6 +26,7 @@ static inline enum uv_system_type get_uv
 static inline int is_uv_system(void)   { return 0; }
 static inline void uv_cpu_init(void)   { }
 static inline void uv_system_init(void){ }
+static inline void uv_register_nmi_notifier(void) { }
 static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
unsigned long start, unsigned long end, unsigned int cpu)
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -39,12 +39,6 @@
 #include 
 #include 
 
-/* BMC sets a bit this MMR non-zero before sending an NMI */
-#define UVH_NMI_MMRUVH_SCRATCH5
-#define UVH_NMI_MMR_CLEAR  (UVH_NMI_MMR + 8)
-#define UV_NMI_PENDING_MASK(1UL << 63)
-DEFINE_PER_CPU(unsigned long, cpu_last_nmi_count);
-
 DEFINE_PER_CPU(int, x2apic_extra_bits);
 
 #define PR_DEVEL(fmt, args...) pr_devel("%s: " fmt, __func__, args)
@@ -58,7 +52,6 @@ int uv_min_hub_revision_id;
 EXPORT_SYMBOL_GPL(uv_min_hub_revision_id);
 unsigned int uv_apicid_hibits;
 EXPORT_SYMBOL_GPL(uv_apicid_hibits);
-static DEFINE_SPINLOCK(uv_nmi_lock);
 
 static struct apic apic_x2apic_uv_x;
 
@@ -854,68 +847,6 @@ void uv_cpu_init(void)
set_x2apic_extra_bits(uv_hub_info->pnode);
 }
 
-/*
- * When NMI is received, print a stack trace.
- */
-int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
-{
-   unsigned long real_uv_nmi;
-   int bid;
-
-   /*
-* Each blade has an MMR that indicates when an NMI has been sent
-* to cpus on the blade. If an NMI is detected, atomically
-* clear the MMR and update a per-blade NMI count used to
-* cause each cpu on the blade to notice a new NMI.
-*/
-   bid = uv_numa_blade_id();
-   real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
-
-   if (unlikely(real_uv_nmi)) {
-   spin_lock(&uv_blade_info[bid].nmi_lock);
-   real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & 
UV_NMI_PENDING_MASK);
-   if (real_uv_nmi) {
-   uv_blade_info[bid].nmi_count++;
-   uv_write_local_mmr(UVH_NMI_MMR_CLEAR, 
UV_NMI_PENDING_MASK);
-   }
-   spin_unlock(&uv_blade_info[bid].nmi_lock);
-   }
-
-   if (likely(__get_cpu_var(cpu_last_nmi_count) == 
uv_blade_info[bid].nmi_count))
-   return NMI_DONE;
-
-   __get_cpu_var(cpu_last_nmi_count) = uv_blade_info[bid].nmi_count;
-
-   /*
-* Use a lock so only one cpu prints at a time.
-* This prevents intermixed output.
-*/
-   spin_lock(&uv_nmi_lock);
-   pr_info("UV NMI stack dump cpu %u:\n", smp_processor_id());
-   dump_stack();
-   spin_unlock(&uv_nmi_lock);
-
-   return NMI_HANDLED;
-}
-
-void uv_register_nmi_notifier(void)
-{
-   if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
-   printk(KERN_WARNING "UV NMI handler failed to register\n");
-}
-
-void uv_nmi_init(void)
-{
-   unsigned int value;
-
-   /*
-* Unmask NMI on all cpus
-*/
-   value = apic_read(APIC_LVT1) | APIC_DM_NMI;
-   value &= ~APIC_LVT_MASKED;
-   apic_write(APIC_LVT1, value);
-}
-
 void __init uv_system_init(void)
 {
union uvh_rh_gam_config_mmr_u  m_n_config;
--- linux.orig/arch/x86/platform/uv/Makefile
+++ linux/arch/x86/platform/uv/Makefile
@@ -1 +1 @@
-obj-$(CONFIG_X86_UV)   += tlb_uv.o bios_uv.o uv_irq.o uv_sysfs.o 
uv_time.o
+obj-$(CONFIG_X86_UV)   += tlb_uv.o bios_uv.o uv_irq.o uv_sysfs.o 
uv_time.o uv_nmi.o
--- /dev/null
+++ lin

[PATCH 2/7] x86/UV: Update UV support for external NMI signals

2013-09-23 Thread Mike Travis

The current UV NMI handler has not been updated for the changes in the
system NMI handler and the perf operations.  The UV NMI handler reads
an MMR in the UV Hub to check to see if the NMI event was caused by
the external 'system NMI' that the operator can initiate on the System
Mgmt Controller.

The problem arises when the perf tools are running, causing millions of
perf events per second on very large CPU count systems.  Previously this
was okay because the perf NMI handler ran at a higher priority on the
NMI call chain and if the NMI was a perf event, it would stop calling
other NMI handlers remaining on the NMI call chain.

Now the system NMI handler calls all the handlers on the NMI call
chain including the UV NMI handler.  This causes the UV NMI handler
to read the MMRs at the same millions per second rate.  This can lead
to significant performance loss and possible system failures.  It also
can cause thousands of 'Dazed and Confused' messages being sent to the
system console.  This effectively makes perf tools unusable on UV systems.

To avoid this excessive overhead when perf tools are running, this code
has been optimized to minimize reading of the MMRs as much as possible,
by moving to the NMI_UNKNOWN notifier chain.  This chain is called only
when all the users on the standard NMI_LOCAL call chain have been called
and none of them have claimed this NMI.

There is an exception where the NMI_LOCAL notifier chain is used.  When
the perf tools are in use, it's possible that the UV NMI was captured by
some other NMI handler and then either ignored or mistakenly processed as
a perf event.  We set a per_cpu ('ping') flag for those CPUs that ignored
the initial NMI, and then send them an IPI NMI signal.  The NMI_LOCAL
handler on each cpu does not need to read the MMR, but instead checks the
in memory flag indicating it was pinged.  There are two module variables,
'ping_count' indicating how many requested NMI events occurred, and
'ping_misses' indicating how many stray NMI events.  These most likely
are perf events so it shows the overhead of the perf NMI interrupts
and how many MMR reads were avoided.

This patch also minimizes the reads of the MMRs by having the first
cpu entering the NMI handler on each node set a per HUB in-memory
atomic value.  (Having a per HUB value avoids sending lock traffic over
NumaLink.)  Both types of UV NMIs from the SMI layer are supported.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/include/asm/uv/uv_hub.h   |   57 +++
 arch/x86/include/asm/uv/uv_mmrs.h  |   31 ++
 arch/x86/kernel/apic/x2apic_uv_x.c |1 
 arch/x86/platform/uv/uv_nmi.c  |  551 ++---
 4 files changed, 599 insertions(+), 41 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -502,8 +502,8 @@ struct uv_blade_info {
unsigned short  nr_online_cpus;
unsigned short  pnode;
short   memory_nid;
-   spinlock_t  nmi_lock;
-   unsigned long   nmi_count;
+   spinlock_t  nmi_lock;   /* obsolete, see uv_hub_nmi */
+   unsigned long   nmi_count;  /* obsolete, see uv_hub_nmi */
 };
 extern struct uv_blade_info *uv_blade_info;
 extern short *uv_node_to_blade;
@@ -576,6 +576,59 @@ static inline int uv_num_possible_blades
return uv_possible_blades;
 }
 
+/* Per Hub NMI support */
+extern void uv_nmi_setup(void);
+
+/* BMC sets a bit this MMR non-zero before sending an NMI */
+#define UVH_NMI_MMRUVH_SCRATCH5
+#define UVH_NMI_MMR_CLEAR  UVH_SCRATCH5_ALIAS
+#define UVH_NMI_MMR_SHIFT  63
+#defineUVH_NMI_MMR_TYPE"SCRATCH5"
+
+/* Newer SMM NMI handler, not present in all systems */
+#define UVH_NMI_MMRX   UVH_EVENT_OCCURRED0
+#define UVH_NMI_MMRX_CLEAR UVH_EVENT_OCCURRED0_ALIAS
+#define UVH_NMI_MMRX_SHIFT (is_uv1_hub() ? \
+   UV1H_EVENT_OCCURRED0_EXTIO_INT0_SHFT :\
+   UVXH_EVENT_OCCURRED0_EXTIO_INT0_SHFT)
+#defineUVH_NMI_MMRX_TYPE   "EXTIO_INT0"
+
+/* Non-zero indicates newer SMM NMI handler present */
+#define UVH_NMI_MMRX_SUPPORTED UVH_EXTIO_INT0_BROADCAST
+
+/* Indicates to BIOS that we want to use the newer SMM NMI handler */
+#define UVH_NMI_MMRX_REQ   UVH_SCRATCH5_ALIAS_2
+#define UVH_NMI_MMRX_REQ_SHIFT 62
+
+struct uv_hub_nmi_s {
+   raw_spinlock_t  nmi_lock;
+   atomic_tin_nmi; /* flag this node in UV NMI IRQ */
+   atomic_tcpu_owner;  /* last locker of this struct */
+   atomic_tread_mmr_count; /* count of MMR reads */
+   atomic_tnmi_count;  /* count of true UV NMIs */
+   unsigned long   nmi_value;  /* last value read from NMI MMR */
+};
+
+struct uv_cpu_nmi_s {
+   struct uv_hub_nmi_s *hub;
+

[PATCH 3/7] x86/UV: Add summary of cpu activity to UV NMI handler

2013-09-23 Thread Mike Travis

The standard NMI handler dumps the states of all the cpus.  This includes
a full register dump and stack trace.  This can be way more information
than what is needed.  This patch adds a "summary" dump that is basically
a form of the "ps" command.  It includes the symbolic IP address as well
as the command field and basic process information.

It is enabled when the nmi action is changed to "ips".

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   48 ++
 1 file changed, 44 insertions(+), 4 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -139,6 +139,19 @@ module_param_named(wait_count, uv_nmi_wa
 static int uv_nmi_retry_count = 500;
 module_param_named(retry_count, uv_nmi_retry_count, int, 0644);
 
+/*
+ * Valid NMI Actions:
+ *  "dump" - dump process stack for each cpu
+ *  "ips"  - dump IP info for each cpu
+ */
+static char uv_nmi_action[8] = "dump";
+module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
+
+static inline bool uv_nmi_action_is(const char *action)
+{
+   return (strncmp(uv_nmi_action, action, strlen(action)) == 0);
+}
+
 /* Setup which NMI support is present in system */
 static void uv_nmi_setup_mmrs(void)
 {
@@ -367,13 +380,38 @@ static void uv_nmi_wait(int master)
atomic_read(&uv_nmi_cpus_in_nmi), num_online_cpus());
 }
 
+static void uv_nmi_dump_cpu_ip_hdr(void)
+{
+   printk(KERN_DEFAULT
+   "\nUV: %4s %6s %-32s %s   (Note: PID 0 not listed)\n",
+   "CPU", "PID", "COMMAND", "IP");
+}
+
+static void uv_nmi_dump_cpu_ip(int cpu, struct pt_regs *regs)
+{
+   printk(KERN_DEFAULT "UV: %4d %6d %-32.32s ",
+   cpu, current->pid, current->comm);
+
+   printk_address(regs->ip, 1);
+}
+
 /* Dump this cpu's state */
 static void uv_nmi_dump_state_cpu(int cpu, struct pt_regs *regs)
 {
const char *dots = " . ";
 
-   printk(KERN_DEFAULT "UV:%sNMI process trace for CPU %d\n", dots, cpu);
-   show_regs(regs);
+   if (uv_nmi_action_is("ips")) {
+   if (cpu == 0)
+   uv_nmi_dump_cpu_ip_hdr();
+
+   if (current->pid != 0)
+   uv_nmi_dump_cpu_ip(cpu, regs);
+
+   } else if (uv_nmi_action_is("dump")) {
+   printk(KERN_DEFAULT
+   "UV:%sNMI process trace for CPU %d\n", dots, cpu);
+   show_regs(regs);
+   }
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_DUMP_DONE);
 }
 
@@ -420,7 +458,8 @@ static void uv_nmi_dump_state(int cpu, s
int ignored = 0;
int saved_console_loglevel = console_loglevel;
 
-   pr_alert("UV: tracing processes for %d CPUs from CPU %d\n",
+   pr_alert("UV: tracing %s for %d CPUs from CPU %d\n",
+   uv_nmi_action_is("ips") ? "IPs" : "processes",
atomic_read(&uv_nmi_cpus_in_nmi), cpu);
 
console_loglevel = uv_nmi_loglevel;
@@ -482,7 +521,8 @@ int uv_handle_nmi(unsigned int reason, s
uv_nmi_wait(master);
 
/* Dump state of each cpu */
-   uv_nmi_dump_state(cpu, regs, master);
+   if (uv_nmi_action_is("ips") || uv_nmi_action_is("dump"))
+   uv_nmi_dump_state(cpu, regs, master);
 
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] x86/UV: Add call to KGDB/KDB from NMI handler

2013-09-23 Thread Mike Travis

This patch restores the capability to enter KDB (and KGDB) from the UV NMI
handler.  This is needed because the UV system console is not capable of
sending the 'break' signal to the serial console port.  It is also useful
when the kernel is hung in such a way that it isn't responding to normal
external I/O, so sending 'g' to sysreq-trigger does not work either.

Another benefit of the external NMI command is that all the cpus receive
the NMI signal at roughly the same time so they are more closely aligned
timewise.

It utilizes the newly added kgdb_nmicallin function to gain entry
to KGDB/KDB by the master.  The slaves still enter via the standard
kgdb_nmicallback function.  It also uses the new 'send_ready' pointer
to tell KGDB/KDB to signal the slaves when to proceed into the KGDB
slave loop.

It is enabled when the nmi action is set to "kdb" and the kernel is
built with CONFIG_KDB enabled.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   47 +-
 1 file changed, 46 insertions(+), 1 deletion(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,7 +21,9 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -32,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -146,8 +149,9 @@ module_param_named(retry_count, uv_nmi_r
  *  "dump" - dump process stack for each cpu
  *  "ips"  - dump IP info for each cpu
  *  "kdump"- do crash dump
+ *  "kdb"  - enter KDB/KGDB (default)
  */
-static char uv_nmi_action[8] = "dump";
+static char uv_nmi_action[8] = "kdb";
 module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
 
 static inline bool uv_nmi_action_is(const char *action)
@@ -533,6 +537,43 @@ static inline void uv_nmi_kdump(int cpu,
 }
 #endif /* !CONFIG_KEXEC */
 
+#ifdef CONFIG_KGDB_KDB
+/* Call KDB from NMI handler */
+static void uv_call_kdb(int cpu, struct pt_regs *regs, int master)
+{
+   int ret;
+
+   if (master) {
+   /* call KGDB NMI handler as MASTER */
+   ret = kgdb_nmicallin(cpu, X86_TRAP_NMI, regs,
+   &uv_nmi_slave_continue);
+   if (ret) {
+   pr_alert("KDB returned error, is kgdboc set?\n");
+   atomic_set(&uv_nmi_slave_continue, SLAVE_EXIT);
+   }
+   } else {
+   /* wait for KGDB signal that it's ready for slaves to enter */
+   int sig;
+
+   do {
+   cpu_relax();
+   sig = atomic_read(&uv_nmi_slave_continue);
+   } while (!sig);
+
+   /* call KGDB as slave */
+   if (sig == SLAVE_CONTINUE)
+   ret = kgdb_nmicallback(cpu, regs);
+   }
+   uv_nmi_sync_exit(master);
+}
+
+#else /* !CONFIG_KGDB_KDB */
+static inline void uv_call_kdb(int cpu, struct pt_regs *regs, int master)
+{
+   pr_err("UV: NMI error: KGDB/KDB is not enabled in this kernel\n");
+}
+#endif /* !CONFIG_KGDB_KDB */
+
 /*
  * UV NMI handler
  */
@@ -565,6 +606,10 @@ int uv_handle_nmi(unsigned int reason, s
if (uv_nmi_action_is("ips") || uv_nmi_action_is("dump"))
uv_nmi_dump_state(cpu, regs, master);
 
+   /* Call KDB if enabled */
+   else if (uv_nmi_action_is("kdb"))
+   uv_call_kdb(cpu, regs, master);
+
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/7] KGDB/KDB: add support for external NMI handler to call KGDB/KDB.

2013-09-23 Thread Mike Travis

This patch adds a kgdb_nmicallin() interface that can be used by
external NMI handlers to call the KGDB/KDB handler.  The primary need
for this is for those types of NMI interrupts where all the CPUs
have already received the NMI signal.  Therefore no send_IPI(NMI)
is required, and in fact it will cause a 2nd unhandled NMI to occur.
This generates the "Dazed and Confuzed" messages.

Since all the CPUs are getting the NMI at roughly the same time, it's not
guaranteed that the first CPU that hits the NMI handler will manage to
enter KGDB and set the dbg_master_lock before the slaves start entering.
The new argument "send_ready" was added for KGDB to signal the NMI handler
to release the slave CPUs for entry into KGDB.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 include/linux/kdb.h |1 +
 include/linux/kgdb.h|1 +
 kernel/debug/debug_core.c   |   30 +-
 kernel/debug/debug_core.h   |1 +
 kernel/debug/kdb/kdb_debugger.c |5 -
 kernel/debug/kdb/kdb_main.c |3 +++
 6 files changed, 39 insertions(+), 2 deletions(-)

--- linux.orig/include/linux/kdb.h
+++ linux/include/linux/kdb.h
@@ -109,6 +109,7 @@ typedef enum {
KDB_REASON_RECURSE, /* Recursive entry to kdb;
 * regs probably valid */
KDB_REASON_SSTEP,   /* Single Step trap. - regs valid */
+   KDB_REASON_SYSTEM_NMI,  /* In NMI due to SYSTEM cmd; regs valid */
 } kdb_reason_t;
 
 extern int kdb_trap_printk;
--- linux.orig/include/linux/kgdb.h
+++ linux/include/linux/kgdb.h
@@ -310,6 +310,7 @@ extern int
 kgdb_handle_exception(int ex_vector, int signo, int err_code,
  struct pt_regs *regs);
 extern int kgdb_nmicallback(int cpu, void *regs);
+extern int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *snd_rdy);
 extern void gdbstub_exit(int status);
 
 extern int kgdb_single_step;
--- linux.orig/kernel/debug/debug_core.c
+++ linux/kernel/debug/debug_core.c
@@ -575,8 +575,12 @@ return_normal:
raw_spin_lock(&dbg_slave_lock);
 
 #ifdef CONFIG_SMP
+   /* If SYSTEM_NMI, slaves are already waiting */
+   if (ks->err_code == KDB_REASON_SYSTEM_NMI)
+   atomic_set(ks->send_ready, 1);
+
/* Signal the other CPUs to enter kgdb_wait() */
-   if ((!kgdb_single_step) && kgdb_do_roundup)
+   else if ((!kgdb_single_step) && kgdb_do_roundup)
kgdb_roundup_cpus(flags);
 #endif
 
@@ -729,6 +733,30 @@ int kgdb_nmicallback(int cpu, void *regs
return 0;
}
 #endif
+   return 1;
+}
+
+int kgdb_nmicallin(int cpu, int trapnr, void *regs, atomic_t *send_ready)
+{
+#ifdef CONFIG_SMP
+   if (!kgdb_io_ready(0) || !send_ready)
+   return 1;
+
+   if (kgdb_info[cpu].enter_kgdb == 0) {
+   struct kgdb_state kgdb_var;
+   struct kgdb_state *ks = &kgdb_var;
+
+   memset(ks, 0, sizeof(struct kgdb_state));
+   ks->cpu = cpu;
+   ks->ex_vector   = trapnr;
+   ks->signo   = SIGTRAP;
+   ks->err_code= KDB_REASON_SYSTEM_NMI;
+   ks->linux_regs  = regs;
+   ks->send_ready  = send_ready;
+   kgdb_cpu_enter(ks, regs, DCPU_WANT_MASTER);
+   return 0;
+   }
+#endif
return 1;
 }
 
--- linux.orig/kernel/debug/debug_core.h
+++ linux/kernel/debug/debug_core.h
@@ -26,6 +26,7 @@ struct kgdb_state {
unsigned long   threadid;
longkgdb_usethreadid;
struct pt_regs  *linux_regs;
+   atomic_t*send_ready;
 };
 
 /* Exception state values */
--- linux.orig/kernel/debug/kdb/kdb_debugger.c
+++ linux/kernel/debug/kdb/kdb_debugger.c
@@ -69,7 +69,10 @@ int kdb_stub(struct kgdb_state *ks)
if (atomic_read(&kgdb_setting_breakpoint))
reason = KDB_REASON_KEYBOARD;
 
-   if (in_nmi())
+   if (ks->err_code == KDB_REASON_SYSTEM_NMI)
+   reason = KDB_REASON_SYSTEM_NMI;
+
+   else if (in_nmi())
reason = KDB_REASON_NMI;
 
for (i = 0, bp = kdb_breakpoints; i < KDB_MAXBPT; i++, bp++) {
--- linux.orig/kernel/debug/kdb/kdb_main.c
+++ linux/kernel/debug/kdb/kdb_main.c
@@ -1200,6 +1200,9 @@ static int kdb_local(kdb_reason_t reason
   instruction_pointer(regs));
kdb_dumpregs(regs);
break;
+   case KDB_REASON_SYSTEM_NMI:
+   kdb_printf("due to System NonMaskable Interrupt\n");
+   break;
case KDB_REASON_NMI:
kdb_printf("due to NonMaskable Interrupt @ "
   kdb_machreg_fmt "\n",

-- 
--
To unsubscribe fr

[PATCH 4/7] x86/UV: Add kdump to UV NMI handler

2013-09-23 Thread Mike Travis

If a system has hung and it no longer responds to external events, this
patch adds the capability of doing a standard kdump and system reboot
then triggered by the system NMI command.

It is enabled when the nmi action is changed to "kdump" and the
kernel is built with CONFIG_KEXEC enabled.

Signed-off-by: Mike Travis 
Reviewed-by: Dimitri Sivanich 
Reviewed-by: Hedi Berriche 
---
 arch/x86/platform/uv/uv_nmi.c |   41 +
 1 file changed, 41 insertions(+)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,6 +21,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -70,6 +71,7 @@ static atomic_t   uv_in_nmi;
 static atomic_t uv_nmi_cpu = ATOMIC_INIT(-1);
 static atomic_t uv_nmi_cpus_in_nmi = ATOMIC_INIT(-1);
 static atomic_t uv_nmi_slave_continue;
+static atomic_t uv_nmi_kexec_failed;
 static cpumask_var_t uv_nmi_cpu_mask;
 
 /* Values for uv_nmi_slave_continue */
@@ -143,6 +145,7 @@ module_param_named(retry_count, uv_nmi_r
  * Valid NMI Actions:
  *  "dump" - dump process stack for each cpu
  *  "ips"  - dump IP info for each cpu
+ *  "kdump"- do crash dump
  */
 static char uv_nmi_action[8] = "dump";
 module_param_string(action, uv_nmi_action, sizeof(uv_nmi_action), 0644);
@@ -496,6 +499,40 @@ static void uv_nmi_touch_watchdogs(void)
touch_nmi_watchdog();
 }
 
+#if defined(CONFIG_KEXEC)
+static void uv_nmi_kdump(int cpu, int master, struct pt_regs *regs)
+{
+   /* Call crash to dump system state */
+   if (master) {
+   pr_emerg("UV: NMI executing crash_kexec on CPU%d\n", cpu);
+   crash_kexec(regs);
+
+   pr_emerg("UV: crash_kexec unexpectedly returned, ");
+   if (!kexec_crash_image) {
+   pr_cont("crash kernel not loaded\n");
+   atomic_set(&uv_nmi_kexec_failed, 1);
+   uv_nmi_sync_exit(1);
+   return;
+   }
+   pr_cont("kexec busy, stalling cpus while waiting\n");
+   }
+
+   /* If crash exec fails the slaves should return, otherwise stall */
+   while (atomic_read(&uv_nmi_kexec_failed) == 0)
+   mdelay(10);
+
+   /* Crash kernel most likely not loaded, return in an orderly fashion */
+   uv_nmi_sync_exit(0);
+}
+
+#else /* !CONFIG_KEXEC */
+static inline void uv_nmi_kdump(int cpu, int master, struct pt_regs *regs)
+{
+   if (master)
+   pr_err("UV: NMI kdump: KEXEC not supported in this kernel\n");
+}
+#endif /* !CONFIG_KEXEC */
+
 /*
  * UV NMI handler
  */
@@ -517,6 +554,10 @@ int uv_handle_nmi(unsigned int reason, s
/* Indicate we are the first CPU into the NMI handler */
master = (atomic_read(&uv_nmi_cpu) == cpu);
 
+   /* If NMI action is "kdump", then attempt to do it */
+   if (uv_nmi_action_is("kdump"))
+   uv_nmi_kdump(cpu, master, regs);
+
/* Pause as all cpus enter the NMI handler */
uv_nmi_wait(master);
 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/7] x86/UV/KDB/NMI: Updates for NMI/KDB handler for SGI UV

2013-09-24 Thread Mike Travis



On 9/24/2013 12:52 AM, Ingo Molnar wrote:
> 
> Hm, do you test-build your patches? 

Both build and test incessantly...

This series produces the following
> annoying warning:
> 
>  arch/x86/platform/uv/uv_nmi.c: In function ‘uv_nmi_setup’:
>  arch/x86/platform/uv/uv_nmi.c:664:2: warning: the address of 
> ‘uv_nmi_cpu_mask’ will always evaluate as ‘true’ [-Waddress]

I didn't hit the above warning since I never tried building without
CONFIG_CPUMASK_OFFSTACK defined.  I wonder if uv_nmi.c should not
be built if not on an enterprise sized system?

> 
> This:
> 
> alloc_cpumask_var(&uv_nmi_cpu_mask, GFP_KERNEL);
> BUG_ON(!uv_nmi_cpu_mask);
> 
> 
> the way to check for allocation failures is by checking the return value 
> of alloc_cpumask_var():
> 
>   BUG_ON(!alloc_cpumask_var(&uv_nmi_cpu_mask, GFP_KERNEL));
> 
> I've fixed this in the patch.

Thanks!!  I should have remembered this since it was my code. (doh!)
> 
> Thanks,
> 
>   Ingo
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-21 Thread Mike Travis

On 6/21/2013 10:18 AM, Nathan Zimmer wrote:
> On 06/21/2013 12:03 PM, H. Peter Anvin wrote:
>> On 06/21/2013 09:51 AM, Greg KH wrote:
>>> On Fri, Jun 21, 2013 at 11:25:32AM -0500, Nathan Zimmer wrote:
 This rfc patch set delays initializing large sections of memory
 until we have
 started cpus.  This has the effect of reducing startup times on
 large memory
 systems.  On 16TB it can take over an hour to boot and most of that
 time
 is spent initializing memory.

On 32TB we went from 2:25 to around 20 minutes.

 We avoid that bottleneck by delaying initialization until after we have
 started multiple cpus and can initialize in a multithreaded manner.
 This allows us to actually reduce boot time rather then just moving
 around
 the point of initialization.

 Mike and I have worked on this set for a while, with him doing the
 most of the
 heavy lifting, and are eager for some feedback.
>>> Why make this a config option at all, why not just always do this if the
>>> memory size is larger than some specific number (like 8TB?)
>>>
>>> Otherwise the distros will always enable this option, and having it be a
>>> configuration choice doesn't make any sense.
>>>
>> Since you made it a compile time option, it would be good to know how
>> much code it adds, but otherwise I agree with Greg here... this really
>> shouldn't need to be an option.  It *especially* shouldn't need to be a
>> hand-set runtime option (which looks quite complex, to boot.)
> The patchset as a whole is just over 400 lines so it doesn't add alot.
> If I were to pull the .config option it would probably remove 30 lines.
> 
> The command line option is too complex but some of the data I
> haven't found a way to get at runtime yet.

Specifically, the physical address space of each node and whether the
block size is 128M or 2G is needed.  The other params are really there as
a fallback as we have not yet verified on the largest possible machine.
The parameter is intended to be set by a configurator.  There are far
too many kernel params to leave them to chance to be set correctly.
On UV we use a utility called 'uvconfig'.

What we could do is default the values unless specifically set?
Perhaps only set the node address space?  Delaying the memory
insertion is mostly a debug aid for debugging the insertion functions.

>>
>> I suspect the cutoff for this should be a lot lower than 8 TB even, more
>> like 128 GB or so.  The only concern is to not set the cutoff so low
>> that we can end up running out of memory or with suboptimal NUMA
>> placement just because of this.

Exactly.

We test regularly on a machine that has ~4TB and the speedup is
negligible.  The problem seems to occur as the count of memory blocks
is increased over some limit.  I think going much lower might start
getting in the way of other things, like constructing transparent
huge pages, etc.

Also notice that Node 0 and the last Node already have all their memory.
There are just too many other types in the memmap that it wasn't worth
the hassle.  So unless the system has at least 6 or 8 nodes you're not
gaining much.

> Even at lower amounts of ram there is an positive impact.I it knocks
> time off
> boot even at as small as a 1TB of ram.
> 
>> Also, in case it is not bloody obvious: whatever memory the kernel image
>> was loaded into MUST be considered "online", even if it is loaded way
>> high.

Good point, we should add a check since we have that info at boot time.
Other checks might be if this is a kdump kernel, or even perhaps a KVM
kernel boot (though giving it 16TB is pretty wild.)

Thanks!
>>
>> -hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-21 Thread Mike Travis



On 6/21/2013 11:36 AM, Yinghai Lu wrote:
> On Fri, Jun 21, 2013 at 9:25 AM, Nathan Zimmer  wrote:
>> This rfc patch set delays initializing large sections of memory until we have
>> started cpus.  This has the effect of reducing startup times on large memory
>> systems.  On 16TB it can take over an hour to boot and most of that time
>> is spent initializing memory.
> 
> One hour on system with 16T ram? BIOS or OS?

The BIOS is about 20 minutes *before* the 1+ hour.  (When we started this
UV project way back when, 8TB took over 4 hours before we threw in the towel.)
> 
> I use wall clock to check bootime on one system with 3T and 16 pcie cards,
> Linus only takes about 3m and 30 seconds from bootloader.

I can send some stats on where various delays are but most of it was in
memory initialization.  On average UV nodes carry 128 or 256G per node,
so 12 nodes would take about 3 or 4 minutes, perhaps more.
> 
> wonder if you boot delay is with so many cpu get onlined in serialized mode.

Nope.  If that was the case, delaying memory but initializing all the
cpus would not affect the time.
> 
> so can you try boot your system with "maxcpus=128" to get the boot time with
> wall clock ?

We could try if you really think it will provide any useful info.  Not sure
exactly how having memory on nodes with no active cpus will react.
> 
> Thanks
> 
> Yinghai
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-21 Thread Mike Travis



On 6/21/2013 12:00 PM, Yinghai Lu wrote:
> On Fri, Jun 21, 2013 at 11:44 AM, Greg Kroah-Hartman
>  wrote:
>> On Fri, Jun 21, 2013 at 11:36:21AM -0700, Yinghai Lu wrote:
>>> On Fri, Jun 21, 2013 at 9:25 AM, Nathan Zimmer  wrote:
 This rfc patch set delays initializing large sections of memory until we 
 have
 started cpus.  This has the effect of reducing startup times on large 
 memory
 systems.  On 16TB it can take over an hour to boot and most of that time
 is spent initializing memory.
>>>
>>> One hour on system with 16T ram? BIOS or OS?
>>>
>>> I use wall clock to check bootime on one system with 3T and 16 pcie cards,
>>> Linus only takes about 3m and 30 seconds from bootloader.
>>>
>>> wonder if you boot delay is with so many cpu get onlined in serialized mode.
>>>
>>> so can you try boot your system with "maxcpus=128" to get the boot time with
>>> wall clock ?
>>
>> Why use the "wall clock" when we have the wonderful bootchart tools and
>> scripts that do this all for you, and can tell you exactly what part of
>> the kernel is taking what time, to help with fixing issues like this?
> 
> bootchart is not completed.
> 
> printk timestamp come after mem get initialized.
> 
> [0.004000] tsc: Fast TSC calibration using PIT
> 
> before that stamp are all 0.
> 
> Yinghai
> 

On UV the system console function has an option to include
timestamps, both sequential in HH:MM:SS and deltas to 100ms.
So we get both the BIOS and system times before printk time
is active.  We also have a custom "script" command that adds
timing info.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-21 Thread Mike Travis



On 6/21/2013 11:50 AM, Greg KH wrote:
> On Fri, Jun 21, 2013 at 11:44:22AM -0700, Yinghai Lu wrote:
>> On Fri, Jun 21, 2013 at 10:03 AM, H. Peter Anvin  wrote:
>>> On 06/21/2013 09:51 AM, Greg KH wrote:
>>>
>>> I suspect the cutoff for this should be a lot lower than 8 TB even, more
>>> like 128 GB or so.  The only concern is to not set the cutoff so low
>>> that we can end up running out of memory or with suboptimal NUMA
>>> placement just because of this.
>>
>> I would suggest another way:
>> only boot the system with boot node (include cpu, ram and pci root buses).
>> then after boot, could add other nodes.
> 
> What exactly do you mean by "after boot"?  Often, the boot process of
> userspace needs those additional cpus and ram in order to initialize
> everything (like the pci devices) properly.

Exactly.  That's why I left both low and high memory on each node.

> 
> thanks,
> 
> greg k-h
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-21 Thread Mike Travis



On 6/21/2013 1:08 PM, H. Peter Anvin wrote:
> Is this init code?  32K of unconditional runtime addition isn't completely 
> trivial.

The delay functions that move memory to the absent list are __init
but the read back of the list and memory insertion are not.  BTW, this
option is only available with the memory hotplug option which depends
on the sparse memory option.  So any usage of these distro configs will
normally be on enterprise class machines.

> 
> Nathan Zimmer  wrote:
> 
>> On 06/21/2013 12:28 PM, H. Peter Anvin wrote:
>>> On 06/21/2013 10:18 AM, Nathan Zimmer wrote:
> Since you made it a compile time option, it would be good to know
>> how
> much code it adds, but otherwise I agree with Greg here... this
>> really
> shouldn't need to be an option.  It *especially* shouldn't need to
>> be a
> hand-set runtime option (which looks quite complex, to boot.)
 The patchset as a whole is just over 400 lines so it doesn't add
>> alot.
 If I were to pull the .config option it would probably remove 30
>> lines.
>>> I'm more concerned about bytes of code.
>> Oh, The difference is just under 32k.
>> 371843425 Jun 21 14:08 vmlinux.o /* DELAY_MEM_INIT is not set  */
>> 371875600 Jun 21 14:36 vmlinux.o /* DELAY_MEM_INIT=y */
>>
>>>
 The command line option is too complex but some of the data I
>> haven't
 found a way to get at runtime yet.
>>> I think that is probably key.
>>>
> I suspect the cutoff for this should be a lot lower than 8 TB even,
>> more
> like 128 GB or so.  The only concern is to not set the cutoff so
>> low
> that we can end up running out of memory or with suboptimal NUMA
> placement just because of this.
 Even at lower amounts of ram there is an positive impact.I it knocks
 time off
 boot even at as small as a 1TB of ram.
>>> I am not surprised.
>>>
>>> -hpa
>>>
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 2/2] x86_64, mm: Reinsert the absent memory

2013-06-25 Thread Mike Travis

Thanks.  I went through a few iterations until finalizing
on the current patches.  Originally the memory was inserted
using the memory probe interface, but at 2G per addition,
it was still very slow.  By bypassing the memory driver
interface, and going directly to add_memory and memory_online,
along with Nathan's changes to unlock the memory mutex on
a per node basis, it sped up dramatically.


On 6/25/2013 8:07 AM, H. Peter Anvin wrote:
> I have to say I really like this concept. It should have some very nice 
> properties including perhaps making THP work better?
> 
> Ingo Molnar  wrote:
> 
>>
>> * Nathan Zimmer  wrote:
>>
>>> On Sun, Jun 23, 2013 at 11:28:40AM +0200, Ingo Molnar wrote:

 That's 4.5 GB/sec initialization speed - that feels a bit slow and
>> the 
 boot time effect should be felt on smaller 'a couple of gigabytes' 
 desktop boxes as well. Do we know exactly where the 2 hours of boot
>>
 time on a 32 TB system is spent?
>>>
>>> There are other several spots that could be improved on a large
>> system 
>>> but memory initialization is by far the biggest.
>>
>> My feeling is that deferred/on-demand initialization triggered from the
>>
>> buddy allocator is the better long term solution.
>>
>> That will also make it much easier to profile/test memory init 
>> performance: boot up a large system and run a simple testprogram that 
>> allocates a lot of RAM.
>>
>> ( It will also make people want to optimize the initialization sequence
>>
>>  better, as it will be part of any freshly booted system's memory 
>>  allocation overhead. )
>>
>> Thanks,
>>
>>  Ingo
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 2/2] x86_64, mm: Reinsert the absent memory

2013-06-25 Thread Mike Travis



On 6/25/2013 12:38 AM, Ingo Molnar wrote:
> 
> * Nathan Zimmer  wrote:
> 
>> On Sun, Jun 23, 2013 at 11:28:40AM +0200, Ingo Molnar wrote:
>>>
>>> That's 4.5 GB/sec initialization speed - that feels a bit slow and the 
>>> boot time effect should be felt on smaller 'a couple of gigabytes' 
>>> desktop boxes as well. Do we know exactly where the 2 hours of boot 
>>> time on a 32 TB system is spent?
>>
>> There are other several spots that could be improved on a large system 
>> but memory initialization is by far the biggest.
> 
> My feeling is that deferred/on-demand initialization triggered from the 
> buddy allocator is the better long term solution.

I haven't caught up with all of Nathan's changes yet (just
got back from vacation), but there was an option to either
start the memory insertion on boot, or trigger it later
using the /sys/.../memory interface.  There is also a monitor
program that calculates the memory insertion rate.  This was
extremely useful to determine how changes in the kernel
affected the rate.

> 
> That will also make it much easier to profile/test memory init 
> performance: boot up a large system and run a simple testprogram that 
> allocates a lot of RAM.
> 
> ( It will also make people want to optimize the initialization sequence 
>   better, as it will be part of any freshly booted system's memory 
>   allocation overhead. )
> 
> Thanks,
> 
>   Ingo
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-25 Thread Mike Travis



On 6/21/2013 5:23 PM, Yinghai Lu wrote:
> On Fri, Jun 21, 2013 at 2:30 PM, Mike Travis  wrote:
>>
>>
>> On 6/21/2013 11:50 AM, Greg KH wrote:
>>> On Fri, Jun 21, 2013 at 11:44:22AM -0700, Yinghai Lu wrote:
>>>> On Fri, Jun 21, 2013 at 10:03 AM, H. Peter Anvin  wrote:
>>>>> On 06/21/2013 09:51 AM, Greg KH wrote:
>>>>>
>>>>> I suspect the cutoff for this should be a lot lower than 8 TB even, more
>>>>> like 128 GB or so.  The only concern is to not set the cutoff so low
>>>>> that we can end up running out of memory or with suboptimal NUMA
>>>>> placement just because of this.
>>>>
>>>> I would suggest another way:
>>>> only boot the system with boot node (include cpu, ram and pci root buses).
>>>> then after boot, could add other nodes.
>>>
>>> What exactly do you mean by "after boot"?  Often, the boot process of
>>> userspace needs those additional cpus and ram in order to initialize
>>> everything (like the pci devices) properly.
>>
>> Exactly.  That's why I left both low and high memory on each node.
> 
> looks like you assume every node have same ram, and before booting you
> you need to know memory layout to append the boot command line.
> 
> We have patchset that moving srat table parse early.
> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> for-x86-mm
> https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/log/?h=for-x86-mm
> 
> on top that, we could make your patch pass more simple command like
> 1/2^n of every node, and only need to pass n instead.

The two params that I couldn't figure out how to provide except via kernel
param option was the memory block size (128M or 2G) and the physical
address space per node.  The other 3 params can be automatically
setup by a script when the total system size is known.  As soon as we
verify on the 32TB system and surmise what will be needed for 64TB,
then those 3 params can probably disappear.




> 
> Thanks
> 
> Yinghai
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-25 Thread Mike Travis

On 6/25/2013 11:17 AM, H. Peter Anvin wrote:
> On 06/25/2013 10:35 AM, Mike Travis wrote:
>>
>> The two params that I couldn't figure out how to provide except via kernel
>> param option was the memory block size (128M or 2G) and the physical
>> address space per node.  The other 3 params can be automatically
>> setup by a script when the total system size is known.  As soon as we
>> verify on the 32TB system and surmise what will be needed for 64TB,
>> then those 3 params can probably disappear.
>>
> 
> "Setup by script" is a no-go.  You *have* the total system size already,
> it is in the e820 tables (anything which isn't in e820 is hotplug, that
> automagically gets deferred.)

Okay, I'll figure something out.  If Yinghai's SRAT patch can help
with the node address space, then I might be able to determine if
the system is a UV which is the only system I see that uses 2G
memory blocks.  (Or make get_memory_block_size() a global.)

Then a simple param to start the insertion early or defer it until
the system is fully up is still useful, and that's easy to understand.

[I think we still want to keep the actual process of moving memory
to the absent list an option, yes?  If for no other reason except
to rule out this code when a problem crops up.  Or at least have a
way to disable the process if it's CONFIG'd in.]

> 
> However, please consider Ingo's counterproposal of doing this via the
> buddy allocator, i.e. hugepages being broken on demand.  That is a
> *very* powerful model, although would require more infrastructure.

We will certainly continue to make improvements as larger system sizes
become more commonplace (and customers continue to complain :).  But
we are cutting it close to including this into the nextgen distro
releases, so it would have to be a follow on project.  (I've been
working on this patch since last November.)

Thanks,
Mike

> 
>   -hpa
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-25 Thread Mike Travis



On 6/25/2013 11:38 AM, Yinghai Lu wrote:
> On Tue, Jun 25, 2013 at 10:35 AM, Mike Travis  wrote:
>>
>>
>> On 6/21/2013 5:23 PM, Yinghai Lu wrote:
>>> On Fri, Jun 21, 2013 at 2:30 PM, Mike Travis  wrote:
>>>> Exactly.  That's why I left both low and high memory on each node.
>>>
>>> looks like you assume every node have same ram, and before booting you
>>> you need to know memory layout to append the boot command line.
>>>
>>> We have patchset that moving srat table parse early.
>>> git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
>>> for-x86-mm
>>> https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/log/?h=for-x86-mm
>>>
>>> on top that, we could make your patch pass more simple command like
>>> 1/2^n of every node, and only need to pass n instead.
>>
>> The two params that I couldn't figure out how to provide except via kernel
>> param option was the memory block size (128M or 2G) and the physical
>> address space per node.  The other 3 params can be automatically
>> setup by a script when the total system size is known.  As soon as we
>> verify on the 32TB system and surmise what will be needed for 64TB,
>> then those 3 params can probably disappear.
> 
> our "numa parsing early" patchset could provide "physical address
> space per node",
> also can calculate memory block size via alignment detection from numa info.

Thanks!  I'll try it out.
> 
> with that, user only can pass "delay_init_mem" only.
> 
> Thanks
> 
> Yinghai
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 2/2] x86_64, mm: Reinsert the absent memory

2013-06-25 Thread Mike Travis



On 6/25/2013 11:43 AM, H. Peter Anvin wrote:
> On 06/25/2013 10:22 AM, Mike Travis wrote:
>>
>> On 6/25/2013 12:38 AM, Ingo Molnar wrote:
>>>
>>> * Nathan Zimmer  wrote:
>>>
>>>> On Sun, Jun 23, 2013 at 11:28:40AM +0200, Ingo Molnar wrote:
>>>>>
>>>>> That's 4.5 GB/sec initialization speed - that feels a bit slow and the 
>>>>> boot time effect should be felt on smaller 'a couple of gigabytes' 
>>>>> desktop boxes as well. Do we know exactly where the 2 hours of boot 
>>>>> time on a 32 TB system is spent?
>>>>
>>>> There are other several spots that could be improved on a large system 
>>>> but memory initialization is by far the biggest.
>>>
>>> My feeling is that deferred/on-demand initialization triggered from the 
>>> buddy allocator is the better long term solution.
>>
>> I haven't caught up with all of Nathan's changes yet (just
>> got back from vacation), but there was an option to either
>> start the memory insertion on boot, or trigger it later
>> using the /sys/.../memory interface.  There is also a monitor
>> program that calculates the memory insertion rate.  This was
>> extremely useful to determine how changes in the kernel
>> affected the rate.
>>
> 
> Sorry, I *totally* did not follow that comment.  It seemed like a
> complete non-sequitur?
> 
>   -hpa

It was I who was not following the question.  I'm still reverting
back to "work mode".

[There is more code in a separate patch that Nate has not sent
yet that instructs the kernel to start adding memory as early
as possible, or not.  That way you can start the insertion process
later and monitor it's progress to determine how changes in the
kernel affect that process.  It is controlled by a separate
CONFIG option.]


> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-25 Thread Mike Travis



On 6/25/2013 11:44 AM, H. Peter Anvin wrote:
> On 06/25/2013 11:40 AM, Yinghai Lu wrote:
>> On Tue, Jun 25, 2013 at 11:17 AM, H. Peter Anvin  wrote:
>>> On 06/25/2013 10:35 AM, Mike Travis wrote:
>>
>>> However, please consider Ingo's counterproposal of doing this via the
>>> buddy allocator, i.e. hugepages being broken on demand.  That is a
>>> *very* powerful model, although would require more infrastructure.
>>
>> Can you or Ingo elaborate more about the buddy allocator proposal?
>>
> 
> Start by initializing 1G hyperpages only, but mark them so that the
> allocator knows that if it needs to break them apart it has to
> initialize the page structures for the 2M subpages.
> 
> Same thing with 2M -> 4K.
> 
>   -hpa
> 
> 

It is worth experimenting with but the big question would be,
if it still avoids the very expensive "memmap_init_zone" and
it's sub-functions using huge expanses of memory.  I'll do some
experimenting as soon as I can.  Our 32TB system is being
brought back to 16TB (we found a number of problems as we
get closer and closer to the 64TB limit), but that's still
a significant size.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 2/2] x86_64, mm: Reinsert the absent memory

2013-06-26 Thread Mike Travis

On 6/26/2013 5:14 AM, Ingo Molnar wrote:
> 
> * Nathan Zimmer  wrote:
> 
>> perf seems to struggle with 512 cpus, but I did get some data.
> 
> Btw., mind outlining in what way it struggles?
> 
> Thanks,
> 
>   Ingo
> 

I submitted some patches to Jason Wessel that updated the community
support for KDB & NMI quite awhile ago that addressed the issues
with perf and friends on UV.  But I have not heard back from him in a
couple of months.  Is there a new maintainer for NMI/PERF/KDB etc.?

The primary problem is that the current UV NMI handler is in the
primary NMI notifier chain causing excessive reads of a register in
the UV hub.  When perf is running these approach the millions per
second, and the MMIO read is not only expensive but also distrupts
the primary HUB activities of directing NUMA traffic.  Thus the
system slows down considerably, and perf can even lose events.

We've even updated the UV BIOS to make this a faster path but it
needs the newer NMI handler to use it.  Perhaps I should resubmit
them directly to you?

Thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Transparent on-demand memory setup initialization embedded in the (GFP) buddy allocator

2013-06-26 Thread Mike Travis



On 6/26/2013 6:37 AM, Ingo Molnar wrote:
> 
> * Andrew Morton  wrote:
> 
>> On Wed, 26 Jun 2013 11:22:48 +0200 Ingo Molnar  wrote:
>>
>>> except that on 32 TB 
>>> systems we don't spend ~2 hours initializing 8,589,934,592 page heads.
>>
>> That's about a million a second which is crazy slow - even my 
>> prehistoric desktop is 100x faster than that.
>>
>> Where's all this time actually being spent?
> 
> See the earlier part of the thread - apparently it's spent initializing 
> the page heads - remote NUMA node misses from a single boot CPU, going 
> across a zillion cross-connects? I guess there's some other low hanging 
> fruits as well - so making this easier to profile would be nice. The 
> profile posted was not really usable.

This is one advantage of delayed memory init.  I can do it under
the profiler.  I will put everything together to accomplish this
and then send a perf report.

> 
> Btw., NUMA locality would be another advantage of on-demand 
> initialization: actual users of RAM tend to allocate node-local 
> (especially on large clusters), so any overhead will be naturally lower.
> 
> Thanks,
> 
>   Ingo
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 0/2] Delay initializing of large sections of memory

2013-06-27 Thread Mike Travis



On 6/26/2013 11:37 PM, Yinghai Lu wrote:
> On Tue, Jun 25, 2013 at 11:58 AM, Mike Travis  wrote:
>> experimenting as soon as I can.  Our 32TB system is being
>> brought back to 16TB (we found a number of problems as we
>> get closer and closer to the 64TB limit), but that's still
>> a significant size.
> 
> Hi, Mike,
> 
> Can you post e820 memory map on system that have 32TiB or more?
> 
> Is there one range size more than 16TiB? like [16TiB, 32TiB)...
> 
> Thanks
> 
> Yinghai
> 

Here is (was) the 32T system with 2048 cores with HT disabled:

BIOS-e820:  - 0007f000 (usable)
BIOS-e820: 0007f000 - 0008 (reserved)
BIOS-e820: 0008 - 000a (usable)
BIOS-e820: 0010 - 7ad54000 (usable)
BIOS-e820: 7ad54000 - 7ad55000 (reserved)
BIOS-e820: 7ad55000 - 7ad68000 (usable)
BIOS-e820: 7ad68000 - 7afa8000 (reserved)
BIOS-e820: 7afa8000 - 7bcb7000 (usable)
BIOS-e820: 7bcb7000 - 7bdb7000 (reserved)
BIOS-e820: 7bdb7000 - 7beb7000 (unusable)
BIOS-e820: 7beb7000 - 7bfb7000 (reserved)
BIOS-e820: 7bfb7000 - 7d1b7000 (ACPI NVS)
BIOS-e820: 7d1b7000 - 7e00 (ACPI data)
BIOS-e820: 7e00 - 7e318000 (usable)
BIOS-e820: 7e318000 - 7ef49000 (ACPI data)
BIOS-e820: 7ef49000 - 7f00 (usable)
BIOS-e820: 0001 - 001e (usable)
BIOS-e820: 0020 - 003dff00 (usable)
BIOS-e820: 0040 - 005dff00 (usable)
BIOS-e820: 0060 - 007dff00 (usable)


BIOS-e820: 0e00 - 0e1dff00 (usable)
BIOS-e820: 0e20 - 0e3dff00 (usable)
BIOS-e820: 0e40 - 0e5dff00 (usable)
BIOS-e820: 0e60 - 0e7dff00 (usable)

Note that on UV, there is some memory reserved at the end
of each node that is used for the directory RAM by the
UV HUB.  That is why the ranges are not contiguous.  I'd
have to double check with the BIOS guys, but I don't believe
they would have collapsed ranges across nodes even if the
directory RAM did not exist.

-Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 02/15] KDB: fix errant character in KDB show regs

2013-03-25 Thread Mike Travis

When KDB prints the process regs and backtrace, every line is preceeded
with the character 'd'.  This is the level argument to printk which
is not interpreted when KDB is printing.  Skip over this possible
printk level in the outgoing string to fix this.

Here is a small sample:

dRIP: 0010:[]  [] poll_idle+0x4a/0x90
dRSP: 0018:88081d5eddd8  EFLAGS: 0246
dRAX: 0004 RBX: 0216ae7fbf5d RCX: 021658a8e600
dRDX: 88081d5ec010 RSI: 819a7d20 RDI: 8193c140

Cc: Tim Bird 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_io.c |   16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

--- linux.orig/kernel/debug/kdb/kdb_io.c
+++ linux/kernel/debug/kdb/kdb_io.c
@@ -559,6 +559,7 @@ int vkdb_printf(const char *fmt, va_list
int retlen = 0;
int fnd, len;
char *cp, *cp2, *cphold = NULL, replaced_byte = ' ';
+   const char *ostring;
char *moreprompt = "more> ";
struct console *c = console_drivers;
static DEFINE_SPINLOCK(kdb_printf_lock);
@@ -690,20 +691,21 @@ kdb_printit:
/*
 * Write to all consoles.
 */
-   retlen = strlen(kdb_buffer);
+   ostring = printk_skip_level(kdb_buffer);
+   retlen = strlen(ostring);
if (!dbg_kdb_mode && kgdb_connected) {
-   gdbstub_msg_write(kdb_buffer, retlen);
+   gdbstub_msg_write(ostring, retlen);
} else {
if (dbg_io_ops && !dbg_io_ops->is_console) {
len = retlen;
-   cp = kdb_buffer;
+   cp = (char *)ostring;
while (len--) {
dbg_io_ops->write_char(*cp);
cp++;
}
}
while (c) {
-   c->write(c, kdb_buffer, retlen);
+   c->write(c, ostring, retlen);
touch_nmi_watchdog();
c = c->next;
}
@@ -711,7 +713,7 @@ kdb_printit:
if (logging) {
saved_loglevel = console_loglevel;
console_loglevel = 0;
-   printk(KERN_INFO "%s", kdb_buffer);
+   pr_info("%s", ostring);
}
 
if (KDB_STATE(PAGER)) {
@@ -723,10 +725,10 @@ kdb_printit:
int got = 0;
len = retlen;
while (len--) {
-   if (kdb_buffer[len] == '\n') {
+   if (ostring[len] == '\n') {
kdb_nextline++;
got = 0;
-   } else if (kdb_buffer[len] == '\r') {
+   } else if (ostring[len] == '\r') {
got = 0;
} else {
got++;

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 03/15] KDB: up the default LINES value

2013-03-25 Thread Mike Travis

Currently the default for the # of lines displayed by the KDB pager
is 24.  This does not allow all of the lines for the entry messages,
reg dump and process trace.  Increase it to something more reasonable.

Cc: Tim Bird 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_io.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux.orig/kernel/debug/kdb/kdb_io.c
+++ linux/kernel/debug/kdb/kdb_io.c
@@ -586,7 +586,7 @@ int vkdb_printf(const char *fmt, va_list
 
diag = kdbgetintenv("LINES", &linecount);
if (diag || linecount <= 1)
-   linecount = 24;
+   linecount = 60;
 
diag = kdbgetintenv("COLUMNS", &colcount);
if (diag || colcount <= 1)

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 00/15] x86/UV/KDB/NMI: Updates for NMI/KDB handler for SGI UV

2013-03-25 Thread Mike Travis


These are kernel updates for the NMI/KDB handler on the SGI
Ultraviolet System.

* Fix problem where 'quit' to more> prompt doesn't stop output.
* Fix problem where reg dump shows letter 'd' in first column of
  every line.
* Up the number of LINES so the entire entry message is displayed.
* Moves KDB header defines to externally visable header so external
  KDB modules can be built.
* Exports some significant KDB functions for use by external modules.
* Consolidates the '| grep' support into new kdb_grep.c file.
* Updates kdb grep support to add some new options.
* Restablishes support for kdump from the kdb prompt.
* Adds back in the 'pshelp' command.
* Adds new entry into KGDB for systems with global NMI support.
* Adds new reason code to enter KDB that's not considered an error.
* Adds support for external 'uvtrace' module.
* Updates NMI support for new internal system (SMM) NMI handler.
* Adds back the capability of NMI entering KDB/KGDB.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 15/15] x86/UV: Add call to KGDB/KDB from NMI handler

2013-03-25 Thread Mike Travis

This patch restores the ability to enter KDB (and KGDB) from the UV
NMI handler.  It utilizes the newly added kgdb_nmicallin function
to gain entry to KGDB/KDB by the master.  The slaves still enter via
the standard kgdb_nmicallback function.

The handler also uses the new 'send_ready' pointer to tell KGDB/KDB
to signal the slaves when to proceed into the KGDB slave loop.

Cc: Alexander Gordeev 
Cc: Suresh Siddha 
Cc: "Michael S. Tsirkin" 
Cc: Steffen Persvold 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 arch/x86/platform/uv/uv_nmi.c |   73 --
 1 file changed, 71 insertions(+), 2 deletions(-)

--- linux.orig/arch/x86/platform/uv/uv_nmi.c
+++ linux/arch/x86/platform/uv/uv_nmi.c
@@ -21,6 +21,8 @@
 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -34,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -521,6 +524,68 @@ static void uv_nmi_touch_watchdogs(void)
touch_nmi_watchdog();
 }
 
+#ifdef CONFIG_KGDB_KDB
+
+/* Disable to force process dump instead of entering KDB or KGDB */
+static int uv_nmi_kdb_on = 1;
+module_param_named(kdb_on, uv_nmi_kdb_on, int, 0644);
+
+/* Call KDB from NMI handler */
+static void uv_call_kdb(int cpu, struct pt_regs *regs,
+   int master, unsigned long *flags)
+{
+   int ret;
+
+   if (master) {
+   /* call KGDB NMI handler as MASTER */
+   local_irq_restore(*flags);
+   ret = kgdb_nmicallin(cpu, X86_TRAP_NMI, regs,
+   &uv_nmi_slave_continue);
+   local_irq_save(*flags);
+
+   /*
+* if KGDB/KDB did not handle the NMI, then signal slaves
+*   to do process dump instead.
+*/
+   if (ret) {
+   uv_nmi_dump_state(cpu, regs, 1);
+   return;
+   }
+   } else {
+   int sig;
+
+   /* wait for KGDB to say it's ready for slaves to enter */
+   do {
+   cpu_relax();
+   sig = atomic_read(&uv_nmi_slave_continue);
+   } while (!sig);
+
+   /*
+* if KGDB/KDB did not handle the NMI for the master, then
+*   the master signals the slaves to do process dump instead.
+*/
+   if (sig == 2) {
+   uv_nmi_dump_state(cpu, regs, 0);
+   return;
+   }
+
+   /* call KGDB as slave */
+   local_irq_restore(*flags);
+   ret = kgdb_nmicallback(cpu, regs);
+   local_irq_save(*flags);
+   }
+   uv_nmi_sync_exit(master);
+}
+
+#else /* !CONFIG_KGDB_KDB */
+static inline void uv_call_kdb(int cpu, struct pt_regs *regs,
+   int master, unsigned long *flags)
+{
+   pr_err("UV: NMI error: KDB is not enabled in this kernel\n");
+   uv_nmi_dump_state(cpu, regs, master);
+}
+#endif /* !CONFIG_KGDB_KDB */
+
 /*
  * UV NMI handler
  */
@@ -547,8 +612,12 @@ int uv_handle_nmi(unsigned int reason, s
if (master && uv_nmi_kdump_requested)
uv_nmi_kdump(regs);
 
-   /* Dump state of each cpu */
-   uv_nmi_dump_state(cpu, regs, master);
+   /* Call KDB if enabled */
+   if (uv_nmi_kdb_on)
+   uv_call_kdb(cpu, regs, master, &flags);
+
+   else/* Otherwise dump state of each cpu */
+   uv_nmi_dump_state(cpu, regs, master);
 
/* Clear per_cpu "in nmi" flag */
atomic_set(&uv_cpu_nmi.state, UV_NMI_STATE_OUT);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 07/15] KDB: clean up KDB grep code, add some options

2013-03-25 Thread Mike Travis

This patch cleans up the grep 'pipe' code in KDB and adds some new options:

* allows multiple '| grep' options to be used.
* adds '-v' flag to invert the search.
* adds '-o' flag for optional ('OR') patterns.
* adds '-u' flag to delay printing until match found.

Options may be mixed in any combination.

Cc: Tim Bird 
Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Cc: "Vincent StehlÃ©" 
Cc: Andrei Warkentin 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/kdb_grep.c|  352 +
 kernel/debug/kdb/kdb_io.c  |   39 ++--
 kernel/debug/kdb/kdb_main.c|   10 -
 kernel/debug/kdb/kdb_private.h |   55 +-
 4 files changed, 361 insertions(+), 95 deletions(-)

--- linux.orig/kernel/debug/kdb/kdb_grep.c
+++ linux/kernel/debug/kdb/kdb_grep.c
@@ -11,80 +11,224 @@
 
 #include 
 #include 
+#include 
 #include 
 #include "kdb_private.h"
 
-#define GREP_LEN 256
-char kdb_grep_string[GREP_LEN];
-int kdb_grepping_flag;
+#define KDB_GREP_PATT_LEN 511
+#define KDB_GREP_MAX 8
+
+static char kdb_grep_patterns[KDB_GREP_PATT_LEN+1];
+static int kdb_grep_pattern_idx;
+
+/* Note: kdb_grep_stack[0] intentially left zero */
+struct kdb_grep_stack_s kdb_grep_stack[KDB_GREP_MAX+1];
+int kdb_grepping_flag; /* now kdb_grep_stack index */
 EXPORT_SYMBOL(kdb_grepping_flag);
-int kdb_grep_leading;
-int kdb_grep_trailing;
 
-/*
- * The "str" argument may point to something like  | grep xyz
- */
-void kdb_grep_parse(const char *str)
+static void kdb_grep_stack_clear(void)
 {
-   int len;
-   char*cp = (char *)str, *cp2;
+   kdb_grep_stack[kdb_grepping_flag].flags = 0;
+   kdb_grep_stack[kdb_grepping_flag].pattern_idx = 0;
+}
+
+static int kdb_grep_push(void)
+{
+   if (kdb_grepping_flag < KDB_GREP_MAX) {
+   ++kdb_grepping_flag;
+   kdb_grep_stack_clear();
+   kdb_grep_set(enabled);
+   return 1;
+   }
+   return 0;
+}
+
+static void kdb_grep_pop(void)
+{
+   if (kdb_grepping_flag > 0) {
+   kdb_grep_pattern_idx =
+   kdb_grep_stack[kdb_grepping_flag].pattern_idx;
+
+   kdb_grep_patterns[kdb_grep_pattern_idx] = '\0';
+
+   if (kdb_grep(suspended))
+   kdb_grep_set_lvl(suspended, kdb_grepping_flag - 1);
+
+   kdb_grep_stack_clear();
+   --kdb_grepping_flag;
+
+   if (!kdb_grep(enabled))
+   kdb_grep_pop();
+   }
+}
+
+void kdb_grep_clear_all(void)
+{
+   kdb_grepping_flag = 0;
+   kdb_grep_pattern_idx = 0;
+   kdb_grep_patterns[0] = 0;
+   memset(kdb_grep_stack, 0, sizeof(kdb_grep_stack));
+}
+
+static int kdb_grep_error(const char *str)
+{
+   kdb_grep_clear_all();
+   kdb_printf("grep error: %s, see grephelp\n", str);
+   return -1;
+}
 
-   /* sanity check: we should have been called with the \ first */
+static const char *kdb_grep_pattern(int lvl)
+{
+   return &kdb_grep_patterns[kdb_grep_stack[lvl].pattern_idx];
+}
+
+static int kdb_grep_add_pattern(char *str)
+{
+   int len = strlen(str);
+
+   if (!len) {
+   kdb_grep_error("empty search pattern");
+   return 0;
+   }
+
+   if ((kdb_grep_pattern_idx + len) >= KDB_GREP_PATT_LEN) {
+   kdb_grep_error("search string(s) too long");
+   return 0;
+   }
+
+   /* copy string into pattern(s) buffer */
+   kdb_grep_stack[kdb_grepping_flag].pattern_idx = kdb_grep_pattern_idx;
+   strcpy((char *)kdb_grep_pattern(kdb_grepping_flag), str);
+   kdb_grep_pattern_idx += len + 1;
+   kdb_grep_patterns[kdb_grep_pattern_idx] = '\0';
+   return 1;
+}
+
+static char *is_grep(const char *cp)
+{
+   /* sanity check: we should have been called with the | first */
if (*cp != '|')
-   return;
+   return 0;
cp++;
while (isspace(*cp))
cp++;
+
if (strncmp(cp, "grep ", 5)) {
-   kdb_printf("invalid 'pipe', see grephelp\n");
-   return;
+   kdb_grep_error("invalid 'pipe'");
+   return NULL;
}
cp += 5;
+   return (char *)cp;
+}
+
+/*
+ * The "str" argument may point to something like  | grep xyz
+ */
+int kdb_grep_parse(char *str)
+{
+   int len;
+   char*cp, *cp2;
+   char*newgrep;
+
+   cp = is_grep(str);
+   if (!cp)
+   return -1;
+repeat:
+   if (!kdb_grep_push())
+   return kdb_grep_error("too many grep's");
+
+   newgrep = NULL;
+
while (isspace(*cp))
cp++

[PATCH 06/15] KDB: consolidate KDB grep code

2013-03-25 Thread Mike Travis

This patch consolidates various parts of the grep code in KDB
into a new file, kdb_grep.c, in preparation of various cleanups
and additions.

Cc: Tim Bird 
Cc: Anton Vorontsov 
Cc: Sasha Levin 
Cc: Rusty Russell 
Cc: Greg Kroah-Hartman 
Cc: "Vincent StehlÃ©" 
Cc: Andrei Warkentin 
Reviewed-by: Dimitri Sivanich 
Signed-off-by: Mike Travis 
---
 kernel/debug/kdb/Makefile  |2 
 kernel/debug/kdb/kdb_grep.c|  145 +
 kernel/debug/kdb/kdb_io.c  |   38 --
 kernel/debug/kdb/kdb_main.c|   92 --
 kernel/debug/kdb/kdb_private.h |4 +
 5 files changed, 152 insertions(+), 129 deletions(-)

--- linux.orig/kernel/debug/kdb/Makefile
+++ linux/kernel/debug/kdb/Makefile
@@ -7,7 +7,7 @@
 #
 
 CCVERSION  := $(shell $(CC) -v 2>&1 | sed -ne '$$p')
-obj-y := kdb_io.o kdb_main.o kdb_support.o kdb_bt.o gen-kdb_cmds.o kdb_bp.o 
kdb_debugger.o
+obj-y := kdb_io.o kdb_main.o kdb_support.o kdb_bt.o gen-kdb_cmds.o kdb_bp.o 
kdb_grep.o kdb_debugger.o
 obj-$(CONFIG_KDB_KEYBOARD)+= kdb_keyboard.o
 
 clean-files := gen-kdb_cmds.c
--- /dev/null
+++ linux/kernel/debug/kdb/kdb_grep.c
@@ -0,0 +1,145 @@
+/*
+ * Kernel Debugger Architecture Grep Support
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 1999-2004,2013 Silicon Graphics, Inc.  All Rights Reserved.
+ * Copyright (c) 2009 Wind River Systems, Inc.  All Rights Reserved.
+ */
+
+#include 
+#include 
+#include 
+#include "kdb_private.h"
+
+#define GREP_LEN 256
+char kdb_grep_string[GREP_LEN];
+int kdb_grepping_flag;
+EXPORT_SYMBOL(kdb_grepping_flag);
+int kdb_grep_leading;
+int kdb_grep_trailing;
+
+/*
+ * The "str" argument may point to something like  | grep xyz
+ */
+void kdb_grep_parse(const char *str)
+{
+   int len;
+   char*cp = (char *)str, *cp2;
+
+   /* sanity check: we should have been called with the \ first */
+   if (*cp != '|')
+   return;
+   cp++;
+   while (isspace(*cp))
+   cp++;
+   if (strncmp(cp, "grep ", 5)) {
+   kdb_printf("invalid 'pipe', see grephelp\n");
+   return;
+   }
+   cp += 5;
+   while (isspace(*cp))
+   cp++;
+   cp2 = strchr(cp, '\n');
+   if (cp2)
+   *cp2 = '\0'; /* remove the trailing newline */
+   len = strlen(cp);
+   if (len == 0) {
+   kdb_printf("invalid 'pipe', see grephelp\n");
+   return;
+   }
+   /* now cp points to a nonzero length search string */
+   if (*cp == '"') {
+   /* allow it be "x y z" by removing the "'s - there must
+  be two of them */
+   cp++;
+   cp2 = strchr(cp, '"');
+   if (!cp2) {
+   kdb_printf("invalid quoted string, see grephelp\n");
+   return;
+   }
+   *cp2 = '\0'; /* end the string where the 2nd " was */
+   }
+   kdb_grep_leading = 0;
+   if (*cp == '^') {
+   kdb_grep_leading = 1;
+   cp++;
+   }
+   len = strlen(cp);
+   kdb_grep_trailing = 0;
+   if (*(cp+len-1) == '$') {
+   kdb_grep_trailing = 1;
+   *(cp+len-1) = '\0';
+   }
+   len = strlen(cp);
+   if (!len)
+   return;
+   if (len >= GREP_LEN) {
+   kdb_printf("search string too long\n");
+   return;
+   }
+   strcpy(kdb_grep_string, cp);
+   kdb_grepping_flag++;
+   return;
+}
+
+
+/*
+ * search arg1 to see if it contains arg2
+ * (kdmain.c provides flags for ^pat and pat$)
+ *
+ * return 1 for found, 0 for not found
+ */
+int kdb_grep_search(char *searched)
+{
+   char firstchar, *cp;
+   char *searchfor = kdb_grep_string;
+   int len1, len2;
+
+   /* not counting the newline at the end of "searched" */
+   len1 = strlen(searched)-1;
+   len2 = strlen(searchfor);
+   if (len1 < len2)
+   return 0;
+   if (kdb_grep_leading && kdb_grep_trailing && len1 != len2)
+   return 0;
+   if (kdb_grep_leading) {
+   if (!strncmp(searched, searchfor, len2))
+   return 1;
+   } else if (kdb_grep_trailing) {
+   if (!strncmp(searched+len1-len2, searchfor, len2))
+   return 1;
+   } else {
+   firstchar = *searchfor;
+   cp = searched;
+   while ((cp = strchr(cp, firstchar))) {
+   if (!strncmp(cp, searchfor,

1 2 3 4 5 6 7 >

1 - 100 of 693 matches

Mail list logo