+ Andreas. Dude, look at this boot log below:
http://quora.org/2012/16-server-boot-2.txt That's 192 F10h's! On Mon, Oct 29, 2012 at 04:54:59PM +0800, Daniel J Blueman wrote: > >A number of other callers lookup the PCI device based on index > >0..amd_nb_num(), but we can't easily allocate contiguous northbridge IDs > >from the PCI device in the first place. > > >OTOH we can simply this code by changing amd_get_node_id to generate a > >linear northbridge ID from the index of the matching entry in the > >northbridge array. > > > >I'll get a patch together to see if there are any snags. I suspected that after we have this nice approach, you guys would come with non-contiguous node numbers. Maan, can't you build your systems so that software people can have it easy at least for once??! :-) > This really is a lot less intrusive [1] and boots well on top of > 3.7-rc3 on one of our 16-server/192-core/512GB systems [2]. > > If you're happy with this simpler approach for now, I'll present > this and a separate patch cleaning up the inconsistent use of > unsigned and u8 node ID variables to u16? Sure, bring it on. > diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h > index b3341e9..b88fc7a 100644 > --- a/arch/x86/include/asm/amd_nb.h > +++ b/arch/x86/include/asm/amd_nb.h > @@ -81,6 +81,18 @@ static inline struct amd_northbridge > *node_to_amd_nb(int node) > return (node < amd_northbridges.num) ? > &amd_northbridges.nb[node] : NULL; > } > > +static inline u8 get_node_id(struct pci_dev *pdev) > +{ > + int i; > + > + for (i = 0; i != amd_nb_num(); i++) > + if (pci_domain_nr(node_to_amd_nb(i)->misc->bus) == > pci_domain_nr(pdev->bus) && > + PCI_SLOT(node_to_amd_nb(i)->misc->devfn) == > PCI_SLOT(pdev->devfn)) > + return i; Looks ok, can you send the whole patch please? > + BUG(); I'm not sure about this - maybe WARN()? Are we absolutely sure we unconditionally should panic after not finding an NB descriptor? > [2] http://quora.org/2012/16-server-boot-2.txt That's just crazy: [ 45.987953] Brought up 192 CPUs :-) Btw, this shouldn't happen on those CPUs: [ 39.279131] TSC synchronization [CPU#0 -> CPU#12]: [ 39.287223] Measured 22750019569 cycles TSC warp between CPUs, turning off TSC clock. [ 0.030000] tsc: Marking TSC unstable due to check_tsc_sync_source failed I guess TSCs are not starting at the same moment on all boards. You definitely need ucode on those too: [ 113.392460] microcode: CPU0: patch_level=0x00000000 That's just crazy, hahahah. Thanks. -- Regards/Gruss, Boris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/