On Thu, Jan 18, 2007 at 11:08:04AM +0200, Gleb Natapov wrote: >On Thu, Jan 18, 2007 at 03:52:19AM -0500, Robin Humble wrote: >> On Wed, Jan 17, 2007 at 08:55:31AM -0700, Brian W. Barrett wrote: >> >On Jan 17, 2007, at 2:39 AM, Gleb Natapov wrote: >> >> On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote: >> >>> basically I'm seeing wildly different bandwidths over InfiniBand 4x DDR >> >>> when I use different kernels. >> >> Try to load ib_mthca with tune_pci=1 option on those kernels that are >> >> slow. >> >when an application has high buffer reuse (like NetPIPE), which can >> >be enabled by adding "-mca mpi_leave_pinned 1" to the mpirun command >> >line. >> thanks! :-) >> tune_pci=1 makes a huge difference at the top end, and >Well this is broken BIOS then. Look here for more explanation: >https://staging.openfabrics.org/svn/openib/gen2/branches/1.1/ofed/docs/mthca_release_notes.txt >search for "tune_pci=1".
ok. thanks :-/ >> -mca mpi_leave_pinned 1 adds lots of midrange bandwidth. >> >> latencies (~4us) and the low end performance are all unchanged. >> >> see attached for details. >> most curves are for 2.6.19.2 except the last couple (tagged as old) >> which are for 2.6.9-42.0.3.ELsmp and for which tune_pci changes nothing. >> >> why isn't tune_pci=1 the default I wonder? >> files in /sys/module/ib_mthca/ tell me it's off by default in >> 2.6.9-42.0.3.ELsmp, but the results imply that it's on... maybe PCIe >> handling is very different in that kernel. >This is explained in the link above. hmmm... but (sorry to harp on about this) /sys/module/ib_mthca/tune_pci is 0 for 2.6.9-42.0.3.ELsmp. and even if that's lying, then mthca_tune_pci() appears identically invoked in mthca_main.c from both 2.6.9-42.0.3.ELsmp and 2.6.19.2. mthca_main.c is the only place in infiniband/hw/mthca that pci_write_config_word() is called from, so you'd think that's got to be how PCIe for IB was setup. basically it's not clear to me how or if tune_pci is being set in 2.6.9-42.0.3.ELsmp, nor why it's any different to 2.6.19.2 :-/ maybe it's some other level in the kernel setting up PCIe differently? but that would presumably be unrelated to OFED. is there a way to check pci burst settings from userland? or BIOS? BTW, the card appears to be Voltaire and system is SGI xe (210 and 240) if that helps. /sys/class/infiniband/mthca0/board_id is VLT0050010001 not that I'm blaming anyone! :-) cheers, robin