All OFED 1.4 and 2.6.32 (that's what I can get to today) qib to qib:
# OSU MPI Latency Test v3.3 # Size Latency (us) 0 0.29 1 0.32 2 0.31 4 0.32 8 0.32 16 0.35 32 0.35 64 0.47 128 0.47 256 0.50 512 0.53 1024 0.66 2048 0.88 4096 1.24 8192 1.89 16384 3.94 32768 5.94 65536 9.79 131072 18.93 262144 37.36 524288 71.90 1048576 189.62 2097152 478.55 4194304 1148.80 # OSU MPI Bandwidth Test v3.3 # Size Bandwidth (MB/s) 1 2.48 2 5.00 4 10.04 8 20.02 16 33.22 32 67.32 64 134.65 128 260.30 256 486.44 512 860.77 1024 1385.54 2048 1940.68 4096 2231.20 8192 2343.30 16384 2944.99 32768 3213.77 65536 3174.85 131072 3220.07 262144 3259.48 524288 3277.05 1048576 3283.97 2097152 3288.91 4194304 3291.84 # OSU MPI Bi-Directional Bandwidth Test v3.3 # Size Bi-Bandwidth (MB/s) 1 3.10 2 6.21 4 13.08 8 26.91 16 41.00 32 78.17 64 161.13 128 312.08 256 588.18 512 968.32 1024 1683.42 2048 2513.86 4096 2948.11 8192 2918.39 16384 3370.28 32768 3543.99 65536 4159.99 131072 4709.73 262144 4733.31 524288 4795.44 1048576 4753.69 2097152 4786.11 4194304 4779.40 mlx4 to mlx4: # OSU MPI Latency Test v3.3 # Size Latency (us) 0 1.62 1 1.66 2 1.67 4 1.66 8 1.70 16 1.71 32 1.75 64 1.91 128 3.11 256 3.32 512 3.66 1024 4.46 2048 5.57 4096 6.62 8192 8.95 16384 11.07 32768 15.94 65536 25.57 131072 44.93 262144 83.58 524288 160.85 1048576 315.47 2097152 624.68 4194304 1247.17 # OSU MPI Bandwidth Test v3.3 # Size Bandwidth (MB/s) 1 1.80 2 4.21 4 8.79 8 18.14 16 35.79 32 68.58 64 132.72 128 221.89 256 399.62 512 724.13 1024 1267.36 2048 1959.22 4096 2354.26 8192 2519.50 16384 3225.44 32768 3227.86 65536 3350.76 131072 3369.86 262144 3378.76 524288 3384.02 1048576 3386.60 2097152 3387.97 4194304 3388.66 # OSU MPI Bi-Directional Bandwidth Test v3.3 # Size Bi-Bandwidth (MB/s) 1 1.70 2 3.86 4 10.42 8 20.99 16 41.22 32 79.17 64 151.25 128 277.64 256 495.44 512 843.44 1024 162.53 2048 2427.23 4096 2989.63 8192 3587.58 16384 5391.08 32768 6051.56 65536 6314.33 131072 6439.04 262144 6506.51 524288 6539.51 1048576 6558.34 2097152 6567.24 4194304 6555.76 mixed: # OSU MPI Latency Test v3.3 # Size Latency (us) 0 3.81 1 3.88 2 3.86 4 3.85 8 3.92 16 3.93 32 3.93 64 4.02 128 4.60 256 4.80 512 5.14 1024 5.94 2048 7.26 4096 8.50 8192 10.98 16384 19.92 32768 26.35 65536 39.93 131072 64.45 262144 106.93 524288 191.89 1048576 358.31 2097152 694.25 4194304 1429.56 # OSU MPI Bandwidth Test v3.3 # Size Bandwidth (MB/s) 1 0.64 2 1.39 4 2.76 8 5.58 16 11.03 32 22.17 64 43.70 128 100.49 256 179.83 512 305.87 1024 544.68 2048 838.22 4096 1187.74 8192 1542.07 16384 1260.93 32768 1708.54 65536 2180.45 131072 2482.28 262144 2624.89 524288 2680.55 1048576 2728.58 never gets past here # OSU MPI Bi-Directional Bandwidth Test v3.3 # Size Bi-Bandwidth (MB/s) 1 0.41 2 0.83 4 1.68 8 3.37 16 6.71 32 13.37 64 26.64 128 63.47 256 113.23 512 202.92 1024 362.48 2048 578.53 4096 830.31 8192 1143.16 16384 1303.02 32768 1913.07 65536 2463.83 131072 2793.83 262144 2918.32 524288 2987.92 1048576 3033.31 never gets past here On 07/15/11 09:03, Jeff Squyres wrote:
I don't think too many people have done combined QLogic + Mellanox runs, so this probably isn't a well-explored space. Can you run some microbenchmarks to see what kind of latency / bandwidth you're getting between nodes of the same type and nodes of different types? On Jul 14, 2011, at 8:21 PM, David Warren wrote:On my test runs (wrf run just long enough to go beyond the spinup influence) On just 6 of the the old mlx4 machines I get about 00:05:30 runtime On 3 mlx4 and 3 qib nodes I get avg of 00:06:20 So the slow down is about 11+% When this is a full run 11% becomes a evry long time. This has held for some longer tests as well before I went to ofed 1.6. On 07/14/11 05:55, Jeff Squyres wrote:On Jul 13, 2011, at 7:46 PM, David Warren wrote:I finally got access to the systems again (the original ones are part of our real time system). I thought I would try one other test I had set up first. I went to OFED 1.6 and it started running with no errors. It must have been an OFED bug. Now I just have the speed problem. Anyone have a way to make the mixture of mlx4 and qlogic work together without slowing down?What do you mean by "slowing down"?<warren.vcf>
<<attachment: warren.vcf>>