for ... rank 13=os221 slot=2 rank 14=os222 slot=2 rank 15=os224 slot=2 rank 16=os228 slot=4 rank 17=os229 slot=4
I've tried and here are the results, same thing happened. 2010-08-12 11:09:28,814 59759 DEBUG [0x7fbd3fdce740] - RANK(0) Printing Times... 2010-08-12 11:09:28,814 59759 DEBUG [0x7fbd3fdce740] - os221 RANK(1) :24 sec 2010-08-12 11:09:28,814 59759 DEBUG [0x7fbd3fdce740] - os222 RANK(2) :27 sec 2010-08-12 11:09:28,814 59759 DEBUG [0x7fbd3fdce740] - os224 RANK(3) :27 sec 2010-08-12 11:09:28,814 59759 DEBUG [0x7fbd3fdce740] - os228 RANK(4) :41 sec 2010-08-12 11:09:28,814 59759 DEBUG [0x7fbd3fdce740] - os229 RANK(5) :42 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os223 RANK(6) :27 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os221 RANK(7) :28 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os222 RANK(8) :22 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os224 RANK(9) :22 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os228 RANK(10) :*40 sec* 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os229 RANK(11) :24 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os223 RANK(12) :26 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os221 RANK(13) :28 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os222 RANK(14) :27 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os224 RANK(15) :27 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os228 RANK(16) :19 sec 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - os229 RANK(17) :*43 sec* 2010-08-12 11:09:28,815 59759 DEBUG [0x7fbd3fdce740] - TOTAL CORRELATION TIME: 43 sec for ... rank 12=os223 slot=2 rank 13=os221 slot=2 rank 14=os222 slot=2 rank 15=os224 slot=2 rank 16=os228 slot=2 rank 17=os229 slot=2 here are the results 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os221 RANK(1) :23 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os222 RANK(2) :23 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os224 RANK(3) :24 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os228 RANK(4) :20 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os229 RANK(5) :20 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os223 RANK(6) :24 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os221 RANK(7) :23 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os222 RANK(8) :22 sec 2010-08-12 11:19:33,916 54609 DEBUG [0x7f22881b5740] - os224 RANK(9) :22 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os228 RANK(10) :19 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os229 RANK(11) :*35 sec* 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os223 RANK(12) :23 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os221 RANK(13) :23 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os222 RANK(14) :23 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os224 RANK(15) :23 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os228 RANK(16) :19 sec 2010-08-12 11:19:33,917 54609 DEBUG [0x7f22881b5740] - os229 RANK(17) :*37 sec* Again the same thing happened. I also tried to give the slots 0, 3, 7 and some other combinations, but it didn't change the result. Sometimes it gave pretty normal, then I got some strange ones again. *I guess specifiying the slot number doesn't affect the BIOS rank choice.*The last test was as follows: 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os221 RANK(1) :24 sec 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os222 RANK(2) :23 sec 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os224 RANK(3) :23 sec *2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os228 RANK(4) :40 sec* 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os229 RANK(5) :20 sec 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os223 RANK(6) :24 sec 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os221 RANK(7) :24 sec 2010-08-12 11:25:02,599 55467 DEBUG [0x7f15af87a740] - os222 RANK(8) :22 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os224 RANK(9) :22 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os228 RANK(10) :20 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os229 RANK(11) :21 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os223 RANK(12) :23 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os221 RANK(13) :24 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os222 RANK(14) :24 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os224 RANK(15) :23 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os228 RANK(16) :38 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - os229 RANK(17) :21 sec 2010-08-12 11:25:02,599 55468 DEBUG [0x7f15af87a740] - TOTAL CORRELATION TIME: 40 sec Now I'm gonna try the other advices here. Such as mpstat, or -bynode etc. I hope to find a solution. Then I'm gonna post it here. On Wed, Aug 11, 2010 at 6:23 PM, Eugene Loh <eugene....@oracle.com> wrote: > The way MPI processes are being assigned to hardware threads is perhaps > neither controlled nor optimal. On the HT nodes, two processes may end up > sharing the same core, with poorer performance. > > Try submitting your job like this > > % cat myrankfile1 > rank 0=os223 slot=0 > rank 1=os221 slot=0 > rank 2=os222 slot=0 > rank 3=os224 slot=0 > rank 4=os228 slot=0 > rank 5=os229 slot=0 > rank 6=os223 slot=1 > rank 7=os221 slot=1 > rank 8=os222 slot=1 > rank 9=os224 slot=1 > rank 10=os228 slot=1 > rank 11=os229 slot=1 > rank 12=os223 slot=2 > rank 13=os221 slot=2 > rank 14=os222 slot=2 > rank 15=os224 slot=2 > rank 16=os228 slot=2 > rank 17=os229 slot=2 > % mpirun -host os221,os222,os223,os224,os228,os229 -np 18 --rankfile > myrankfile1 ./a.out > > You can also try > > % cat myrankfile2 > rank 0=os223 slot=0 > rank 1=os221 slot=0 > rank 2=os222 slot=0 > rank 3=os224 slot=0 > rank 4=os228 slot=0 > rank 5=os229 slot=0 > rank 6=os223 slot=1 > rank 7=os221 slot=1 > rank 8=os222 slot=1 > rank 9=os224 slot=1 > rank 10=os228 slot=2 > rank 11=os229 slot=2 > rank 12=os223 slot=2 > rank 13=os221 slot=2 > rank 14=os222 slot=2 > rank 15=os224 slot=2 > rank 16=os228 slot=4 > rank 17=os229 slot=4 > % mpirun -host os221,os222,os223,os224,os228,os229 -np 18 --rankfile > myrankfile2 ./a.out > > which one reproduces your problem and which one avoids it depends on how > the BIOS numbers your HTs. Once you can confirm you understand the problem, > you (with the help of this list) can devise a solution approach for your > situation. > > > > Saygin Arkan wrote: > > Hello, > > I'm running mpi jobs in non-homogeneous cluster. 4 of my machines have the > following properties, os221, os222, os223, os224: > > vendor_id : GenuineIntel > cpu family : 6 > model : 23 > model name : Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz > stepping : 7 > cache size : 3072 KB > physical id : 0 > siblings : 4 > core id : 3 > cpu cores : 4 > fpu : yes > fpu_exception : yes > cpuid level : 10 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm > constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx smx est > tm2 ssse3 cx16 xtpr sse4_1 lahf_lm > bogomips : 4999.40 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > > and the problematic, hyper-threaded 2 machines are as follows, os228 and > os229: > > vendor_id : GenuineIntel > cpu family : 6 > model : 26 > model name : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz > stepping : 5 > cache size : 8192 KB > physical id : 0 > siblings : 8 > core id : 3 > cpu cores : 4 > fpu : yes > fpu_exception : yes > cpuid level : 11 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx > rdtscp lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx > est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm ida > bogomips : 5396.88 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > > > The problem is: those 2 machines seem to be having 8 cores (virtually, > actualy core number is 4). > When I submit an MPI job, I calculated the comparison times in the cluster. > I got strange results. > > I'm running the job on 6 nodes, 3 core per node. And sometimes ( I can say > 1/3 of the tests) os228 or os229 returns strange results. 2 cores are slow > (slower than the first 4 nodes) but the 3rd core is extremely fast. > > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - RANK(0) Printing > Times... > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(1) :38 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os222 RANK(2) :38 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os224 RANK(3) :38 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os228 RANK(4) :37 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os229 RANK(5) :34 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os223 RANK(6) :38 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(7) :39 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os222 RANK(8) :37 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os224 RANK(9) :38 > sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os228 RANK(10) : > *48 sec* > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os229 RANK(11) > :35 sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os223 RANK(12) > :38 sec > 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(13) > :37 sec > 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os222 RANK(14) > :37 sec > 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os224 RANK(15) > :38 sec > 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os228 RANK(16) : > *43 sec* > 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os229 RANK(17) > :35 sec > TOTAL CORRELATION TIME: 48 sec > > > or another test: > > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - RANK(0) Printing > Times... > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os221 RANK(1) > :170 sec > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os222 RANK(2) > :161 sec > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os224 RANK(3) > :158 sec > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os228 RANK(4) > :142 sec > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os229 RANK(5) : > *256 sec* > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os223 RANK(6) > :156 sec > 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os221 RANK(7) > :162 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os222 RANK(8) > :159 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os224 RANK(9) > :168 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os228 RANK(10) > :141 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os229 RANK(11) > :136 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os223 RANK(12) > :173 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os221 RANK(13) > :164 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os222 RANK(14) > :171 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os224 RANK(15) > :156 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os228 RANK(16) > :136 sec > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os229 RANK(17) : > *250 sec* > 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - TOTAL CORRELATION > TIME: 256 sec > > > Do you have any idea? Why it is happening? > I assume that it gives 2 jobs to 2 cores in os229, but actually those 2 are > one core. > Do you have any idea? If you have, how can I fix it? because the longest > time affects the whole time information. 100 sec delay is too much for 250 > sec comparison time, > and it might have finish around 160 sec. > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Saygin