i have good news.

after updating to a newer kernel on ubuntu server nodes, sm is not a problem
anymore for the nehalem CPUs!!!
my older kernel, was
Linux 2.6.32-22-server #36-Ubuntu SMP Thu Jun 3 20:38:33 UTC 2010 x86_64
GNU/Linux

and i upgraded to
Linux agua 2.6.32-24-server #39-Ubuntu SMP Wed Jul 28 06:21:40 UTC 2010
x86_64 GNU/Linux

that solved everything.
Gus, maybe the problem you had with fedora can be solved in a similar way.

we should keep this for the records.

regards
Cristobal






On Wed, Jul 28, 2010 at 6:45 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> Cristobal Navarro wrote:
>
>> Gus
>> my kernel for all nodes is this one:
>> Linux 2.6.32-22-server #36-Ubuntu SMP Thu Jun 3 20:38:33 UTC 2010 x86_64
>> GNU/Linux
>>
>>
> Kernel is not my league.
>
> However, it would be great if somebody clarified
> for good these issues with Nehalem/Westmere, HT,
> shared memory and what the kernel is doing,
> or how to make the kernel do the right thing.
> Maybe Intel could tell.
>
>
>  at least for the moment i will use this configuration, at least for
>> deveplopment/testing  of the parallel programs.
>> lag is minimum :)
>>
>> whenever i get another kernel update, i will test again to check if sm
>> works, would be good to know that suddenly another distribution supports
>> nehalem sm.
>>
>> best regards and thanks again
>> Cristobal
>> ps: guess what are the names of the other 2 nodes lol
>>
>
> Acatenango (I said that before), and Pacaya.
>
> Maybe: Santa Maria, Santiaguito, Atitlan, Toliman, San Pedro,
> Cerro de Oro ... too many volcanoes, and some are multithreaded ...
> You need to buy more nodes!
>
> Gus
>
>
>>
>>
>> On Wed, Jul 28, 2010 at 5:50 PM, Gus Correa <g...@ldeo.columbia.edu<mailto:
>> g...@ldeo.columbia.edu>> wrote:
>>
>>    Hi Cristobal
>>
>>    Please, read my answer (way down the message) below.
>>
>>    Cristobal Navarro wrote:
>>
>>
>>
>>        On Wed, Jul 28, 2010 at 3:28 PM, Gus Correa
>>        <g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu>
>>        <mailto:g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu>>>
>>        wrote:
>>
>>           Hi Cristobal
>>
>>           Cristobal Navarro wrote:
>>
>>
>>
>>               On Wed, Jul 28, 2010 at 11:09 AM, Gus Correa
>>               <g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu>
>>        <mailto:g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu>>
>>                <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu> <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>>>>
>>               wrote:
>>
>>                  Hi Cristobal
>>
>>                  In case you are not using full path name for
>>        mpiexec/mpirun,
>>                  what does "which mpirun" say?
>>
>>
>>               --> $which mpirun
>>                    /opt/openmpi-1.4.2
>>
>>
>>                  Often times this is a source of confusion, old
>>        versions may
>>                  be first on the PATH.
>>
>>                  Gus
>>
>>
>>               openMPI version problem is now gone, i can confirm that the
>>               version is consistent now :), thanks.
>>
>>
>>           This is good news.
>>
>>
>>               however, i keep getting this kernel crash randomnly when i
>>               execute with -np higher than 5
>>               these are Xeons, with Hyperthreading On, is that a problem??
>>
>>
>>           The problem may be with Hyperthreading, maybe not.
>>           Which Xeons?
>>
>>
>>        --> they are not so old, not so new either
>>        fcluster@agua:~$ cat /proc/cpuinfo | more
>>        processor : 0
>>        vendor_id : GenuineIntel
>>        cpu family : 6
>>        model : 26
>>        model name : Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
>>        stepping : 5
>>        cpu MHz : 1596.000
>>        cache size : 8192 KB
>>        physical id : 0
>>        siblings : 8
>>        core id : 0
>>        cpu cores : 4
>>        apicid : 0
>>        initial apicid : 0
>>        fpu : yes
>>        fpu_exception : yes
>>        cpuid level : 11
>>        wp : yes
>>        flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>>        cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss h
>>        t tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts
>>        rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_
>>        cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt
>>        lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
>>        bogomips : 4522.21
>>        clflush size : 64
>>        cache_alignment : 64
>>        address sizes : 40 bits physical, 48 bits virtual
>>        power management:
>>        ...same for cpu1, 2, 3, ..., 15.
>>
>>
>>    AHA! Nehalems!
>>
>>    Here they are E5540, just a different clock speed, I suppose.
>>
>>
>>        information on how the cpu is distributed
>>        fcluster@agua:~$ lstopo
>>        System(7992MB)
>>         Socket#0 + L3(8192KB)
>>           L2(256KB) + L1(32KB) + Core#0
>>             P#0
>>             P#8
>>           L2(256KB) + L1(32KB) + Core#1
>>             P#2
>>             P#10
>>           L2(256KB) + L1(32KB) + Core#2
>>             P#4
>>             P#12
>>           L2(256KB) + L1(32KB) + Core#3
>>             P#6
>>             P#14
>>         Socket#1 + L3(8192KB)
>>           L2(256KB) + L1(32KB) + Core#0
>>             P#1
>>             P#9
>>           L2(256KB) + L1(32KB) + Core#1
>>             P#3
>>             P#11
>>           L2(256KB) + L1(32KB) + Core#2
>>             P#5
>>             P#13
>>           L2(256KB) + L1(32KB) + Core#3
>>             P#7
>>             P#15
>>
>>
>>
>>                    If I remember right, the old hyperthreading on old
>> Xeons was
>>           problematic.
>>
>>           OTOH, about 1-2 months ago I had trouble with OpenMPI on a
>>           relatively new Xeon Nehalem machine with (the new)
>> Hyperthreading
>>           turned on,
>>           and Fedora Core 13.
>>           The machine would hang with the OpenMPI connectivity example.
>>           I reported this to the list, you may find in the archives.
>>
>>
>>        --i foudn the archives recently about an hour ago, was not sure
>>        if it was the same problem but i removed HT for testing with
>>        setting the online flag to 0 on the extra cpus showed with
>>        lstopo, unfortenately i also crashes, so HT may not be the problem.
>>
>>
>>    It didn't fix the problem in our Nehalem machine here either,
>>    although it was FC13, and I don't know what OS and kernel you're using.
>>
>>
>>           Apparently other people got everything (OpenMPI with HT on
>>        Nehalem)
>>           working in more stable distributions (CentOS, RHEL, etc).
>>
>>           That problem was likely to be in the FC13 kernel,
>>           because even turning off HT I still had the machine hanging.
>>           Nothing worked with shared memory turned on,
>>           so I had to switch OpenMPI to use tcp instead,
>>           which is kind of ridiculous in a standalone machine.
>>
>>
>>        --> very interesting, sm can be the problem
>>
>>
>>
>>               im trying to locate the kernel error on logs, but after
>>               rebooting a crash, the error is not in the kern.log (neither
>>               kern.log.1).
>>               all i remember is that it starts with "Kernel BUG..."
>>               and somepart it mentions a certain CPU X, where that cpu
>>        can be
>>               any from 0 to 15 (im testing only in main node).  Someone
>>        knows
>>               where the log of kernel error could be?
>>
>>
>>           Have you tried to turn off hyperthreading?
>>
>>
>>        --> yes, tried, same crashes.
>>                    In any case, depending on the application, it may not
>> help much
>>           performance to have HT on.
>>
>>           A more radical alternative is to try
>>           -mca btl tcp,self
>>           in the mpirun command line.
>>           That is what worked in the case I mentioned above.
>>
>>
>>        wow!, this worked really :),  you pointed out the problem, it
>>        was shared memory.
>>
>>
>>    Great news!
>>    That's exactly the problem we had here.
>>    Glad that the same solution worked for you.
>>
>>    Over a year ago another fellow reported the same problem on Nehalem,
>>    on the very early days of Nehalem.
>>    The thread should be in the archives.
>>    Somebody back then (Ralph, or Jeff, or other?)
>>    suggested that turning off "sm" might work.
>>    So, I take no credit for this.
>>
>>
>>        i have 4 nodes, so anyways there will be node comunication, do
>>        you think i can rely on working with -mca btl tcp,self?? i dont
>>        mind small lag.
>>
>>
>>    Well, this may be it, short from reinstalling the OS.
>>
>>    Some people reported everything works with OpenMPI+HT+sm in CentOS
>>    and RHEL, see the thread I mentioned in the archives from 1-2 months
>>    ago.
>>    I don't administer that machine, and didn't have the time to do OS
>>    reinstall either.
>>    So I left it with -mca btl tcp,self, and the user/machine owner
>>    is happy that he can run his programs right,
>>    and with a performance that he considers good.
>>
>>
>>        i just have one more question, is this a problem of the ubuntu
>>        server kernel?? from the Nehalem Cpus?? from openMPI (i dont
>>        think) ??
>>
>>
>>    I don't have any idea.
>>    It may be a problem with some kernels, not sure.
>>    Which kernel do you have?
>>
>>    Ours was FC-13, maybe FC-12, I don't remember exactly.
>>    Currently that machine has kernel 2.6.33.6-147.fc13.x86_64 #1 SMP.
>>    However, it may have been a slightly older kernel when I installed
>>    OpenMPI there.
>>    It may have been 2.6.33.5-124.fc13.x86_64 or 2.6.32.14-127.fc12.x86_64.
>>    My colleague here updates the machines with yum,
>>    so it may have gotten a new kernel since then.
>>
>>    Our workhorse machines in the clusters that I take care
>>    of are AMD Opteron, never had this problem there.
>>    Maybe the kernels have yet to catch up with Nehalem,
>>    now Westmere, soon another one.
>>
>>
>>        and on what depends that in the future, sm could be possible on
>>        the same configuration i have?? kernel update?.
>>
>>
>>    You may want to try CentOS or RHEL, but I can't guarantee the results.
>>    Somebody else in the list may have had the direct experience,
>>    and may speak out.
>>
>>    It may be worth the effort anyway.
>>    After all, intra-node communication should be
>>    running on shared memory.
>>    Having to turn it off is outrageous.
>>
>>    If you try another OS distribution,
>>    and if it works, please report the results back to the list:
>>    OS/distro, kernel, OpenMPI version, HT on or off,
>>    mca btl sm/tcp/self/etc choices, compilers, etc.
>>    This type of information is a real time saving for everybody.
>>
>>
>>
>>        Thanks very much Gus, really!
>>        Cristobal
>>
>>
>>
>>    My pleasure.
>>    Glad that there was a solution, even though not the best.
>>    Enjoy your cluster with vocano-named nodes!
>>    Have fun with OpenMPI and PETSc!
>>
>>    Gus Correa
>>    ---------------------------------------------------------------------
>>    Gustavo Correa
>>    Lamont-Doherty Earth Observatory - Columbia University
>>    Palisades, NY, 10964-8000 - USA
>>    ---------------------------------------------------------------------
>>
>>
>>           My $0.02
>>           Gus Correa
>>
>>
>>                  Cristobal Navarro wrote:
>>
>>
>>                      On Tue, Jul 27, 2010 at 7:29 PM, Gus Correa
>>                      <g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu> <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>>
>>               <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu> <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>>>
>>                      <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>
>>               <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>> <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>
>>               <mailto:g...@ldeo.columbia.edu
>>        <mailto:g...@ldeo.columbia.edu>>>>>
>>
>>                      wrote:
>>
>>                         Hi Cristobal
>>
>>                         Does it run only on the head node alone?
>>                         (Fuego? Agua? Acatenango?)
>>                         Try to put only the head node on the hostfile
>>        and execute
>>                      with mpiexec.
>>
>>                      --> i will try only with the head node, and post
>>        results back
>>                         This may help sort out what is going on.
>>                         Hopefully it will run on the head node.
>>
>>                         Also, do you have Infinband connecting the nodes?
>>                         The error messages refer to the openib btl (i.e.
>>               Infiniband),
>>                         and complains of
>>
>>
>>                      no we are just using normal network 100MBit/s ,
>>        since i
>>               am just
>>                      testing yet.
>>
>>
>>                         "perhaps a missing symbol, or compiled for a
>>        different
>>                         version of Open MPI?".
>>                         It sounds as a mixup of versions/builds.
>>
>>
>>                      --> i agree, somewhere there must be the remains
>>        of the older
>>                      version
>>
>>                         Did you configure/build OpenMPI from source, or
>> did
>>               you install
>>                         it with apt-get?
>>                         It may be easier/less confusing to install from
>>        source.
>>                         If you did, what configure options did you use?
>>
>>
>>                      -->i installed from source, ./configure
>>                      --prefix=/opt/openmpi-1.4.2 --with-sge --without-xgid
>>                      --disable--static
>>
>>                         Also, as for the OpenMPI runtime environment,
>>                         it is not enough to set it on
>>                         the command line, because it will be effective
>>        only on the
>>                      head node.
>>                         You need to either add them to the PATH and
>>               LD_LIBRARY_PATH
>>                         on your .bashrc/.cshrc files (assuming these
>>        files and
>>               your home
>>                         directory are *also* shared with the nodes via
>>        NFS),
>>                         or use the --prefix option of mpiexec to point
>>        to the
>>               OpenMPI
>>                      main
>>                         directory.
>>
>>
>>                      yes, all nodes have their PATH and LD_LIBRARY_PATH
>>        set up
>>                      properly inside the login scripts ( .bashrc in my
>>        case  )
>>
>>                         Needless to say, you need to check and ensure
>>        that the
>>               OpenMPI
>>                         directory (and maybe your home directory, and
>>        your work
>>                      directory)
>>                         is (are)
>>                         really mounted on the nodes.
>>
>>
>>                      --> yes, doublechecked that they are
>>
>>                         I hope this helps,
>>
>>
>>                      --> thanks really!
>>
>>                         Gus Correa
>>
>>                         Update: i just reinstalled openMPI, with the same
>>               parameters,
>>                      and it
>>                         seems that the problem has gone, i couldnt test
>>               entirely but
>>                      when i
>>                         get back to lab ill confirm.
>>
>>                      best regards! Cristobal
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>                      _______________________________________________
>>                      users mailing list
>>                      us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>
>>               <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>>
>>
>>
>>                      http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>                  _______________________________________________
>>                  users mailing list
>>                  us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>
>>               <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>>
>>
>>
>>                  http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>               _______________________________________________
>>               users mailing list
>>               us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>
>>               http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>           _______________________________________________
>>           users mailing list
>>           us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>>
>>           http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>        _______________________________________________
>>        users mailing list
>>        us...@open-mpi.org <mailto:us...@open-mpi.org>
>>        http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>    _______________________________________________
>>    users mailing list
>>    us...@open-mpi.org <mailto:us...@open-mpi.org>
>>    http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to