Hello Ralph & Jeff,

    This is the same issue - but this time the job is running on a single node.

The two systems on which the jobs are run, have the same hardware/OS
configuration. The only differences are:

One node has 4 GB RAM and it is part of infiniband connected nodes.

The other node has 8 GB RAM and it is part of gigabit connected nodes.

For both jobs only 4 processes are used.

All the processes are run on a single node.

But why the GB node is taking more time than IB node?

{ELAPSED TIME = WALL CLOCK TIME}

Hope you are now clear with the problem.

Thanks,
Sangamesh
On Mon, Mar 9, 2009 at 10:56 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> It depends on the characteristics of the nodes in question.  You mention the
> CPU speeds and the RAM, but there are other factors as well: cache size,
> memory architecture, how many MPI processes you're running, etc.  Memory
> access patterns, particularly across UMA machines like clovertown and
> follow-in intel architectures can really get bogged down by the RAM
> bottlneck (all 8 cores hammering on memory simultaneously via a single
> memory bus).
>
>
>
> On Mar 9, 2009, at 10:30 AM, Sangamesh B wrote:
>
>> Dear Open MPI team,
>>
>>      With Open MPI-1.3, the fortran application CPMD is installed on
>> Rocks-4.3 cluster - Dual Processor Quad core Xeon @ 3 GHz. (8 cores
>> per node)
>>
>> Two jobs (4 processes job) are run on two nodes, separately - one node
>> has a ib connection ( 4 GB RAM)  and the other node has gigabit
>> connection (8 GB RAM).
>>
>> Note that, the network-connectivity may not be or not required to be
>> used as the two jobs are running in stand alone mode.
>>
>> Since the jobs are running on single node - no intercommunication
>> between nodes - so the performance of both the jobs should be same
>> irrespective of network connectivity. But here this is not the case.
>> The gigabit job is taking double the time of infiniband job.
>>
>> Following are the details of two jobs:
>>
>> Infiniband Job:
>>
>>      CPU TIME :    0 HOURS 10 MINUTES 21.71 SECONDS
>>   ELAPSED TIME :    0 HOURS 10 MINUTES 23.08 SECONDS
>>  ***      CPMD| SIZE OF THE PROGRAM IS  301192/ 571044 kBYTES ***
>>
>> Gigabit Job:
>>
>>       CPU TIME :    0 HOURS 12 MINUTES  7.93 SECONDS
>>   ELAPSED TIME :    0 HOURS 21 MINUTES  0.07 SECONDS
>>  ***      CPMD| SIZE OF THE PROGRAM IS  123420/ 384344 kBYTES ***
>>
>> More details are attached here in a file.
>>
>> Why there is a long difference between CPU TIME and ELAPSED TIME for
>> Gigabit job?
>>
>> This could be an issue with Open MPI itself. What could be the reason?
>>
>> Is there any flags need to be set?
>>
>> Thanks in advance,
>> Sangamesh
>>
>> <cpmd_gb_ib_1node><ATT3915213.txt>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to