If at certain x msg size you achieve X performance (MB/s) and at 2x msg size
or higher you achieve Y performance, being Y significantly lower than X, is it
possible to have a parameter that chops messages internally to x size in order
to sustain X performance rather than let it choke ? sort of flow control to
avoid congestion ?
If that is possible, what would be that parameter for vader ?

Other than source code, is there any detailed documentation/studies of vader
related parameters to improve the bandwidth at large message size ? I did see
some documentation for sm, but not for vader.

Thanks,
Joshua


------ Original Message ------
Received: 03:06 PM CDT, 03/17/2017
From: George Bosilca <bosi...@icl.utk.edu>
To: Joshua Mora <joshua_m...@usa.net>
Cc: Open MPI Users <users@lists.open-mpi.org>
Subject: Re: [OMPI users] tuning sm/vader for large messages














> On Fri, Mar 17, 2017 at 3:33 PM, Joshua Mora <joshua_m...@usa.net> wrote:
> 
> > Thanks for the quick reply.
> > This test is between 2 cores that are on different CPUs. Say data has to
> > traverse coherent fabric (eg. QPI,UPI, cHT).
> > It has to go to main memory independently of cache size. Wrong assumption
?
> >
> 
> Depends on the usage pattern. Some benchmarks have options to clean/flush
> the cache before each round of tests.
> 
> 
> > Can data be evicted from cache and put into cache of second core on
> > different
> > CPU without placing it first in main memory ?
> >
> 
> It would depend on the memory coherency protocol. Usually it gets marked as
> shared, and as a result it might not need to be pushed into main memory
> right away.
> 
> 
> > I am more thinking that there is a parameter that splits large messages
in
> > smaller ones at 64k or 128k ?
> >
> 
> Pipelining is not the answer to all situations. Once your messages are
> larger than the caches, you already built memory pressure (by getting
> outside the cache size) so the pipelining is bound by the memory bandwidth.
> 
> 
> 
> > This seems (wrong assumption ?) like the kind of parameter I would need
for
> > large messages on a NIC. Coalescing data / large MTU,...
> 
> 
> Sure, but there are hard limits imposed by the hardware, especially with
> regards to intranode communications. Once you saturate the memory bus, you
> hit a pretty hard limit.
> 
>   George.
> 
> 
> 
> >
> > Joshua
> >
> >
> >
> >
> >
> >
> >
> >
> > ------ Original Message ------
> > Received: 02:15 PM CDT, 03/17/2017
> > From: George Bosilca <bosi...@icl.utk.edu>
> > To: Open MPI Users <users@lists.open-mpi.org>
> >
> > Subject: Re: [OMPI users] tuning sm/vader for large messages
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > > Joshua,
> > >
> > > In shared memory the bandwidth depends on many parameters, including
the
> > > process placement and the size of the different cache levels. In your
> > > particular case I guess after 128k you are outside the L2 cache (1/2 of
> > the
> > > cache in fact) and the bandwidth will drop as the data need to be
flushed
> > > to main memory.
> > >
> > >   George.
> > >
> > >
> > >
> > > On Fri, Mar 17, 2017 at 1:47 PM, Joshua Mora <joshua_m...@usa.net>
> > wrote:
> > >
> > > > Hello,
> > > > I am trying to get the max bw for shared memory communications using
> > > > osu_[bw,bibw,mbw_mr] benchmarks.
> > > > I am observing a peak at ~64k/128K msg size and then drops instead of
> > > > sustaining it.
> > > > What parameters or linux config do I need to add to default openmpi
> > > > settings
> > > > to get this improved ?
> > > > I am already using vader and knem.
> > > >
> > > > See below one way bandwidth with peak at 64k.
> > > >
> > > > # Size      Bandwidth (MB/s)
> > > > 1                       1.02
> > > > 2                       2.13
> > > > 4                       4.03
> > > > 8                       8.48
> > > > 16                     11.90
> > > > 32                     23.29
> > > > 64                     47.33
> > > > 128                    88.08
> > > > 256                   136.77
> > > > 512                   245.06
> > > > 1024                  263.79
> > > > 2048                  405.49
> > > > 4096                 1040.46
> > > > 8192                 1964.81
> > > > 16384                2983.71
> > > > 32768                5705.11
> > > > 65536                7181.11
> > > > 131072               6490.55
> > > > 262144               4449.59
> > > > 524288               4898.14
> > > > 1048576              5324.45
> > > > 2097152              5539.79
> > > > 4194304              5669.76
> > > >
> > > > Thanks,
> > > > Joshua
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > users mailing list
> > > > users@lists.open-mpi.org
> > > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> > > >
> > >
> >
> >
> >
> >
> >
> >
> >
> >
> > > _______________________________________________
> > > users mailing list
> > > users@lists.open-mpi.org
> > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> 

















_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to