Yes, Nathan has a few coll ml fixes queued up for 1.8.
On Mar 24, 2014, at 10:11 PM, tmish...@jcity.maeda.co.jp wrote:
>
>
> I ran our application using the final version of openmpi-1.7.5 again
> with coll_ml_priority = 90.
>
> Then, coll/ml was actually activated and I got these error message
I ran our application using the final version of openmpi-1.7.5 again
with coll_ml_priority = 90.
Then, coll/ml was actually activated and I got these error messages
as shown below:
[manage][[11217,1],0][coll_ml_lmngr.c:265:mca_coll_ml_lmngr_alloc] COLL-ML
List manager is empty.
[manage][[11217,1
I could roughly understand what the coll_ml is and how you
are going to treat it, thanks.
As Ralph pointed out, I didn't see coll_ml was really used.
I just thought the slowdown meant it was used. I'll check it
later. It might be due to the expensive connectivity computation.
Tetsuya
> One of
One of the authors of ML mentioned to me off-list that he has an idea what
might have been causing the slowdown. They're actively working on tweaking and
making things better.
I told them to ping you -- the whole point is that ml is supposed to be
*better* than our existing collectives, so if
On Mar 20, 2014, at 5:56 PM, tmish...@jcity.maeda.co.jp wrote:
>
> Hi Ralph, congratulations on releasing new openmpi-1.7.5.
>
> By the way, opnempi-1.7.5rc3 has been slowing down our application
> with smaller size of testing data, where the time consuming part
> of our application is so calle
Hi Ralph, congratulations on releasing new openmpi-1.7.5.
By the way, opnempi-1.7.5rc3 has been slowing down our application
with smaller size of testing data, where the time consuming part
of our application is so called sparse solver. It's negligible
with medium or large size data - more practi