On Jan 22, 2013, at 2:53 PM, Shamis, Pavel wrote: >> >> Switching to SRQ and some guess of queue values selected appears to let the >> code run. >> S,4096,128:S,12288,128:S,65536,12 >> >> Two questions, >> >> This is a ConnectX fabric, should I switch them to XRC queues? And should I >> use the same queue size/count? That a safe assumption? >> X,4096,128:X,12288,128:X,65536,12 > > Yeah, I would use the same values as a starting point.
Thanks, the users full resolution job got further with shared queues, we are going to do a test with XRC queues of the same count. But he keeps getting OpenMPI out of memory/reg fail messages. > >> >> >> When should I use one queue type over the other? > > Generally speaking XRC transport has much better scalability that RC. Ok so if we are useing shared queues on ConnectX gear default to XRC, will do. > > >> >> Is there a way to get stat feedback on the use of your shared queues (SRQ or >> XRC) ? >> >> Example, using code 'not from here' and would like to know, "hey you are >> always running out of your queue of size X" Or " your queue of size Y is >> never used" >> >> We are kinda blind for a lot of our applications :-) > > Right now we don't have such hooks in openib BTL. > It is not very difficult to add some code that will report stat for QP > utilization. > > In you other email you mentioned MXM. I would recommend to try both XRC and > MXM and see which one performance better. On relatively small system I would > guess > XRC will perform better, on large system MXM should demonstrate better > performance. But again, it all depends on your application. > > - Pasha > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users