Re: [deal.II] Cluster with Infiniband network

2016-11-24 Thread Wolfgang Bangerth
On 11/17/2016 05:41 AM, Denis Davydov wrote: On Wednesday, November 16, 2016 at 3:11:08 PM UTC+1, Wolfgang Bangerth wrote: Our rule of thumb is usually to use about 100,000 unknowns per MPI process. @Woflgang: maybe this should be added to a stand-along page in https://www.dealii.or

Re: [deal.II] Cluster with Infiniband network

2016-11-17 Thread Wolfgang Bangerth
On 11/17/2016 06:40 AM, Ali Dorostkar wrote: Assume that we have 1,000,000 DOFs per compute node which means 1,000,000 DOFs per MPI process for one MPI process this means roughly 10GB of memory. Now if we use 10 core per node, it means 100,000 DOFs per MPI process and the amount of memory per M

Re: [deal.II] Cluster with Infiniband network

2016-11-17 Thread Denis Davydov
On Wednesday, November 16, 2016 at 3:11:08 PM UTC+1, Wolfgang Bangerth wrote: > > Our rule of thumb is usually to use about 100,000 > unknowns per MPI process. @Woflgang: maybe this should be added to a stand-along page in https://www.dealii.org/developer/index.html within the "Information

Re: [deal.II] Cluster with Infiniband network

2016-11-16 Thread Wolfgang Bangerth
On 11/16/2016 04:31 AM, Ashkan Dorostkar wrote: Could you elaborate on why I am using twice more memory on each node if I use more than one core per node? If you use two cores per node, then you are running two programs within the same amount of memory, each of which will allocate their own da

Re: [deal.II] Cluster with Infiniband network

2016-11-14 Thread Ashkan Dorostkar
Hi again, Sorry for the late reply. Believe it or not, the internet connection is terrible here. The failure happens when the number of degree of freedoms is 96 million or higher. This happens every time I run the program. I have has this issue on another cluster but I managed to avoid it by usi

Re: [deal.II] Cluster with Infiniband network

2016-11-09 Thread Wolfgang Bangerth
On 11/09/2016 08:36 AM, Ashkan Dorostkar wrote: [n49422:9059] *** An error occurred in MPI_Allreduce [n49422:9059] *** reported by process [3040346113,140733193388063] [n49422:9059] *** on communicator MPI_COMM_WORLD [n49422:9059] *** MPI_ERR_IN_STATUS: error code in status [n49422:9059] *** MPI

[deal.II] Cluster with Infiniband network

2016-11-09 Thread Ashkan Dorostkar
Hello all, I am running a simulation of linear elasticity in the Lomonosov2 cluster (rank 41 in top 500) which has InfiniBand network. For problems larger than 90 million unknowns openmpi just aborts the program with this message [n49422:9059] *** An error occurred in MPI_Allreduce [n49422:905