Hi, > It's probably due to one or both of the following: > > 1. configure with --enable-heterogeneous > > 2. execute with --hetero-apps > > Both are required for hetero operations
Unfortunately this doesn't solve the problem. I thought that -hetero-apps is only necessary if I mix 32- and 64-bit binaries and that it is only available in openmpi-1.7.x and newer. tyr matrix 105 ompi_info | grep -e Ident -e Hetero -e "Built on" Ident string: 1.6.2 Built on: Fri Sep 28 19:25:06 CEST 2012 Heterogeneous support: yes tyr matrix 106 mpiexec -np 4 -host rs0,sunpc0 -hetero-apps mat_mult_1 mpiexec (OpenRTE) 1.6.2 Usage: mpiexec [OPTION]... [PROGRAM]... Start the given program using Open RTE ... -hetero-apps is not available in openmpi-1.6.2. I try once more with openmpi-1.9. tyr matrix 117 file ~/SunOS/sparc/bin/mat_mult_1 \ ~/SunOS/x86_64/bin/mat_mult_1 /home/fd1026/SunOS/sparc/bin/mat_mult_1: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped /home/fd1026/SunOS/x86_64/bin/mat_mult_1: ELF 64-bit LSB executable AMD64 Version 1 [SSE2 SSE FXSR FPU], dynamically linked, not stripped tyr matrix 118 ompi_info | grep -e Ident -e Hetero -e "Built on" Ident string: 1.9a1r27380 Built on: Fri Sep 28 22:57:29 CEST 2012 Heterogeneous support: yes tyr matrix 119 mpiexec -np 4 -host rs0,sunpc0 -hetero-apps mat_mult_1 Process 0 of 4 running on rs0.informatik.hs-fulda.de Process 2 of 4 running on sunpc0 Process 1 of 4 running on rs0.informatik.hs-fulda.de Process 3 of 4 running on sunpc0 (4,6)-matrix a: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 (6,8)-matrix b: 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 (4,8)-result-matrix c = a * b : 448 427 406 385 364 343 322 301 1456 1399 1342 1285 1228 1171 1114 1057 49.015-3.56813e+304-3.1678e+296 -NaN6.6727e-315-7.40627e+304-3.1678e+296 -NaN 48-3.56598e+304-3.18057e+296 -NaN2.122e-314-7.68057e+304-3.26998e+296 -NaN Thank you very much for any help in advance. I assume that the conversion from the internal format to the network format and vice versa has a bug. Kind regards Siegmar > On Oct 4, 2012, at 4:03 AM, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote: > > > Hi, > > > > I have a small matrix multiplication program which computes wrong > > results in a heterogeneous environment with different little endian > > and big endian architectures. Every process computes one row (block) > > of the result matrix. > > > > > > Solaris 10 x86_64 and Linux x86_64: > > > > tyr matrix 162 mpiexec -np 4 -host sunpc0,sunpc1,linpc0,linpc1 mat_mult_1 > > Process 0 of 4 running on sunpc0 > > Process 1 of 4 running on sunpc1 > > Process 2 of 4 running on linpc0 > > Process 3 of 4 running on linpc1 > > ... > > (4,8)-result-matrix c = a * b : > > > > 448 427 406 385 364 343 322 301 > > 1456 1399 1342 1285 1228 1171 1114 1057 > > 2464 2371 2278 2185 2092 1999 1906 1813 > > 3472 3343 3214 3085 2956 2827 2698 2569 > > > > > > Solaris Sparc: > > > > tyr matrix 167 mpiexec -np 4 -host tyr,rs0,rs1 mat_mult_1 > > Process 0 of 4 running on tyr.informatik.hs-fulda.de > > Process 3 of 4 running on tyr.informatik.hs-fulda.de > > Process 2 of 4 running on rs1.informatik.hs-fulda.de > > Process 1 of 4 running on rs0.informatik.hs-fulda.de > > ... > > (4,8)-result-matrix c = a * b : > > > > 448 427 406 385 364 343 322 301 > > 1456 1399 1342 1285 1228 1171 1114 1057 > > 2464 2371 2278 2185 2092 1999 1906 1813 > > 3472 3343 3214 3085 2956 2827 2698 2569 > > > > > > Solaris Sparc and x86_64: Rows 1 and 3 are from sunpc0 (adding the > > option "-hetero" doesn't change anything) > > > > tyr matrix 168 mpiexec -np 4 -host tyr,sunpc0 mat_mult_1 > > Process 1 of 4 running on sunpc0 > > Process 3 of 4 running on sunpc0 > > Process 0 of 4 running on tyr.informatik.hs-fulda.de > > Process 2 of 4 running on tyr.informatik.hs-fulda.de > > ... > > (4,8)-result-matrix c = a * b : > > > > 448 427 406 385 364 343 322 301 > > 48-3.01737e+304-3.1678e+296 -NaN 0-7.40627e+304-3.16839e+296 -NaN > > 2464 2371 2278 2185 2092 1999 1906 1813 > > 48-3.01737e+304-3.18057e+296 -NaN2.122e-314-7.68057e+304-3.26998e+296 -NaN > > > > > > Solaris Sparc and Linux x86_64: Rows 1 and 3 are from linpc0 > > > > tyr matrix 169 mpiexec -np 4 -host tyr,linpc0 mat_mult_1 > > Process 0 of 4 running on tyr.informatik.hs-fulda.de > > Process 2 of 4 running on tyr.informatik.hs-fulda.de > > Process 1 of 4 running on linpc0 > > Process 3 of 4 running on linpc0 > > ... > > (4,8)-result-matrix c = a * b : > > > > 448 427 406 385 364 343 322 301 > > 0 0 0 0 0 08.10602e-3124.27085e-319 > > 2464 2371 2278 2185 2092 1999 1906 1813 > > 6.66666e-3152.86948e-3161.73834e-3101.39066e-3092.122e-3141.39066e-3091.39066e-3 > > 099.88131e-324 > > > > In the past the program worked in a heterogeneous environment. This > > is the main part of the program. > > > > ... > > double a[P][Q], b[Q][R], /* matrices to multiply */ > > c[P][R], /* matrix for result */ > > row_a[Q], /* one row of matrix "a" */ > > row_c[R]; /* one row of matrix "c" */ > > ... > > /* send matrix "b" to all processes */ > > MPI_Bcast (b, Q * R, MPI_DOUBLE, 0, MPI_COMM_WORLD); > > /* send row i of "a" to process i */ > > MPI_Scatter (a, Q, MPI_DOUBLE, row_a, Q, MPI_DOUBLE, 0, > > MPI_COMM_WORLD); > > for (j = 0; j < R; ++j) /* compute i-th row of "c" */ > > { > > row_c[j] = 0.0; > > for (k = 0; k < Q; ++k) > > { > > row_c[j] = row_c[j] + row_a[k] * b[k][j]; > > } > > } > > /* receive row i of "c" from process i */ > > MPI_Gather (row_c, R, MPI_DOUBLE, c, R, MPI_DOUBLE, 0, > > MPI_COMM_WORLD); > > ... > > > > > > Does anybody know why my program doesn't work? It blocks with > > openmpi-1.7a1r27379 and openmpi-1.9a1r27380 (I had to add one > > more machine because my local machine will not be used in these > > versions) and it works as long as the machines have the same > > endian. > > > > tyr matrix 110 mpiexec -np 4 -host tyr,linpc0,rs0 mat_mult_1 > > Process 0 of 4 running on linpc0 > > Process 1 of 4 running on linpc0 > > Process 3 of 4 running on rs0.informatik.hs-fulda.de > > Process 2 of 4 running on rs0.informatik.hs-fulda.de > > ... > > (6,8)-matrix b: > > > > 48 47 46 45 44 43 42 41 > > 40 39 38 37 36 35 34 33 > > 32 31 30 29 28 27 26 25 > > 24 23 22 21 20 19 18 17 > > 16 15 14 13 12 11 10 9 > > 8 7 6 5 4 3 2 1 > > > > ^CKilled by signal 2. > > Killed by signal 2. > > > > > > Thank you very much for any help in advance. > > > > > > Kind regards > > > > Siegmar > > #include <stdio.h> > > #include <stdlib.h> > > #include "mpi.h" > > > > #define P 4 /* # of rows */ > > #define Q 6 /* # of columns / rows */ > > #define R 8 /* # of columns */ > > > > static void print_matrix (int p, int q, double **mat); > > > > int main (int argc, char *argv[]) > > { > > int ntasks, /* number of parallel tasks */ > > mytid, /* my task id */ > > namelen, /* length of processor name */ > > i, j, k, /* loop variables */ > > tmp; /* temporary value */ > > double a[P][Q], b[Q][R], /* matrices to multiply */ > > c[P][R], /* matrix for result */ > > row_a[Q], /* one row of matrix "a" */ > > row_c[R]; /* one row of matrix "c" */ > > char processor_name[MPI_MAX_PROCESSOR_NAME]; > > > > MPI_Init (&argc, &argv); > > MPI_Comm_rank (MPI_COMM_WORLD, &mytid); > > MPI_Comm_size (MPI_COMM_WORLD, &ntasks); > > MPI_Get_processor_name (processor_name, &namelen); > > fprintf (stdout, "Process %d of %d running on %s\n", > > mytid, ntasks, processor_name); > > fflush (stdout); > > MPI_Barrier (MPI_COMM_WORLD); /* wait for all other processes */ > > > > if ((ntasks != P) && (mytid == 0)) > > { > > fprintf (stderr, "\n\nI need %d processes.\n" > > "Usage:\n" > > " mpiexec -np %d %s.\n\n", > > P, P, argv[0]); > > } > > if (ntasks != P) > > { > > MPI_Finalize (); > > exit (EXIT_FAILURE); > > } > > if (mytid == 0) > > { > > tmp = 1; > > for (i = 0; i < P; ++i) /* initialize matrix a */ > > { > > for (j = 0; j < Q; ++j) > > { > > a[i][j] = tmp++; > > } > > } > > printf ("\n\n(%d,%d)-matrix a:\n\n", P, Q); > > print_matrix (P, Q, (double **) a); > > tmp = Q * R; > > for (i = 0; i < Q; ++i) /* initialize matrix b */ > > { > > for (j = 0; j < R; ++j) > > { > > b[i][j] = tmp--; > > } > > } > > printf ("(%d,%d)-matrix b:\n\n", Q, R); > > print_matrix (Q, R, (double **) b); > > } > > /* send matrix "b" to all processes */ > > MPI_Bcast (b, Q * R, MPI_DOUBLE, 0, MPI_COMM_WORLD); > > /* send row i of "a" to process i */ > > MPI_Scatter (a, Q, MPI_DOUBLE, row_a, Q, MPI_DOUBLE, 0, > > MPI_COMM_WORLD); > > for (j = 0; j < R; ++j) /* compute i-th row of "c" */ > > { > > row_c[j] = 0.0; > > for (k = 0; k < Q; ++k) > > { > > row_c[j] = row_c[j] + row_a[k] * b[k][j]; > > } > > } > > /* receive row i of "c" from process i */ > > MPI_Gather (row_c, R, MPI_DOUBLE, c, R, MPI_DOUBLE, 0, > > MPI_COMM_WORLD); > > if (mytid == 0) > > { > > printf ("(%d,%d)-result-matrix c = a * b :\n\n", P, R); > > print_matrix (P, R, (double **) c); > > } > > MPI_Finalize (); > > return EXIT_SUCCESS; > > } > > > > > > void print_matrix (int p, int q, double **mat) > > { > > int i, j; /* loop variables */ > > > > for (i = 0; i < p; ++i) > > { > > for (j = 0; j < q; ++j) > > { > > printf ("%6g", *((double *) mat + i * q + j)); > > } > > printf ("\n"); > > } > > printf ("\n"); > > } > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > >