[PATCH] RE: gcc parallel make check

VandeVondele Joost Fri, 05 Sep 2014 07:27:29 -0700

> The splits are in the Makefiles, see check_gcc_parallelize

attached is a patch to improve the parallel performance of 'make -jXX -k 
check-fortran'. For XX=16, this yields ~50% speedup, and even with XX=4 we 
still have 15%, the measured slowdown at XX=1 (<2%) is in the noise of testing. 
The patch is a simple update of the 'check_gfortran_parallelize' variable, 
updating it from its 2008 values to a set that I found +- optimal based on 
several tests. Detailed timings are :


# timings/trunk-check-fortran
#cores      average    std. dev. #tests
     1      2955.32        75.06      3
     2      1735.30       122.26      3
     4       929.51        54.19      3
     8       470.29         7.85      3
    16       468.09         4.29      3
    32       466.06         1.24      3

# timings/patched-check-fortran
#cores      average    std. dev. #tests
     1      3008.89        16.38      3
     2      1534.17       118.33      3
     4       800.18        31.71      3
     8       418.71         0.20      2
    16       298.29         5.86      3
    32       299.84         1.34      3

There is no effect on a full 'make -j32 -k check' as other goals run for much 
longer (to be looked at in a followup).

A second part of the patch is a new file 'contrib/generate_tcl_patterns.sh' 
which generates the needed regexp to do the split based on an input of the 
files in the target directory. It basically groups the initial characters such 
that each regexp tries not to exceed a maximum number of files. So, the number 
of files is used as a proxy for the runtime. While I don't feel to strong about 
adding this (shell/gawk) script, it certainly is convenient, and makes sure 
that no characters are missing from the regexp. The maximum number of files per 
regexp is an input, testing (-j16) with 200, 300, 400 I found that 300 was 
optimal for testsuite/gfortran.dg, but this will depend on many things. 

A sample run would look like

gcc/gcc/testsuite/gfortran.dg> ls -1 | 
../../../contrib/generate_tcl_patterns.sh 300 "dg.exp=gfortran.dg/"
Adding label:  p matching files:499
Adding label:  c matching files:497
Adding label:  a matching files:448
Adding label:  i matching files:350
Adding label:  d matching files:245
Adding label:  s matching files:211
Adding label:  b matching files:206
Adding label:  t matching files:180
Adding label:  f matching files:173
Adding label:  e matching files:166
Adding label:  r matching files:165
Adding label:  n matching files:162
Adding label:  mu matching files:278
Adding label:  wlgo matching files:284
Adding label:  vhzPkqWx_-9876543210ZYXVUTSRQONMLKJIHGFEDCBAyj matching files:94
patterns:
dg.exp=gfortran.dg/p* \
dg.exp=gfortran.dg/c* \
dg.exp=gfortran.dg/a* \
dg.exp=gfortran.dg/i* \
dg.exp=gfortran.dg/\[wlgo\]* \
dg.exp=gfortran.dg/\[mu\]* \
dg.exp=gfortran.dg/d* \
dg.exp=gfortran.dg/s* \
dg.exp=gfortran.dg/b* \
dg.exp=gfortran.dg/t* \
dg.exp=gfortran.dg/f* \
dg.exp=gfortran.dg/e* \
dg.exp=gfortran.dg/r* \
dg.exp=gfortran.dg/n* \
dg.exp=gfortran.dg/\[vhzPkqWx_-9876543210ZYXVUTSRQONMLKJIHGFEDCBAyj\]* \

Is the current attached patch OK for trunk ?

contrib/ChangeLog

2014-09-05  Joost VandeVondele  <vond...@gcc.gnu.org>

       * generate_tcl_patterns.sh: New file.

gcc/fortran/ChangeLog

 2014-09-05  Joost VandeVondele  <vond...@gcc.gnu.org>

       * Make-lang.in (check_gfortran_parallelize): improved parallelism.

Index: contrib/generate_tcl_patterns.sh
===================================================================
--- contrib/generate_tcl_patterns.sh	(revision 0)
+++ contrib/generate_tcl_patterns.sh	(revision 0)
@@ -0,0 +1,86 @@
+#! /bin/sh
+
+#
+# based on a list of filenames as input,
+# generate regexps that match subsets trying to not exceed a
+# 'maxcount' parameter. Most useful to generate the
+# check_LANG_parallelize assignments needed to split
+# testsuite directories, defining prefix appropriately.
+#
+# Example usage:
+#   cd gcc/gcc/testsuite/gfortran.dg
+#   ls -1 | ../../../contrib/generate_tcl_patterns.sh 300 "dg.exp=gfortran.dg/"
+#
+# the first parameter is the maximum number of files.
+# the second parameter the prefix used for printing.
+#
+
+# Copyright (C) 2014 Free Software Foundation
+# Contributed by Joost VandeVondele <joost.vandevond...@mat.ethz.ch>
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING.  If not, write to
+# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+# Boston, MA 02110-1301, USA.
+
+gawk -v maxcount=$1 -v prefix=$2 '
+BEGIN{
+  # list of allowed starting chars for a file name in a dir to split
+  achars="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-_"
+}
+{
+  nfiles++ ; files[nfiles]=$1
+}
+END{
+  for(i=1; i<=length(achars); i++) count[substr(achars,i,1)]=0
+  for(i=1; i<=nfiles; i++) {
+     if (length(files[i]>0)) { count[substr(files[i],1,1)]++ }
+  };
+  asort(count,ordered)
+  countsingle=0
+  groups=0
+  label=""
+  for(i=length(achars);i>=1;i--) {
+    countsingle=countsingle+ordered[i] 
+    for(j=1;j<=length(achars);j++) {
+       if(count[substr(achars,j,1)]==ordered[i]) found=substr(achars,j,1)
+    }
+    count[found]=-1
+    label=label found
+    if(i==1) { val=maxcount+1 } else { val=ordered[i-1] }
+    if(countsingle+val>maxcount) {
+      subset[label]=countsingle
+      print "Adding label: ", label, "matching files:" countsingle
+      groups++
+      countsingle=0
+      label=""
+    }
+  }
+  print "patterns:"
+  asort(subset,ordered)
+  for(i=groups;i>=1;i--) {
+    for(j in subset){
+      if(subset[j]==ordered[i]) found=j
+    }
+    subset[found]=-1
+    if (length(found)==1) {
+       printf("%s%s* \\\n",prefix,found)
+    } else {
+       printf("%s\\[%s\\]* \\\n",prefix,found)
+    }
+  }
+}
+'
+
Index: gcc/fortran/Make-lang.in
===================================================================
--- gcc/fortran/Make-lang.in	(revision 214949)
+++ gcc/fortran/Make-lang.in	(working copy)
@@ -168,12 +168,22 @@ check-fortran-subtargets : check-gfortra
 lang_checks += check-gfortran
 lang_checks_parallelized += check-gfortran
 # For description see comment above check_gcc_parallelize in gcc/Makefile.in.
-check_gfortran_parallelize = dg.exp=gfortran.dg/\[adAD\]* \
-			     dg.exp=gfortran.dg/\[bcBC\]* \
-			     dg.exp=gfortran.dg/\[nopNOP\]* \
-			     dg.exp=gfortran.dg/\[isuvISUV\]* \
-			     dg.exp=gfortran.dg/\[efhkqrxzEFHKQRXZ\]* \
-			     dg.exp=gfortran.dg/\[0-9gjlmtwyGJLMTWY\]*
+check_gfortran_parallelize = execute.exp \
+			dg.exp=gfortran.dg/p* \
+			dg.exp=gfortran.dg/c* \
+			dg.exp=gfortran.dg/a* \
+			dg.exp=gfortran.dg/i* \
+			dg.exp=gfortran.dg/\[wlgo\]* \
+			dg.exp=gfortran.dg/\[mu\]* \
+			dg.exp=gfortran.dg/d* \
+			dg.exp=gfortran.dg/s* \
+			dg.exp=gfortran.dg/b* \
+			dg.exp=gfortran.dg/t* \
+			dg.exp=gfortran.dg/f* \
+			dg.exp=gfortran.dg/e* \
+			dg.exp=gfortran.dg/r* \
+			dg.exp=gfortran.dg/n* \
+			dg.exp=gfortran.dg/\[vhzPkqWx_-9876543210ZYXVUTSRQONMLKJIHGFEDCBAyj\]*
 
 # GFORTRAN documentation.
 GFORTRAN_TEXI = \

[PATCH] RE: gcc parallel make check

Reply via email to