> On Sep 15, 2017, at 12:24 PM, David Jones <djo...@ena.com> wrote: >> You kinda have to work backwards through the scripts to find what is >> generating the scores-set0 file and turning it into 72_scores.cf. I am >> grep'ing through the work dir on the SA server now but it contains a lot >> of files. I need to find the large dirs and exclude them. > > you may have already done this, but if you modify the scripts to not > overwrite (or save a copy) of the intermediate files (which may clue into > exactly where the problem is being introduced). ie. runGA lines 57-59, > 124-132 (for 50_scores.cf)
Yes it would be very interesting to look at intermediate / temp files. Especially of everything generated by /masses/rule-update-score-gen/generate-new-scores.sh and the 'make' of the freq files. It seems the scoreset files are truncated. There are never rule names starting with 'S' or higher present in the scoresets. I think we need to backtrack the intermediate files to see at which stage we see the truncation happen. Then we could have at least an idea in which part of the code the problem is, because the whole 'make freqs' is pretty dense.. > > another 'easy' test I would try would be to set numcpus in runGA to 1 just > in case the problem is that somewhere there are multiple writers > overwriting parts of the same file > > -- > Daniel J. Luke > > > >