> On Apr 17, 2020, at 22:39, Andreas Tille <andr...@fam-tille.de> wrote: > > Hi Matthew, > > thanks a lot for your detailed investigation. > > On Fri, Apr 17, 2020 at 04:28:23PM -0700, Matthew Fernandez wrote: >>> Program received signal SIGBUS, Bus error. >>> 0x5556a1b8 in PairDistances (distmat=0x7fff278c, mseq=0x55692a30, >>> pairdist_type=<optimized out>, bPercID=<optimized out>, istart=0, iend=3, >>> jstart=0, jend=3, fdist_in=0x0, >>> fdist_out=0x0) at pair_dist.c:346 >>> 346 NewProgress(&prProgress, LogGetFP(&rLog, LOG_INFO), >> >> OK, let me try a little harder :) >> >> $ # enable debugging symbols and Address Sanitizer >> $ CFLAGS="-g -fsanitize=address" CXXFLAGS="-g -fsanitize=address" >> ./configure >> … >> $ make clean && make >> … >> $ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out >> temp_test.dnd -o temp_test.aln --outfmt clustal --force >> ================================================================= >> ==30264==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on >> address 0x7ffcfcbf5784 at pc 0x5620f0aa478c bp 0x7ffcfcbf56c0 sp >> 0x7ffcfcbf56b8 >> WRITE of size 4 at 0x7ffcfcbf5784 thread T0 >> #0 0x5620f0aa478b in PairDistances >> /home/matthew/clustal-omega-1.2.4/src/clustal/pair_dist.c:336 >> #1 0x5620f0a91d9f in AlignmentOrder >> /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:835 >> #2 0x5620f0a95c04 in Align >> /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:1221 >> #3 0x5620f0a90d76 in MyMain >> /home/matthew/clustal-omega-1.2.4/src/mymain.c:1192 >> #4 0x5620f0a88ca2 in main >> /home/matthew/clustal-omega-1.2.4/src/main.cpp:469 >> #5 0x7f3773d9009a in __libc_start_main ../csu/libc-start.c:308 >> #6 0x5620f0a89ad9 in _start >> (/home/matthew/clustal-omega-1.2.4/src/clustalo+0x2dad9) >> >> Address 0x7ffcfcbf5784 is located in stack of thread T0 >> SUMMARY: AddressSanitizer: dynamic-stack-buffer-overflow >> /home/matthew/clustal-omega-1.2.4/src/clustal/pair_dist.c:336 in >> PairDistances >> Shadow bytes around the buggy address: >> 0x10001f976aa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x10001f976ab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x10001f976ac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x10001f976ad0: 00 00 00 00 00 00 00 00 00 00 00 00 ca ca ca ca >> 0x10001f976ae0: 04 cb cb cb cb cb cb cb 00 00 00 00 ca ca ca ca >> =>0x10001f976af0:[04]cb cb cb cb cb cb cb 00 00 00 00 00 00 00 00 >> 0x10001f976b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x10001f976b10: f1 f1 f1 f1 00 f2 f2 f2 f2 f2 f2 f2 00 f2 f2 f2 >> 0x10001f976b20: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 f2 f2 f2 >> 0x10001f976b30: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x10001f976b40: 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 f2 f2 >> Shadow byte legend (one shadow byte represents 8 application bytes): >> Addressable: 00 >> Partially addressable: 01 02 03 04 05 06 07 >> Heap left redzone: fa >> Freed heap region: fd >> Stack left redzone: f1 >> Stack mid redzone: f2 >> Stack right redzone: f3 >> Stack after return: f5 >> Stack use after scope: f8 >> Global redzone: f9 >> Global init order: f6 >> Poisoned by user: f7 >> Container overflow: fc >> Array cookie: ac >> Intra object redzone: bb >> ASan internal: fe >> Left alloca redzone: ca >> Right alloca redzone: cb >> ==30264==ABORTING >> >> Looking at line 336 of pair_dist.c, it looks like the bound on the >> containing loop is wrong. So let’s try adjusting that: >> >> $ vim src/clustal/pair_dist.c >> $ git diff src/clustal/pair_dist.c >> diff --git a/src/clustal/pair_dist.c b/src/clustal/pair_dist.c >> index e6dbdc3..bb79e61 100644 >> --- a/src/clustal/pair_dist.c >> +++ b/src/clustal/pair_dist.c >> @@ -321,7 +321,7 @@ PairDistances(symmatrix_t **distmat, mseq_t *mseq, >> int pairdist_type, bool bPerc >> >> /* FIXME: can get rid of iChunkStart, iChunkEnd now that we're >> using the arrays */ >> iChunkStart = iend; >> - for(iChunk = 0; iChunk <= iNumberOfThreads; iChunk++) >> + for(iChunk = 0; iChunk < iNumberOfThreads; iChunk++) >> { >> iChunkEnd = iChunkStart; >> if (iChunk == iNumberOfThreads - 1){ >> $ make >> … >> $ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out >> temp_test.dnd -o temp_test.aln --outfmt clustal --force >> ================================================================= >> ==30601==ERROR: AddressSanitizer: global-buffer-overflow on address >> 0x561188847864 at pc 0x5611886da6e7 bp 0x7fffe6d77ef0 sp 0x7fffe6d77ee8 >> READ of size 4 at 0x561188847864 thread T0 >> #0 0x5611886da6e6 in FullAlignment::Build(HMM&, Hit&, char*) >> /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:250 >> #1 0x5611886df3eb in HitList::PrintAlignments(char**, char**, char*, >> char*, HMM&, char*, char) >> /home/matthew/clustal-omega-1.2.4/src/hhalign/hhhitlist-C.h:197 >> #2 0x5611886f379b in hhalign >> /home/matthew/clustal-omega-1.2.4/src/hhalign/hhalign.cpp:1211 >> #3 0x56118863f848 in HHalignWrapper >> /home/matthew/clustal-omega-1.2.4/src/clustal/hhalign_wrapper.c:1342 >> #4 0x561188637db1 in Align >> /home/matthew/clustal-omega-1.2.4/src/clustal-omega.c:1250 >> #5 0x561188632d76 in MyMain >> /home/matthew/clustal-omega-1.2.4/src/mymain.c:1192 >> #6 0x56118862aca2 in main >> /home/matthew/clustal-omega-1.2.4/src/main.cpp:469 >> #7 0x7f6d857f109a in __libc_start_main ../csu/libc-start.c:308 >> #8 0x56118862bad9 in _start >> (/home/matthew/clustal-omega-1.2.4/src/clustalo+0x2dad9) >> >> 0x561188847864 is located 60 bytes to the left of global variable 'Sim' >> defined in 'hhdecl-C.h:234:7' (0x5611888478a0) of size 1764 >> 0x561188847864 is located 0 bytes to the right of global variable 'S' >> defined in 'hhdecl-C.h:235:7' (0x561188847180) of size 1764 >> SUMMARY: AddressSanitizer: global-buffer-overflow >> /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:250 in >> FullAlignment::Build(HMM&, Hit&, char*) >> Shadow bytes around the buggy address: >> 0x0ac2b1100eb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100ec0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100ed0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100ee0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100ef0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> =>0x0ac2b1100f00: 00 00 00 00 00 00 00 00 00 00 00 00[04]f9 f9 f9 >> 0x0ac2b1100f10: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 0x0ac2b1100f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> Shadow byte legend (one shadow byte represents 8 application bytes): >> Addressable: 00 >> Partially addressable: 01 02 03 04 05 06 07 >> Heap left redzone: fa >> Freed heap region: fd >> Stack left redzone: f1 >> Stack mid redzone: f2 >> Stack right redzone: f3 >> Stack after return: f5 >> Stack use after scope: f8 >> Global redzone: f9 >> Global init order: f6 >> Poisoned by user: f7 >> Container overflow: fc >> Array cookie: ac >> Intra object redzone: bb >> ASan internal: fe >> Left alloca redzone: ca >> Right alloca redzone: cb >> ==30601==ABORTING >> >> Looking at line 250 of hhfullalignment-C.h, we can see it’s reading the >> array ‘S’ out of bounds here. Someone has helpfully left a debugging line >> below this, so let’s shuffle it ahead of the faulting access and remove the >> part where it is also performing the faulting access: >> >> $ vim src/hhalign/hhfullalignment-C.h >> $ git diff src/hhalign/hhfullalignment-C.h >> diff --git a/src/hhalign/hhfullalignment-C.h >> b/src/hhalign/hhfullalignment-C.h >> index 8f40fd1..fd759f9 100644 >> --- a/src/hhalign/hhfullalignment-C.h >> +++ b/src/hhalign/hhfullalignment-C.h >> @@ -247,8 +247,8 @@ FullAlignment::Build(HMM& q, Hit& hit, char zcError[]) >> char qc=qa->seq[ q.nfirst][ qa->m[ q.nfirst][hit.i[step]] >> ]; >> char tc=ta->seq[hit.nfirst][ ta->m[hit.nfirst][hit.j[step]] >> ]; >> if (qc==tc) identities++; // count identical amino acids >> + fprintf(stderr,"%3i %3i %3i %3i %3i %1c %1c %6.2f %6.2f >> %6.2f >> \n",step,hit.nsteps,hit.i[step],hit.j[step],int(state),qc,tc,score_sim,hit.P_posterior[step],hit.sum_of_probs); >> //DEBUG >> score_sim += S[(int)aa2i(qc)][(int)aa2i(tc)]; >> - // fprintf(stderr,"%3i %3i %3i %3i %3i %1c %1c >> %6.2f %6.2f %6.2f %6.2f >> \n",step,hit.nsteps,hit.i[step],hit.j[step],int(state),qc,tc,S[(int)aa2i(qc)][(int)aa2i(tc)],score_sim,hit.P_posterior[step],hit.sum_of_probs); >> //DEBUG >> } >> } >> >> $ make >> … >> $ ./src/clustalo -i debian/tests/biopython_testdata/f002 --guidetree-out >> temp_test.dnd -o temp_test.aln --outfmt clustal —force >> … >> 28 582 386 559 10 N - 127.25 0.01 2.91 >> ================================================================= >> ==30936==ERROR: AddressSanitizer: global-buffer-overflow on address >> 0x563d5b2258a4 at pc 0x563d5b0b79e8 bp 0x7ffd269e0e40 sp 0x7ffd269e0e38 >> READ of size 4 at 0x563d5b2258a4 thread T0 >> #0 0x563d5b0b79e7 in FullAlignment::Build(HMM&, Hit&, char*) >> /home/matthew/clustal-omega-1.2.4/src/hhalign/hhfullalignment-C.h:251 >> … >> >> So the values of qc and tc at this point are 'N' and '-', respectively. This >> results in an access to S[20][21], which is indeed out of range as S is a >> 21x21 array. To go further, I think I need some guidance from a domain >> expert. Is aa2i() ever expected to be called with a value that maps to GAP >> or ANY? Maybe S is actually meant to be a 22x22 array? Maybe the loop in >> hhfullalignment-C.h is meant to skip any iteration for which qc or tc map to >> GAP? >> >> By the way, Andreas, I am doing this debugging on the upstream 1.2.4 release >> on an x86-64 machine so I still have no certainty that this is related to >> the root cause of your observed problem on MIPS. > > Upstream is in the row of this investigation. Its quite interesting > that the issue could also observed on amd64. So probably this is a real > issue which is just exposed on mipsel. I think just bumping the array > boundaries is cheap. If there will be no response from upstream (or > somebody else who is comfortable with the algorithm which I'm not) I'll > check again on mipsel and in case it will work I'll upload.
Fair enough. While we’re discussing this, here’s another patch for a memory leak that fixes a typoed ifdef and a missing free(): diff --git a/src/squid/clustal.c b/src/squid/clustal.c index 650004a..bb1fec8 100644 --- a/src/squid/clustal.c +++ b/src/squid/clustal.c @@ -207,7 +207,7 @@ WriteClustal(FILE *fp, MSA *msa) int namelen; /* maximum name length used */ int pos; /* position counter */ #ifdef CLUSTALO - char *buf; /* buffer for writing seq */ + char *buf = NULL; /* buffer for writing seq */ int cpl = msa->alen<iWrap ? msa->alen+10 : (iWrap > 0 ? iWrap : 60); /* char per line (< 64) */ #else char buf[80]; /* buffer for writing seq */ @@ -410,8 +410,9 @@ WriteClustal(FILE *fp, MSA *msa) #endif } -#ifdef CLUSTAL +#ifdef CLUSTALO free(piResCnt); piResCnt = NULL; + free(buf); buf = NULL; #endif return;