Hi all,

I would like to submit a patch which :
    - speed up 'cli_bm_scanbuff()'
    - reduce memory usage (allocated in 'cli_bm_init()')

What I am looking for here is your opinion on the best what to implement it.


I/ Description of the proposal :
==============================
When using the BM algorithm, if BM_MIN_LENGTH == BM_BLOCK_SIZE, the
root->bm_shift array is not useful. We can find in cli_bm_addpatt :

  for(i = BM_MIN_LENGTH - BM_BLOCK_SIZE; i >= 0; i--) {
      idx = HASH(pt[i], pt[i + 1], pt[i + 2]);
      root->bm_shift[idx] = MIN(root->bm_shift[idx], BM_MIN_LENGTH -
BM_BLOCK_SIZE - i);
  }

We can see, that in this case ALL the bm_shift that can be useful (i.e. all
bm_shift index that have a corresponding entry in the virus signature hash
table (bm_suffix)) are set to 0.

So this array becomes completely useless and we can avoid its creation
(around 200 Ko) and its use in 'cli_bm_scanbuff()' to speed up the test done
for each byte in the scanned files.
(shift = root->bm_shift[idx]; if(shift == 0)... is always true)

It is clear that this can be avoided ONLY if BM_MIN_LENGTH == BM_BLOCK_SIZE
which is the case actually but which was not in the past. So I would like to
implement it so that if s.o. change the value of the constant, for test
purpose for example, the previous behaviour remains the same.


II/ Idea of implementation :
==========================
2.1) define a macro and macroize the code (AVOID_BM_SHIFT ???) everywhere
needed

2.2) add test like : if (BM_MIN_LENGTH == BM_BLOCK_SIZE) ... where needed
and let the compiler determine dead code and optimize it away

2.3) add a new inlined function (int cli_can_avoid_bm_shift() ???) that
perform this test

2.4) any other idea ?


Personally, I think that 2.3 is the best approach.


Thanks in advance for your comments



_______________________________________________
http://lurker.clamav.net/list/clamav-devel.html

Reply via email to