Hi all, I would like to submit a patch which : - speed up 'cli_bm_scanbuff()' - reduce memory usage (allocated in 'cli_bm_init()')
What I am looking for here is your opinion on the best what to implement it. I/ Description of the proposal : ============================== When using the BM algorithm, if BM_MIN_LENGTH == BM_BLOCK_SIZE, the root->bm_shift array is not useful. We can find in cli_bm_addpatt : for(i = BM_MIN_LENGTH - BM_BLOCK_SIZE; i >= 0; i--) { idx = HASH(pt[i], pt[i + 1], pt[i + 2]); root->bm_shift[idx] = MIN(root->bm_shift[idx], BM_MIN_LENGTH - BM_BLOCK_SIZE - i); } We can see, that in this case ALL the bm_shift that can be useful (i.e. all bm_shift index that have a corresponding entry in the virus signature hash table (bm_suffix)) are set to 0. So this array becomes completely useless and we can avoid its creation (around 200 Ko) and its use in 'cli_bm_scanbuff()' to speed up the test done for each byte in the scanned files. (shift = root->bm_shift[idx]; if(shift == 0)... is always true) It is clear that this can be avoided ONLY if BM_MIN_LENGTH == BM_BLOCK_SIZE which is the case actually but which was not in the past. So I would like to implement it so that if s.o. change the value of the constant, for test purpose for example, the previous behaviour remains the same. II/ Idea of implementation : ========================== 2.1) define a macro and macroize the code (AVOID_BM_SHIFT ???) everywhere needed 2.2) add test like : if (BM_MIN_LENGTH == BM_BLOCK_SIZE) ... where needed and let the compiler determine dead code and optimize it away 2.3) add a new inlined function (int cli_can_avoid_bm_shift() ???) that perform this test 2.4) any other idea ? Personally, I think that 2.3 is the best approach. Thanks in advance for your comments _______________________________________________ http://lurker.clamav.net/list/clamav-devel.html