Hi there, On Fri, 31 Aug 2012, Maarten Broekman wrote:
I see where your confusion comes from. I'm not generating pdb signatures. I'm generating ndb signatures ...
Sorry, bit of a senior moment there. They seem to be creeping up on me lately. :( I had to go back and read http://www.clamav.net/doc/latest/signatures.pdf again. I'm still perplexed by the numbers here. You say that you have signatures of the order of 8k characters, and you want to save (O)10 characters here and there in the signatures. It seems like you're fighting an uphill battle, what else am I missing? Have you estimated the gains you're going to be able to make? How many occurrences of the target replacements do you expect to find in the signatures? A *long* time ago I was faced with something superficially similar, in the context of trying to fit the descriptions for 50,000+ stationery products into 40 character strings. Descriptions were abbreviated, ad-hoc, apparently by careless staff for whom English was at best a second language. A very large number of corrections was necessary. It was a nightmare, and it needed to be done four times per annum, so I wrote a simple parser in Perl. Amongst other things, it used a kind of 'thesaurus' of text strings. Here's a brief extract: ... *B/FILE *BXFILE *BOXFILE BOX FILE *BRACKETS *BRCK BRACKET ... The asterisk is just a character which didn't often appear in the input descriptions. Your thesaurus would probably look something like ... *hyyp:// *hyyps:// {7-8} ... It's a very simple idea. The input is a catalogue which contains tens of thousands of single-line descriptions of products. A description line is matched against the thesaurus. If a string is found in the line which matches one of the strings in the thesaurus which you see prefixed by an asterisk, then it is replaced by the string following next in the thesaurus which is not prefixed by an asterisk. It's an easy thing to do in Perl, but if Perl isn't your second language you might find it testing. If it's of interest please give me some more examples of your replacement requirements and I'll dust off the code. -- 73, Ged. _______________________________________________ Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net http://www.clamav.net/support/ml