On Fri, Oct 29, 2004 at 03:36:26PM -0400, [EMAIL PROTECTED] wrote: > > In short, when I run sa-learn --dump, I see a slew of binary tokens. I've > isolated the problem by creating a test directory, pointing sa-dump to it via > --dbpath, and creating a new db. Even after loading only a single spam > message, my db dump still shows all binary/useless tokens. It seems to be > like sa-learn and my berkeley db version don't jive, perhaps? I don't seem > to be getting any bayesian matching out of this in spamassassin, so I'm > concluding it is a real issue and not just aesthetic. Sample output (mind > you after loading only ONE 32-line/304-word spam message). >
It's just aesthetic. All bayes tokens are stored in binary form now. For the curious, it's the low order 40 bits of the tokens SHA1 hash. Because these values are binary they wouldn't print very well in the sa-learn --dump output, so when you see them here, in the --restore output or in the bayes_journal file they are actually unpacked values of the binary token. They are then repacked before going into the database. So, yes you are seeing binary tokens in your database, no they are not necessarily what you are seeing the dump output (this is basically a hex representation of the binary token value). If you need access to the raw token value you can write a plugin to dump the values from the bayes hooks. Now, I'm not saying you don't have a problem with bayes, you did say it seems to not be working. If you run with -D you will see some bayes debug output that might lead to why it isn't working, assuming it isn't. Michael
pgppJmHsDDwiB.pgp
Description: PGP signature