Hello, all.

> On Mon, Jan 15, 2007 at 01:54:07AM -0800, Stuart Robinson wrote:
> > I've searched around a bit, both on gmane and Google, but I haven't found
> > much more information regarding your two points. What IS stored in the
> > token field of the table bayes_token? And how is the SHA1 hash involved?
> 
> A SHA1 hash is taken of the original token value, and the bottom 40 bits are
> used as the token from then-on.  There is a plugin call which can be used to
> store raw token -> hash value data, but otherwise the raw token information is
> lost after the message is processed.

Where could I find more information about the plugin call that allows me
to do this? 

> > Where can I find documentation of this? Any suggestions would be greatly
> > appreciated.
> 
> I don't think there's outright documentation about it.  There was a lot of
> chatter about it on the lists a couple of years ago when the change to
> using the hash happened.  I recall there being some talk about it recently
> too, though I can't find it via the archives right now either. :(

I'll keep looking around. It might be nice to have a configuration option
that says whether or not to store the raw tokens in the database along
with their associated hash values.

BTW, the reason I'm asking about this stuff is that I'd like to be able to
store the raw and hashed values side by side in the database so that a CGI
script can pull token info from the database (say, the Top 10 spammiest
tokens per individual user).

Thanks,
Stuart

Reply via email to