I see there would be problems in naming your project "RSA". Nevertheless, is there any plan to have the current rspamd features in a library, in order to allow third-parties to develop their own message handling interface wrapping it?
Giampaolo Vsevolod Stakhov <vsevo...@highsecure.ru> ha scritto: Hello, I've decided to write to SA users list about rspamd project[1] status since I've got the second mention of rspamd in this list. However, I was not subscribed to it, therefore I cannot reply directly to the original author of the post. The phrase mentioned in the original post: "With similar rules, rspamd is about ten times faster than SpamAssassin", was my mistake, as it only describes the comparison of SA and rspamd on rather specific ruleset that was selected after filtering of the overall SA ruleset on our specific mail payload (and this set included about 100 rules). So I feel sorry about this phrase that is not true in a common case, as rspamd does not support all features of SA and has not the same ruleset. Nevertheless, whilst I was implementing rspamd I took into consideration main problems with performance I had found in SA: too many regexp checks for each action (for example, in Received headers parsing code), too many repeated checks of the same text and so on. Rspamd tries to fix these problems by using of specified finite state machines, using of tries for patterns matching, having rules planner to pass more probable checks before less probable and so on. Moreover, rspamd can use thread pools for statistic and regexp check that allows to scale easily on multi-cores machines. As a result, on the rules that we've selected for porting from SA to rspamd, rspamd was several times faster than SA. Actually, we could not afford the check speed of SA with our amount of mail and with our amount of servers. And rspamd solved the problem that time. Furthermore, I was focused on maximum performance while writing code for other rspamd modules, for example, DKIM, SPF or SURBL, trying to avoid usage of resource greedy libraries (like opendkim or libspf2). The statistic module was implemented based on Markovian Bayes algorithm with OSB tokenizer in crm114, that behaves more accurately in my tests than unigramm bayes that is used in SA by default. In conclusion, I'd like to add some words about immature state of the project. Unfortunately, I've developed it focused only on a single client. Therefore, rspamd can not be compared with SA in terms of features amount, however, it can be useful for those who do not require every single feature of SA, but want something oriented on performance and statistical checks. I'm very keen to attracting more users to rspamd project, that's why if you have any questions or want to try rspamd, please feel free to contact me. Eventually, sorry for this message that is not directly connected with SA project. [1]: https://bitbucket.org/vstakhov/rspamd/ -- Vsevolod Stakhov