Hi,

while further studying the bayesian import matching algorithm I'm now at the point, where I wanted to understand, how the bayes formula is applied to the problem of matching transactions to accounts using tokens. But I need further information, since it doesn't come clear to me what is really calculated there.

The implementation can be found in the following functions in Account.cpp:

 * get_first_pass_probabilities()
 * build_probabilities()
 * highest_probability()

Actually, the latter could be omitted as it only selects the account with the highest matching probability.

Studying the code and the rare comments on the implementation it seems to be a variant of the naive bayes classifier <https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Probabilistic_model> with the tokens used as (independent) "features" and the accounts used as "classes". But comparing this algorithm to the code leaves several questions open.

Does anybody know a more precise algorithm description, on which the implementation in GnuCash is based on?

Regards,
Christian

_______________________________________________
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel

Reply via email to