Dedup detection is definitely far from perfect and was just something I tried at the time.
In the new version - beangulp, which Daniele is driving - dedup can be done by importer. I think that per-importer custom dedup is best. For example, any importer that has a unique ID per transaction should leverage this. On Tue, Mar 30, 2021, 06:57 redst...@gmail.com <redstre...@gmail.com> wrote: > Reg. class SimilarityComparator in similarity.py: > > The final check is: > # Here, we have found at least one common account with a close > # amount. Now, we require that the set of accounts are equal or > that > # one be a subset of the other. > return accounts1.issubset(accounts2) or > accounts2.issubset(accounts1) > > I've been instead using a slightly modified version, where I just check > for intersection: > return accounts1.intersection(accounts2) > > For my use cases, this has worked better in every case. The common case is > an import of a credit card transaction that is modified post-import. On a > subsequent import (with an overlapping date range), dedupe does not work > with the original heuristic. > > I can't help but wonder if this would be universally better for everyone. > Thoughts? > > If not, perhaps an option might help users fine tune for their use cases? > Suggestions: > --aggressive_match > --heuristic=match_on_one_common_posting (--heuristic would take in a list) > > Making dedupe detection better further cuts down ingest effort > <https://reds-rants.netlify.app/personal-finance/the-five-minute-ledger-update/> > (links to 5min ledger update article). > > Martin, would you be opposed to one of the approaches above? > > Thanks, > -red > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/ee41980d-dcea-4e82-879d-9bd41b9d7363n%40googlegroups.com > <https://groups.google.com/d/msgid/beancount/ee41980d-dcea-4e82-879d-9bd41b9d7363n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhO%2Bz0MUf5O2OxxLwmDeKwO3ivKTLtonN68mL-KT5OvhSw%40mail.gmail.com.