According to examples/import.py hooks take two parameters Args: extracted_entries_list: A list of (filename, entries) pairs, where 'entries' are the directives extract from 'filename'. ledger_entries: If provided, a list of directives from the existing ledger of the user. This is non-None if the user provided their ledger file as an option.
Returns: A possibly different version of extracted_entries_list, a list of (filename, entries), to be printed. But "ledger_entries" is never None -- that is, it is non-None even if a user didn't provide their ledger file as an option. This is because of the deduplicate logic in __init__.py/_extract which extends existing entries before calling hooks # Deduplicate. for filename, entries, account, importer in extracted: importer.deduplicate(entries, existing_entries) existing_entries.extend(entries) # Invoke hooks. for func in ctx.hooks: extracted = func(extracted, existing_entries) This was somewhat surprising to me (especially since it was contrary to the quasi-documentation/comment I quoted) as I wouldn't expect (or want) to have the newly imported entries merged into the existing entries before hooks have run. Is this intended? Is there another easy way to get the pristine set of entries from within a hook short of just running beancount.loader myself? My actual use case: I want to know the most recent Balance statement of an account in the ledger, which I am using as a proxy for "last imported date". But the most recent Balance statement I actually find will the one auto-generated by beangulp and then merged into existing_entries. Cheers, Justus -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/beancount/90ef1123-d14d-44b9-8974-102477fdc5d7n%40googlegroups.com.