Re: load_file omits some entries (balances)

Florian Lindner Sun, 19 May 2019 12:04:17 -0700

Am 17.05.19 um 01:01 schrieb Martin Blais:
> Alright now I see what you want to do.
> You want to rewrite your payees, but in the source file itself.
> That's a nice idea.


Thanks!

> However, I don't think you'll be able to put together a nice solution with 
> rewriting after processing.
> I would work off the source text itself.
> Or even better: as a combination of both.
> Here's what you could do: parse the entire thing, filter just the 
> transactions.
> For each transaction  you have the filename and line number.
> Do whatever remapping / processing / cleaning you want to do on the payee 
> names in your script.
> Then, process each file, using a regexp to replace the first string that 
> occurs on the lines where you have transactions with renamed payees.
> 
> This is better than working purely from the source file because you won't 
> have to write a full alternative parser to make your replacements; all you 
> need to ace is replacement of the first string on those transaction lines and 
> leave all the other lines untouched. Should be pretty easy and robust enough 
> (tip: make sure you safeguard your files in a git/hg repo and diff just in 
> case). The benefit is your source files will keep all the other formatting 
> and comments and spacing and and ordering and whatever else.
> 
> This is how I'd go about this.
> I think it would even be possible to template this and provide helper 
> functions.

Ok, I understand what you're suggesting, but I am not really sure if that is 
the way to go. For an easy case, such as replacing payees it is ok, but I think 
for more complex tasks, like adding new meta data fields, changing accounts, or 
even splitting transactions between accounts a search-and-replace approach will 
evolve into just rewriting the entire transactions in the source file from the 
Transaction object.

Right now, I think reading in the beancount file into a string, parse them 
using bc.parser.parser.parse_many and perform the transformations is the best 
way for me. Then, rewrite the entire file using bc.parser.printer.print_entries.

You wrote:

> Still, when you write entries out, they won't look precisely the same as the 
> input. Numbers will have been filled in, cost bases will show up, etc. I 
> don't see the point.

Given my very simple transactions, e.g.,

2018-05-20 * "KREDITKARTENABRECHNUNG" "18.05.18 1234"
  buchungsart: "Lastschr.Kreditkarten"
  empfaenger: " / "
  hash: "e4a580e7002e606a4314f864f64f30a12fb8673f"
  Assets:Giro       -115.00 EUR
  Expenses:Unknown             

Rewriting the ledger file, as I mentioned above, does not change an entry like 
that.

As I just use simple beancount syntax, but potentially want to use more, do you 
consider that kind of rewriting a problem?

In the long run, I think a rewriting protocol would make a beneficial addition 
to beancount, as Stefano also suggested.

Maybe something like the importer protocol:

class MyRewriter:

  def rewriteTransaction(self, txn):
    return txn

  def rewriteBAlance(self, bal):
    return bal

or alike, one function for all types. Then you invoke bean-rewrite on a file or 
a set of transactions. Just a first idea...  ;-)

Best Regards,
Florian


> On Wed, May 15, 2019 at 4:12 AM Florian Lindner <mailingli...@xgm.de 
> <mailto:mailingli...@xgm.de>> wrote:
> 
>     Hi,
> 
>     Am 15.05.19 um 02:57 schrieb Martin Blais:> But why are you trying to do 
> this? What's your purpose?
>     My importer applies a set of rules to convert payee names and assign 
> certain kind of transactions to accounts:
> 
>     # List of tuples (regular expression, replacement)
>     payee_replacements = [
>         ("^AMAZON", "Amazon"),
>     ]
> 
>     # List of tuples (python expression to match, second account to set)
>     accounts_assignments = [
>         ("desc == 'Miete PSW 1'", "Expenses:Miete"),
>         ("payee in ['REWE', 'Kaufland', 'ALDI']", "Expenses:Groceries"),
>         ("True", "Expenses:Unknown")
>     ]
> 
> 
>     def transform_txn(txn):
>         payee = txn.payee
> 
>         for pattern, substitute in payee_replacements:
>             if re.match(pattern, payee):
>                 payee = substitute
>                 break
> 
>         txn = txn._replace(payee = payee)
> 
>         local_vars = {"payee" : txn.payee, "desc" : txn.narration, 
> "buchungsart" : txn.meta["buchungsart"]}   
>         if txn.postings[1].account == "Expenses:Unknown":
>             for expr, acc in accounts_assignments:
>                 if eval(expr, local_vars):
>                     account = acc
>                     break
> 
>             txn.postings[1] = txn.postings[1]._replace(account = account)
>            
>         return txn
> 
> 
>     These two rulesets are applied on import.
> 
>     I want to also apply them on existing ledgers.
> 
>     Usecase: I identify a recurring transaction pattern, such as "buchungsart 
> == 'GAA,Spk.Netz'. All matching transaction to imported as well as existing 
> ones should have the account "Assets:Bargeld" assigned. For that, I need a 
> method to read in all transactions, transform them and write them to a 
> beancount file.
> 
>     This is my solution to this question: 
> https://groups.google.com/forum/#!topic/beancount/e93VI4s4YCQ
> 
>     An alternative approach are plugins. So far I understand plugins they 
> only apply live transformations, i.e., they transform data as it is loaded 
> from a file, but do not write back the data to the file.
> 
>     >> A workaround I see, is to read in main.beancount and write out the 
> entries to different files based on entries[6].meta["filename"]. Basically 
> rewriting the entire ledger.
>     >
>     > I was going to suggest this.
>     > Still, when you write entries out, they won't look precisely the same 
> as the input. Numbers will have been filled in, cost bases will show up, etc. 
> I don't see the point.
>     Yes, I have noticed that, but that seems ok to me.
> 
>     I hope I was able to explain my use case. I am open to any thoughts and 
> ideas to achieve that differently.
> 
>     Best Regards,
>     Florian
> 
>     >
>     >
>     >
>     > On Mon, May 13, 2019 at 10:36 AM Florian Lindner <mailingli...@xgm.de 
> <mailto:mailingli...@xgm.de> <mailto:mailingli...@xgm.de 
> <mailto:mailingli...@xgm.de>>> wrote:
>     >
>     >         I see.
>     >         Well FWIW, entries which have errors are not guaranteed to show 
> up in the output stream at all.
>     >         It's unclear to me whether this is always the best outcome, but 
> a long while ago I decided to do this for transactions and for some other 
> directives.
>     >         
> https://bitbucket.org/blais/beancount/src/d1b2cbf2841669e988f6692ec1d39db3708730cc/beancount/ops/balance.py#lines-119
>     >
>     >         I don't have a solution for you. This is an unusual case.
>     >
>     >
>     >     I tried to apply the workaround I mentioned:
>     >
>     >         entries, error, option_map = bc.loader.load_file(args.inputfile)
>     >         sorted_entries = {} # file -> list of entries
>     >
>     >         for e in entries:
>     >             entry = transform_txn(e) if type(e) == data.Transaction 
> else e
>     >             name = entry.meta["filename"]
>     >             sorted_entries[name] = sorted_entries.get(name, []) + 
> [entry]
>     >      
>     >         for filename in sorted_entries:
>     >             with open(filename, "w") as f:
>     >                 
> bc.parser.printer.print_entries(sorted_entries[filename], file = f)
>     >
>     >     A problem that shows up, is that in main.beancount I have some 
> options set (e.g. operation_currency). They don't show up in entries, but in 
> option_map. However, I don't know how to write them to file.
>     >
>     >     Another idea: At a first try, it seems that reading the entire file 
> into a string and use |beancount.parser.parser.||parse_many|would work and 
> also parser the balances:
>     >
>     >         with open(args.inputfile, "r") as f:
>     >             instr = f.read()
>     >       
>     >         entries = bc.parser.parser.parse_many(instr)
>     >
>     >     Seems to work fine so far. What do you think?
> 
>     -- 
>     You received this message because you are subscribed to the Google Groups 
> "Beancount" group.
>     To unsubscribe from this group and stop receiving emails from it, send an 
> email to beancount+unsubscr...@googlegroups.com 
> <mailto:beancount+unsubscr...@googlegroups.com>.
>     To post to this group, send email to beancount@googlegroups.com 
> <mailto:beancount@googlegroups.com>.
>     To view this discussion on the web visit 
> https://groups.google.com/d/msgid/beancount/a404f3ac-e9db-4470-b15c-4ee2bc611525%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/beancount/a404f3ac-e9db-4470-b15c-4ee2bc611525%40googlegroups.com?utm_medium=email&utm_source=footer>.
>     For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to beancount+unsubscr...@googlegroups.com 
> <mailto:beancount+unsubscr...@googlegroups.com>.
> To post to this group, send email to beancount@googlegroups.com 
> <mailto:beancount@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/beancount/CAK21%2BhMvPQ7_8hY98oi1mFXvcJ4X0yishrUNiSVeV76Cq1d_kA%40mail.gmail.com
>  
> <https://groups.google.com/d/msgid/beancount/CAK21%2BhMvPQ7_8hY98oi1mFXvcJ4X0yishrUNiSVeV76Cq1d_kA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To post to this group, send email to beancount@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/525d347f-6608-4690-8f60-3ca4a5fa9bc4%40xgm.de.
For more options, visit https://groups.google.com/d/optout.

Re: load_file omits some entries (balances)

Reply via email to