There are also some examples in the source code, here: https://bitbucket.org/blais/beancount/src/default/examples/ingest/office/
On Sun, Sep 16, 2018 at 12:15 AM <[email protected]> wrote: > Hey, > > I'm in a very similar boat, were you able to post your importer files > publicly? I think seeing the conversation of you working through this, > along with your finished files would make your files a lot more easier to > understand than the current examples I've seen. > > Cheers, > > > On Friday, 20 July 2018 02:22:48 UTC+10, [email protected] wrote: >> >> I figured it out. The dumb_categorizer does .lower(): and I was passing >> it a search term with a capital letter in it. Now I'm off to the races.. :) >> >> I think maybe I might publish my working setup once I get it all cleaned >> up, as yet another example for others to follow. >> >> TRS-80 >> >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 19. Jul 2018 10:44 by [email protected]: >> >> OK, I am successfully calling dumb_categorizer from CSV Importer by >> defining it at beginning of .config file, and then passing categorizer = >> dumb_categorizer to CSV Importer. I know this because I replaced it with a >> simple print("something") and I got a bunch of "something" on stdout. So >> the categorizer is getting called, it's just either not matching or not >> attaching the other leg... ? >> >> Any help would be greatly appreciated. >> >> TRS-80 >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 19. Jul 2018 08:52 by [email protected]: >> >> I suppose I should have included a link to the CSV importer source: >> https://bitbucket.org/blais/beancount/src/80d30d6896cf5fdcff8c1156cab77107ee8e0f96/beancount/ingest/importers/csv.py?at=default&fileviewer=file-view-default >> >> Down toward the bottom (line 283) is where the categorizer gets called. >> >> Last night at my local LUG, I volunteered to do a talk next month on >> plain text accounting, and got the green light. So it would be nice to get >> this working by then. :) >> >> TRS-80 >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 19. Jul 2018 08:32 by [email protected]: >> >> It is still unclear to me where to put this categorizer code? I have >> tried putting it here, there, and everywhere. I am using the provided >> generic CSV importer, which calls it, but I cannot figure out where to put >> it or how to instantiate it or whatever it is you need to do in Python. >> >> Since I don't really know Python, I am happy to pay someone few bucks to >> help me get this working. >> >> (from >> https://bitbucket.org/blais/beancount/pull-requests/24/improve-ingestimporterscsv/diff >> ): >> >> def dumb_categorizer(txn): >> # At this time the txn has only one posting >> try: >> posting1 = txn.postings[0] >> except IndexError: >> return txn >> >> # Guess the account(s) of the other posting(s) >> if 'nutella' in txn.narration.lower(): >> account = 'Expenses:Food' >> else: >> return txn >> >> # Make the other posting(s) >> posting2 = posting1._replace( >> account=account, >> units=-posting1.units >> ) >> >> # Insert / Append the posting into the transaction >> if posting1.units < posting2.units: >> txn.postings.append(posting2) >> else: >> txn.postings.insert(0, posting2) >> >> return txn >> >> >> >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 25. Jun 2018 16:33 by [email protected]: >> >> OK, stayed up late last night and actually got all my character stripping >> accomplished in Python within the provided tools. Yay me (first Python code >> I ever wrote)! :) >> >> OK so basic CSV importers are working, now trying to figure out where to >> stick the categorizer code I found here: >> https://bitbucket.org/blais/beancount/pull-requests/24/improve-ingestimporterscsv/diff >> >> I been trying here and there without success as of yet. Any >> hints/pointers would be greatly appreciated! >> >> TRS-80 >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 24. Jun 2018 15:21 by [email protected]: >> >> On Sun, Jun 24, 2018 at 11:58 AM <[email protected]> wrote: >> >>> [...]But by all means, please correct me if I am wrong, or have missed >>> something. >>> >>> So now that I have attained some success, and see the light at the end >>> of the tunnel, it looks like I will have to do ~ the following: >>> 1.Manually download CSV file from bank. >>> >> Yes >> >> >>> 2.Do some pre-processing, either manually or with macros in Emacs, or >>> (more likely) programatically, using scripts and sed, etc. to remove parens >>> and $s. >>> >> You can write code in your importer to do that. >> >> >>> 3.Run the actual bean-import. >>> >> You mean bean-extract. >> >> 4.Run some post processing (I would like to change date: metadata name to >>> transaction_date: because I think it's more descriptive). >>> >> Do that in your importer code as well. >> >> >> 5.And then finally hand copy these transactions into my main .beancount >>> file, double checking and tweaking (aka "clearing") them in the process, >>> categorizing remaining ones into Expense accounts and perhaps updating my >>> scripts in the process. >>> >> Yes. >> >> I suppose 2, 4, and 5 could be done all in Emacs, but I'll just have to >>> figure out some workflow now that works for me. >>> >> Yes. >> >> >>> >>> Also not mentioned is somehow programatically inserting the other leg of >>> the transaction (which Expense account). I agree with Martin's basic >>> philosophy on this, and still plan on manually reviewing everything, >>> however I am already seeing that the bulk of transactions are the same >>> places in my case and could easily be categorized with some simple matching >>> (either in a post matching script or within bean-extract using >>> categorizer). I need to look into this more, and also experiment or read up >>> on how the de-duplication works, as I think it's probably related. >>> >> >> You can write some function for your importer to do that with your >> particular rules if it saves you time. >> >> >> Anyway, I will continue to report on what I find as I go along, and even >>> though I'm not getting any replies >>> >> Short emails with direct questions -> more replies more quickly >> >> >> >>> hopefully this will either encourage others to try and set this up or >>> perhaps help other noobs who come along later looking for more in depth >>> info (or perhaps stumble across similar error messages searching the >>> internet) and it eventually helps someone. >>> >>> Helpful tips, encouraging words, or even just letting me know if anyone >>> is actually reading my idiotic ramblings are always welcomed. :D >>> >> >> Sounds like you're making great progress! >> Unfortunately automating the importing still requires writing Python code >> and I see no way around that, I wish it was easier. >> >> >> >>> >>> TRS-80 >>> -- >>> Securely sent with Tutanota. Claim your encrypted mailbox today! >>> https://tutanota.com >>> >>> 22. Jun 2018 19:21 by [email protected]: >>> >>> Yeah I was completely on the wrong track before (I think). But I am on >>> the right one now (I think)? >>> >>> So what I have done is just copy the csv.py file and save it as >>> __init__.py in my importers/suncoast_g directory. Then I put the following >>> into ledger.config: >>> https://paste.pound-python.org/show/popHoa0wvVE2OiPCqIAL >>> >>> But now when doing bean-extract I get "ValueError: CSV config without >>> header has non-index fields: {'[DATE]': 'Posted Date', '[TXN_DATE]': >>> 'Transaction Date', '[NARRATION1]': 'Description', '[CREDIT]': 'Deposit', >>> '[DEBIT]': 'Withdrawal', '[BALANCE]': 'Balance'}" >>> >>> Yes my CSV have headers. I been searching the internet for that error, >>> but still scratching my head. Also tried to change '[DATE]' to 'DATE' etc. >>> but that didn't seem to make a difference either. >>> >>> Of course, I could be completely off track (this is my fourth different >>> approach). I been flailing around at this all day and a good part of >>> yesterday too. Early in the morning until late at night. At this point I >>> would be willing to send someone a few dollars to help me get this set up. >>> I am sure I could get other accounts working and maintain it once I can >>> just get the first one working. >>> >>> When I first saw my credit union's CSV file I thought "this should be >>> easy" because it's very straightforward. I don't need all this complicated >>> parsing like I have seen in some of the other Importers I have been >>> studying. Just a straight CSV import. Or so I thought... :/ >>> >>> Anyway, any help at all would be greatly appreciated at this point. Any >>> clue might help! >>> >>> TRS-80 >>> -- >>> Securely sent with Tutanota. Claim your encrypted mailbox today! >>> https://tutanota.com >>> >>> 22. Jun 2018 14:19 by [email protected]: >>> >>> OK I sought and received some help in @python. I think I am on a much >>> better track now. I don't know where I got my original __init__.py from, >>> some similar thread here I think. >>> >>> But now I have downloaded from source the utrade one from: >>> https://bitbucket.org/blais/beancount/src/65212d1176bb427a7883d2593edbd0e0545a145a/examples/ingest/office/importers/utrade/__init__.py?at=default&fileviewer=file-view-default >>> and am modifying that to my needs. I now see that I missed a whole bunch of >>> the methods listed in "Writing an Importer" section of "Importing External >>> Data" Docs. It will take me a while to work through it but I will post >>> something back later, including results. I just didn't want anyone to spend >>> time posting a long reply in the meantime. >>> >>> Fun fun! :) >>> >>> TRS-80 >>> >>> -- >>> Securely sent with Tutanota. Claim your encrypted mailbox today! >>> https://tutanota.com >>> >>> 22. Jun 2018 12:08 by [email protected]: >>> >>> OK, so this is quite challenging for someone who doesn't really know >>> Python. However I think it's a good exercise not only for myself but also >>> to help other newbies who would like to try and get this awesome feature >>> working. >>> >>> I have read everything I can in source and mailing list about CSV Import >>> / Ingest and I've made some progress, but now I'm stuck. >>> >>> Apologies in advance for ugly formatting, Google Groups apparently do >>> not support inline text formatting, and I am communicating with the group >>> via email. >>> >>> I've tried to (mostly) follow the naming conventions in the examples but >>> it seems they have changed over time. Anyway, file structure looks like so: >>> ~/fin >>> |---documents >>> |---Downloads >>> |---importers >>> | |---suncoast_g >>> | |---__init__.py (this file shared below) >>> | |---__init__.py (this file is empty) >>> |---ledger.beancount >>> |---ledger.config (I have seen this also referenced as >>> .import in docs) >>> >>> Here is my ledger.config file: >>> --------------------(begin ledger.config file)-------------------- >>> #!/usr/bin/env python3 >>> """Example import configuration.""" >>> >>> # Insert our custom importers path here. >>> # (In practice you might just change your PYTHONPATH environment.) >>> import sys >>> from os import path >>> sys.path.insert(0, path.join(path.dirname(__file__))) >>> >>> from importers import suncoast_g >>> #from importers import acme_pdf >>> >>> from beancount.ingest import extract >>> #from beancount.ingest.importers import ofx >>> >>> >>> # Setting this variable provides a list of importer instances. >>> # >>> # Removed the following from below to replace with my own, saved for >>> reference >>> # >>> # utrade.Importer("USD", >>> # "Assets:US:UTrade", >>> # "Assets:US:UTrade:Cash", >>> # "Income:US:UTrade:{}:Dividend", >>> # "Income:US:UTrade:{}:Gains", >>> # "Expenses:Financial:Fees", >>> # "Assets:US:BofA:Checking"), >>> # >>> # ofx.Importer("379700001111222", >>> # "Liabilities:US:CreditCard", >>> # "bofa"), >>> # >>> # acme_pdf.Importer("Assets:US:AcmeBank"), >>> # >>> CONFIG = [ >>> suncoast_g.Importer("Assets:Suncoast:Checking-G"), >>> ] >>> >>> >>> # Override the header on extracted text (if desired). >>> extract.HEADER = ';; -*- mode: org; mode: beancount; coding: utf-8; >>> -*-\n' >>> --------------------(end ledger.config file)-------------------- >>> >>> OK now the __init__.py that is in suncoast_g contains following: >>> --------------------(begin __init__.py file)-------------------- >>> #!/usr/bin/env python3 >>> >>> # >>> # Configuration file for extracting Suncoast-G data >>> # >>> >>> from beancount.ingest import regression >>> from beancount.ingest.importers import csv >>> >>> from beancount.plugins import auto_accounts >>> >>> >>> class Importer(csv.Importer): >>> >>> config = {csv.Col.DATE: 'Posted Date', >>> csv.Col.TXN_DATE: 'Transaction Date', >>> csv.Col.NARRATION: 'Description', >>> csv.Col.AMOUNT_CREDIT: 'Deposit', >>> csv.Col.AMOUNT_DEBIT: 'Withdrawal', >>> csv.Col.BALANCE: 'Balance'} >>> >>> def __init__(self, account): >>> csv.Importer.__init__( >>> self, self.config, >>> account, 'Currency', >>> ('Posted Date,Transaction Date,Description,' >>> 'Deposit,Withdrawal,Balance'), >>> 1) >>> >>> def get_description(self, row): >>> payee, narration = super().get_description() >>> narration = '{} ({})'.format(narration, row.category) >>> return payee, narration >>> --------------------(end __init__.py file)-------------------- >>> >>> I have just copied this stuff and tried to figure it out. I'm sure I've >>> got something wrong in here but I don't really know what I'm doing. FYI >>> here is what the data looks like which is in G.csv in Downloads: >>> >>> Posted Date,Transaction Date,Description,Deposit,Withdrawal,Balance >>> 6/4/2018,6/4/2018,Withdrawal Debit Card SOME BAR & GRILL CITY ST Card >>> XXXX,,($59.83),$229.15 >>> >>> OK I think that's all the relevant info. So now when I do: >>> >>> ~/fin$ bean-identify ledger.config Downloads >>> >>> I get: >>> >>> **** /home/myname/fin/Downloads/A Sunnet History 6186156 >>> 23032018_21062018.csv >>> **** /home/myname/fin/Downloads/G.csv >>> >>> Which I think means it is identifying those 2 files (the only ones in >>> there) as CSV, correct? I will point out that G.csv is an Asset account and >>> is my first target here. The other one is a Liability account (credit card) >>> and therefore has different fields (only one amount, and no balance). But I >>> figure once I get this one working, that other one (and subsequent others) >>> should be pretty easy. >>> >>> OK so now when I do: >>> >>> ~/fin$ bean-extract ledger.config Downloads >>> >>> I get: >>> >>> **** /home/myname/fin/Downloads/A Sunnet History 6186156 >>> 23032018_21062018.csv >>> >>> **** >>> /home/myname/fin/Downloads/G.csv >>> >>> ERROR:root:Importer importers.suncoast_g.Importer: >>> "Assets:Suncoast:Checking-G".extract() raised an unexpected error: CSV >>> config without header has non-index fields: {<Col.DATE: '[DATE]'>: 'Posted >>> Date', <Col.TXN_DATE: '[TXN_DATE]'>: 'Transaction Date', <Col.NARRATION: >>> '[NARRATION1]'>: 'Description', <Col.AMOUNT_CREDIT: '[CREDIT]'>: 'Deposit', >>> <Col.AMOUNT_DEBIT: '[DEBIT]'>: 'Withdrawal', <Col.BALANCE: '[BALANCE]'>: >>> 'Balance'} >>> >>> ERROR:root:Traceback: Traceback (most recent call >>> last): >>> >>> File >>> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/extract.py", line >>> 187, in extract >>> >>> allow_none_for_tags_and_links=allow_none_for_tags_and_links) >>> >>> File >>> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/extract.py", line >>> 69, in extract_from_file >>> new_entries = importer.extract(file, **kwargs) >>> File >>> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/importers/csv.py", >>> line 189, in extract >>> iconfig, has_header = normalize_config(self.config, file.head()) >>> File >>> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/importers/csv.py", >>> line 340, in normalize_config >>> "{}".format(config)) >>> ValueError: CSV config without header has non-index fields: {<Col.DATE: >>> '[DATE]'>: 'Posted Date', <Col.TXN_DATE: '[TXN_DATE]'>: 'Transaction Date', >>> <Col.NARRATION: '[NARRATION1]'>: 'Description', <Col.AMOUNT_CREDIT: >>> '[CREDIT]'>: 'Deposit', <Col.AMOUNT_DEBIT: '[DEBIT]'>: 'Withdrawal', >>> <Col.BALANCE: '[BALANCE]'>: 'Balance'} >>> >>> ;; -*- mode: org; mode: beancount; coding: utf-8; -*- >>> >>> And this is where I'm currently stuck. I feel like it's something dumb, >>> something not pointing at something else correctly but I don't know enough >>> Python (yet) to figure it out myself. Any halp would be greatly >>> appreciated. :) >>> >>> TRS-80 >>> -- >>> Securely sent with Tutanota. Claim your encrypted mailbox today! >>> https://tutanota.com >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Beancount" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/beancount/LFcF9ZJ--3-0%40tutanota.com >>> <https://groups.google.com/d/msgid/beancount/LFcF9ZJ--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Beancount" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/beancount/LFciKzu--3-0%40tutanota.com >>> <https://groups.google.com/d/msgid/beancount/LFciKzu--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Beancount" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/beancount/LFdnLh3--3-0%40tutanota.com >>> <https://groups.google.com/d/msgid/beancount/LFdnLh3--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Beancount" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/beancount/LFmJI7Y--B-0%40tutanota.com >>> <https://groups.google.com/d/msgid/beancount/LFmJI7Y--B-0%40tutanota.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/CAK21%2BhNT9Wvhd9EtFvp_F6sNKBV4NAFBmw_yJyu_umkHPwY%2Bsw%40mail.gmail.com >> <https://groups.google.com/d/msgid/beancount/CAK21%2BhNT9Wvhd9EtFvp_F6sNKBV4NAFBmw_yJyu_umkHPwY%2Bsw%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LFsdlPg--3-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LFsdlPg--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LHmWkuU--3-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LHmWkuU--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LHmaD4f--F-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LHmaD4f--F-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LHmzwng--3-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LHmzwng--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/660e92ff-2ba4-4c47-9fbd-eb76b8ec6571%40googlegroups.com > <https://groups.google.com/d/msgid/beancount/660e92ff-2ba4-4c47-9fbd-eb76b8ec6571%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhNo9eyhKOLsaa%3DgrHV%3D-_fmBKv6D6J5kAr4HcVy9BTTEQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
