TRS-80, You may find it reassuring that every professional I know shares a more-or-less extreme bias toward automated workflows.
You might be interested in https://csingley.github.com/ofxtools (that's mine). It's a much better OFX client than what you're using (if I do say so myself), and it can also parse the downloaded files to extract the transactions/balances. Cheers. On Wednesday, July 15, 2020 at 1:32:33 PM UTC-5 TRS-80 wrote: > OK so this got a little long. Go grab yourself your favorite tasty cold > delicious adult beverage and get comfortable. I did put in some > headings > at least to mitigate the wall of text. :D > > I am on my second or third go 'round with Beancount, over a period of > some > years. I have had various levels of success or failure, but for me > personally, I really felt like I was the most successful when automating > as > much as possible. > > This post will be about how I arrived at that conclusion, and some > things I > learned along the way. I hope it ends up being useful to others who may > have similar ideas, but perhaps not put all the pieces together yet. Or > maybe need some encouragement, or...? > > * Know Thyself > > I guess I felt the need to make this post because Martin himself > throughout > the docs seems to put forth his more "manual" way of doing things (as a > way > to keep more "in touch" with his numbers, if I am reading him > correctly). > > But perhaps I read too much into that. I can only say that generally I > do > try and follow the recommendations made in docs by founder and those > more > involved in a particular project (especially when I am just starting > out). > I suppose I figure there must be some reason for it, even if I do not > yet > understand what those reason(s) might be... So maybe this is just me > finally gaining enough experience to know what is what, and perhaps more > importantly "knowing myself" enough to recognize what works for me, > personally. > > If you prefer the more "manual" approach (or any other approach, for > that > matter) I encourage you to "do what works for you." Thankfully we have > such flexible tools available to us... > > * Automate as much as Possible > > For me, what seems to work is "as much automation as possible." I still > end up manually doing some stuff of course, but for me if I can get that > down to 10% or 5% (or whatever) the way I see it is I have reduced > 90-95% > of the work (and drudgery) involved. I mean, this is what computers are > best at, isn't it? > > * Moving from CSV to OFX import > > Along those lines, I recently moved from CSV import to OFX. It's still > early, but I am well on my way to nearly //completely// automating my > download, import, and categorization. > > Before with CSV I had to log on to my bank and click through stuff, save > the file (and then remember my file naming scheme), etc. and some times > that just became too much friction and sooner or later I would start > falling behind from the simple drudgery of it all. > > Further complicating the issue, my bank only keeps "transactions" around > for 90 days, so if I got busy or fell behind, I would be back to > /manually/ > entering any "missed" transactions (yeah, right!). > > Enter OFX (via ofxclient), which solves these problems by being > completely > scriptable (and thus automateable) tool. > > There were a couple bugbears with ofxclient[0] however. The guy is not > really actively maintaining it. However after fixing a couple missing > apostrophes I /finally/ got it to work. I guess my Python must be > getting a > little better, because 1-2 years ago I had already failed once or twice > at > this exact same task. :) > > So, on to the next hurdle... > > * No (built in) OFX "categorizer" > > Anyway so then it was a little disappointing to learn that there is no > callable "categorizer" available in the OFX importer example the same > way > that there was in the CSV importer example. > > Until I found a recent post titled "Categorizing transactions > automatically > on import" which solved that particular part of the problem. I left a > more > fleshed out example as a reply to that thread for anyone who is > interested > (search the mailing list for that or "OFX categorizer" etc.). > > * Next steps (Selenium WebDriver) > > At this point I am satisfied enough in my progress (and have learned > enough) that I felt it would be worth sharing that progress with others. > But already I am looking forward to next steps. And I am getting > excited > about Beancount again. :) > > The last days I have already been reading up docs about Selenium > WebDriver. > I have heard about Selenium before of course, but what I think motivated > me > to really give it a try now was an article I recently came across over > at > plaintextaccounting.org[1] by Lee Yingtong Li titled "Using selenium to > scrape/import bank transactions for ledger-cli."[2] This is a quite > recent > article (2020-04-29) as you can see by the link. > > Anyway he is using it to get his "transactions" but that is not what I > plan > on using it for (I have OFX for that). For me, the only remaining piece > of > the puzzle that is left to automate is... > > * Automatically downloading PDF statements > > Like my "transactions", downloading these PDF "statements" was an > exercise > in drudgery, for all the reasons already mentioned above (clicking > through > bank website, remembering file naming convention, etc.). > > First I tried doing this through OFX protocol itself. And maybe there > is a > way? The standard would seem to indicate maybe there is. But I made > posts > about this not only here but on ledger mailing list before and received > exactly zero replies so far (which is also why I am not even going to > bother looking them up in order to link to them). So I gave up on that > way > (for now). > > So then I got the idea to maybe automate this drudgery using Selenium > (WebDriver). > > * Arguments for Selenium WebDriver (in general) > > Now, I have not even got this actually working yet, and the > implementation > details will of course be very bank (web site) dependant. So why bother > bringing it up now (or at all, for that matter)? > > Well for same reason as posted very early on, mainly I have heard of > this > sort of thing being referred to mostly as "too much trouble" and took > that > assessment at face value. But is it? Some things I learned in my > research > the last few days started to change my mind: > > 1. So far, the Selenium WebDriver docs[3] seem to be very good. Simple > and > to the point. > > 2. There are bindings for several different languages. And the lanuage > bindings (I was looking at Python mostly) seem to be quite clean, > straightforward, and easy to remember / intuitive. > > 3. It appears to be quite a mature and reliable thing nowadays, with > browser vendors like Google and Mozilla (and others) actually > maintaining their own drivers for each particular browser. No more > "PhantomJS" and feeling like you are in some neverending cat and > mouse > with an opponent. > > 4. Not only that, apparently the whole notion of automated browser / > site > testing has actually become an W3C recommendation by now(!). [4] > > It really appears to me to be a completely different dynamic nowadays. > Therefore I would challenge the notion that the ROI is not there. Not > only > is this looking quite easy, but dare I say, /well supported/ even! :) > > Of course if I run into some brick wall (or get along swimmingly) I will > try and make some time and remember to report back in either case. :) > Which leads me into my final point... > > * Choice of tools > > At some point during this whole adventure (a while back) I thought long > and > hard about choice of tools. > > There are other ways to accomplish "automation." Mainly online > "aggregators" like Plaid, Mint, and probably some others. I actually > had > signed up for a Plaid developer account at one point, before getting > ofxclient working. Those are certainly viable, perhaps even > preferrable, > depending on your personal proclivities. But not for me and here is > why. > > First, it is a matter of dependance. Do I want to come to rely on some > centralized service, who could change their API or "developer" terms at > any > time and lock me out? Personally, no, I do not. > > Second aspect is trust/security. Do I really trust a third party to > hold > all my various banking credentials? Personally, no, I do not. > > And finally, independence and learning new skills in general. We all > have > very limited resources (mostly time). Do I want to spend my valuable > time > learning one particular (likely proprietary) API? Or should I instead > spend it learning a much more general (and F/LOSS) tool (like Selenium) > which also has the benefit of being able to solve lots of other > problems, > in addition to this particular one I am trying to solve right now? > Personally, <s>I think</s> I know that I prefer the latter. > > So that is why I have chosen to go this particular route. > > I'd love to hear anyone's thoughts on any or all of the above. Please > also > chime in if you have gotten stuck at any particular point along the way, > and maybe myself (or others) can help you get un-stuck. Thanks for > sticking with me if you made it this far. :) > > Cheers, > > TRS-80 > > [0] https://github.com/captin411/ofxclient > [1] https://plaintextaccounting.org/#articles-blog-posts > [2] https://yingtongli.me/blog/2020/04/29/hbs-scrape.html > [3] https://www.selenium.dev/documentation/en/webdriver > [4] https://www.w3.org/TR/webdriver1 > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/1830dade-1b3b-44cb-9b47-f47e83c68a2an%40googlegroups.com.