TRS-80,

You may find it reassuring that every professional I know shares a 
more-or-less extreme bias toward automated workflows.

You might be interested in https://csingley.github.com/ofxtools (that's 
mine).  It's a much better OFX client than what you're using (if I do say 
so myself), and it can also parse the downloaded files to extract the 
transactions/balances.

Cheers.

On Wednesday, July 15, 2020 at 1:32:33 PM UTC-5 TRS-80 wrote:

> OK so this got a little long. Go grab yourself your favorite tasty cold
> delicious adult beverage and get comfortable. I did put in some 
> headings
> at least to mitigate the wall of text. :D
>
> I am on my second or third go 'round with Beancount, over a period of 
> some
> years. I have had various levels of success or failure, but for me
> personally, I really felt like I was the most successful when automating 
> as
> much as possible.
>
> This post will be about how I arrived at that conclusion, and some 
> things I
> learned along the way. I hope it ends up being useful to others who may
> have similar ideas, but perhaps not put all the pieces together yet. Or
> maybe need some encouragement, or...?
>
> * Know Thyself
>
> I guess I felt the need to make this post because Martin himself 
> throughout
> the docs seems to put forth his more "manual" way of doing things (as a 
> way
> to keep more "in touch" with his numbers, if I am reading him 
> correctly).
>
> But perhaps I read too much into that. I can only say that generally I 
> do
> try and follow the recommendations made in docs by founder and those 
> more
> involved in a particular project (especially when I am just starting 
> out).
> I suppose I figure there must be some reason for it, even if I do not 
> yet
> understand what those reason(s) might be... So maybe this is just me
> finally gaining enough experience to know what is what, and perhaps more
> importantly "knowing myself" enough to recognize what works for me,
> personally.
>
> If you prefer the more "manual" approach (or any other approach, for 
> that
> matter) I encourage you to "do what works for you." Thankfully we have
> such flexible tools available to us...
>
> * Automate as much as Possible
>
> For me, what seems to work is "as much automation as possible." I still
> end up manually doing some stuff of course, but for me if I can get that
> down to 10% or 5% (or whatever) the way I see it is I have reduced 
> 90-95%
> of the work (and drudgery) involved. I mean, this is what computers are
> best at, isn't it?
>
> * Moving from CSV to OFX import
>
> Along those lines, I recently moved from CSV import to OFX. It's still
> early, but I am well on my way to nearly //completely// automating my
> download, import, and categorization.
>
> Before with CSV I had to log on to my bank and click through stuff, save
> the file (and then remember my file naming scheme), etc. and some times
> that just became too much friction and sooner or later I would start
> falling behind from the simple drudgery of it all.
>
> Further complicating the issue, my bank only keeps "transactions" around
> for 90 days, so if I got busy or fell behind, I would be back to 
> /manually/
> entering any "missed" transactions (yeah, right!).
>
> Enter OFX (via ofxclient), which solves these problems by being 
> completely
> scriptable (and thus automateable) tool.
>
> There were a couple bugbears with ofxclient[0] however. The guy is not
> really actively maintaining it. However after fixing a couple missing
> apostrophes I /finally/ got it to work. I guess my Python must be 
> getting a
> little better, because 1-2 years ago I had already failed once or twice 
> at
> this exact same task. :)
>
> So, on to the next hurdle...
>
> * No (built in) OFX "categorizer"
>
> Anyway so then it was a little disappointing to learn that there is no
> callable "categorizer" available in the OFX importer example the same 
> way
> that there was in the CSV importer example.
>
> Until I found a recent post titled "Categorizing transactions 
> automatically
> on import" which solved that particular part of the problem. I left a 
> more
> fleshed out example as a reply to that thread for anyone who is 
> interested
> (search the mailing list for that or "OFX categorizer" etc.).
>
> * Next steps (Selenium WebDriver)
>
> At this point I am satisfied enough in my progress (and have learned
> enough) that I felt it would be worth sharing that progress with others.
> But already I am looking forward to next steps. And I am getting 
> excited
> about Beancount again. :)
>
> The last days I have already been reading up docs about Selenium 
> WebDriver.
> I have heard about Selenium before of course, but what I think motivated 
> me
> to really give it a try now was an article I recently came across over 
> at
> plaintextaccounting.org[1] by Lee Yingtong Li titled "Using selenium to
> scrape/import bank transactions for ledger-cli."[2] This is a quite 
> recent
> article (2020-04-29) as you can see by the link.
>
> Anyway he is using it to get his "transactions" but that is not what I 
> plan
> on using it for (I have OFX for that). For me, the only remaining piece 
> of
> the puzzle that is left to automate is...
>
> * Automatically downloading PDF statements
>
> Like my "transactions", downloading these PDF "statements" was an 
> exercise
> in drudgery, for all the reasons already mentioned above (clicking 
> through
> bank website, remembering file naming convention, etc.).
>
> First I tried doing this through OFX protocol itself. And maybe there 
> is a
> way? The standard would seem to indicate maybe there is. But I made 
> posts
> about this not only here but on ledger mailing list before and received
> exactly zero replies so far (which is also why I am not even going to
> bother looking them up in order to link to them). So I gave up on that 
> way
> (for now).
>
> So then I got the idea to maybe automate this drudgery using Selenium
> (WebDriver).
>
> * Arguments for Selenium WebDriver (in general)
>
> Now, I have not even got this actually working yet, and the 
> implementation
> details will of course be very bank (web site) dependant. So why bother
> bringing it up now (or at all, for that matter)?
>
> Well for same reason as posted very early on, mainly I have heard of 
> this
> sort of thing being referred to mostly as "too much trouble" and took 
> that
> assessment at face value. But is it? Some things I learned in my 
> research
> the last few days started to change my mind:
>
> 1. So far, the Selenium WebDriver docs[3] seem to be very good. Simple 
> and
> to the point.
>
> 2. There are bindings for several different languages. And the lanuage
> bindings (I was looking at Python mostly) seem to be quite clean,
> straightforward, and easy to remember / intuitive.
>
> 3. It appears to be quite a mature and reliable thing nowadays, with
> browser vendors like Google and Mozilla (and others) actually
> maintaining their own drivers for each particular browser. No more
> "PhantomJS" and feeling like you are in some neverending cat and 
> mouse
> with an opponent.
>
> 4. Not only that, apparently the whole notion of automated browser / 
> site
> testing has actually become an W3C recommendation by now(!). [4]
>
> It really appears to me to be a completely different dynamic nowadays.
> Therefore I would challenge the notion that the ROI is not there. Not 
> only
> is this looking quite easy, but dare I say, /well supported/ even! :)
>
> Of course if I run into some brick wall (or get along swimmingly) I will
> try and make some time and remember to report back in either case. :)
> Which leads me into my final point...
>
> * Choice of tools
>
> At some point during this whole adventure (a while back) I thought long 
> and
> hard about choice of tools.
>
> There are other ways to accomplish "automation." Mainly online
> "aggregators" like Plaid, Mint, and probably some others. I actually 
> had
> signed up for a Plaid developer account at one point, before getting
> ofxclient working. Those are certainly viable, perhaps even 
> preferrable,
> depending on your personal proclivities. But not for me and here is 
> why.
>
> First, it is a matter of dependance. Do I want to come to rely on some
> centralized service, who could change their API or "developer" terms at 
> any
> time and lock me out? Personally, no, I do not.
>
> Second aspect is trust/security. Do I really trust a third party to 
> hold
> all my various banking credentials? Personally, no, I do not.
>
> And finally, independence and learning new skills in general. We all 
> have
> very limited resources (mostly time). Do I want to spend my valuable 
> time
> learning one particular (likely proprietary) API? Or should I instead
> spend it learning a much more general (and F/LOSS) tool (like Selenium)
> which also has the benefit of being able to solve lots of other 
> problems,
> in addition to this particular one I am trying to solve right now?
> Personally, <s>I think</s> I know that I prefer the latter.
>
> So that is why I have chosen to go this particular route.
>
> I'd love to hear anyone's thoughts on any or all of the above. Please 
> also
> chime in if you have gotten stuck at any particular point along the way,
> and maybe myself (or others) can help you get un-stuck. Thanks for
> sticking with me if you made it this far. :)
>
> Cheers,
>
> TRS-80
>
> [0] https://github.com/captin411/ofxclient
> [1] https://plaintextaccounting.org/#articles-blog-posts
> [2] https://yingtongli.me/blog/2020/04/29/hbs-scrape.html
> [3] https://www.selenium.dev/documentation/en/webdriver
> [4] https://www.w3.org/TR/webdriver1
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/1830dade-1b3b-44cb-9b47-f47e83c68a2an%40googlegroups.com.

Reply via email to