TL;DR: it would be feasible to port the entire Whimsy code base over to
Node.js, get it up an running on whimsy-vm6 hosted on Ubuntu 20.04
(Focal Fossa). There would be a number of advantages to doing so.
The difference wouldn't merely be one of language syntax. The result
would likely be closer to an actor model than an object oriented model.
https://en.wikipedia.org/wiki/Actor_model
https://en.wikipedia.org/wiki/Object-oriented_modeling
A number of quasi-independent observations lead up to this conclusion.
Warning, the third part is rather lengthy.
- - - -
I wrote a test for posting an item to the agenda:
https://github.com/rubys/whimsy-board-agenda-nodejs/blob/d2b46afa81ccd980c416109f6c9c4fea198b508f/src/server/__tests__/post.js#L10
It looks remarkably similar to the analogous test for the Ruby
implementation:
https://github.com/apache/whimsy/blob/1316a898d5e8c91e8a33d89565d12efc4842dd56/www/board/agenda/spec/actions_spec.rb#L20
With this test in place, I now feel that I have one of everything needed
to make a complete application working. The question is whether this
version of the codebase will attract a sustainable community.
My current thoughts are that if I were to start from scratch, I would
definitely do so in Node.js, but at the moment, that's not what I am
facing. The Ruby base is more mature, and the Node.js base is just
starting.
- - - - -
It appears that svn added a new command line parameter
--password-from-stdin. I added support for this parameter to the
node.js board agenda tool yesterday:
https://github.com/rubys/whimsy-board-agenda-nodejs/blob/d2b46afa81ccd980c416109f6c9c4fea198b508f/src/server/svn.js#L54
If you are running on a Mac, you may or may not have a version of svn
that supports it. brew upgrade svn will get it for you.
The version of svn in the repositories for Ubuntu 16.04 and 18.04 don't
support this parameter. The version of svn in the Ubuntu 20.04
repository does.
If we are writing new code, it is relatively straightforward to handle
this correctly. If we want to update all of the existing code, at this
point that represents a technical debt.
- - - - -
While Ruby and JavaScript have very different surface syntaxes, they
superficially have a lot of similarities in their runtime models. There
are some subtle differences, which I will over simplify as follows:
Ruby tends to encourage a more object oriented approach to solving
problems. JavaScript tends to encourage a more event driven approach.
The current Ruby model can be seen here:
https://whimsy.apache.org/docs/api/
We have some obvious domain model object classes: Person, Committee,
CCLAFiles. There are some cross-cutting concerns shared by each, and
those tend to be broken out into classes: LDAP, Git, SVN. Along the
way, the reads and writes for any given data type tend to be clustered
together.
I continued with this approach on the client, where I had an Agenda
class which contained a list of items, each of which were objects that
responded to method calls that would indicated whether that item was
read for review, flagged, or whatever.
There are a number of drawbacks to this approach. As an example, if you
don't take care, performance suffers. A number of operations take a
while because a large number of LDAP requests would be required.
Caching can help, but then there are cache invalidation issues to worry
about. I built a clever solution using weak references, but clever is
generally a sign that there is a flaw in this approach.
Going out to svn every time there is a request would be a problem, but
that can be mitigated by keeping a local working copy up to date with
cron jobs. This, too, is a form of caching.
We can improve on that with pubsub, and the infrastructure team is
working on that.
But if you ignore the caches, the flow for displaying an agenda item on
the client is very linear: you start with getting a file from svn, you
issue a bunch of LDAP requests, you get another file from svn, issue
more LDAP requests, package up a JSON response that is sent to the
client which pulls it apart, and renders the result in the DOM.
None of that would be feasible without caches.
The JavaScript approach is different. Instead of starting with the
objects and propping up the architecture with caches, you start with the
data (what you previously would call a cache) and build a number of
quasi independent units of work that operate on the data.
When I undertook the port to Node.js of the board agenda tool, it took
whatever code I needed, and figured that I would find a way to factor it
out later into libraries. The code to parse committee-info.txt is a
prime example of something that would be useful to many tools.
On the client, I decided to replace my custom models, routing, and event
models with the ones that are favored by the React community.
Stepping back, I see that the code tends to be considerably less linear.
From the client side, pushing a button would tend to do something, and
often would need to have explicit code in place to cause data in another
component to rerender.
Now pushing a button will generally do one of three things: surface a
modal dialog, change state within the dialog, or dispatch an action to
the Redux store (possibly based on data retrieved from the server).
That's it. In the first two cases, everything is local. In the latter
case, it it somebody elses problem to do something with the data.
A similar thing happened on the server. I have cache files that
represent the parsed version of the agenda, committee-info.txt, member
data from LDAP, and the like. If cached files change, the client is
notified, and it has the option to load that information in the store.
The infrastructure team has already enabled pubsub for LDAP data, and is
working on pubsub for private svn repositories. The role of these
functions will be be to update the source of truth on the the servers
(i.e., the caches).
Instead of having shared libraries for parsing committee-info.txt, there
can be a canonical JSON file for this data, and multiple tools can have
file watchers that trigger when this file changes.
In fact, we already have these types of files; you can see them in
https://whimsy.apache.org/public/. We can create some more and put them
in a private directory (and perhaps even make them available to
authenticated requests). But we mostly make this data available for
other tools, we don't use this data much ourselves.
I'm finding that I'm liking the result more and more. Instead of
looking at an object and a description of what a method should return,
and trying to figure out what's broken when things don't work as
expected, you can directly inspect the data to see if the problem is in
the production of the data or in the consumption.
In other words, when something goes wrong, the first thing you do is
examine the Redux store on the client (I enabled this by pressing "=" in
the board agenda tool) or go to https://whimsy.apache.org/public/ or
equivalent.
And when we are done, we can not only be a pubsub consumer, but perhaps
we can look to be a pubsub source (likely by hosting an instance of the
the pypubsub package). Different tools running on different machines,
perhaps written in different languages, can all collaborate in this manner.
- Sam Ruby