Re: [PATCH] ob-shell-test, test-ob-shell and introduction

Thomas S. Dye Sun, 02 Jan 2022 10:58:38 -0800

Aloha all,

FWIW, as a user actively pursuing reproducible research with Organd a contributor to documentation about Org and Babel intendedfor other users (rather than Org mode elisp coders) I'd be pleasedif Org's code custodians look favorably on this proposal.


+1

All the best,
Tom

Matt <m...@excalamus.com> writes:

Apologies for the book. I've been sitting on this stuff for twomonths and am wondering how to proceed.
IANAL but AFAIK/CT, my contract contains an exception for makingcontributions to projects like Org. I've gotten confirmationfrom my manager and by HR. However, until the CEO signs the FSFdisclaimer, I can't officially contribute. I'm confident that Ican publish changes (e.g. to a personal website); the FSF justcan't accept my changes (yet).
I could start working on ob-shell.el separately now and publishthat myself. It's not clear how I would then contribute thosechanges back once I'm cleared by the FSF. I'm inclined towardssome refactoring and I'm not sure how I could break that down insuch a way that if it took two more months to get the copyrightstuff worked out that anyone could make sense of the changes. Iwould much prefer to gradually submit patches, discuss them, andthen eventually have them merged in turn. If I have a heap ofchanges elsewhere, I feel like it would be harder to have aconversion about them.
Regardless, I think I should define test cases first. If thoseare considered valid, then any refactoring would be moot if theypass, right? With (agreed upon) test cases, it shouldn't matterif I refactor as long as functionality remains the same?
Overall, what advice do you have?
It looks to me like ob-shell.el could use some love in otherrespects besides async evaluation. I've detailed where below,partly for my own organization and partly for posterity, butmainly because this isn't my house, so to speak, and I don'twant to barge in and start rearranging the furniture and eatingwhatever's in the fridge. Or, is it like Worg in that once Ihave the keys I can drive where I like, so long as there're nocrashes?
I'm interested in people's thoughts on these notes onob-shell.el:
* Tests
There are several code paths, like shebang, cmdline, and basicexecution, which don't always have something to do with oneanother in the code. Having tests would be really helpful tomake sure everything jives. Before doing anything with the codebase, I intend to create tests for all the functionality.
* 2D-array
I documented two options for the =:var= header[fn:1]. Theob-shell.el code actually defines a third option for 2D-arrays.I couldn't get it to work. It was one of several things notdocumented anywhere, at least as far as I could find, and whichI had to figure out straight from the code. Between not beinggreat at shell scripting and having a hard time deciphering thatob-shell.el code, I'm not sure 2D-arrays are actually fully orcorrectly implemented.
* M-up behavior <<M-up>>
The =org-babel-load-session:shell= code path only works whenM-up is used on a code block[fn:2]. Is M-up actually documentedanywhere? Furthermore, the =org-babel-load-session:shell= onlyworks for the "shell" language, which is not actually a "proper"shell (i.e. it's not listed in =org-babel-shell-names=). TheM-up defaults to using $ESHELL or shell-file-name through the=shell= command.
For example, try calling M-up on these:

#+comment: (opaquely) calls the system shell
#+begin_src shell :session my-shell
echo "hello, world!"  #+end_src

#+comment: fails
#+begin_src sh :session my-sh
echo "hello, world!"  #+end_src

#+comment: fails
#+begin_src bash :session my-bash
echo "hello, world!"  #+end_src
To fix this, there needs to be an=org-babel-load-session:<lang>= for each language in=org-babel-shell-names=. This would probably make the mostsense in =org-babel-shell-initialize=. However, that function[[org-babel-shell-initialize][needs attention]].
* Refactoring <<refactoring>>
The ob-shell.el code appears inconsistent to me. Clearly, thisis somewhat subjective. I've tried to give a rationale for eachpoint to make it less so. My goal is to be maintainer ofob-shell.el, but that's not forever. These things were anobstacle for me and my aim is to remove them for the nextperson.
** =org-babel-shell-initialize= <<org-babel-shell-initialize>>
As alluded to elsewhere, =org-babel-shell-initialize= doesn'tappear to adhere to the (elisp) Coding Conventions,
#+begin_quote
• Constructs that define a function or variable should bemacros, not functions, and their names should start with‘define-’. The macro should receive the name to be definedas the first argument. That will help various tools find thedefinition automatically. Avoid constructing the names inthe macro itself, since that would confuse these tools.#+end_quote
The =org-babel-shell-initialize= function defines=org-babel-execute:<lang>=,=org-babel-variable-assignments:<lang>=, and=org-babel-default-header-args:<lang>= for the "languages" in=org-babel-shell-names=. As it stands, that=org-babel-shell-initialize= is a function does no harm (asidefrom being confusing by way of straying from convention).However, if the [[M-up][M-up]] issue is to be resolved, it seemsto me that =org-babel-shell-initialize= should be updated tomatch the convention (i.e. be a macro).
** Organization
Definitions are introduced in different orders. For example,the =org-babel-shell-initialize= function whose sole purpose isto work with =org-babel-shell-names= is defined before thereader knows what =org-babel-shell-names= is. Later, thispattern of defining the component pieces after they're used isreversed. For example, =org-babel-load-session:shell= relies on=org-babel-prep-session:shell= which is defined first. I finddefining terms before they're used makes a document more easy tocomprehend than otherwise. I want to reorder the definitions.
Similarly, some functions could use breaking out. The mostimportant is probably =org-babel-sh-evaluate= which handles thevarious header arguments. There are various paths of executioneach requiring separate logic, yet all live in the one largefunction. Breaking these out would allow them to have separatedocstrings and, I expect, be easier to understand since thelogic of one would be (lexically) separated from the rest.
Other functionality might be better served by consolidatingfunctions. There's a lot of fiddly code dedicated to variableassignment. Actually, much of the ob-shell.el file is related tovariable assignment. The assignments are done using separatefunctions, yet all apply to the same task. They'll never be usedfor anything else. Do they need to be split out? Is there atechnical reason? I don't see one. Does it help comprehension?When split out as they are, I found it hard to make sense of thecontext they're used in since they're defined apart from thelogic that uses them (i.e. what header uses them, what form doesthe header take, etc.). I think it's worth seeing if there's abetter way to present that part of the code base.
** Naming
The following apply to all shells, not just "sh" and should beupdated to be "shell". This looks like cruft from whenob-shell.el was called ob-sh.el AFAICT.
- =org-babel-sh-evaluate=
- =org-babel-sh-eoe-indicator=
- =org-babel-sh-eoe-output=
- =org-babel--variable-assignments:sh-generic=
- =org-babel-sh-var-to-sh=
- =org-babel-sh-var-to-string=
- =org-babel-sh-initiate-session=
- =org-babel-sh-evaluate=
- =org-babel-sh-strip-weird-long-prompt=
Generally speaking, I find the Org Babel code base tricky toread (especially at first). I spent a good deal of timeuntangling what lived where and who did what. I can play alongfine now that I'm familiar. However, since understanding tooklonger than I think was necessary, I want to detail the painpoints as they have made contributing to Babel harder.
Overall, Babel somewhat breaks the (elisp) Coding Conventionsfor naming,
#+begin_quote
You should choose a short word to distinguish your program fromother Lisp programs. #+end_quote
I understand the variable/function name prefix to be the filename, typically. The file name is often the package name, ormore precisely the feature provided by the file[fn:3]. For OrgBabel, there's not a solid file-to-prefix relation. We say "OrgBabel", but the main functionality is in ob-core and the various"ob-" files either extend or implement implied behavior (e.g.=org-babel-<lang>-execute=). Is the "program" ob-core, ob-lang,or the whole suite of files? This is a subjective questionwhich the Org Babel "program" answers with, "the whole suite offiles". All components, across all "ob-" files, bear the name"org-babel-". This is still something that trips me up: is thecurrent symbol core or not? Who is responsible for what?
I would expect the core API to have its own prefix. Theextensions would then define their code and have a differentprefix, "ob-<lang>-". This way, readers/contributors could openthe pertinent ob-* file, see the expected symbol prefix (e.g."ob-shell-") and another prefix (e.g. "org-babel-") and be ableto distinguish which is which. As it stands, ob-core.el couldbe renamed to org-babel.el or the "org-babel-" prefix could bechanged to "ob-core-".
Another possible solution, or a stopgap, would be to have adocument detailing the Org Babel API[fn:4].
* Process interaction
Emacs has several different ways of interacting with processes.The ob-shell.el code uses a few of them. Since async is anotherway to interact with a process, a single process pattern couldbe used. The goal would be to make each of the differentfunctionalities provided by ob-shell.el have a similarimplementation. The expectation is that this would benefitmaintenance.
* Footnotes
[fn:1]https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-shell.html#orgfa6b7c5
[fn:2] M-up is bound to =org-metaup-hook= and
=ob-core:org-babel-load-in-session= by default.
[fn:3] It's not clear to me if there's a technical definitionfor an
Emacs package.
[fn:4] I may extend my personal notes into a document detailingthe
Org API. http://excalamus.com/2021-11-03-org_babel.html



--
Thomas S. Dye
https://tsdye.online/tsdye

Re: [PATCH] ob-shell-test, test-ob-shell and introduction

Reply via email to