I don’t have a solution, but I completely agree with the problem description.

 

I guess one small step would be that package authors should follow the patterns 
in base, if there are any. 

 

From: julia-users@googlegroups.com [mailto:julia-users@googlegroups.com] On 
Behalf Of Gabriel Gellner
Sent: Friday, October 7, 2016 8:36 AM
To: julia-users <julia-users@googlegroups.com>
Subject: [julia-users] Julia and the Tower of Babel

 

Something that I have been noticing, as I convert more of my research code over 
to Julia, is how the super easy to use package manager (which I love), coupled 
with the talent base of the Julia community seems to have a detrimental effect 
on the API consistency of the many “micro” packages that cover what I would 
consider the de-facto standard library.

What I mean is that whereas a commercial package like Matlab/Mathematica etc., 
being written under one large umbrella, will largely (clearly not always) 
choose consistent names for similar API keyword arguments, and have similar 
calling conventions for master function like tools (`optimize` versus `lbfgs`, 
etc), which I am starting to realize is one of the great selling points of 
these packages as an end user. I can usually guess what a keyword will be in 
Mathematica, whereas even after a year of using Julia almost exclusively I find 
I have to look at the documentation (or the source code depending on the 
documentation ...) to figure out the keyword names in many common packages.

Similarly, in my experience with open source tools, due to the complexity of 
the package management, we get large “batteries included” distributions that 
cover a lot of the standard stuff for doing science, like python’s numpy + 
scipy combination. Whereas in Julia the equivalent of scipy is split over many, 
separately developed packages (Base, Optim.jl, NLopt.jl, Roots.jl, NLsolve.jl, 
ODE.jl/DifferentialEquations.jl). Many of these packages are stupid awesome, 
but they can have dramatically different naming conventions and calling 
behavior, for essential equivalent behavior. Recently I noticed that 
tolerances, for example, are named as `atol/rtol` versus `abstol/reltol` versus 
`abs_tol/rel_tol`, which means is extremely easy to have a piece of scientific 
code that will need to use all three conventions across different calls to 
seemingly similar libraries. 

Having brought this up I find that the community is largely sympathetic and, in 
general, would support a common convention, the issue I have slowly realized is 
that it is rarely that straightforward. In the above example the abstol/reltol 
versus abs_tol/rel_tol seems like an easy example of what can be tidied up, but 
the latter underscored name is consistent with similar naming conventions from 
Optim.jl for other tolerances, so that community is reluctant to change the 
convention. Similarly, I think there would be little interest in changing 
abstol/reltol to the underscored version in packages like Base, ODE.jl etc as 
this feels consistent with each of these code bases. Hence I have started to 
think that the problem is the micro-packaging. It is much easier to look for 
consistency within a package then across similar packages, and since Julia 
seems to distribute so many of the essential tools in very narrow boundaries of 
functionality I am not sure that this kind of naming convention will ever be 
able to reach something like a Scipy, or the even higher standard of commercial 
packages like Matlab/Mathematica. (I am sure there are many more examples like 
using maxiter, versus iterations for describing stopping criteria in iterative 
solvers ...)

Even further I have noticed that even when packages try to find consistency 
across packages, for example Optim.jl <-> Roots.jl <-> NLsolve.jl, when one 
package changes how they do things (Optim.jl moving to delegation on types for 
method choice) then again the consistency fractures quickly, where we now have 
a common divide of using either Typed dispatch keywords versus :method symbol 
names across the previous packages (not to mention the whole inplace versus 
not-inplace for function arguments …)

Do people, with more experience in scientific packages ecosystems, feel this is 
solvable? Or do micro distributions just lead to many, many varying degrees of 
API conventions that need to be learned by end users? Is this common in 
communities that use C++ etc? I ask as I wonder how much this kind of thing can 
be worried about when making small packages is so easy.

Reply via email to