I don't really see how you can solve this without a single dictator who controls the package ecosystem. I'm not enough of an expert in Python to say how well things work there, but the R ecosystem is vastly less organized than the Julia ecosystem. Insofar as it's getting better, it's because the community has agreed to make Hadley Wickham their benevolent dictator.
--John On Friday, October 7, 2016 at 8:35:46 AM UTC-7, Gabriel Gellner wrote: > > Something that I have been noticing, as I convert more of my research code > over to Julia, is how the super easy to use package manager (which I love), > coupled with the talent base of the Julia community seems to have a > detrimental effect on the API consistency of the many “micro” packages that > cover what I would consider the de-facto standard library. > > What I mean is that whereas a commercial package like Matlab/Mathematica > etc., being written under one large umbrella, will largely (clearly not > always) choose consistent names for similar API keyword arguments, and have > similar calling conventions for master function like tools (`optimize` > versus `lbfgs`, etc), which I am starting to realize is one of the great > selling points of these packages as an end user. I can usually guess what a > keyword will be in Mathematica, whereas even after a year of using Julia > almost exclusively I find I have to look at the documentation (or the > source code depending on the documentation ...) to figure out the keyword > names in many common packages. > > Similarly, in my experience with open source tools, due to the complexity > of the package management, we get large “batteries included” distributions > that cover a lot of the standard stuff for doing science, like python’s > numpy + scipy combination. Whereas in Julia the equivalent of scipy is > split over many, separately developed packages (Base, Optim.jl, NLopt.jl, > Roots.jl, NLsolve.jl, ODE.jl/DifferentialEquations.jl). Many of these > packages are stupid awesome, but they can have dramatically different > naming conventions and calling behavior, for essential equivalent behavior. > Recently I noticed that tolerances, for example, are named as `atol/rtol` > versus `abstol/reltol` versus `abs_tol/rel_tol`, which means is extremely > easy to have a piece of scientific code that will need to use all three > conventions across different calls to seemingly similar libraries. > > Having brought this up I find that the community is largely sympathetic > and, in general, would support a common convention, the issue I have slowly > realized is that it is rarely that straightforward. In the above example > the abstol/reltol versus abs_tol/rel_tol seems like an easy example of what > can be tidied up, but the latter underscored name is consistent with > similar naming conventions from Optim.jl for other tolerances, so that > community is reluctant to change the convention. Similarly, I think there > would be little interest in changing abstol/reltol to the underscored > version in packages like Base, ODE.jl etc as this feels consistent with > each of these code bases. Hence I have started to think that the problem is > the micro-packaging. It is much easier to look for consistency within a > package then across similar packages, and since Julia seems to distribute > so many of the essential tools in very narrow boundaries of functionality I > am not sure that this kind of naming convention will ever be able to reach > something like a Scipy, or the even higher standard of commercial packages > like Matlab/Mathematica. (I am sure there are many more examples like using > maxiter, versus iterations for describing stopping criteria in iterative > solvers ...) > > Even further I have noticed that even when packages try to find > consistency across packages, for example Optim.jl <-> Roots.jl <-> > NLsolve.jl, when one package changes how they do things (Optim.jl moving to > delegation on types for method choice) then again the consistency fractures > quickly, where we now have a common divide of using either Typed dispatch > keywords versus :method symbol names across the previous packages (not to > mention the whole inplace versus not-inplace for function arguments …) > > Do people, with more experience in scientific packages ecosystems, feel > this is solvable? Or do micro distributions just lead to many, many varying > degrees of API conventions that need to be learned by end users? Is this > common in communities that use C++ etc? I ask as I wonder how much this > kind of thing can be worried about when making small packages is so easy. >