Hi,

I'd like to solicit some opinions about a PR in scipy.interpolate. The PR
itself is https://github.com/scipy/scipy/pull/19633 --- it has several
moving parts; here on list I'd be most thankful for input on its
(relatively) high-level design. (Of course, reviews of internals in the PR
itself would be very welcome, too, if you've spare cycles!).

To briefly recap the current state in the main branch
---------------------------------------------------------------------

- Our current recommendation for N-dimensional interpolation on a grid is
the RegularGridInterpolator class.
- Its spline modes (method="cubic" etc) are very slow:  they contain a
python level loop of the size `num_data_points * num_dimensions` with a lot
of python overhead on each iteration.
As a result, it is two to three orders of magnitude slower than
now-deprecated interp2d et al (see e.g.
https://github.com/scipy/scipy/issues/18010).
- The PR,  https://github.com/scipy/scipy/pull/19633, brings performance to
within a factor of 1-3 of specialized 2D `interp2d` and `*BivariateSpline`s
without the loss of N-D generality, brings features, and allows further
enhancements. Under the hood, it pre-constructs an instance of an NdBSpline
and forwards the evaluations to it. Construction of NdBSpline is what is
finicky.

The current state of the PR
------------------------------------

Currently, https://github.com/scipy/scipy/pull/19633, adds new
interpolation modes, (e.g. "cubic_" --- note the trailing underscore!) to
existing modes (e.g. "cubic").

The existing methods are very quick to set up and slow to evaluate; the new
ones are fast to evaluate, but the construction is potentially slow and
might require some user control
(there is large sparse linear algebra involved, so a user may need to
choose between a direct or iterative solver etc).

Here's how it looks:

>>> from scipy.interpolate import RegularGridInterpolator as RGI
>>> RGI((x, y), values, method="cubic_")         # trailing underscore!

is the classic fast-to-construct, slow-to-evaluate method, and

>>> RGI((x, y), values, method="cubic")

is a new slow-to-construct, fast-to-evaluate method.

Additionally, the RGI constructor needs optional keyword arguments for
sparse linalg solvers --- the PR offers reasonable defaults, but they will
not cover all cases.
This way, call signatures can be

>>> RGI((x, y), values, method="cubic", solver=gmres, atol=1e-6)   # atol
is for gmres

but this will raise (old methods do not accept solver kwargs):

>>> RGI((x, y), values, method="cubic_", solver=gmres, atol=1e-6)   # cubic_


What I'd like to ask the input for
------------------------------------------

The story with method="cubic" and "cubic_" and method-dependent kwargs is a
bit messy :-).

Possible options include:

1. Keep the PR as is.

2. Keep method="cubic" to be what it is in main, make new methods have
underscores, "cubic_" etc.

3. As above, plus invent better names for new methods (am open to
suggestions!)

4. Remove the old methods completely.

The backwards compat effect is not known, I cannot confidently guarantee
the behavior of the new methods in all cases. So would prefer to keep the
old ones as fallbacks in some form.

5. Keep the RGI class intact, offer a parallel API for the new methods.

The most straightforward way would be to have a factory function,
`make_nd_spline` (a made-up name) to construct and return an NdBSpline
instance. This would parallel the 1D `make_interp_spline` factory.

My preference would be option 1.

Pros:
- improves performance in common cases with no user intervention;
- offers multiple knobs to tweak things for use cases where defaults are
not adequate
- does not bring new names to an already large API surface.

Cons:
-  not fully backwards compatible: some users might need to change their
code to either keep strict backwards compat (add an undescore to the method
invocation) or experiment with the solver/solver arguments.

So, I'd greatly appreciate inputs on these possible options --- or if
you've other alternatives, great, I'm all ears!

If you're an RGI user, test-driving
https://github.com/scipy/scipy/pull/19633 would be extremely valuable, too!

Cheers,

Evgeni
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org
To unsubscribe send an email to scipy-dev-le...@python.org
https://mail.python.org/mailman3/lists/scipy-dev.python.org/
Member address: arch...@mail-archive.com

Reply via email to