On 24Sep2014 00:48, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> 
wrote:
I have a certain calculation which can be performed two radically different
ways. With the first algorithm, let's call it SHORT, performance is very
fast for small values of the argument, but terrible for large values. For
the second algorithm, LARGE, performance is quite poor for small values,
but excellent for large values.

To add to the complexity, I perform this calculation repeatedly, in a loop:

value = 1
while True:
   x = SHORT(value)  # Or should I use LARGE? Decisions, decisions...
   process(x)
   value += some_increment()
   if condition: break

Luckily, the value never gets smaller. So if I could somehow determine a
cut-over point, CUTOFF, I might write my loop like this:

value = 1
while True:
   f = SHORT if value < CUTOFF else LARGE
   x = f(value)
   process(x)
   value += some_increment()
   if condition: break

alas, the CUTOVER point is likely to be machine-dependent. Take it as a
given that inserting a fixed CUTOVER point into the source code (say,
``CUTOVER = 123456``) is not likely to be very effective, and dynamically
calculating it at import time is impractical.

I have two suggestions. The first suggestion presumes multiple CPUs, and is a bit of a brute force hack. If the total runtime domain for SHORT (total runtime for when SHORT is better) is small compared to the domain for LARGE it may be worthwhile.

FIRST SUGGESTION

Run SHORT and LARGE in subprocesses, flat out. When either stops, kill the other. Once LARGE is the first to stop, do not bother with SHORT any more.

FIRST SUGGESTION

Chris Angelico's approach looks pretty cool, but seems unstable (in choices) initially.

Are the input values predictable? Or a fixed sequences?

I am wondering if there's any scope for gathering all the values (or all the values up to comfortably past the unknown cutover point) before running SHORT or LARGE at all?

I was thinking bisecting to find the CUTOFF, but the cost of the wrong algorithm for values well over the CUTOFF would be nasty.

Alternatively:

Did you say SHORT becomes quadratic in cost as it grows and LARGE roughtly linear? Pretend they really are purely quadratic and linear respectively.

Then you could start with both SHORT and LARGE for your first value.
Run SHORT and LARGE. Measure the elapsed time for each. Compute their ratio.

I think you should be able to estimate the cutover point as the point where the two curves intersect. Run the SHORT algorithm until them. Then run both. Remeasure. Re-estimate. Repeat until definitely LARGE domain (estimate points into the past, maybe with an error margin). Then just use LARGE.

Thoughts?

Cheers,
Cameron Simpson <c...@zip.com.au>
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to