Hi *,
a work with *huge* (sometimes 100+ MB per workbook) calculation models in Excel 
through POI and i was looking for a way to optimize calculation perfomance. 

Turns out that if any Excel ref/numeric/name/etc error occurs during collecting 
operands or evaluation, a corresponding EvaluationExcetion is thrown to exit 
early. I would do the same, it makes perfect sense.

However the cost for filling the stack trace is overwhelming. 
Here's a simple demo case:

@Test
public void throwDemo() {
    Function function = AggregateFunction.SUM;
    ValueEval[] args = new ValueEval[]{NumberEval.ZERO, NumberEval.ZERO, 
NumberEval.ZERO, NumberEval.ZERO, ErrorEval.REF_INVALID};

    int N = 1_000_000;

    double start = System.nanoTime();
    for (int i = 0; i < N; i++) {
        function.evaluate(args, 0, 0);
    }
    double stop = System.nanoTime();
    double seconds = (stop-start)/1.0e9;

    System.out.printf("Cycle time: %.3f s, throughput: %.1f evals/s%n", 
seconds, N/seconds);
}

On my Core-i5 2500K it yields cycle time ~2.750 s, throughput ~370K evals/s.

Now, if I prevent filling stack trace at EvaluationException():
public EvaluationException(ErrorEval errorEval) {
    super(errorEval.getErrorString(), null, false, false);
    _errorEval = errorEval;
}...

the test now yields cycle time ~0.44 s, throughput ~2.3 *M* evals/s which is 7+ 
times higher.

One of my models used to take about 15 seconds to evaluate, now it's about 6.5 
seconds.

I understand that an exception with no stack trace is not a nice thing to debug 
but as it's a checked exception, 
its scope is quite limited, so hopefully the speedup wins over missing stack 
trace.

The proposed change doen't break any tests.

Looking forward to hearing your comments,

Vladislav


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org

Reply via email to