Gnumeric parsing efficiency

Kenneth Dakin Tue, 17 Nov 2009 09:18:34 -0800

As a dedicated program efficiency enthusiast, I take exception to examples of 
code that could be improved significantly with almost no effort. I am 'of the 
school' that believes wasting CPU cycles is not at all clever when they can be 
used for some other processing in the same program or to simply reduce the need 
for more or faster processors.


For quite some time now, I have noticed that the Switch statement is used 
extensively in Gnumeric code (and presumably in other Gnome software). This is 
fine - up to a point.

The prevailing view of high level language programmers is that the switch 
statement, as well as being a concise method of multiway branching, is very 
efficient and usually optimized by the C compiler. This however is only true 
for 'small' ranges of values (eg 0-9 etc with few gaps). 
By looking at various examples of distributed Gnumeric source code, I noticed 
that it frequently 'parses' for combinations of characters such as " *,/,+,-, 
(,)" etc. using Switch.

I believe that the C compiler will generally simply produce a long sequence of 
compare and branch instructions for these cases (optimized or not). I am not 
familiar with the Gnumeric C compiler and maybe it takes these as special cases 
to always generate a 'branch table' regardless of the apparent gaps - or maybe 
not!

As a programmer with around 40 years experience, mainly in test/debugging and 
performance tools, I can assure you that it is entirely possible (and extremely 
easy) to mechanically apply a simple but effective global source translation to 
all the Gnumeric code to optimize the switch statements to operate in constant 
time - irrespective of the number of characters tested for in the switch 
statement. This can be accomplished in around 5 machine instructions (or about 
the same or less in C statements, I estimate).

Because of the mechanical nature of these changes and the sheer simplicity of 
the transformation, no new bugs can be introduced by manual coding errors, the 
resultant code will be, on average (assuming random spread of input values), up 
to 41 times faster and the binary size also reduced significantly.

In practice, a smaller improvement than this will be nevertheless achievable 
but likely still  very worthwhile.

To test this suggestion beforehand is extremely easy. 

1. Simply code up a stand-alone switch statement with say 10-20 different input 
values and execute it say 10,000,000 times while timing the test. 
2. Run the translation (in this case manually change the switch statement with 
a few lines of my supplied code!)  and 
3. test again.

Simple !

Please let me know if :-

a) your gnumeric compiler always produces an efficient branch table in these 
circumstances (and therefore you are not interested!) or
b) you are interested in my suggestion and you will email a sample switch 
statement by return  for me to make the simple transformation and send back for 
you to try out yourselves.
c) you are not interested in improving efficiency, especially from outsiders!
d) you think it may impact existing code reliability (I can convince you of the 
opposite)
e) you are too busy, however much better the code will become

(P.S. I have had limited experience with gnumeric when, about 24 months ago)  I 
discovered a very significant and undiscovered basic bug (a cut/paste error) 
while using Editgrid and afterwards got it fixed pretty quickly by the 
Editgrid/gnumeric team).

Best Regards
Ken Dakin

_______________________________________________
gnumeric-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/gnumeric-list

Gnumeric parsing efficiency

Reply via email to