For months I have been wanting to experiment with DSNLEXER::findToken() and see 
if there
were faster techniques than the binary search of C strings that I initially 
used.  I had a
hunch that a hashtable might be faster.

After bringing in boost unordered_map< std::string, int > and instrumenting 
files.cpp to
tell how many microseconds a PLUGIN::Load() would take, the results were 
surprising to me.
The moral of this story is never guess what you can measure, unless you are 
willing to be
wrong.  I was able to measure before and after results.

The hashtable unordered_map< std::string ..> was slower than the bsearch().  
Then I
swapped out the hash function with a custom one.  This was nearly identical in 
results.

About the time I was ready then to yank out the experimental code, i.e. give 
up, I said
"well why not try one last thing".  How about a tricked out hashtable that uses 
C strings
rather than std::string, and a hashfunction that works directly on C string and 
an
equality test that distills down to C's strcmp()?


The result was again surprising.  I bagged a 13% reduction/speedup in overall 
*.kicad_pcb
file loading times on larger files.  Note that this was a specialized hashtable 
that I
basically invented, you will probably not find this in books.  Because if you 
pass "const
char*" pointer to a boost hashtable, you basically get const char* (32 or 64 
bits) being
treated as a large integer, and no de-referencing is made to the actual C 
string.  Of
course the reason is that no storage is provided for the C string within the 
hashtable.

Yours truly was rewarded with a little experimentation and out of the box 
thinking.

I cannot image how much faster this technique is relative to 
DSNLEXER::findToken()
considered alone.  What I am reporting is the effect on all of board loading 
time, so you
know that findToken() has to be enormously much faster now than the 13% effect 
on over all
load time.


Regards,

Dick


P.S.

The bad news is that *.kicad_pcb file loading is still slower than legacy.  And 
it will
probably always be so.  But we are now within ball park, so don't let it bother 
you, it
won't bother me.  I gave it my best, and don't think there are any more low 
hanging fruit
to reap here.

A certain portion of this gain, say about 20% of the total gain, was due to 
giving up case
independent compare.  Now your keywords must match, you don't get case 
independence on
token matching.  (I forgot that was even in there, and I don't think any 
s-expression
files became dependent on upper case keywords, since we complain in CMake about 
them in
the grammar files.)

If it breaks anything, yell.



_______________________________________________
Mailing list: https://launchpad.net/~kicad-developers
Post to     : kicad-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kicad-developers
More help   : https://help.launchpad.net/ListHelp

Reply via email to