For months I have been wanting to experiment with DSNLEXER::findToken() and see if there were faster techniques than the binary search of C strings that I initially used. I had a hunch that a hashtable might be faster.
After bringing in boost unordered_map< std::string, int > and instrumenting files.cpp to tell how many microseconds a PLUGIN::Load() would take, the results were surprising to me. The moral of this story is never guess what you can measure, unless you are willing to be wrong. I was able to measure before and after results. The hashtable unordered_map< std::string ..> was slower than the bsearch(). Then I swapped out the hash function with a custom one. This was nearly identical in results. About the time I was ready then to yank out the experimental code, i.e. give up, I said "well why not try one last thing". How about a tricked out hashtable that uses C strings rather than std::string, and a hashfunction that works directly on C string and an equality test that distills down to C's strcmp()? The result was again surprising. I bagged a 13% reduction/speedup in overall *.kicad_pcb file loading times on larger files. Note that this was a specialized hashtable that I basically invented, you will probably not find this in books. Because if you pass "const char*" pointer to a boost hashtable, you basically get const char* (32 or 64 bits) being treated as a large integer, and no de-referencing is made to the actual C string. Of course the reason is that no storage is provided for the C string within the hashtable. Yours truly was rewarded with a little experimentation and out of the box thinking. I cannot image how much faster this technique is relative to DSNLEXER::findToken() considered alone. What I am reporting is the effect on all of board loading time, so you know that findToken() has to be enormously much faster now than the 13% effect on over all load time. Regards, Dick P.S. The bad news is that *.kicad_pcb file loading is still slower than legacy. And it will probably always be so. But we are now within ball park, so don't let it bother you, it won't bother me. I gave it my best, and don't think there are any more low hanging fruit to reap here. A certain portion of this gain, say about 20% of the total gain, was due to giving up case independent compare. Now your keywords must match, you don't get case independence on token matching. (I forgot that was even in there, and I don't think any s-expression files became dependent on upper case keywords, since we complain in CMake about them in the grammar files.) If it breaks anything, yell. _______________________________________________ Mailing list: https://launchpad.net/~kicad-developers Post to : kicad-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~kicad-developers More help : https://help.launchpad.net/ListHelp