On 2016-08-06 22:18, Pedro Giffuni wrote: > On 06/08/2016 15:13, Ed Schouten wrote: >> 2016-08-04 17:27 GMT+02:00 Pedro F. Giffuni <p...@freebsd.org>: >>> Log: >>> indent(1): Use bsearch() for looking up type keywords. >> You're never doing any deletions, right? Would it make more sense to >> use hsearch_r() in that case? >> > Indeed, good idea, although it may be an issue if portability to Windows > is desired.
I think I could have done better job describing the reasons behind that commit, because there are two of those. The main thing is that I wanted to get rid of arbitrary limit of 16384 "specials" (keywords, including typedef names) -- and the malloc/realloc/bsearch based implementation of looking up type keywords solved that, while being relatively simple and easy to follow. It was also very nice to notice a significant performance improvement since I would re-indent whole src/ directory of the PostgreSQL project many times while testing changes I made. But it's not that important after all, because PostgreSQL sources get re-indented very rarely. Having said that, I felt compelled to compare performance of bsearch() to that of hsearch(). First thing to say here is that I did the testing on a Linux+glibc platform, because 1) that's my main development platform, and 2) it's probably the only one where anybody re-indents hundreds of thousands of lines of code with indent(1) in one go, often, and multiple times a day -- and therefore can actually notice the difference. Secondly, while hsearch_r() being a GNU extension is an argument against using it, in this particular case there would be no advantage to be gained by preferring the re-entrant version, I believe -- which renders the lower portability argument moot. So what I tested was hsearch(). And my implementation of hsearch() based typedef name lookup (diff attached) wasn't significantly faster than bsearch(). Obviously I tested the glibc version and perhaps FreeBSD hsearch() is faster, but I don't expect it to be significantly faster. The most important conclusion, though, is that at least the glibc implementation imposes an arbitrary limit on how many items you can add to the data structure: The argument nel specifies the maximum number of entries in the table. (This maximum cannot be changed later, so choose it wisely.) Adding new items to the data structure (or searching them - I didn't investigate) actually _stopped working_ when approximately /nel/ number of elements was reached (I accidentally set it to 2000 while PostgreSQL has nearly 3000 typedef names). I don't know if FreeBSD's implementation does the same, but it's a show-stopper to me anyway, so personally I won't be pursuing this idea of replacing bsearch() with hsearch().
--- lexi.c 2016-08-07 10:10:54.115382621 +0200 +++ lexi.c.hcreate 2016-08-07 10:09:33.548394456 +0200 @@ -52,6 +52,7 @@ #include <ctype.h> #include <stdlib.h> #include <string.h> +#include <search.h> #include "indent_globs.h" #include "indent_codes.h" #include "indent.h" @@ -270,13 +271,14 @@ sizeof(specials[0]), strcmp_type); if (p == NULL) { /* not a special keyword... */ + ENTRY item; char *u; + item.key = s_token; + /* ... so maybe a type_t or a typedef */ if ((auto_typedefs && ((u = strrchr(s_token, '_')) != NULL) && - strcmp(u, "_t") == 0) || (typename_top >= 0 && - bsearch(s_token, typenames, typename_top + 1, - sizeof(typenames[0]), strcmp_type))) { + strcmp(u, "_t") == 0) || hsearch(item, FIND) != NULL) { ps.keyword = 4; /* a type name */ ps.last_u_d = true; goto found_typename; @@ -590,40 +592,15 @@ void alloc_typenames(void) { - - typenames = (const char **)malloc(sizeof(typenames[0]) * - (typename_count = 16)); - if (typenames == NULL) - err(1, NULL); + hcreate(10000); } void add_typename(const char *key) { - int comparison; - - if (typename_top + 1 >= typename_count) { - typenames = realloc((void *)typenames, - sizeof(typenames[0]) * (typename_count *= 2)); - if (typenames == NULL) - err(1, NULL); - } - if (typename_top == -1) - typenames[++typename_top] = key; - else if ((comparison = strcmp(key, typenames[typename_top])) >= 0) { - /* take advantage of sorted input */ - if (comparison != 0) /* remove duplicates */ - typenames[++typename_top] = key; - } - else { - int p; - - for (p = 0; (comparison = strcmp(key, typenames[p])) >= 0; p++) - /* find place for the new key */; - if (comparison == 0) /* remove duplicates */ - return; - memmove(&typenames[p + 1], &typenames[p], - sizeof(typenames[0]) * (++typename_top - p)); - typenames[p] = key; - } + ENTRY item; + + item.key = key; + item.data = NULL; + hsearch(item, ENTER); }
_______________________________________________ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"