On 10/09/2012 12:52 PM, Jim Meyering wrote:> Eric Blake wrote: >> On 07/26/2011 02:33 PM, marcel partap wrote: >>> Here's a patch. Adds STRCASEEQ_LEN macro for case insensitive extension >>> matching. >>> #regards/marcel. >> >> Your patch would make the new behavior unconditional. But I like case >> sensitivity, and think that case insensitivity should be an opt-in >> process that I request, with coordination between dircolors to >> generate a new string for LS_COLORS to be honored by ls. Furthermore, >> the patch is lacking in NEWS, documentation, and testsuite coverage. >> >> Additionally, you should be aware that strncasecmp() has undefined >> behavior in non-C multibyte locales. It would probably be better to >> use c_strncasecmp(), so that you are guaranteed defined behavior >> regardless of the current locale. >> >> Would you care to tackle those additional issues? And are you set up >> for copyright assignment, since the patch will probably be non-trivial >> by that point in time? > > I saw this while going back through old bugs, so started poking around. > How about this: if a suffix is not matched, convert it to lower case > and check again. That way, anyone who cares to highlight .TaR or .TAR > differently from .tar can still easily do so, yet names ending with > uppercase .TAR, .JPG, .FLAC, etc. will now be highlighted by default. > > Does anyone think it's worth making this new fallback-to-case-insensitivity > an option?
Not me anyway. > While looking at that, I realized that comparing each name in an ls'd > directory with each of nearly 100 suffixes should leave nontrivial room > for improvement. Sure enough, once I'd replaced that linear search > with a hash-table lookup, I see a measurable improvement. > and taking best-of-10 times, I see a 0.34 -> 0.23 (~30%) improvement. Very nice. > Of course, I have deliberately used the case that shows the most improvement: > - a directory with very many files > - no suffix match, so the old code searches all suffixes > - using tmpfs: minimal stat overhead > @@ -4343,21 +4412,11 @@ print_color_indicator (const struct fileinfo *f, bool symlink_target) > } > > /* Check the file's suffix only if still classified as C_FILE. */ > - ext = NULL; > - if (type == C_FILE) > - { > - /* Test if NAME has a recognized suffix. */ > - > - len = strlen (name); > - name += len; /* Pointer to final \0. */ > - for (ext = color_ext_list; ext != NULL; ext = ext->next) > - { > - if (ext->ext.len <= len > - && STREQ_LEN (name - ext->ext.len, ext->ext.string, > - ext->ext.len)) > - break; > - } > - } > + char *suffix; > + struct sufco *ext; /* Color extension */ > + ext = (type == C_FILE && (suffix = strrchr (name, '.')) > + ? suffix_color_lookup (suffix) > + : NULL); So do we now only support suffixes delimited by '.' ? Previously the delimiter was arbitrary or optional: touch star; LS_COLORS="*tar=01;31" /bin/ls --color *tar cheers, Pádraig.