Hi Florian, "pelzflorian (Florian Pelz)" <pelzflor...@pelzflorian.de> skribis:
> On Mon, Mar 09, 2020 at 06:02:40PM +0100, Ludovic Courtès wrote: >> To me it’s not a bug in Guile, but simply the fact that regexps, as >> implemented by the C library, are locale-dependent. >> > > (use-modules (ice-9 regex)) > (regexp-exec (make-regexp "^([a-z]+)$") > "iyiyim") > ⇒ #f > > Guile’s behavior that i is not among [a-z] has been confirmed as > unexpected by a natively Turkish friend of mine. It is different from > the behavior of current glibc: > > florian@florianmacbook ~$ cat iyiyim.c > #include <regex.h> > #include <stdio.h> > #include <stdlib.h> > #define STR "iyiyım" > int main (int argc, > char** argv) > { You’re seeing a different behavior because you forgot a: setlocale (LC_ALL, ""); call here. >> The patch you proposed looks good to me, though perhaps we could >> explicitly list all the alphabet in the regexp? >> >> A better option is to reimplement ‘store-path-package-name’ in a way >> similar to ‘store-path-hash-part’, as in commit >> 35eb77b09d957019b2437e7681bd88013d67d3cd. > > I suppose it would be better to cache the compiled regexp. What is > this mcached syntax inside (guix store)? Or do I use Scheme’s 'delay' > and 'force' for caching? I lean towards avoiding regexps altogether, as I wrote above. WDYT? > The attached patch fixes the regexp. Shall I push the attached patch > and then try making it cache the compiled regexp or do you still > prefer an implementation without regexps? Why would not using a > regexp be better? It reduces reliance on libc, reduces complexity, and performs better as noted in the commit log of 35eb77b09d957019b2437e7681bd88013d67d3cd. Thanks, Ludo’.