dsanders created this revision.
dsanders added reviewers: mclow.lists, hans.
dsanders added a subscriber: cfe-commits.

On glibc, the bits used for the various character classes is endian dependant
(see _ISbit() in ctypes.h) but __regex_word does not account for this and uses
a spare bit that isn't spare on big-endian. On big-endian, it overlaps with the
bit for graphic characters which causes '-', '@', etc. to be considered a word
character.

Fixed this by defining the value using _ISbit(15) on glibc systems.

Fixes PR26476.

http://reviews.llvm.org/D17132

Files:
  include/regex

Index: include/regex
===================================================================
--- include/regex
+++ include/regex
@@ -976,7 +976,12 @@
     typedef locale                  locale_type;
     typedef ctype_base::mask        char_class_type;
 
+#if defined(__GLIBC__)
+    static const char_class_type __regex_word = 
static_cast<char_class_type>(_ISbit(15));
+#else
     static const char_class_type __regex_word = 0x80;
+#endif
+
 private:
     locale __loc_;
     const ctype<char_type>* __ct_;


Index: include/regex
===================================================================
--- include/regex
+++ include/regex
@@ -976,7 +976,12 @@
     typedef locale                  locale_type;
     typedef ctype_base::mask        char_class_type;
 
+#if defined(__GLIBC__)
+    static const char_class_type __regex_word = static_cast<char_class_type>(_ISbit(15));
+#else
     static const char_class_type __regex_word = 0x80;
+#endif
+
 private:
     locale __loc_;
     const ctype<char_type>* __ct_;
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to