[issue1084] ''.find() gives wrong result in Python built with ICC
Changes by Simon Anders: -- components: Build, Interpreter Core severity: normal status: open title: ''.find() gives wrong result in Python built with ICC versions: Python 2.5 __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1084> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1084] ''.find() gives wrong result in Python built with ICC
New submission from Simon Anders: I have just encountered a strange bug affecting Python 2.5.1 on an x86_64 Linux, but only when compiled with the Intel C Compiler (ICC) 10.0, not a GCC-compiled Python. On my Intel-compiled one, which otherwise seems to work fine, ''.find() works incorrectly. I have narrowed down the issue to the simple test case "foo2/**bar**/".find ("/**bar**/") Observe: On a GCC-compiled Python 2.5.1, the command works as expected by returning 4: [EMAIL PROTECTED] tmp]$ /usr/site/hc-2.6/python/gnu/2.5.1/bin/python2.5 Python 2.5.1 (r251:54863, Aug 30 2007, 16:21:23) [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print "foo2/**bar**/".find ("/**bar**/") 4 On my Python 2.5.1 installation which was compiled from source with the Intel C Compiler (ICC) for Linux, version 10.0, '-1' is returned: [EMAIL PROTECTED] tmp]$ /usr/site/hc-2.6/python/intel/2.5.1/bin/python2.5 Python 2.5.1 (r251:54863, Aug 30 2007, 16:20:06) [GCC Intel(R) C++ gcc 3.4 mode] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print "foo2/**bar**/".find ("/**bar**/") -1 What could have possibly gone wrong here? Admittedly, this smacks more of a bug in icc than in Python, but I report it here, as I feel at loss of what else to do with it. Obvious first question: Does anyone else out here have an ICC-compiled Python handy to check whether the bug reproduces elsewhere? Any idea what kind of oddity I have stumbled over here? Obviously, it could simply be that something went wrong when compiling Python from source with ICC, but it should not happen that the interpreter nebertheless starts up and fails only silently. Additional information: - I have stumbled over the problem when trying to install Numpy 1.0.3.1, as the built failed at the point where a script 'conv_template.py', which is part of NumPy's installtion system, is started to do some pattern replacements in a file called 'scalartypes.inc.src'. My test case is reduced from this script. - The system is the master node of a compute cluster with AMD Opteron CPUs. The cluster is not involved, all was done on the master node. The box runs RedHat Enterprise Linux 4.0 Advanced Server. It replies to 'uname -a' with: Linux hc-ma.uibk.ac.at 2.6.9-42.0.10.ELsmp #1 SMP Fri Feb 16 17:13:42 EST 2007 x86_64 x86_64 x86_64 GNU/Linux - The dynamic dependencies of the GCC-compiled and the ICC-compiled Python binaries are: [EMAIL PROTECTED] tmp]$ ldd /usr/site/hc-2.6/python/gnu/2.5.1/bin/python2.5 libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x00370290) libdl.so.2 => /lib64/libdl.so.2 (0x003701d0) libutil.so.1 => /lib64/libutil.so.1 (0x00370390) libm.so.6 => /lib64/tls/libm.so.6 (0x003701b0) libc.so.6 => /lib64/tls/libc.so.6 (0x00370180) /lib64/ld-linux-x86-64.so.2 (0x00370160) [EMAIL PROTECTED] tmp]$ ldd /usr/site/hc-2.6/python/intel/2.5.1/bin/python2.5 libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x00370290) libdl.so.2 => /lib64/libdl.so.2 (0x003701d0) libutil.so.1 => /lib64/libutil.so.1 (0x00370390) libimf.so => /usr/site/hc-2.6/intel/10.0/cc/lib/libimf.so (0x002a95579000) libm.so.6 => /lib64/tls/libm.so.6 (0x003701b0) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00370580) libc.so.6 => /lib64/tls/libc.so.6 (0x00370180) /lib64/ld-linux-x86-64.so.2 (0x00370160) - The precise revision of Python is "Python 2.5.1 (r251:54863)". - The test case ceases to show failure if the string is only slightly altered, e.g. if the word 'foo', the word 'bar' or the one of the asterisks or one of the slashes is cut out in both search and target string. -- nosy: +sanders_muc __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1084> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1084] ''.find() gives wrong result in Python built with ICC
Simon Anders added the comment: Martin, you are right: is is related to compiler optimization. I have boiled it down to a call of stringlib_find (defined in Python-2.5.1/Objects/stringlib/find.h) and this runs fine with 'icc -O2' but incorrectly for 'icc -O3'. (The test code is attached.) So, it seems that the lesson is simply once again: Do not use '-O3' with Intel's C compiler. (At least, for me, it is not the first time that this caused trouble.) On the other hand, Python's ./configure script is quite clear in its preference of GCC, anyway: It more or less ignores with option '--without-gcc' and uses the content of the CC environment variable only very occasionally. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1084> __#define STRINGLIB_CHAR char #define STRINGLIB_CMP memcmp #define STRINGLIB_LEN PyString_GET_SIZE #define STRINGLIB_NEW PyString_FromStringAndSize #define STRINGLIB_STR PyString_AS_STRING #define STRINGLIB_EMPTY nullstring #include "/usr/site/hc-2.6/python/gnu/2.5.1/include/python2.5/Python.h" #include "../Python-2.5.1/Objects/stringlib/fastsearch.h" #include "../Python-2.5.1/Objects/stringlib/find.h" int main () { STRINGLIB_CHAR* str = "foo2/**bar**/"; Py_ssize_t str_len = strlen (str); STRINGLIB_CHAR* sub = "/**bar**/"; Py_ssize_t sub_len = strlen (sub); Py_ssize_t offset = 0; Py_ssize_t res; res = stringlib_find(str, str_len, sub, sub_len, offset); printf ("%d\n", res); return 0; } ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1084] ''.find() gives wrong result in Python built with ICC
Simon Anders added the comment: Martin: I've boiled down the test case a bit more and removed all Python-specific types and macros, so that it can now be compiled stand-alone. (Updated test case 'findtest.c' attached.) I didn't feel like diving into the code much deeper, and so I have sent it to Intel Premier Support as Issue #448807. Let's see if they bother to investigate it further. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1084> __/* Testcase for problem with 'icc -O3': The function 'fastsearch' is taken from the source code of Python 2.5 and looks for the substring 'p' (of length 'm') within the string 's' (of length 'n'). If 'mode' is 'FAST_COUNT' the number of occurences of p in s is returned, and for 'FAST_SEARCH', the position of the first occurence. For the specific values used in main() below, the function returns correctly '4', if compiled with at most optimization '-O2', but '-1' for optimization level '-O3'. I have just changed the Python-specific types to standard ones, otherwise fastsearc() is as defined in file Objects/stringlib/fastsearch.h of the Python 2.5.1 source code. It has been written by Fredrik Lundh and is described in his blog here: http://effbot.org/zone/stringlib.htm Simon Anders, [EMAIL PROTECTED], 2007-09-02 */ #include #include #define FAST_COUNT 0 #define FAST_SEARCH 1 inline int fastsearch(const char* s, int n, const char* p, int m, int mode) { long mask; int skip, count = 0; int i, j, mlast, w; w = n - m; if (w < 0) return -1; /* look for special cases */ if (m <= 1) { if (m <= 0) return -1; /* use special case for 1-character strings */ if (mode == FAST_COUNT) { for (i = 0; i < n; i++) if (s[i] == p[0]) count++; return count; } else { for (i = 0; i < n; i++) if (s[i] == p[0]) return i; } return -1; } mlast = m - 1; /* create compressed boyer-moore delta 1 table */ skip = mlast - 1; /* process pattern[:-1] */ for (mask = i = 0; i < mlast; i++) { mask |= (1 << (p[i] & 0x1F)); if (p[i] == p[mlast]) skip = mlast - i - 1; } /* process pattern[-1] outside the loop */ mask |= (1 << (p[mlast] & 0x1F)); for (i = 0; i <= w; i++) { /* note: using mlast in the skip path slows things down on x86 */ if (s[i+m-1] == p[m-1]) { /* candidate match */ for (j = 0; j < mlast; j++) if (s[i+j] != p[j]) break; if (j == mlast) { /* got a match! */ if (mode != FAST_COUNT) return i; count++; i = i + mlast; continue; } /* miss: check if next character is part of pattern */ if (!(mask & (1 << (s[i+m] & 0x1F i = i + m; else i = i + skip; } else { /* skip: check if next character is part of pattern */ if (!(mask & (1 << (s[i+m] & 0x1F i = i + m; } } if (mode != FAST_COUNT) return -1; return count; } int main () { char* str = "foo2/**bar**/"; int str_len = strlen (str); char* sub = "/**bar**/"; int sub_len = strlen (sub); int offset = 0; int res; res = fastsearch (str, str_len, sub, sub_len, FAST_SEARCH); printf ("%d\n", res); return 0; } ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8158] documentation of 'optparse' module incomplete
New submission from Simon Anders : The class optparse.OptionParser supports a number of useful keyword arguments to the initializer, which are not documented in the Python Standard Library documentation, here: http://docs.python.org/library/optparse.html This is a bit unfortunate. For example, I wanted to add a description to the top of my script's help page and a copyright notice to the foot, and was already about to subclass OptionParser in order to override the format_help method, when I noticed that optional keyword arguments 'description' and 'epilog' are provided for precisely this purpose. The 'epilog' attribute is at least mentioned in the class's docstring, while the 'description' argument is completely undocumented. I doubt that this was done on purpose. I'd suggest to go over the documentation page for optparse and fill in the missing bits; at minimum, list all keyword arguments to optparse.OptionParser.__init__. -- assignee: georg.brandl components: Documentation messages: 101177 nosy: georg.brandl, sanders severity: normal status: open title: documentation of 'optparse' module incomplete versions: Python 2.6 ___ Python tracker <http://bugs.python.org/issue8158> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1084] ''.find() gives wrong result in Python built with ICC
Simon Anders added the comment: Update to the story: After I submitted the bug report to Intel, they investigated and quickly confirmed it to be a compiler bug, whcih they then managed to fix. I have just got an e-mail from Intel that the newest available version of ICC, namely version l_cc_c_10.1.008, contains the fix. In principle the problem should vanish now, but I have not found the time to verify that. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1084> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5704] Command line option '-3' should imply '-t'
New submission from Simon Anders : The '-3' command line option in Python 2.6 is supposed to warn whenever encountering something that would throw an error in Python 3. Mixing of tabs and spaces has become illegal in Python 3. However, Python 2.6, called with '-3', passes silently over this unless '-t' was given, too. Would it not be more consistent to let '-3' imply '-t'? -- components: Interpreter Core messages: 85581 nosy: sanders_muc severity: normal status: open title: Command line option '-3' should imply '-t' type: behavior versions: Python 2.6, Python 2.7 ___ Python tracker <http://bugs.python.org/issue5704> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com