New submission from Daniel Farina <dan...@heroku.com>: I seem to be encountering somewhat rare an infinite loop in hash table probing while importing _socket, as triggered by init_socket.c in Python 2.6, as seen/patched shipped with Ubuntu 10.04 LTS. The problem only reproduces on 32 bit machines, on both -O2 and -O0 builds (which is how I have managed to retrieve the detailed stack traces below). To cut to the chase, the bottom of the stack trace invariably looks like this, in particular the "key" (and therefore "hash") value is always the same:
#0 0x08088637 in lookdict_string (mp=0xa042714, key='SO_RCVTIMEO', hash=612808203) at ../Objects/dictobject.c:421 #1 0x080886cd in insertdict (mp=0xa042714, key='SO_RCVTIMEO', hash=612808203, value=20) at ../Objects/dictobject.c:450 #2 0x08088cac in PyDict_SetItem (op=<unknown at remote 0x37>, key= 'SO_RCVTIMEO', value=20) at ../Objects/dictobject.c:701 #3 0x0808b8d4 in PyDict_SetItemString (v= {'AF_INET6': 10, 'SocketType': <type at remote 0x8275e00>, 'getaddrinfo': <built-in function getaddrinfo>, 'TIPC_MEDIUM_IMPORTANCE': 1, 'htonl': <built-in function htonl>, 'AF_UNSPEC': 0, 'TIPC_DEST_DROPPABLE': 129, 'TIPC_ADDR_ID': 3, 'PF_PACKET': 17, 'AF_WANPIPE': 25, 'PACKET_OTHERHOST': 3, 'AF_AX25': 3, 'PACKET_BROADCAST': 1, 'PACKET_FASTROUTE': 6, 'TIPC_NODE_SCOPE': 3, 'inet_pton': <built-in function inet_pton>, 'AF_ATMPVC': 8, 'NETLINK_IP6_FW': 13, 'NETLINK_ROUTE': 0, 'TIPC_PUBLISHED': 1, 'TIPC_WITHDRAWN': 2, 'AF_ECONET': 19, 'AF_LLC': 26, '__name__': '_socket', 'AF_NETROM': 6, 'SOCK_RDM': 4, 'AF_IRDA': 23, 'htons': <built-in function htons>, 'SOCK_RAW': 3, 'inet_ntoa': <built-in function inet_ntoa>, 'AF_NETBEUI': 13, 'AF_NETLINK': 16, 'TIPC_WAIT_FOREVER': -1, 'AF_UNIX': 1, 'TIPC_SUB_PORTS': 1, 'HCI_TIME_STAMP': 3, 'gethostbyname_ex': <built-in function gethostbyname_ex>, 'SO_RCVBUF': 8, 'AF_APPLETALK': 5, 'SOCK_SEQPACKET': 5, 'AF_DECnet': 12, 'PACKET_OUTGOING': 4, 'SO_SNDLOWAT': 19, 'TIPC_SRC_DROPPABLE':...(truncated), key=0x81ac5fb "SO_RCVTIMEO", item=20) at ../Objects/dictobject.c:2301 #4 0x080f6c98 in PyModule_AddObject (m=<module at remote 0xb73cac8c>, name= 0x81ac5fb "SO_RCVTIMEO", o=20) at ../Python/modsupport.c:615 #5 0x080f6d0b in PyModule_AddIntConstant (m=<module at remote 0xb73cac8c>, name=0x81ac5fb "SO_RCVTIMEO", value=20) at ../Python/modsupport.c:627 #6 0x081321fd in init_socket () at ../Modules/socketmodule.c:4708 Here, we never escape from lookdict_string. The key is not in the dictionary, but at this stage Python is trying to figure out that is the case, and cannot seem to exit because of the lack of a dummy entry. Furthermore, every single reproduced case has a dictionary with a suspicious looking violation of an invariant that I believe is communicated by the source of dictobject.c, with emphasis on the values of ma_fill, ma_used, and ma_mask, which never deviate in any reproduced case. It seems like no hash table should ever get this full, per the comments in the source: $3 = {ob_refcnt = 1, ob_type = 0x81c3aa0, ma_fill = 128, ma_used = 128, ma_mask = 127, ma_table = 0xa06b4a8, ma_lookup = 0x8088564 <lookdict_string>, ma_smalltable = {{me_hash = 0, me_key = 0x0, me_value = 0x0}, {me_hash = 1023053529, me_key = '__name__', me_value = '_socket'}, {me_hash = 1679430097, me_key = 'gethostbyname', me_value = <built-in function gethostbyname>}, {me_hash = 0, me_key = 0x0, me_value = 0x0}, {me_hash = 779452068, me_key = 'gethostbyname_ex', me_value = <built-in function gethostbyname_ex>}, {me_hash = -322108099, me_key = '__doc__', me_value = None}, {me_hash = -1649837379, me_key = 'gethostbyaddr', me_value = <built-in function gethostbyaddr>}, { me_hash = 1811348911, me_key = '__package__', me_value = None}}} The Python program that is running afoul this bug is using gevent, but the stack traces suggest that all gevent is doing at the time this crashes is importing "socket", and this is done at the very, very beginning of program execution. Finally, what's especially strange is that I had gone a very long time running this exact version of Python, libraries, and application quite frequently: it suddenly started cropping up a little while ago (maybe a few weeks). It could have been just coincidence, but if there are code paths in init_socket.c that may somehow be sensitive to the network somehow, this could have been related. I also have a limited suspicion that particularly unlucky OOM (these systems are configured in a way where malloc and friends will return NULL, i.e. no overcommit on Linux) could be related. ---------- components: Interpreter Core messages: 161527 nosy: Daniel.Farina priority: normal severity: normal status: open title: dictobject infinite loop on 2.6.5 on 32-bit x86 type: behavior versions: 3rd party, Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14903> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com