Bugs item #1467080, was opened at 2006-04-08 18:09 Message generated for change (Comment added) made by sobomax You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1467080&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Maxim Sobolev (sobomax) Assigned to: Nobody/Anonymous (nobody) Summary: Many functions in socket module are not thread safe Initial Comment: The socket module make a great effort to be thread-safe, but misses one big point - it uses single per-instance buffer to hold resolved sockaddr_xxx structures. Therefore, on SMP system it is possible that several threads calling functions that perform address resolution in parallel will stomp on each other resulting in incorrect or invalid address to be used in each case. For example, two threads calling sendto() in parallel can result in packets to be sent to incorrect addresses - packets from thread one from time to time will be sent to address requested by thread two and vice versa. Another, smaller issue is that the call to getnameinfo() is not protected with netdb mutex on systems that don't have thread-safe resolver. ---------------------------------------------------------------------- >Comment By: Maxim Sobolev (sobomax) Date: 2006-04-24 15:31 Message: Logged In: YES user_id=24670 OK, it looks like the only way to get this bug fixed is to reimplement patch to use stack instead of heap, so that here we go. Attached please find new version of the patch which allocates address buffer on stack. -Maxim ---------------------------------------------------------------------- Comment By: Maxim Sobolev (sobomax) Date: 2006-04-17 16:05 Message: Logged In: YES user_id=24670 > The big win is in simplification of the code. What do you call "big simplification"? Several malloc/free calls and appropriate NULL checks? Aside from stack usage issues the big loss here is portability - we either need to use some fixed length buffer with the size sufficient to hold any type of address (ugly!) or use sockaddr_storage, which may not exist on all platforms supported by the python. -Maxim ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-04-17 15:51 Message: Logged In: YES user_id=21627 The big win is in simplification of the code. Also, we are not talking about 10k here. On Linux, sock_addr_t is 128 bytes. Take a look at set_error: it allocates 100 bytes for an error buffer. Or sock_repr: 512 bytes for a buffer. Or socket_gethostname: 1024 bytes. Or socket_gethostbyname_ex: 16384 bytes. socket_getaddrinfo: 30 bytes. os_init: 100 bytes Or, looking at other modules: symtable.c:check_unoptimized: 300 bytes. Py_GetVersion: 250 bytes. PySys_SetArgv: 2*MAXPATHLEN+1 (on Linux, this is 8193 bytes). I could go on. Don't worry about stack consumption. Or, if you do, analyse the Python source code, and fix the big offenders first. Premature optimization is the root of all evil. ---------------------------------------------------------------------- Comment By: Maxim Sobolev (sobomax) Date: 2006-04-17 15:10 Message: Logged In: YES user_id=24670 > The address space available to each thread typically doesn't > depend on the number of threads. Instead, the stack size is > pre-determined, so it's vice versa: the number of threads > supported depends on that stack-size, which (currently) isn't tunable. Yes, but my point is that on low-end and/or embedded system the system can be configured with as low stack per thread as possible (since with for example 100 threads, every extra 10K of stack per thread means 1M of extra memory, which in the absence of paging needs to be wired down) and if only one thread needs this stack 990K of it will be effectively wasted. And since getaddrinfo()-family already uses heap for its results I don't see any big win in avoiding extra malloc()/free() call. Expecially considering that we are dealing with i/o here, so that system call overhead will be much more than that anyway, even if the program calls those functions heavily. -Maxim ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-04-17 14:22 Message: Logged In: YES user_id=21627 The argument of "hundreds of threads" is a red herring. The address space available to each thread typically doesn't depend on the number of threads. Instead, the stack size is pre-determined, so it's vice versa: the number of threads supported depends on that stack-size, which (currently) isn't tunable. Also, stack space is typically a scarce resource only for recursive functions. For leave functions, it doesn't matter, unless a single function consumes the majority of the stack (which certainly wouldn't be the case here). Allocation on the stack is cheap, and over-allocation doesn't hurt. Please change the patch to use automatic variables. It will become simpler and more maintainable that way (in addition, it should also become minimally faster). ---------------------------------------------------------------------- Comment By: Maxim Sobolev (sobomax) Date: 2006-04-17 13:26 Message: Logged In: YES user_id=24670 > Why is it necessary to allocate the memory dynamically? > Couldn't the caller provide the memory on the stack (i.e. > through a local variable)? Local variable is not very good IMHO since in threading environment with hundreds of threads running at the same time stack can be a scarce resource. Another issue is that the caller doesn't know the actual size it has to allocate until the resolution has been done, therefore it would need to overallocate in most cases. -Maxim ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-04-10 22:22 Message: Logged In: YES user_id=21627 Why is it necessary to allocate the memory dynamically? Couldn't the caller provide the memory on the stack (i.e. through a local variable)? ---------------------------------------------------------------------- Comment By: Maxim Sobolev (sobomax) Date: 2006-04-10 17:08 Message: Logged In: YES user_id=24670 Sorry, for some reason the patch did not go through first time. See attached. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2006-04-09 14:18 Message: Logged In: YES user_id=21627 I wonder why the sock_addr member is there in the socket object in the first place. AFAICT, there is no need for it; it was introduced in r4509. Would you like to work on a patch? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1467080&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com