> Please show the *exact* error message, including the traceback, by > copying and pasting it. Do not retype it by hand, or summarize it, or put > it into your own words.
Unfortunately this is not possible. The logging system I designed only gives the following information, as we have millions of logs per-day of custom exceptions I didnot include the full traceback.Here is only what I have: 1448) 15/09/10 20:02:08 - [*] ERROR: Physical max client limit reached. Please contact maintenance.filedescriptor out of range in select()[scSocketServer.py:215:][Port:515] The code generating the error is: try: self.__ReadersInCycle, self.__WritersInCycle, e = \ select( self.__Sockets, self.__WritersInCycle, [], base.scOptions.scOPT_SELECT_TIMEOUT) except ValueError, e: LogError('Physical max client limit reached.' \ ' Please contact maintenance.'+ str(e)) self.scSvr_OnClientPhysicalLimitReached() #define a policy here continue > > First of all, in our entire application there is no line of code like > > remove(x), meaning there is no x variable. > > Look at this example: > > >>> sockets = [] > >>> sockets.remove("Hello world") > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ValueError: list.remove(x): x not in list > Ok. Thanks. > Anything is possible, but it's not likely. What's far more likely is that > you have a bug in your code, and that somehow, under rare circumstances, > it tries to remove something from a list that was never inserted into the > list. Or it tries to remove it twice. > > My guess is something like this: > > try: > socket = get_socket() > self._sockets.append(socket) > except SomeError: > pass > # later on > self._sockets.remove(socket) > Hmm.. Might be, but inside the self.__Sockets list there is the ListenSocket() which is the real listening socket. Naturally, I am using it in the read list of select() on every server cycle. The weird thing is that the ListenSocket itself is throwing the "not in list" exception, too! And one thing I am sure is that I have not written any kind of code that removes the Listen socket from the List, that is just impossible. Additionaly, there are very few places that I traverse the __Sockets list for optimization. The only places I delete something from the __Sockets list: 1) a user disconnects (normal disconnect, authentication or ping timeout) 3) server is being stopped or restarted Other than that there is not access to that variable from outside objects, as can be seen it is also private. And please keep in mind that this bug is there for about a year, so many code reviews have passed successfully without noticing the type of error you are suggesting. And more information on system: I am running Python 2.4 on CentOS. By the way, through digging the logs and system, it turns out select(..) is hitting the per-process FD limit. Although the system wide ulimit is unlimited, I think Python "selectmodule.c" enforces the rule to 1024. I am getting the error after hitting that limit and somehow as I just explained the __ListenSocket is being removed from the read list which causes it to be lost and Server instance is just lost forever. Putting a try..except to that code and re-init server port is a solution but I guess a bad one, because I will have not found the root cause. Thanks in advance, -- http://mail.python.org/mailman/listinfo/python-list