Hello, I have written a script using bottle.py. The app works fine most of times. Sometimes though, the server takes time to respond and the web browser eventually drops the connection to the server after a certain time (timeout), showing this page :
""" Connection reset The connection to the server was reset while the page was loading. This site may be temporarily unavailable or too busy. Try again in a few moments. If you can not load any pages, check your computer's network connection. If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the web. """ In the app, I have setup faulthandler (https://pypi.python.org/pypi/faulthandler/2.4) in my application to respond to SIGUSER1. When I trigger the signal (while the server is hung), here's the traceback I get. It shows precisely where is my application hanging : Current thread 0x00007fb1a0bfe700 (most recent call first): File "/usr/lib/python2.7/socket.py", line 447 in readline File "/usr/lib/python2.7/wsgiref/simple_server.py", line 116 in handle File "/usr/lib/python2.7/SocketServer.py", line 649 in __init__ File "/usr/lib/python2.7/SocketServer.py", line 334 in finish_request File "/usr/lib/python2.7/SocketServer.py", line 321 in process_request File "/usr/lib/python2.7/SocketServer.py", line 295 in _handle_request_noblock File "/usr/lib/python2.7/SocketServer.py", line 238 in serve_forever File "/usr/local/lib/python2.7/dist-packages/infomaniak/bottle.py", line 2680 in run File "/usr/local/lib/python2.7/dist-packages/infomaniak/bottle.py", line 3048 in run File "/usr/local/lib/python2.7/dist-packages/infomaniak/server.py", line 69 in <module> File "/usr/bin/infomaniak", line 2 in <module> Instead of the readline call to timeout after a certain amount of time, it seems it gets stuck there forever. When I strace my app it, I find it stuck at the recvfrom system call : root@audio-mon[10.10.10.82] ~ # service infomaniak status infomaniak is running with pid 2149 root@audio-mon[10.10.10.82] ~ # strace -fp 2149 Process 2149 attached - interrupt to quit recvfrom(7, ^C <unfinished ...> Process 2149 detached root@audio-mon[10.10.10.82] ~ # It is stuck there for hours (7 hours now). How can I fix this ? this behaviour is also hard to reproduce, I don't know what triggers it. The app was working fine for almost a month, no problem at all. Here's code : #!/usr/bin/env python #-*- encoding=utf-8 -*- # infomaniak.getFlux() et infomaniak.login() import infomaniak # Pour la localisation des fichiers HTML import pkg_resources import logging import ConfigParser # pour attrapper l'exception requests.exceptions.ConnectionError import requests # pour faire tourner le schmilblick import bottle # service infomaniak reload import signal # service infomaniak trace import faulthandler cp = ConfigParser.ConfigParser() cp.read("/etc/infomaniak.conf") PORT = cp.getint("SERVER","PORT") USER = cp.get("LOGIN","USER") PASS = cp.get("LOGIN","PASS") LOG_LEVEL = cp.get("LOG","LOG_LEVEL") logging.basicConfig(filename='/var/log/infomaniak/main.log', level=getattr(logging,LOG_LEVEL), format='%(asctime)s [%(levelname)s] %(message)s' ) logging.info("========================= START ==========================") logging.info("Reading config file Ok.. Serving on port %s" % (PORT)) def reload_conf(signal,frame): logging.info("========================= RELOAD ==========================") logging.info("Reading config file Ok.. Serving on port %s" % (PORT)) cp = ConfigParser.ConfigParser() cp.read("/etc/infomaniak.conf") LOG_LEVEL = cp.get("LOG","LOG_LEVEL") logging.getLogger().setLevel(getattr(logging,LOG_LEVEL)) logging.info("LOG LEVEL %s" % (LOG_LEVEL)) signal.signal(signal.SIGUSR2,reload_conf) faulthandler.register(signal.SIGUSR1,file=open("/var/log/infomaniak/main.log","a")) @bottle.route("/<url:path>") def get(url): # print "thing",url # logging.info("--------------------------------------------------------------------------") # logging.info("[%s:%s] get %s" % (self.client_address[0],self.client_address[1],self.path)) # MUST BE OF THE FORM : # https://statslive.infomaniak.com/radio/config/formatsflux.php/g3377s3i4402 return infomaniak.getFlux(url.lstrip("/"),USER,PASS) logged_in = False attempts = 1 while (logged_in == False) : logging.info("login...(tentative %d)" % attempts) try: infomaniak.login(USER,PASS) logged_in = True except requests.exceptions.ConnectionError,e: logging.critical("Impossible de se connecter au serveur infomaniak") logging.critical(e) attempts+=1 logging.info("login OK (tentatives : %d)" % attempts) bottle.run(host='',port=PORT) logging.info("========================= STOP ==========================") Any ideas on how to debug this or try to reproduce it ? anyone encountered similar behaviour ? I should mention that I don't query my server directly, but through web page (on yet another server on the same LAN) that has an HTML page with like 11 <iframe>s, all the iframes having their hrefs to my application, like this : <iframe src="http://10.10.10.82:8081/url1..."></iframe> <iframe src="http://10.10.10.82:8081/url2..."></iframe> <iframe src="http://10.10.10.82:8081/url3..."></iframe> etc. My app being sereved on 10.10.10.82:8081. Thanks for any help. -- https://mail.python.org/mailman/listinfo/python-list