> I don't think it's a problem with that particular app - it was basically
> a vanilla django install - and it works fine after the restart.
>
> the real problem is that it cascades.  once one vassal starts to
> experience the problem, then any new vassals created from that point on,
> or any restarted, also start to see problems...
>
> just happened again :/

One thing i fear is not clear, when you say that every new vassals
created/restarted sees the problem, you mean that you get the kill() error
or you simply do not see anythin in the logs ?


>
>
> --
> Harry Percival
> Developer
> [email protected]
>
> PythonAnywhere - a fully browser-based Python development and hosting
> environment
> <http://www.pythonanywhere.com/>
>
> PythonAnywhere LLP
> 17a Clerkenwell Road, London EC1M 5RD, UK
> VAT No.: GB 893 5643 79
> Registered in England and Wales as company number OC378414.
> Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
>
> On 10/03/14 13:56, Roberto De Ioris wrote:
>>> Hi there,
>>>
>>> Happened again today, I tried to snapshot some more debug info:
>>>
>>> here are the logs from the emperor, when i try to reload the vassal:
>>>
>>>      2014-03-10 12:19:28 +0000 EMPEROR - [emperor] kill: No such
>>> process
>>>      [core/emperor.c line 1699]
>>>      2014-03-10 12:19:31 +0000 EMPEROR - emperor_respawn/write():
>>> Broken
>>>      pipe [core/emperor.c line 656]
>>>      2014-03-10 12:19:31 +0000 EMPEROR - [emperor] reload the uwsgi
>>>      instance redacted.pythonanywhere.com.ini
>>>      2014-03-10 12:19:31 +0000 EMPEROR - [emperor] kill: No such
>>> process
>>>      [core/emperor.c line 1699]
>>>      2014-03-10 12:19:34 +0000 EMPEROR - [emperor] kill: No such
>>> process
>>>      [core/emperor.c line 1699]
>>>      2014-03-10 12:19:37 +0000 EMPEROR - [emperor] kill: No such
>>> process
>>>      [core/emperor.c line 1699]
>>>
>>> You can see the "no such process" error keeps happening, every couple
>>> of
>>> seconds
>>>
>>> here are the logs from the vassal server log:
>>>
>>>      2014-03-10 11:58:51 VACUUM: unix socket
>>>      /var/sockets/redacted.pythonanywhere.com/socket removed.
>>>      2014-03-10 11:58:53 *** Starting uWSGI 2.0 (64bit) on [Mon Mar 10
>>>      11:58:52 2014] ***
>>>      2014-03-10 11:58:53 compiled with version: 4.8.1 on 07 February
>>> 2014
>>>      19:06:17
>>>      2014-03-10 11:58:53 os: Linux-3.11.0-15-generic #25-Ubuntu SMP Thu
>>>      Jan 30 17:22:01 UTC 2014
>>>      2014-03-10 11:58:53 nodename: giles-liveweb2
>>>      2014-03-10 11:58:53 machine: x86_64
>>>      2014-03-10 11:58:53 clock source: unix
>>>      2014-03-10 11:58:53 pcre jit disabled
>>>      2014-03-10 11:58:53 detected number of CPU cores: 4
>>>      2014-03-10 11:58:53 current working directory: /etc/uwsgi/vassals
>>>      2014-03-10 11:58:53 detected binary path: /usr/local/bin/uwsgi
>>>      2014-03-10 11:58:53 using Linux cgroup
>>>      /mnt/cgroups/cpu/user_types/free with mode 700
>>>      2014-03-10 11:58:53 assigned process 16789 to cgroup
>>>      /mnt/cgroups/cpu/user_types/free/tasks
>>>      2014-03-10 11:58:53 using Linux cgroup
>>>      /mnt/cgroups/cpuacct/users/Redacted with mode 700
>>>      2014-03-10 11:58:53 assigned process 16789 to cgroup
>>>      /mnt/cgroups/cpuacct/users/Redacted/tasks
>>>      2014-03-10 11:58:53 using Linux cgroup
>>>      /mnt/cgroups/memory/user_types/free with mode 700
>>>      2014-03-10 11:58:53 assigned process 16789 to cgroup
>>>      /mnt/cgroups/memory/user_types/free/tasks
>>>      2014-03-10 11:58:53 uWSGI running as root, you can use
>>>      --uid/--gid/--chroot options
>>>      2014-03-10 11:58:53 chroot() to /mnt/chroots/Redacted
>>>      2014-03-10 11:58:53 setgid() to 60000
>>>      2014-03-10 11:58:53 setuid() to 231762
>>>      2014-03-10 11:58:53 limiting number of processes to 64...
>>>      2014-03-10 11:58:53 your processes number limit is 64
>>>      2014-03-10 11:58:53 your memory page size is 4096 bytes
>>>      2014-03-10 11:58:53 detected max file descriptor number: 123456
>>>      2014-03-10 11:58:53 building mime-types dictionary from file
>>>      /etc/mime.types...
>>>      2014-03-10 11:58:53 536 entry found
>>>      2014-03-10 11:58:53 lock engine: pthread robust mutexes
>>>      2014-03-10 11:58:53 thunder lock: disabled (you can enable it with
>>>      --thunder-lock)
>>>      2014-03-10 11:58:53 uwsgi socket 0 bound to UNIX address
>>>      /var/sockets/redacted.pythonanywhere.com/socket fd 7
>>>      2014-03-10 11:58:53 Python version: 2.7.5+ (default, Sep 19 2013,
>>>      13:52:09)  [GCC 4.8.1]
>>>      2014-03-10 11:58:53 *** Python threads support is disabled. You
>>> can
>>>      enable it with --enable-threads ***
>>>      2014-03-10 11:58:53 Python main interpreter initialized at
>>> 0x1021bb0
>>>      2014-03-10 11:58:53 your server socket listen backlog is limited
>>> to
>>>      100 connections
>>>      2014-03-10 11:58:53 your mercy for graceful operations on workers
>>> is
>>>      60 seconds
>>>      2014-03-10 11:58:53 setting request body buffering size to 65536
>>> bytes
>>>      2014-03-10 11:58:53 mapped 333936 bytes (326 KB) for 1 cores
>>>      2014-03-10 11:58:53 *** Operational MODE: single process ***
>>>      2014-03-10 11:58:53 WSGI app 0 (mountpoint='') ready in 1 seconds
>>> on
>>>      interpreter 0x1021bb0 pid: 16789 (default app)
>>>      2014-03-10 11:58:53 *** uWSGI is running in multiple interpreter
>>>      mode ***
>>>      2014-03-10 11:58:53 spawned uWSGI master process (pid: 16789)
>>>      2014-03-10 11:58:53 spawned uWSGI worker 1 (pid: 16790, cores: 1)
>>>      2014-03-10 11:58:53 spawned 2 offload threads for uWSGI worker 1
>>>      2014-03-10 11:58:57 announcing my loyalty to the Emperor...
>>>      2014-03-10 12:01:14 Mon Mar 10 12:01:14 2014 - received message 0
>>>      from emperor
>>>      2014-03-10 12:01:14 SIGINT/SIGQUIT received...killing workers...
>>>      2014-03-10 12:01:15 worker 1 buried after 1 seconds
>>>      2014-03-10 12:01:15 goodbye to uWSGI.
>>>      2014-03-10 12:01:15 chdir(): No such file or directory
>>> [core/uwsgi.c
>>>      line 1472]
>>>      2014-03-10 12:01:15 VACUUM: unix socket
>>>      /var/sockets/redacted.pythonanywhere.com/socket removed.
>>>
>>> You'll notice the logs are from an earlier reload.  later reloads don't
>>> seem to even log any more.
>>>
>>> And here is the vassal config:
>>>
>>>      [uwsgi]
>>>      plugins = python27
>>>      uid = 231762
>>>      gid = 60000
>>>
>>>      if-not-exists = /mnt/chroots/Redacted/bin/ls
>>>      exec-pre-jail = python
>>>      /home/anywhere/django/anywhere/jails/create.py Redacted
>>>      endif =
>>>      chroot = /mnt/chroots/Redacted
>>>      limit-nproc = 64
>>>      # shutdown app (but not master) after 26hrs of no hits
>>>      idle=93600
>>>      # kill any requests that take too long process
>>>      harakiri = 300
>>>      buffer-size = 32768
>>>      post-buffering = 65536
>>>      vacuum =
>>>      # chrooted master cannot reload itself, so just exit
>>>      exit-on-reload = true
>>>      # file lock prevents respawning vassals from racing dying ones
>>>      flock = %p
>>>
>>>      log-encoder = format redacted.pythonanywhere.com ${strftime:%%F
>>> %%T}
>>>      ${msg}
>>>      logger = rsyslog:10.124.106.197:10515,uwsgi,142
>>>
>>>      workers = 1
>>>      cgroup = /mnt/cgroups/cpu/user_types/free
>>>      cgroup = /mnt/cgroups/cpuacct/users/Redacted
>>>      cgroup = /mnt/cgroups/memory/user_types/free
>>>
>>>      auto-procname
>>>      procname-prefix-spaced = Redacted Redacted.pythonanywhere.com
>>>      disable-logging = true
>>>
>>>      check-static=/var/www/static
>>>
>>>      static-map =
>>>      
>>> /static/admin/=/home/Redacted/.virtualenvs/django16/lib/python2.7/site-packages/django/contrib/admin/static/admin
>>>
>>>      static-index = index.html
>>>      offload-threads = 2
>>>
>>>      touch-reload = /var/www/redacted_pythonanywhere_com_wsgi.py
>>>      socket = /var/sockets/redacted.pythonanywhere.com/socket
>>>      chmod-socket = 666
>>>      chdir = /var/www
>>>      env = HOST_NAME=redacted.pythonanywhere.com
>>>      env = WSGI_MODULE=redacted_pythonanywhere_com_wsgi
>>>
>>>      env = no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com
>>>
>>>      env = HOME=/home/Redacted
>>>
>>>      env = http_proxy=http://proxy.server:3128
>>>
>>>      env = PYENCHANT_LIBRARY_PATH=/usr/lib/libenchant.so.1
>>>
>>>      env = https_proxy=http://proxy.server:3128
>>>
>>>      env = PATH=/home/Redacted/.local/bin:/usr/local/bin:/usr/bin:/bin
>>>      unenv = UWSGI_EMPEROR_FD
>>>      unenv = SHLVL
>>>      unenv = SSH_TTY
>>>      unenv = PWD
>>>      unenv = UWSGI_RELOADS
>>>      unenv = SSH_CLIENT
>>>      unenv = LOGNAME
>>>      unenv = UWSGI_ORIGINAL_PROC_NAME
>>>      unenv = MAIL
>>>      unenv = SSH_CONNECTION
>>>      unenv = _
>>>
>>>      file = /bin/user_wsgi_wrapper.py
>>>
>>>
>>> I've checked the stats server, there aren't any vassals in the
>>> blacklist.
>>>
>>>
>>> Bouncing UWSGI fixes the problem, but obviously it involves downtime,
>>> so
>>> we'd rather avoid it if poss.
>>>
>>>
>>
>> Hi Harry, do not do it, basically your vassal is not removed from the
>> linked list as the process mapped to it is no more available (hard to
>> say
>> the reason). Removing the file (well rename it to .off) from the vassal
>> dir should be enough.
>>
>> By the way, latest code improved that coner-case too:
>>
>> https://github.com/unbit/uwsgi/commit/c118c75bfe5ed6b26668aa48ae076dddcf31a5b9
>>
>>
>> basically if killing the process is not possible the memory area is
>> removed from the list (so it can be restarted). If for some reason the
>> pid
>> is changed, you will get a zombie, but the master will clear it soon or
>> later.
>>
>> If you use the pid namespace (this is very easy, just add
>> emperor-use-clone = pid in your emperor config) you can be sure that
>> once
>> the vassal master is dead no more user processes (even the daemons
>> eventually spawned by your customers) are left (as the master is the new
>> init for the vassal)
>>
>>
>> Let me know
>>
>>
>
> _______________________________________________
> uWSGI mailing list
> [email protected]
> http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
>


-- 
Roberto De Ioris
http://unbit.it
_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi

Reply via email to