All,

Currently experiencing a sporadic issue with our keystone endpoints. Throughout the day, keystone will just stop responding on both the admin and public endpoints, which will cause all services to hang. Restarting apache2 fixes the issue for some amount of time, but it inevitably appears again later on. Here is what I am seeing:

keystone: /var/log/apache2/keystone.log

2018-02-20 21:50:38.830302 Timeout when reading response headers from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin 2018-02-20 21:50:50.799587 Timeout when reading response headers from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin 2018-02-20 21:51:02.857266 Timeout when reading response headers from daemon process 'keystone-admin': /usr/bin/keystone-wsgi-admin 2018-02-20 21:51:02.879630 mod_wsgi (pid=1221): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-admin'.
    2018-02-20 21:51:02.879796 IOError: failed to write data
2018-02-20 21:51:07.005702 mod_wsgi (pid=1220): Exception occurred processing WSGI script '/usr/bin/keystone-wsgi-admin'.

horizon: /var/log/apache2/error.log

[Tue Feb 20 21:47:02.582511 2018] [wsgi:error] [pid 1227:tid 140591048296192] [client 10.10.5.200:57462] Timeout when reading response headers from daemon process 'horizon': /usr/share/openstack-dashboard/openstack_dashboard/wsgi/django.wsgi, referer: https://vta.cybbh.space/horizon/project/instances/900e9d57-752d-488c-8dba-ffc098e1a51a/ [Tue Feb 20 21:48:03.962589 2018] [wsgi:error] [pid 1225:tid 140591249823488] ERROR openstack_auth.user Unable to retrieve project list. [Tue Feb 20 21:48:03.962646 2018] [wsgi:error] [pid 1225:tid 140591249823488] Traceback (most recent call last): [Tue Feb 20 21:48:03.962656 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/openstack_auth/user.py", line 350, in authorized_tenants [Tue Feb 20 21:48:03.962665 2018] [wsgi:error] [pid 1225:tid 140591249823488] is_federated=self.is_federated) [Tue Feb 20 21:48:03.962673 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/openstack_auth/utils.py", line 372, in get_project_list [Tue Feb 20 21:48:03.962734 2018] [wsgi:error] [pid 1225:tid 140591249823488] projects = client.projects.list(user=kwargs.get('user_id')) [Tue Feb 20 21:48:03.962744 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner [Tue Feb 20 21:48:03.962752 2018] [wsgi:error] [pid 1225:tid 140591249823488] return wrapped(*args, **kwargs) [Tue Feb 20 21:48:03.962759 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneclient/v3/projects.py", line 119, in list [Tue Feb 20 21:48:03.962767 2018] [wsgi:error] [pid 1225:tid 140591249823488] **kwargs) [Tue Feb 20 21:48:03.962774 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 75, in func [Tue Feb 20 21:48:03.962782 2018] [wsgi:error] [pid 1225:tid 140591249823488] return f(*args, **new_kwargs) [Tue Feb 20 21:48:03.962789 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 390, in list [Tue Feb 20 21:48:03.962796 2018] [wsgi:error] [pid 1225:tid 140591249823488] self.collection_key) [Tue Feb 20 21:48:03.962803 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 125, in _list [Tue Feb 20 21:48:03.962811 2018] [wsgi:error] [pid 1225:tid 140591249823488] resp, body = self.client.get(url, **kwargs) [Tue Feb 20 21:48:03.962818 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 288, in get [Tue Feb 20 21:48:03.962826 2018] [wsgi:error] [pid 1225:tid 140591249823488] return self.request(url, 'GET', **kwargs) [Tue Feb 20 21:48:03.962833 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 447, in request [Tue Feb 20 21:48:03.962841 2018] [wsgi:error] [pid 1225:tid 140591249823488] resp = super(LegacyJsonAdapter, self).request(*args, **kwargs) [Tue Feb 20 21:48:03.962848 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 192, in request [Tue Feb 20 21:48:03.962855 2018] [wsgi:error] [pid 1225:tid 140591249823488] return self.session.request(url, method, **kwargs) [Tue Feb 20 21:48:03.962863 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner [Tue Feb 20 21:48:03.962870 2018] [wsgi:error] [pid 1225:tid 140591249823488] return wrapped(*args, **kwargs) [Tue Feb 20 21:48:03.962877 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 703, in request [Tue Feb 20 21:48:03.962885 2018] [wsgi:error] [pid 1225:tid 140591249823488] resp = send(**kwargs) [Tue Feb 20 21:48:03.962892 2018] [wsgi:error] [pid 1225:tid 140591249823488] File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 777, in _send_request [Tue Feb 20 21:48:03.962899 2018] [wsgi:error] [pid 1225:tid 140591249823488] raise exceptions.ConnectFailure(msg) [Tue Feb 20 21:48:03.962907 2018] [wsgi:error] [pid 1225:tid 140591249823488] ConnectFailure: Unable to establish connection to https://*******:5000/v3/users/7e68b998ee1ec26139d3482818c9643d1ce3b5aff532c865cff65e1c9fe01306/projects?: ('Connection aborted.', BadStatusLine("''",))


I get the same behavior regardless of service and regardless of whether or not I use the CLI or Horizon. All signs point to keystone being the culprit.

I have adjusted my /etc/apache2/sites-available/keystone.conf:

WSGIDaemonProcess keystone-public processes=8 threads=4 user=keystone group=keystone display-name=%{GROUP} WSGIDaemonProcess keystone-admin processes=8 threads=4 user=keystone group=keystone display-name=%{GROUP}

And ensured that WSGIApplicationGroup %{GLOBAL} is present.

haproxy is sitting in between keystone and all other services, and is configured as follows:

defaults
  log  global
  maxconn  16384
  option  redispatch
  retries  3
  timeout  http-request 30s
  timeout  queue 1m
  timeout  connect 30s
  timeout  client 2m
  timeout  server 2m
  timeout  check 10s

...

listen keystone_admin_cluster
  bind 10.10.5.200:35357
  balance  source
  option  tcpka
  option  httpchk
  option  tcplog
  server keystone-0 10.10.5.120:35357 check inter 2000 rise 2 fall 5
  server keystone-1 10.10.5.121:35357 check inter 2000 rise 2 fall 5

listen keystone_public_internal_cluster
  bind 10.50.10.0:5000 ssl crt /etc/letsencrypt/live/*****/master.pem
  bind 10.10.5.200:5000
  balance  roundrobin
  option  tcpka
  option  httpchk
  option  tcplog
  server keystone-0 10.10.5.120:5000 check inter 2000 rise 2 fall 5
  server keystone-1 10.10.5.121:5000 check inter 2000 rise 2 fall 5

...


Any ideas on where else I should look?

Thanks in advance,

---
v/r

Chris Apsey
bitskr...@bitskrieg.net
https://www.bitskrieg.net

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to