On 2014-06-17 07:30, John Wilkins wrote:
You followed this intallation guide:
http://ceph.com/docs/master/install/install-ceph-gateway/ [16]

An then you, followed this http://ceph.com/docs/master/radosgw/config/
[1] configuration guide and then you executed:

sudo /etc/init.d/ceph-radosgw start
And there was no ceph-radosgw script? We need to verify that first,
and file a bug if we're not getting an init script in CentOS packages.


I took a look again, and the package I had installed seemed to have come from epel, and did not contain the init script. I started from scratch with a minimal install of centos6 that hadn't been used for anything else. The package from the ceph repo does indeed have the init script.

Unfortunately, I'm still running into the same issue. I removed all the rgw pools, started ceph-radosgw, and it recreated a few of them:
.rgw.root
.rgw.control
.rgw
.rgw.gc
.users.uid

Manually creating the rest of them has no effect. It complains about aquiring locks and listing objects: 2014-06-17 00:31:45.150494 7f86ec450820 0 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process radosgw, pid 1704 2014-06-17 00:31:45.150556 7f86ec450820 -1 WARNING: libcurl doesn't support curl_multi_wait() 2014-06-17 00:31:45.150590 7f86ec450820 -1 WARNING: cross zone / region transfer performance may be affected
2014-06-17 00:32:02.469894 7f86ec450820  0 framework: fastcgi
2014-06-17 00:32:02.469958 7f86ec450820  0 starting handler: fastcgi
2014-06-17 00:32:13.455904 7f86bffff700 -1 failed to list objects pool_iterate returned r=-2 2014-06-17 00:32:13.455918 7f86bffff700 0 ERROR: lists_keys_next(): ret=-2 2014-06-17 00:32:13.455924 7f86bffff700 0 ERROR: sync_all_users() returned ret=-2 2014-06-17 00:32:13.611812 7f86d95f9700 0 RGWGC::process() failed to acquire lock on gc.16 2014-06-17 00:32:14.105180 7f86d95f9700 0 RGWGC::process() failed to acquire lock on gc.0



If I make a request, the server eventually fills up with so many radosgw process that the apache user can no longer fork any new processes.

This is an strace from apache:

read(13, "GET / HTTP/1.1\r\nUser-Agent: Wget/1.15 (linux-gnu)\r\nAccept: */*\r\nHost: gateway.ceph.chc.tlocal\r\nConnection: Keep-Alive\r\n\r\n", 8000) = 121 stat("/s3gw.fcgi", 0x7fffcc662580) = -1 ENOENT (No such file or directory) stat("/var/www/html/s3gw.fcgi", {st_mode=S_IFREG|0755, st_size=81, ...}) = 0 open("/var/www/html/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/var/www/html/s3gw.fcgi/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (Not a directory)
open("/var/www/html/s3gw.fcgi", O_RDONLY|O_CLOEXEC) = 14
fcntl(14, F_GETFD)                      = 0x1 (flags FD_CLOEXEC)
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
read(14, "#!/bin/sh\nexec /usr/bin/radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gateway\n", 4096) = 81 stat("/var/www/html/s3gw.fcgi", {st_mode=S_IFREG|0755, st_size=81, ...}) = 0
brk(0x7fab146ce000)                     = 0x7fab146ce000
write(2, "[Mon Jun 16 22:26:03 2014] [warn] FastCGI: 10.30.85.51 GET http://gateway.ceph.chc.tlocal/ auth \n", 97) = 97 stat("/var/run/mod_fastcgi/dynamic/2a13e94a006b7f947a721cf995159615", {st_mode=S_IFSOCK|0600, st_size=0, ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 15
connect(15, {sa_family=AF_FILE, path="/var/run/mod_fastcgi/dynamic/2a13e94a006b7f947a721cf995159615"}, 63) = 0
fcntl(15, F_GETFL)                      = 0x2 (flags O_RDWR)
fcntl(15, F_SETFL, O_RDWR|O_NONBLOCK)   = 0
select(16, [15], [15], NULL, {3, 99784}) = 1 (out [15], left {3, 99781})
write(15, "\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\0\r\0\0\n\1SCRIPT_URL/\1\4\0\1\0+\0\0\n\37SCRIPT_URIhttp://gateway.ceph.chc.tlocal/\1\4\0\1\0\24\0\0\22\0HTTP_AUTHORIZATION\1\4\0\1\0&\0\0\17\25HTTP_USER_AGENTWget/1.15 (linux-gnu)\1\4\0\1\0\20\0\0\v\3HTTP_ACCEPT*/*\1\4\0\1\0\"\0\0\t\27HTTP_HOSTgateway.ceph.chc.tlocal\1\4\0\1\0\33\0\0\17\nHTTP_CONNECTIONKee"..., 841) = 841
select(16, [15], [], NULL, {3, 99624})  = 0 (Timeout)
write(12, "T /var/www/html/s3gw.fcgi 0 0*", 30) = 30
select(16, [15], [], NULL, {2, 996562}) = 0 (Timeout)
write(12, "T /var/www/html/s3gw.fcgi 0 0*", 30) = 30
select(16, [15], [], NULL, {2, 996992}) = 0 (Timeout)
write(12, "T /var/www/html/s3gw.fcgi 0 0*", 30) = 30
select(16, [15], [], NULL, {2, 996700}^C <unfinished ...>
...


This continues until it times out.

and in /var/log/ceph/client.radosgw.gateway.log this repeats as all the other processes start



2014-06-16 22:27:28.411653 7f84742fb820 0 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74), process radosgw, pid 12225 2014-06-16 22:27:28.411668 7f84742fb820 -1 WARNING: libcurl doesn't support curl_multi_wait() 2014-06-16 22:27:28.411672 7f84742fb820 -1 WARNING: cross zone / region transfer performance may be affected 2014-06-16 22:27:28.420286 7f84742fb820 -1 asok(0x8f2fe0) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-client.radosgw.gateway.asok': (17) File exists
2014-06-16 22:27:28.954327 7f84742fb820  0 framework: fastcgi
2014-06-16 22:27:28.954344 7f84742fb820  0 starting handler: fastcgi
2014-06-16 22:27:29.168225 7f84620fb700 0 RGWGC::process() failed to acquire lock on gc.11 2014-06-16 22:27:29.170141 7f84620fb700 0 RGWGC::process() failed to acquire lock on gc.12 2014-06-16 22:27:29.396012 7f84620fb700 0 RGWGC::process() failed to acquire lock on gc.22 2014-06-16 22:27:29.396800 7f8460cf9700 -1 failed to list objects pool_iterate returned r=-2 2014-06-16 22:27:29.396815 7f8460cf9700 0 ERROR: lists_keys_next(): ret=-2 2014-06-16 22:27:29.396821 7f8460cf9700 0 ERROR: sync_all_users() returned ret=-2 2014-06-16 22:27:29.920754 7f84620fb700 0 RGWGC::process() failed to acquire lock on gc.26 2014-06-16 22:27:29.922741 7f84620fb700 0 RGWGC::process() failed to acquire lock on gc.27
...
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to