On 2014-06-17 07:30, John Wilkins wrote:
You followed this intallation guide:
http://ceph.com/docs/master/install/install-ceph-gateway/ [16]
An then you, followed this http://ceph.com/docs/master/radosgw/config/
[1] configuration guide and then you executed:
sudo /etc/init.d/ceph-radosgw start
And there was no ceph-radosgw script? We need to verify that first,
and file a bug if we're not getting an init script in CentOS packages.
I took a look again, and the package I had installed seemed to have come
from epel, and did not contain the init script. I started from scratch
with a minimal install of centos6 that hadn't been used for anything
else. The package from the ceph repo does indeed have the init script.
Unfortunately, I'm still running into the same issue. I removed all the
rgw pools, started ceph-radosgw, and it recreated a few of them:
.rgw.root
.rgw.control
.rgw
.rgw.gc
.users.uid
Manually creating the rest of them has no effect. It complains about
aquiring locks and listing objects:
2014-06-17 00:31:45.150494 7f86ec450820 0 ceph version 0.80.1
(a38fe1169b6d2ac98b427334c12d7cf81f809b74), process radosgw, pid 1704
2014-06-17 00:31:45.150556 7f86ec450820 -1 WARNING: libcurl doesn't
support curl_multi_wait()
2014-06-17 00:31:45.150590 7f86ec450820 -1 WARNING: cross zone / region
transfer performance may be affected
2014-06-17 00:32:02.469894 7f86ec450820 0 framework: fastcgi
2014-06-17 00:32:02.469958 7f86ec450820 0 starting handler: fastcgi
2014-06-17 00:32:13.455904 7f86bffff700 -1 failed to list objects
pool_iterate returned r=-2
2014-06-17 00:32:13.455918 7f86bffff700 0 ERROR: lists_keys_next():
ret=-2
2014-06-17 00:32:13.455924 7f86bffff700 0 ERROR: sync_all_users()
returned ret=-2
2014-06-17 00:32:13.611812 7f86d95f9700 0 RGWGC::process() failed to
acquire lock on gc.16
2014-06-17 00:32:14.105180 7f86d95f9700 0 RGWGC::process() failed to
acquire lock on gc.0
If I make a request, the server eventually fills up with so many radosgw
process that the apache user can no longer fork any new processes.
This is an strace from apache:
read(13, "GET / HTTP/1.1\r\nUser-Agent: Wget/1.15 (linux-gnu)\r\nAccept:
*/*\r\nHost: gateway.ceph.chc.tlocal\r\nConnection: Keep-Alive\r\n\r\n",
8000) = 121
stat("/s3gw.fcgi", 0x7fffcc662580) = -1 ENOENT (No such file or
directory)
stat("/var/www/html/s3gw.fcgi", {st_mode=S_IFREG|0755, st_size=81, ...})
= 0
open("/var/www/html/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such
file or directory)
open("/var/www/html/s3gw.fcgi/.htaccess", O_RDONLY|O_CLOEXEC) = -1
ENOTDIR (Not a directory)
open("/var/www/html/s3gw.fcgi", O_RDONLY|O_CLOEXEC) = 14
fcntl(14, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fcntl(14, F_SETFD, FD_CLOEXEC) = 0
read(14, "#!/bin/sh\nexec /usr/bin/radosgw -c /etc/ceph/ceph.conf -n
client.radosgw.gateway\n", 4096) = 81
stat("/var/www/html/s3gw.fcgi", {st_mode=S_IFREG|0755, st_size=81, ...})
= 0
brk(0x7fab146ce000) = 0x7fab146ce000
write(2, "[Mon Jun 16 22:26:03 2014] [warn] FastCGI: 10.30.85.51 GET
http://gateway.ceph.chc.tlocal/ auth \n", 97) = 97
stat("/var/run/mod_fastcgi/dynamic/2a13e94a006b7f947a721cf995159615",
{st_mode=S_IFSOCK|0600, st_size=0, ...}) = 0
socket(PF_FILE, SOCK_STREAM, 0) = 15
connect(15, {sa_family=AF_FILE,
path="/var/run/mod_fastcgi/dynamic/2a13e94a006b7f947a721cf995159615"},
63) = 0
fcntl(15, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(15, F_SETFL, O_RDWR|O_NONBLOCK) = 0
select(16, [15], [15], NULL, {3, 99784}) = 1 (out [15], left {3, 99781})
write(15,
"\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\0\r\0\0\n\1SCRIPT_URL/\1\4\0\1\0+\0\0\n\37SCRIPT_URIhttp://gateway.ceph.chc.tlocal/\1\4\0\1\0\24\0\0\22\0HTTP_AUTHORIZATION\1\4\0\1\0&\0\0\17\25HTTP_USER_AGENTWget/1.15
(linux-gnu)\1\4\0\1\0\20\0\0\v\3HTTP_ACCEPT*/*\1\4\0\1\0\"\0\0\t\27HTTP_HOSTgateway.ceph.chc.tlocal\1\4\0\1\0\33\0\0\17\nHTTP_CONNECTIONKee"...,
841) = 841
select(16, [15], [], NULL, {3, 99624}) = 0 (Timeout)
write(12, "T /var/www/html/s3gw.fcgi 0 0*", 30) = 30
select(16, [15], [], NULL, {2, 996562}) = 0 (Timeout)
write(12, "T /var/www/html/s3gw.fcgi 0 0*", 30) = 30
select(16, [15], [], NULL, {2, 996992}) = 0 (Timeout)
write(12, "T /var/www/html/s3gw.fcgi 0 0*", 30) = 30
select(16, [15], [], NULL, {2, 996700}^C <unfinished ...>
...
This continues until it times out.
and in /var/log/ceph/client.radosgw.gateway.log this repeats as all the
other processes start
2014-06-16 22:27:28.411653 7f84742fb820 0 ceph version 0.80.1
(a38fe1169b6d2ac98b427334c12d7cf81f809b74), process radosgw, pid 12225
2014-06-16 22:27:28.411668 7f84742fb820 -1 WARNING: libcurl doesn't
support curl_multi_wait()
2014-06-16 22:27:28.411672 7f84742fb820 -1 WARNING: cross zone / region
transfer performance may be affected
2014-06-16 22:27:28.420286 7f84742fb820 -1 asok(0x8f2fe0)
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed
to bind the UNIX domain socket to
'/var/run/ceph/ceph-client.radosgw.gateway.asok': (17) File exists
2014-06-16 22:27:28.954327 7f84742fb820 0 framework: fastcgi
2014-06-16 22:27:28.954344 7f84742fb820 0 starting handler: fastcgi
2014-06-16 22:27:29.168225 7f84620fb700 0 RGWGC::process() failed to
acquire lock on gc.11
2014-06-16 22:27:29.170141 7f84620fb700 0 RGWGC::process() failed to
acquire lock on gc.12
2014-06-16 22:27:29.396012 7f84620fb700 0 RGWGC::process() failed to
acquire lock on gc.22
2014-06-16 22:27:29.396800 7f8460cf9700 -1 failed to list objects
pool_iterate returned r=-2
2014-06-16 22:27:29.396815 7f8460cf9700 0 ERROR: lists_keys_next():
ret=-2
2014-06-16 22:27:29.396821 7f8460cf9700 0 ERROR: sync_all_users()
returned ret=-2
2014-06-16 22:27:29.920754 7f84620fb700 0 RGWGC::process() failed to
acquire lock on gc.26
2014-06-16 22:27:29.922741 7f84620fb700 0 RGWGC::process() failed to
acquire lock on gc.27
...
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com