Good day! Is it possible to change frontend to something different then Apache? For example Nginx.
Regards, Artem Silenkov 2013/11/30 Sebastian <webmas...@mailz.de> > Hi Yehuda, > > > > It's interesting, the responses are received but seems that they > > aren't being handled (hence the following pings). There are a few > > things that you could look at. First, try to connect to the admin > > socket and see if you get any useful information from there. This > > could include in-flight requests, look for other requests that have > > not completed. Also see if there's indication for requests throttling. > > Do you refer to the methods mentioned here? > http://ceph.com/docs/dumpling/radosgw/troubleshooting/? > Unfortunately the socket file is not present. Do i have to activate it in > the config somehow? I could not find any reference to that in the docs. Is > it already included in my radosgw version? > radosgw -v > ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7) > > > Another thing to look at would be at the seemingly unrelated timeout > > messages. These should not happen and might indicate that there's > > something that is holding you up that shouldn't. Try searching for the > > same thread id that is specified in these messages (omit the 0x > > prefix), and see what's the last thing that it's doing. > > I checked that: > http://pastebin.com/Z23PWwjt > i do not see anything unusual before the messages happen, but maybe you > see something odd. > > > > You could also try turning on also 'debug objecter = 20', see if it > > provides more info (it's very verbose though). > > > > Did that, but that is way to verbose for me ;) I uploaded it here: > http://pastebin.com/VBPAVP6z > There might be some requests mixed into it, but the one for > cdn/52974400c6dd6ca719000004/source.avi is the one that stalled. > > > How much are you loading the gateway before that happens? We've seen a > > similar issue in the past that was related to the fcgi library that is > > dynamically linked with the radosgw process (that is, not the apache > > mod_fastcgi module). This, however, would only happen when there's > > heavy load and the fd numbers handled by the radosgw surpassed 1024 > > (buggy library that was using select() instead of poll()). > > There are not that many requests on the Storage, maybe 10-20 req/min. The > cluster serves as a source for a CDN, so once the resource is fetched it > should not be fetched again soon. I checked for the open files, and there > are only about 10-20 open file handles for the radosgw process. So this > probably is not the issue. > > Sebastian > > > > > > Yehuda > > > > On Fri, Nov 29, 2013 at 7:28 AM, Sebastian <webmas...@mailz.de> wrote: > >> Hi, > >> > >> thanks for the hint. I tried this again and noticed that the time out > message does seem to be unrelated. Here is the log file for a stalling > request with debug turned on: > >> http://pastebin.com/DcQuc9wP > >> > >> I really cannot really find a real "error" in the log. The download > stalls at about 500kb at that point though. Restarting radosgw fixes it for > 1 download only, the next one is broken again. But as i said this does not > happen for all files. > >> > >> Sebastian > >> > >> On 27.11.2013, at 21:53, Yehuda Sadeh wrote: > >> > >>> On Wed, Nov 27, 2013 at 4:46 AM, Sebastian <webmas...@mailz.de> wrote: > >>>> Hi, > >>>> > >>>> we have a setup of 4 Servers running ceph and radosgw. We use it as > an internal S3 service for our files. The Servers run Debian Squeeze with > Ceph 0.67.4. > >>>> > >>>> The cluster has been running smoothly for quite a while, but we are > currently experiencing issues with the radosgw. For some files the HTTP > Download just stalls at around 500kb. > >>>> > >>>> The Apache error log just says: > >>>> [error] [client ] FastCGI: comm with server "/var/www/s3gw.fcgi" > aborted: idle timeout (30 sec) > >>>> [error] [client ] Handler for fastcgi-script returned invalid result > code 1 > >>>> > >>>> radosgw logging: > >>>> 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread > 0x7f00934bb700' had timed out after 600 > >>>> 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread > 0x7f00ab4eb700' had timed out after 600 > >>>> > >>>> The interesting thing is that the cluster health is fine an only some > files are not working properly. Most of them just work fine. A restart of > radosgw fixes the issue. The other ceph logs are also clean. > >>>> > >>>> Any idea why this happens? > >>>> > >>> > >>> No, but you can turn on 'debug ms = 1' on your gateway ceph.conf, and > >>> that might give some better indication. > >>> > >>> Yehuda > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com