Re: [ceph-users] OpTracker optimization
Sam/Sage, I saw Giant is forked off today. We need the pull request (https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please merge this into Giant when it will be ready ? Thanks & Regards Somnath -Original Message- From: Samuel Just [mailto:sam.j...@inktank.com] Sent: Thursday, September 11, 2014 11:31 AM To: Somnath Roy Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com Subject: Re: OpTracker optimization Just added it to wip-sam-testing. -Sam On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy wrote: > Sam/Sage, > I have addressed all of your comments and pushed the changes to the same pull > request. > > https://github.com/ceph/ceph/pull/2440 > > Thanks & Regards > Somnath > > -Original Message- > From: Sage Weil [mailto:sw...@redhat.com] > Sent: Wednesday, September 10, 2014 8:33 PM > To: Somnath Roy > Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com > Subject: RE: OpTracker optimization > > I had two substantiative comments on the first patch and then some trivial > whitespace nits.Otherwise looks good! > > tahnks- > sage > > On Thu, 11 Sep 2014, Somnath Roy wrote: > >> Sam/Sage, >> I have incorporated all of your comments. Please have a look at the same >> pull request. >> >> https://github.com/ceph/ceph/pull/2440 >> >> Thanks & Regards >> Somnath >> >> -Original Message- >> From: Samuel Just [mailto:sam.j...@inktank.com] >> Sent: Wednesday, September 10, 2014 3:25 PM >> To: Somnath Roy >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> ceph-users@lists.ceph.com >> Subject: Re: OpTracker optimization >> >> Oh, I changed my mind, your approach is fine. I was unclear. >> Currently, I just need you to address the other comments. >> -Sam >> >> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy wrote: >> > As I understand, you want me to implement the following. >> > >> > 1. Keep this implementation one sharded optracker for the ios going >> > through ms_dispatch path. >> > >> > 2. Additionally, for ios going through ms_fast_dispatch, you want >> > me to implement optracker (without internal shard) per opwq shard >> > >> > Am I right ? >> > >> > Thanks & Regards >> > Somnath >> > >> > -Original Message- >> > From: Samuel Just [mailto:sam.j...@inktank.com] >> > Sent: Wednesday, September 10, 2014 3:08 PM >> > To: Somnath Roy >> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> > ceph-users@lists.ceph.com >> > Subject: Re: OpTracker optimization >> > >> > I don't quite understand. >> > -Sam >> > >> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy >> > wrote: >> >> Thanks Sam. >> >> So, you want me to go with optracker/shadedopWq , right ? >> >> >> >> Regards >> >> Somnath >> >> >> >> -Original Message- >> >> From: Samuel Just [mailto:sam.j...@inktank.com] >> >> Sent: Wednesday, September 10, 2014 2:36 PM >> >> To: Somnath Roy >> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> >> ceph-users@lists.ceph.com >> >> Subject: Re: OpTracker optimization >> >> >> >> Responded with cosmetic nonsense. Once you've got that and the other >> >> comments addressed, I can put it in wip-sam-testing. >> >> -Sam >> >> >> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy >> >> wrote: >> >>> Thanks Sam..I responded back :-) >> >>> >> >>> -Original Message- >> >>> From: ceph-devel-ow...@vger.kernel.org >> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel >> >>> Just >> >>> Sent: Wednesday, September 10, 2014 11:17 AM >> >>> To: Somnath Roy >> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> >>> ceph-users@lists.ceph.com >> >>> Subject: Re: OpTracker optimization >> >>> >> >>> Added a comment about the approach. >> >>> -Sam >> >>> >> >>> On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy >> >>> wrote: >> Hi Sam/Sage, >> >> As we discussed earlier, enabling the present OpTracker code >> degrading performance severely. For example, in my setup a >> single OSD node with >> 10 clients is reaching ~103K read iops with io served from >> memory while optracking is disabled but enabling optracker it is >> reduced to ~39K iops. >> Probably, running OSD without enabling OpTracker is not an >> option for many of Ceph users. >> >> Now, by sharding the Optracker:: ops_in_flight_lock (thus xlist >> ops_in_flight) and removing some other bottlenecks I am able to >> match the performance of OpTracking enabled OSD with OpTracking >> disabled, but with the expense of ~1 extra cpu core. >> >> In this process I have also fixed the following tracker. >> >> >> >> http://tracker.ceph.com/issues/9384 >> >> >> >> and probably http://tracker.ceph.com/issues/8885 too. >> >> >> >> I have created following pull request for the same. Please review it. >> >> >> >> https://github.com/cep
Re: [ceph-users] OpTracker optimization
Hi, as ceph user, It could be wonderfull to have it for Giant, optracker performance impact is really huge (See my ssd benchmark on ceph user mailing) Regards, Alexandre Derumier - Mail original - De: "Somnath Roy" À: "Samuel Just" Cc: "Sage Weil" , ceph-de...@vger.kernel.org, ceph-users@lists.ceph.com Envoyé: Samedi 13 Septembre 2014 10:03:52 Objet: Re: [ceph-users] OpTracker optimization Sam/Sage, I saw Giant is forked off today. We need the pull request (https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please merge this into Giant when it will be ready ? Thanks & Regards Somnath -Original Message- From: Samuel Just [mailto:sam.j...@inktank.com] Sent: Thursday, September 11, 2014 11:31 AM To: Somnath Roy Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com Subject: Re: OpTracker optimization Just added it to wip-sam-testing. -Sam On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy wrote: > Sam/Sage, > I have addressed all of your comments and pushed the changes to the same pull > request. > > https://github.com/ceph/ceph/pull/2440 > > Thanks & Regards > Somnath > > -Original Message- > From: Sage Weil [mailto:sw...@redhat.com] > Sent: Wednesday, September 10, 2014 8:33 PM > To: Somnath Roy > Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com > Subject: RE: OpTracker optimization > > I had two substantiative comments on the first patch and then some trivial > whitespace nits. Otherwise looks good! > > tahnks- > sage > > On Thu, 11 Sep 2014, Somnath Roy wrote: > >> Sam/Sage, >> I have incorporated all of your comments. Please have a look at the same >> pull request. >> >> https://github.com/ceph/ceph/pull/2440 >> >> Thanks & Regards >> Somnath >> >> -Original Message- >> From: Samuel Just [mailto:sam.j...@inktank.com] >> Sent: Wednesday, September 10, 2014 3:25 PM >> To: Somnath Roy >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> ceph-users@lists.ceph.com >> Subject: Re: OpTracker optimization >> >> Oh, I changed my mind, your approach is fine. I was unclear. >> Currently, I just need you to address the other comments. >> -Sam >> >> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy >> wrote: >> > As I understand, you want me to implement the following. >> > >> > 1. Keep this implementation one sharded optracker for the ios going >> > through ms_dispatch path. >> > >> > 2. Additionally, for ios going through ms_fast_dispatch, you want >> > me to implement optracker (without internal shard) per opwq shard >> > >> > Am I right ? >> > >> > Thanks & Regards >> > Somnath >> > >> > -Original Message- >> > From: Samuel Just [mailto:sam.j...@inktank.com] >> > Sent: Wednesday, September 10, 2014 3:08 PM >> > To: Somnath Roy >> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> > ceph-users@lists.ceph.com >> > Subject: Re: OpTracker optimization >> > >> > I don't quite understand. >> > -Sam >> > >> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy >> > wrote: >> >> Thanks Sam. >> >> So, you want me to go with optracker/shadedopWq , right ? >> >> >> >> Regards >> >> Somnath >> >> >> >> -Original Message- >> >> From: Samuel Just [mailto:sam.j...@inktank.com] >> >> Sent: Wednesday, September 10, 2014 2:36 PM >> >> To: Somnath Roy >> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> >> ceph-users@lists.ceph.com >> >> Subject: Re: OpTracker optimization >> >> >> >> Responded with cosmetic nonsense. Once you've got that and the other >> >> comments addressed, I can put it in wip-sam-testing. >> >> -Sam >> >> >> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy >> >> wrote: >> >>> Thanks Sam..I responded back :-) >> >>> >> >>> -Original Message- >> >>> From: ceph-devel-ow...@vger.kernel.org >> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel >> >>> Just >> >>> Sent: Wednesday, September 10, 2014 11:17 AM >> >>> To: Somnath Roy >> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; >> >>> ceph-users@lists.ceph.com >> >>> Subject: Re: OpTracker optimization >> >>> >> >>> Added a comment about the approach. >> >>> -Sam >> >>> >> >>> On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy >> >>> wrote: >> Hi Sam/Sage, >> >> As we discussed earlier, enabling the present OpTracker code >> degrading performance severely. For example, in my setup a >> single OSD node with >> 10 clients is reaching ~103K read iops with io served from >> memory while optracking is disabled but enabling optracker it is >> reduced to ~39K iops. >> Probably, running OSD without enabling OpTracker is not an >> option for many of Ceph users. >> >> Now, by sharding the Optracker:: ops_in_flight_lock (thus xlist >> ops_in_flight) and removing some other bottlenecks I a
Re: [ceph-users] error while installing ceph in cluster node
sorry. that was the wrong log. there was some issue with ceph user while doing yum remotely. So I tried to install ceph using root user. Below is the log [root@ceph-admin ~]# ceph-deploy install ceph-osd [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (1.5.14): /usr/bin/ceph-deploy install ceph-osd [ceph_deploy.install][DEBUG ] Installing stable version firefly on cluster ceph hosts ceph-osd [ceph_deploy.install][DEBUG ] Detecting platform for host ceph-osd ... [ceph-osd][DEBUG ] connected to host: ceph-osd [ceph-osd][DEBUG ] detect platform information from remote host [ceph-osd][DEBUG ] detect machine type [ceph_deploy.install][INFO ] Distro info: CentOS 6.4 Final [ceph-osd][INFO ] installing ceph on ceph-osd [ceph-osd][INFO ] Running command: yum clean all [ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security [ceph-osd][DEBUG ] Cleaning repos: base extras updates [ceph-osd][DEBUG ] Cleaning up Everything [ceph-osd][INFO ] Running command: yum -y install wget [ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security [ceph-osd][WARNIN] You need to be root to perform this command. [ceph-osd][ERROR ] RuntimeError: command returned non-zero exit status: 1 [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y install wget Regards, Subhadip 9741779086 --- On Fri, Sep 12, 2014 at 6:54 PM, Alfredo Deza wrote: > The issue is right there in the logs :) Somehow it couldn't reach that > fedora URL? > > > > On Fri, Sep 12, 2014 at 8:34 AM, Subhadip Bagui wrote: > > Hi, > > > > As you suggested in 2nd option, I created the ssh keys as root user and > > copied the ssh key with other ceph nodes like osd, mon and mds, using > below > > command. > > > > > > [root@ceph-admin ~]# ssh -keygen > > > > [root@ceph-admin ~]# ssh-copy-id root@ceph-osd > > [root@ceph-admin ~]# ssh-copy-id root@ceph-mds > > [root@ceph-admin ~]# ssh-copy-id root@ceph-monitor > > > > And I'm trying to install ceph using below command. But getting error. > > Please let me know what is the issue. > > > > [ceph@ceph-admin ~]$ ceph-deploy install ceph-osd > > > > [ceph_deploy.conf][DEBUG ] found configuration file at: > > /home/ceph/.cephdeploy.conf > > [ceph_deploy.cli][INFO ] Invoked (1.5.14): /usr/bin/ceph-deploy install > > ceph-osd > > [ceph_deploy.install][DEBUG ] Installing stable version firefly on > cluster > > ceph hosts ceph-osd > > [ceph_deploy.install][DEBUG ] Detecting platform for host ceph-osd ... > > [ceph-osd][DEBUG ] connected to host: ceph-osd > > [ceph-osd][DEBUG ] detect platform information from remote host > > [ceph-osd][DEBUG ] detect machine type > > [ceph_deploy.install][INFO ] Distro info: CentOS 6.4 Final > > [ceph-osd][INFO ] installing ceph on ceph-osd > > [ceph-osd][INFO ] Running command: sudo yum clean all > > [ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security > > [ceph-osd][DEBUG ] Cleaning repos: base extras updates > > [ceph-osd][DEBUG ] Cleaning up Everything > > [ceph-osd][DEBUG ] Cleaning up list of fastest mirrors > > [ceph-osd][INFO ] Running command: sudo yum -y install wget > > [ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security > > [ceph-osd][DEBUG ] Determining fastest mirrors > > [ceph-osd][DEBUG ] * base: centos.excellmedia.net > > [ceph-osd][DEBUG ] * extras: centos.excellmedia.net > > [ceph-osd][DEBUG ] * updates: centos.excellmedia.net > > [ceph-osd][DEBUG ] Setting up Install Process > > [ceph-osd][DEBUG ] Package wget-1.12-1.11.el6_5.x86_64 already installed > and > > latest version > > [ceph-osd][DEBUG ] Nothing to do > > [ceph-osd][INFO ] adding EPEL repository > > [ceph-osd][INFO ] Running command: sudo wget > > > http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm > > [ceph-osd][WARNIN] No data was received after 300 seconds, > disconnecting... > > [ceph-osd][INFO ] Running command: sudo rpm -Uvh --replacepkgs > > epel-release-6*.rpm > > [ceph-osd][WARNIN] error: File not found by glob: epel-release-6*.rpm > > [ceph-osd][ERROR ] RuntimeError: command returned non-zero exit status: 1 > > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm -Uvh > > --replacepkgs epel-release-6*.rpm > > > > > > Regards, > > Subhadip > > > > > --- > > > > On Thu, Sep 11, 2014 at 6:29 PM, Alfredo Deza > > wrote: > >> > >> We discourage users from using `root` to call ceph-deploy or to call > >> it with `sudo` for this reason. > >> > >> We have a warning in the docs about it if you are getting started in > >> the Ceph Node Setup section: > >> > >> > http://ceph.com/docs/v0.80.5/start/quick-start-preflight/#ceph-deploy-setup > >> > >> The reason for this is that if you configure ssh to login to the > >> remote server as a non-root user (say us
Re: [ceph-users] OpTracker optimization
On Sat, 13 Sep 2014, Alexandre DERUMIER wrote: > Hi, > as ceph user, It could be wonderfull to have it for Giant, > optracker performance impact is really huge (See my ssd benchmark on ceph > user mailing) Definitely. More importantly, it resolves a few crashes we've observed. It's going through some testing right now, but once that's done it'll go into giant. sage > > Regards, > > Alexandre Derumier > > - Mail original - > > De: "Somnath Roy" > ?: "Samuel Just" > Cc: "Sage Weil" , ceph-de...@vger.kernel.org, > ceph-users@lists.ceph.com > Envoy?: Samedi 13 Septembre 2014 10:03:52 > Objet: Re: [ceph-users] OpTracker optimization > > Sam/Sage, > I saw Giant is forked off today. We need the pull request > (https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please > merge this into Giant when it will be ready ? > > Thanks & Regards > Somnath > > -Original Message- > From: Samuel Just [mailto:sam.j...@inktank.com] > Sent: Thursday, September 11, 2014 11:31 AM > To: Somnath Roy > Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com > Subject: Re: OpTracker optimization > > Just added it to wip-sam-testing. > -Sam > > On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy > wrote: > > Sam/Sage, > > I have addressed all of your comments and pushed the changes to the same > > pull request. > > > > https://github.com/ceph/ceph/pull/2440 > > > > Thanks & Regards > > Somnath > > > > -Original Message- > > From: Sage Weil [mailto:sw...@redhat.com] > > Sent: Wednesday, September 10, 2014 8:33 PM > > To: Somnath Roy > > Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com > > Subject: RE: OpTracker optimization > > > > I had two substantiative comments on the first patch and then some trivial > > whitespace nits. Otherwise looks good! > > > > tahnks- > > sage > > > > On Thu, 11 Sep 2014, Somnath Roy wrote: > > > >> Sam/Sage, > >> I have incorporated all of your comments. Please have a look at the same > >> pull request. > >> > >> https://github.com/ceph/ceph/pull/2440 > >> > >> Thanks & Regards > >> Somnath > >> > >> -Original Message- > >> From: Samuel Just [mailto:sam.j...@inktank.com] > >> Sent: Wednesday, September 10, 2014 3:25 PM > >> To: Somnath Roy > >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> ceph-users@lists.ceph.com > >> Subject: Re: OpTracker optimization > >> > >> Oh, I changed my mind, your approach is fine. I was unclear. > >> Currently, I just need you to address the other comments. > >> -Sam > >> > >> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy > >> wrote: > >> > As I understand, you want me to implement the following. > >> > > >> > 1. Keep this implementation one sharded optracker for the ios going > >> > through ms_dispatch path. > >> > > >> > 2. Additionally, for ios going through ms_fast_dispatch, you want > >> > me to implement optracker (without internal shard) per opwq shard > >> > > >> > Am I right ? > >> > > >> > Thanks & Regards > >> > Somnath > >> > > >> > -Original Message- > >> > From: Samuel Just [mailto:sam.j...@inktank.com] > >> > Sent: Wednesday, September 10, 2014 3:08 PM > >> > To: Somnath Roy > >> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> > ceph-users@lists.ceph.com > >> > Subject: Re: OpTracker optimization > >> > > >> > I don't quite understand. > >> > -Sam > >> > > >> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy > >> > wrote: > >> >> Thanks Sam. > >> >> So, you want me to go with optracker/shadedopWq , right ? > >> >> > >> >> Regards > >> >> Somnath > >> >> > >> >> -Original Message- > >> >> From: Samuel Just [mailto:sam.j...@inktank.com] > >> >> Sent: Wednesday, September 10, 2014 2:36 PM > >> >> To: Somnath Roy > >> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> >> ceph-users@lists.ceph.com > >> >> Subject: Re: OpTracker optimization > >> >> > >> >> Responded with cosmetic nonsense. Once you've got that and the other > >> >> comments addressed, I can put it in wip-sam-testing. > >> >> -Sam > >> >> > >> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy > >> >> wrote: > >> >>> Thanks Sam..I responded back :-) > >> >>> > >> >>> -Original Message- > >> >>> From: ceph-devel-ow...@vger.kernel.org > >> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel > >> >>> Just > >> >>> Sent: Wednesday, September 10, 2014 11:17 AM > >> >>> To: Somnath Roy > >> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> >>> ceph-users@lists.ceph.com > >> >>> Subject: Re: OpTracker optimization > >> >>> > >> >>> Added a comment about the approach. > >> >>> -Sam > >> >>> > >> >>> On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy > >> >>> wrote: > >> Hi Sam/Sage, > >> > >> As we discussed earlier, enabling the present OpTracker code > >>
Re: [ceph-users] OpTracker optimization
Thanks Sage! -Original Message- From: Sage Weil [mailto:sw...@redhat.com] Sent: Saturday, September 13, 2014 7:32 AM To: Alexandre DERUMIER Cc: Somnath Roy; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com; Samuel Just Subject: Re: [ceph-users] OpTracker optimization On Sat, 13 Sep 2014, Alexandre DERUMIER wrote: > Hi, > as ceph user, It could be wonderfull to have it for Giant, optracker > performance impact is really huge (See my ssd benchmark on ceph user > mailing) Definitely. More importantly, it resolves a few crashes we've observed. It's going through some testing right now, but once that's done it'll go into giant. sage > > Regards, > > Alexandre Derumier > > - Mail original - > > De: "Somnath Roy" > ?: "Samuel Just" > Cc: "Sage Weil" , ceph-de...@vger.kernel.org, > ceph-users@lists.ceph.com > Envoy?: Samedi 13 Septembre 2014 10:03:52 > Objet: Re: [ceph-users] OpTracker optimization > > Sam/Sage, > I saw Giant is forked off today. We need the pull request > (https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please > merge this into Giant when it will be ready ? > > Thanks & Regards > Somnath > > -Original Message- > From: Samuel Just [mailto:sam.j...@inktank.com] > Sent: Thursday, September 11, 2014 11:31 AM > To: Somnath Roy > Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com > Subject: Re: OpTracker optimization > > Just added it to wip-sam-testing. > -Sam > > On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy > wrote: > > Sam/Sage, > > I have addressed all of your comments and pushed the changes to the same > > pull request. > > > > https://github.com/ceph/ceph/pull/2440 > > > > Thanks & Regards > > Somnath > > > > -Original Message- > > From: Sage Weil [mailto:sw...@redhat.com] > > Sent: Wednesday, September 10, 2014 8:33 PM > > To: Somnath Roy > > Cc: Samuel Just; ceph-de...@vger.kernel.org; > > ceph-users@lists.ceph.com > > Subject: RE: OpTracker optimization > > > > I had two substantiative comments on the first patch and then some > > trivial whitespace nits. Otherwise looks good! > > > > tahnks- > > sage > > > > On Thu, 11 Sep 2014, Somnath Roy wrote: > > > >> Sam/Sage, > >> I have incorporated all of your comments. Please have a look at the same > >> pull request. > >> > >> https://github.com/ceph/ceph/pull/2440 > >> > >> Thanks & Regards > >> Somnath > >> > >> -Original Message- > >> From: Samuel Just [mailto:sam.j...@inktank.com] > >> Sent: Wednesday, September 10, 2014 3:25 PM > >> To: Somnath Roy > >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> ceph-users@lists.ceph.com > >> Subject: Re: OpTracker optimization > >> > >> Oh, I changed my mind, your approach is fine. I was unclear. > >> Currently, I just need you to address the other comments. > >> -Sam > >> > >> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy > >> wrote: > >> > As I understand, you want me to implement the following. > >> > > >> > 1. Keep this implementation one sharded optracker for the ios going > >> > through ms_dispatch path. > >> > > >> > 2. Additionally, for ios going through ms_fast_dispatch, you want > >> > me to implement optracker (without internal shard) per opwq shard > >> > > >> > Am I right ? > >> > > >> > Thanks & Regards > >> > Somnath > >> > > >> > -Original Message- > >> > From: Samuel Just [mailto:sam.j...@inktank.com] > >> > Sent: Wednesday, September 10, 2014 3:08 PM > >> > To: Somnath Roy > >> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> > ceph-users@lists.ceph.com > >> > Subject: Re: OpTracker optimization > >> > > >> > I don't quite understand. > >> > -Sam > >> > > >> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy > >> > wrote: > >> >> Thanks Sam. > >> >> So, you want me to go with optracker/shadedopWq , right ? > >> >> > >> >> Regards > >> >> Somnath > >> >> > >> >> -Original Message- > >> >> From: Samuel Just [mailto:sam.j...@inktank.com] > >> >> Sent: Wednesday, September 10, 2014 2:36 PM > >> >> To: Somnath Roy > >> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> >> ceph-users@lists.ceph.com > >> >> Subject: Re: OpTracker optimization > >> >> > >> >> Responded with cosmetic nonsense. Once you've got that and the other > >> >> comments addressed, I can put it in wip-sam-testing. > >> >> -Sam > >> >> > >> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy > >> >> wrote: > >> >>> Thanks Sam..I responded back :-) > >> >>> > >> >>> -Original Message- > >> >>> From: ceph-devel-ow...@vger.kernel.org > >> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel > >> >>> Just > >> >>> Sent: Wednesday, September 10, 2014 11:17 AM > >> >>> To: Somnath Roy > >> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > >> >>> ceph-users@lists.ceph.com > >> >>> Subject: Re: OpTracker optimization > >> >>> > >> >>> Added a comment abou
[ceph-users] Cache tier unable to auto flush data to storage tier
Hello Cephers I have created a Cache pool and looks like cache tiering agent is not able to flush/evict data as per defined policy. However when i manually evict / flush data , it migrates data from cache-tier to storage-tier Kindly advice if there is something wrong with policy or anything else i am missing. Ceph Version: 0.80.5 OS : Cent OS 6.4 Cache pool created using the following commands : ceph osd tier add data cache-pool ceph osd tier cache-mode cache-pool writeback ceph osd tier set-overlay data cache-pool ceph osd pool set cache-pool hit_set_type bloom ceph osd pool set cache-pool hit_set_count 1 ceph osd pool set cache-pool hit_set_period 300 ceph osd pool set cache-pool target_max_bytes 1 ceph osd pool set cache-pool target_max_objects 100 ceph osd pool set cache-pool cache_min_flush_age 60 ceph osd pool set cache-pool cache_min_evict_age 60 [root@ceph-node1 ~]# date Sun Sep 14 00:49:59 EEST 2014 [root@ceph-node1 ~]# rados -p data put file1 /etc/hosts [root@ceph-node1 ~]# rados -p data ls [root@ceph-node1 ~]# rados -p cache-pool ls file1 [root@ceph-node1 ~]# [root@ceph-node1 ~]# date Sun Sep 14 00:59:33 EEST 2014 [root@ceph-node1 ~]# rados -p data ls [root@ceph-node1 ~]# [root@ceph-node1 ~]# rados -p cache-pool ls file1 [root@ceph-node1 ~]# [root@ceph-node1 ~]# date Sun Sep 14 01:08:02 EEST 2014 [root@ceph-node1 ~]# rados -p data ls [root@ceph-node1 ~]# rados -p cache-pool ls file1 [root@ceph-node1 ~]# [root@ceph-node1 ~]# rados -p cache-pool cache-flush-evict-all file1 [root@ceph-node1 ~]# [root@ceph-node1 ~]# rados -p data ls file1 [root@ceph-node1 ~]# rados -p cache-pool ls [root@ceph-node1 ~]# Regards Karan Singh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] writing to rbd mapped device produces hang tasks
Hello guys, I've been trying to map an rbd disk to run some testing and I've noticed that while I can successfully read from the rbd image mapped to /dev/rbdX, I am failing to reliably write to it. Sometimes write tests work perfectly well, especially if I am using large block sizes. But often writes hang for a considerable amount of time and I have kernel hang task messages (shown below) in my dmesg. the hang tasks show particularly frequently when using 4K block size. However, with large block sizes writes also sometimes hang, but for sure less frequent I am using simple dd tests like: dd if=/dev/zero of= bs=4K count=250K. I am running firefly on Ubuntu 12.04 on all osd/mon servers. The rbd image is mapped on one of the osd servers. All osd servers are running kernel version 3.15.10-031510-generic . Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.439974] INFO: task jbd2/rbd0-8:3505 blocked for more than 120 seconds. Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.441586] Not tainted 3.15.10-031510-generic #201408132333 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.443022] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444862] jbd2/rbd0-8 D 0003 0 3505 2 0x Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444870] 8803a10a7c48 0002 88007963b288 8803a10a7fd8 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444874] 00014540 00014540 880469f63260 880866969930 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444876] 8803a10a7c58 8803a10a7d88 88034d142100 880848519824 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444879] Call Trace: Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444893] [] schedule+0x29/0x70 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444901] [] jbd2_journal_commit_transaction+0x240/0x1510 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444908] [] ? sched_clock_cpu+0x85/0xc0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444920] [] ? arch_vtime_task_switch+0x8a/0x90 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444923] [] ? vtime_common_task_switch+0x3d/0x50 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444928] [] ? __wake_up_sync+0x20/0x20 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444933] [] ? try_to_del_timer_sync+0x4f/0x70 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444938] [] kjournald2+0xb8/0x240 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444941] [] ? __wake_up_sync+0x20/0x20 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444943] [] ? commit_timeout+0x10/0x10 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444949] [] kthread+0xc9/0xe0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444952] [] ? flush_kthread_worker+0xb0/0xb0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444965] [] ret_from_fork+0x7c/0xb0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444969] [] ? flush_kthread_worker+0xb0/0xb0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.445141] INFO: task dd:21180 blocked for more than 120 seconds. Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.446595] Not tainted 3.15.10-031510-generic #201408132333 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.448070] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449910] dd D 0002 0 21180 19562 0x0002 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449913] 880485a1b5d8 0002 880485a1b5e8 880485a1bfd8 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449916] 00014540 00014540 880341833260 88011086cb90 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449919] 880485a1b5a8 88046fc94e40 88011086cb90 81204da0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449921] Call Trace: Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449927] [] ? __wait_on_buffer+0x30/0x30 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449930] [] schedule+0x29/0x70 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449934] [] io_schedule+0x8f/0xd0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449936] [] sleep_on_buffer+0xe/0x20 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449940] [] __wait_on_bit+0x62/0x90 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449945] [] ? bio_alloc_bioset+0xa0/0x1d0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449947] [] ? __wait_on_buffer+0x30/0x30 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449951] [] out_of_line_wait_on_bit+0x7c/0x90 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449954] [] ? wake_atomic_t_function+0x40/0x40 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449957] [] __wait_on_buffer+0x2e/0x30 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449962] [] ext4_wait_block_bitmap+0xd8/0xe0 Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449969] [] ext4
Re: [ceph-users] Cache tier unable to auto flush data to storage tier
Hi Karan, May be setting the dirty byte ratio (flush) and the full ratio (eviction). Just try to see if it makes any difference - cache_target_dirty_ratio .1 - cache_target_full_ratio .2 Tune the percentage as desired relatively to target_max_bytes and target_max_objects. The first threshold reached will trigger flush or eviction (num objects or num bytes) JC On Sep 13, 2014, at 15:23, Karan Singh wrote: > Hello Cephers > > I have created a Cache pool and looks like cache tiering agent is not able to > flush/evict data as per defined policy. However when i manually evict / flush > data , it migrates data from cache-tier to storage-tier > > Kindly advice if there is something wrong with policy or anything else i am > missing. > > Ceph Version: 0.80.5 > OS : Cent OS 6.4 > > Cache pool created using the following commands : > > ceph osd tier add data cache-pool > ceph osd tier cache-mode cache-pool writeback > ceph osd tier set-overlay data cache-pool > ceph osd pool set cache-pool hit_set_type bloom > ceph osd pool set cache-pool hit_set_count 1 > ceph osd pool set cache-pool hit_set_period 300 > ceph osd pool set cache-pool target_max_bytes 1 > ceph osd pool set cache-pool target_max_objects 100 > ceph osd pool set cache-pool cache_min_flush_age 60 > ceph osd pool set cache-pool cache_min_evict_age 60 > > > [root@ceph-node1 ~]# date > Sun Sep 14 00:49:59 EEST 2014 > [root@ceph-node1 ~]# rados -p data put file1 /etc/hosts > [root@ceph-node1 ~]# rados -p data ls > [root@ceph-node1 ~]# rados -p cache-pool ls > file1 > [root@ceph-node1 ~]# > > > [root@ceph-node1 ~]# date > Sun Sep 14 00:59:33 EEST 2014 > [root@ceph-node1 ~]# rados -p data ls > [root@ceph-node1 ~]# > [root@ceph-node1 ~]# rados -p cache-pool ls > file1 > [root@ceph-node1 ~]# > > > [root@ceph-node1 ~]# date > Sun Sep 14 01:08:02 EEST 2014 > [root@ceph-node1 ~]# rados -p data ls > [root@ceph-node1 ~]# rados -p cache-pool ls > file1 > [root@ceph-node1 ~]# > > > > [root@ceph-node1 ~]# rados -p cache-pool cache-flush-evict-all > file1 > [root@ceph-node1 ~]# > [root@ceph-node1 ~]# rados -p data ls > file1 > [root@ceph-node1 ~]# rados -p cache-pool ls > [root@ceph-node1 ~]# > > > Regards > Karan Singh > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)
David Moreau Simard writes: > > Hi, > > Trying to update my continuous integration environment.. same deployment method with the following specs: > - Ubuntu Precise, Kernel 3.2, Emperor (0.72.2) - Yields a successful, healthy cluster. > - Ubuntu Trusty, Kernel 3.13, Firefly (0.80.5) - I have stuck placement groups. > > Here’s some relevant bits from the Trusty/Firefly setup before I move on to what I’ve done/tried: > http://pastebin.com/eqQTHcxU <— This was about halfway through PG healing. > > So, the setup is three monitors, two other hosts on which there are 9 OSDs each. > At the beginning, all my placement groups were stuck unclean. > > I tried the easy things first: > - set crush tunables to optimal > - run repairs/scrub on OSDs > - restart OSDs > > Nothing happened. All ~12000 PGs remained stuck unclean since forever active+remapped. > Next, I played with the crush map. I deleted the default replicated_ruleset rule and created a (basic) rule > for each pool for the time being. > I set the pools to use their respective rule and also reduced their size to 2 and min_size to 1. > > Still nothing, all PGs stuck. > I’m not sure why but I tried setting the crush tunables to legacy - I guess in a trial and error attempt. > > Half my PGs healed almost immediately. 6082 PGs remained in active+remapped. > I try running scrubs/repairs - it won’t heal the other half. I set the tunables back to optimal, still nothing. > > I set tunables to legacy again and most of them end up healing with only 1335 left in active+remapped. > > The remainder of the PGs healed when I restarted the OSDs. > > Does anyone have a clue why this happened ? > It looks like switching back and forth between tunables fixed the stuck PGs ? > > I can easily reproduce this if anyone wants more info. > > Let me know ! > -- > David Moreau Simard > I recently encountered the exact same problem. I have been working on a new cloud deployment using vagrant to simulate the physical hosts. I have 4 hosts, each is both a mon and osd for testing purposes. System details: Ubuntu Trusty (14.04) Kernel 3.13 Firefly 0.80.5 On deployment of a new cluster, all of my pgs were stuck (HEALTH_WARN 320 pgs incomplete; 320 pgs stuck inactive; 320 pgs stuck unclean). I tried a ton of recommended processes for getting them working and nothing could get them to budge. I did `ceph osd crush tunables legacy` and all 320 pgs went from stuck to active. This is definitely repeatable as I can deploy a new cluster with vagrant/puppet and this happens every time. So, thank you for posting this work-around. Peter ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com