date:20140913

Re: [ceph-users] OpTracker optimization

2014-09-13 Thread Somnath Roy

Sam/Sage,
I saw Giant is forked off today. We need the pull request 
(https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please 
merge this into Giant when it will be ready ?

Thanks & Regards
Somnath

-Original Message-
From: Samuel Just [mailto:sam.j...@inktank.com] 
Sent: Thursday, September 11, 2014 11:31 AM
To: Somnath Roy
Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
Subject: Re: OpTracker optimization

Just added it to wip-sam-testing.
-Sam

On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy  wrote:
> Sam/Sage,
> I have addressed all of your comments and pushed the changes to the same pull 
> request.
>
> https://github.com/ceph/ceph/pull/2440
>
> Thanks & Regards
> Somnath
>
> -Original Message-
> From: Sage Weil [mailto:sw...@redhat.com]
> Sent: Wednesday, September 10, 2014 8:33 PM
> To: Somnath Roy
> Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Subject: RE: OpTracker optimization
>
> I had two substantiative comments on the first patch and then some trivial
> whitespace nits.Otherwise looks good!
>
> tahnks-
> sage
>
> On Thu, 11 Sep 2014, Somnath Roy wrote:
>
>> Sam/Sage,
>> I have incorporated all of your comments. Please have a look at the same 
>> pull request.
>>
>> https://github.com/ceph/ceph/pull/2440
>>
>> Thanks & Regards
>> Somnath
>>
>> -Original Message-
>> From: Samuel Just [mailto:sam.j...@inktank.com]
>> Sent: Wednesday, September 10, 2014 3:25 PM
>> To: Somnath Roy
>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> ceph-users@lists.ceph.com
>> Subject: Re: OpTracker optimization
>>
>> Oh, I changed my mind, your approach is fine.  I was unclear.
>> Currently, I just need you to address the other comments.
>> -Sam
>>
>> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy  wrote:
>> > As I understand, you want me to implement the following.
>> >
>> > 1.  Keep this implementation one sharded optracker for the ios going 
>> > through ms_dispatch path.
>> >
>> > 2. Additionally, for ios going through ms_fast_dispatch, you want 
>> > me to implement optracker (without internal shard) per opwq shard
>> >
>> > Am I right ?
>> >
>> > Thanks & Regards
>> > Somnath
>> >
>> > -Original Message-
>> > From: Samuel Just [mailto:sam.j...@inktank.com]
>> > Sent: Wednesday, September 10, 2014 3:08 PM
>> > To: Somnath Roy
>> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> > ceph-users@lists.ceph.com
>> > Subject: Re: OpTracker optimization
>> >
>> > I don't quite understand.
>> > -Sam
>> >
>> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy  
>> > wrote:
>> >> Thanks Sam.
>> >> So, you want me to go with optracker/shadedopWq , right ?
>> >>
>> >> Regards
>> >> Somnath
>> >>
>> >> -Original Message-
>> >> From: Samuel Just [mailto:sam.j...@inktank.com]
>> >> Sent: Wednesday, September 10, 2014 2:36 PM
>> >> To: Somnath Roy
>> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> >> ceph-users@lists.ceph.com
>> >> Subject: Re: OpTracker optimization
>> >>
>> >> Responded with cosmetic nonsense.  Once you've got that and the other 
>> >> comments addressed, I can put it in wip-sam-testing.
>> >> -Sam
>> >>
>> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy  
>> >> wrote:
>> >>> Thanks Sam..I responded back :-)
>> >>>
>> >>> -Original Message-
>> >>> From: ceph-devel-ow...@vger.kernel.org 
>> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel 
>> >>> Just
>> >>> Sent: Wednesday, September 10, 2014 11:17 AM
>> >>> To: Somnath Roy
>> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> >>> ceph-users@lists.ceph.com
>> >>> Subject: Re: OpTracker optimization
>> >>>
>> >>> Added a comment about the approach.
>> >>> -Sam
>> >>>
>> >>> On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy  
>> >>> wrote:
>>  Hi Sam/Sage,
>> 
>>  As we discussed earlier, enabling the present OpTracker code 
>>  degrading performance severely. For example, in my setup a 
>>  single OSD node with
>>  10 clients is reaching ~103K read iops with io served from 
>>  memory while optracking is disabled but enabling optracker it is 
>>  reduced to ~39K iops.
>>  Probably, running OSD without enabling OpTracker is not an 
>>  option for many of Ceph users.
>> 
>>  Now, by sharding the Optracker:: ops_in_flight_lock (thus xlist
>>  ops_in_flight) and removing some other bottlenecks I am able to 
>>  match the performance of OpTracking enabled OSD with OpTracking 
>>  disabled, but with the expense of ~1 extra cpu core.
>> 
>>  In this process I have also fixed the following tracker.
>> 
>> 
>> 
>>  http://tracker.ceph.com/issues/9384
>> 
>> 
>> 
>>  and probably http://tracker.ceph.com/issues/8885 too.
>> 
>> 
>> 
>>  I have created following pull request for the same. Please review it.
>> 
>> 
>> 
>>  https://github.com/cep

Re: [ceph-users] OpTracker optimization

2014-09-13 Thread Alexandre DERUMIER

Hi,
as ceph user, It could be wonderfull to have it for Giant,
optracker performance impact is really huge (See my ssd benchmark on ceph user 
mailing)

Regards,

Alexandre Derumier

- Mail original - 

De: "Somnath Roy"  
À: "Samuel Just"  
Cc: "Sage Weil" , ceph-de...@vger.kernel.org, 
ceph-users@lists.ceph.com 
Envoyé: Samedi 13 Septembre 2014 10:03:52 
Objet: Re: [ceph-users] OpTracker optimization 

Sam/Sage, 
I saw Giant is forked off today. We need the pull request 
(https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please 
merge this into Giant when it will be ready ? 

Thanks & Regards 
Somnath 

-Original Message- 
From: Samuel Just [mailto:sam.j...@inktank.com] 
Sent: Thursday, September 11, 2014 11:31 AM 
To: Somnath Roy 
Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com 
Subject: Re: OpTracker optimization 

Just added it to wip-sam-testing. 
-Sam 

On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy  wrote: 
> Sam/Sage, 
> I have addressed all of your comments and pushed the changes to the same pull 
> request. 
> 
> https://github.com/ceph/ceph/pull/2440 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Sage Weil [mailto:sw...@redhat.com] 
> Sent: Wednesday, September 10, 2014 8:33 PM 
> To: Somnath Roy 
> Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com 
> Subject: RE: OpTracker optimization 
> 
> I had two substantiative comments on the first patch and then some trivial 
> whitespace nits. Otherwise looks good! 
> 
> tahnks- 
> sage 
> 
> On Thu, 11 Sep 2014, Somnath Roy wrote: 
> 
>> Sam/Sage, 
>> I have incorporated all of your comments. Please have a look at the same 
>> pull request. 
>> 
>> https://github.com/ceph/ceph/pull/2440 
>> 
>> Thanks & Regards 
>> Somnath 
>> 
>> -Original Message- 
>> From: Samuel Just [mailto:sam.j...@inktank.com] 
>> Sent: Wednesday, September 10, 2014 3:25 PM 
>> To: Somnath Roy 
>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> ceph-users@lists.ceph.com 
>> Subject: Re: OpTracker optimization 
>> 
>> Oh, I changed my mind, your approach is fine. I was unclear. 
>> Currently, I just need you to address the other comments. 
>> -Sam 
>> 
>> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy  
>> wrote: 
>> > As I understand, you want me to implement the following. 
>> > 
>> > 1. Keep this implementation one sharded optracker for the ios going 
>> > through ms_dispatch path. 
>> > 
>> > 2. Additionally, for ios going through ms_fast_dispatch, you want 
>> > me to implement optracker (without internal shard) per opwq shard 
>> > 
>> > Am I right ? 
>> > 
>> > Thanks & Regards 
>> > Somnath 
>> > 
>> > -Original Message- 
>> > From: Samuel Just [mailto:sam.j...@inktank.com] 
>> > Sent: Wednesday, September 10, 2014 3:08 PM 
>> > To: Somnath Roy 
>> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> > ceph-users@lists.ceph.com 
>> > Subject: Re: OpTracker optimization 
>> > 
>> > I don't quite understand. 
>> > -Sam 
>> > 
>> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy  
>> > wrote: 
>> >> Thanks Sam. 
>> >> So, you want me to go with optracker/shadedopWq , right ? 
>> >> 
>> >> Regards 
>> >> Somnath 
>> >> 
>> >> -Original Message- 
>> >> From: Samuel Just [mailto:sam.j...@inktank.com] 
>> >> Sent: Wednesday, September 10, 2014 2:36 PM 
>> >> To: Somnath Roy 
>> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> >> ceph-users@lists.ceph.com 
>> >> Subject: Re: OpTracker optimization 
>> >> 
>> >> Responded with cosmetic nonsense. Once you've got that and the other 
>> >> comments addressed, I can put it in wip-sam-testing. 
>> >> -Sam 
>> >> 
>> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy  
>> >> wrote: 
>> >>> Thanks Sam..I responded back :-) 
>> >>> 
>> >>> -Original Message- 
>> >>> From: ceph-devel-ow...@vger.kernel.org 
>> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel 
>> >>> Just 
>> >>> Sent: Wednesday, September 10, 2014 11:17 AM 
>> >>> To: Somnath Roy 
>> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
>> >>> ceph-users@lists.ceph.com 
>> >>> Subject: Re: OpTracker optimization 
>> >>> 
>> >>> Added a comment about the approach. 
>> >>> -Sam 
>> >>> 
>> >>> On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy  
>> >>> wrote: 
>>  Hi Sam/Sage, 
>>  
>>  As we discussed earlier, enabling the present OpTracker code 
>>  degrading performance severely. For example, in my setup a 
>>  single OSD node with 
>>  10 clients is reaching ~103K read iops with io served from 
>>  memory while optracking is disabled but enabling optracker it is 
>>  reduced to ~39K iops. 
>>  Probably, running OSD without enabling OpTracker is not an 
>>  option for many of Ceph users. 
>>  
>>  Now, by sharding the Optracker:: ops_in_flight_lock (thus xlist 
>>  ops_in_flight) and removing some other bottlenecks I a

Re: [ceph-users] error while installing ceph in cluster node

2014-09-13 Thread Subhadip Bagui

sorry. that was the wrong log. there was some issue with ceph user while
doing yum remotely. So I tried to install ceph using root user. Below is
the log

[root@ceph-admin ~]# ceph-deploy install ceph-osd
[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.14): /usr/bin/ceph-deploy install
ceph-osd
[ceph_deploy.install][DEBUG ] Installing stable version firefly on cluster
ceph hosts ceph-osd
[ceph_deploy.install][DEBUG ] Detecting platform for host ceph-osd ...
[ceph-osd][DEBUG ] connected to host: ceph-osd
[ceph-osd][DEBUG ] detect platform information from remote host
[ceph-osd][DEBUG ] detect machine type
[ceph_deploy.install][INFO  ] Distro info: CentOS 6.4 Final
[ceph-osd][INFO  ] installing ceph on ceph-osd
[ceph-osd][INFO  ] Running command: yum clean all
[ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security
[ceph-osd][DEBUG ] Cleaning repos: base extras updates
[ceph-osd][DEBUG ] Cleaning up Everything
[ceph-osd][INFO  ] Running command: yum -y install wget
[ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security
[ceph-osd][WARNIN] You need to be root to perform this command.
[ceph-osd][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y
install wget


Regards,
Subhadip
9741779086
---

On Fri, Sep 12, 2014 at 6:54 PM, Alfredo Deza 
wrote:

> The issue is right there in the logs :) Somehow it couldn't reach that
> fedora URL?
>
>
>
> On Fri, Sep 12, 2014 at 8:34 AM, Subhadip Bagui  wrote:
> > Hi,
> >
> > As you suggested in 2nd option, I created the ssh keys as root user and
> > copied the ssh key with other ceph nodes like osd, mon and mds, using
> below
> > command.
> >
> >
> > [root@ceph-admin ~]# ssh -keygen
> >
> > [root@ceph-admin ~]# ssh-copy-id root@ceph-osd
> > [root@ceph-admin ~]# ssh-copy-id root@ceph-mds
> > [root@ceph-admin ~]# ssh-copy-id root@ceph-monitor
> >
> > And I'm trying to install ceph using below command. But getting error.
> > Please let me know what is the issue.
> >
> > [ceph@ceph-admin ~]$ ceph-deploy install ceph-osd
> >
> > [ceph_deploy.conf][DEBUG ] found configuration file at:
> > /home/ceph/.cephdeploy.conf
> > [ceph_deploy.cli][INFO  ] Invoked (1.5.14): /usr/bin/ceph-deploy install
> > ceph-osd
> > [ceph_deploy.install][DEBUG ] Installing stable version firefly on
> cluster
> > ceph hosts ceph-osd
> > [ceph_deploy.install][DEBUG ] Detecting platform for host ceph-osd ...
> > [ceph-osd][DEBUG ] connected to host: ceph-osd
> > [ceph-osd][DEBUG ] detect platform information from remote host
> > [ceph-osd][DEBUG ] detect machine type
> > [ceph_deploy.install][INFO  ] Distro info: CentOS 6.4 Final
> > [ceph-osd][INFO  ] installing ceph on ceph-osd
> > [ceph-osd][INFO  ] Running command: sudo yum clean all
> > [ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security
> > [ceph-osd][DEBUG ] Cleaning repos: base extras updates
> > [ceph-osd][DEBUG ] Cleaning up Everything
> > [ceph-osd][DEBUG ] Cleaning up list of fastest mirrors
> > [ceph-osd][INFO  ] Running command: sudo yum -y install wget
> > [ceph-osd][DEBUG ] Loaded plugins: fastestmirror, security
> > [ceph-osd][DEBUG ] Determining fastest mirrors
> > [ceph-osd][DEBUG ]  * base: centos.excellmedia.net
> > [ceph-osd][DEBUG ]  * extras: centos.excellmedia.net
> > [ceph-osd][DEBUG ]  * updates: centos.excellmedia.net
> > [ceph-osd][DEBUG ] Setting up Install Process
> > [ceph-osd][DEBUG ] Package wget-1.12-1.11.el6_5.x86_64 already installed
> and
> > latest version
> > [ceph-osd][DEBUG ] Nothing to do
> > [ceph-osd][INFO  ] adding EPEL repository
> > [ceph-osd][INFO  ] Running command: sudo wget
> >
> http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
> > [ceph-osd][WARNIN] No data was received after 300 seconds,
> disconnecting...
> > [ceph-osd][INFO  ] Running command: sudo rpm -Uvh --replacepkgs
> > epel-release-6*.rpm
> > [ceph-osd][WARNIN] error: File not found by glob: epel-release-6*.rpm
> > [ceph-osd][ERROR ] RuntimeError: command returned non-zero exit status: 1
> > [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm -Uvh
> > --replacepkgs epel-release-6*.rpm
> >
> >
> > Regards,
> > Subhadip
> >
> >
> ---
> >
> > On Thu, Sep 11, 2014 at 6:29 PM, Alfredo Deza 
> > wrote:
> >>
> >> We discourage users from using `root` to call ceph-deploy or to call
> >> it with `sudo` for this reason.
> >>
> >> We have a warning in the docs about it if you are getting started in
> >> the Ceph Node Setup section:
> >>
> >>
> http://ceph.com/docs/v0.80.5/start/quick-start-preflight/#ceph-deploy-setup
> >>
> >> The reason for this is that if you configure ssh to login to the
> >> remote server as a non-root user (say us

Re: [ceph-users] OpTracker optimization

2014-09-13 Thread Sage Weil

On Sat, 13 Sep 2014, Alexandre DERUMIER wrote:
> Hi,
> as ceph user, It could be wonderfull to have it for Giant,
> optracker performance impact is really huge (See my ssd benchmark on ceph 
> user mailing)

Definitely.  More importantly, it resolves a few crashes we've observed. 
It's going through some testing right now, but once that's done it'll go 
into giant.

sage


> 
> Regards,
> 
> Alexandre Derumier
> 
> - Mail original - 
> 
> De: "Somnath Roy"  
> ?: "Samuel Just"  
> Cc: "Sage Weil" , ceph-de...@vger.kernel.org, 
> ceph-users@lists.ceph.com 
> Envoy?: Samedi 13 Septembre 2014 10:03:52 
> Objet: Re: [ceph-users] OpTracker optimization 
> 
> Sam/Sage, 
> I saw Giant is forked off today. We need the pull request 
> (https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please 
> merge this into Giant when it will be ready ? 
> 
> Thanks & Regards 
> Somnath 
> 
> -Original Message- 
> From: Samuel Just [mailto:sam.j...@inktank.com] 
> Sent: Thursday, September 11, 2014 11:31 AM 
> To: Somnath Roy 
> Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com 
> Subject: Re: OpTracker optimization 
> 
> Just added it to wip-sam-testing. 
> -Sam 
> 
> On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy  
> wrote: 
> > Sam/Sage, 
> > I have addressed all of your comments and pushed the changes to the same 
> > pull request. 
> > 
> > https://github.com/ceph/ceph/pull/2440 
> > 
> > Thanks & Regards 
> > Somnath 
> > 
> > -Original Message- 
> > From: Sage Weil [mailto:sw...@redhat.com] 
> > Sent: Wednesday, September 10, 2014 8:33 PM 
> > To: Somnath Roy 
> > Cc: Samuel Just; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com 
> > Subject: RE: OpTracker optimization 
> > 
> > I had two substantiative comments on the first patch and then some trivial 
> > whitespace nits. Otherwise looks good! 
> > 
> > tahnks- 
> > sage 
> > 
> > On Thu, 11 Sep 2014, Somnath Roy wrote: 
> > 
> >> Sam/Sage, 
> >> I have incorporated all of your comments. Please have a look at the same 
> >> pull request. 
> >> 
> >> https://github.com/ceph/ceph/pull/2440 
> >> 
> >> Thanks & Regards 
> >> Somnath 
> >> 
> >> -Original Message- 
> >> From: Samuel Just [mailto:sam.j...@inktank.com] 
> >> Sent: Wednesday, September 10, 2014 3:25 PM 
> >> To: Somnath Roy 
> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> ceph-users@lists.ceph.com 
> >> Subject: Re: OpTracker optimization 
> >> 
> >> Oh, I changed my mind, your approach is fine. I was unclear. 
> >> Currently, I just need you to address the other comments. 
> >> -Sam 
> >> 
> >> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy  
> >> wrote: 
> >> > As I understand, you want me to implement the following. 
> >> > 
> >> > 1. Keep this implementation one sharded optracker for the ios going 
> >> > through ms_dispatch path. 
> >> > 
> >> > 2. Additionally, for ios going through ms_fast_dispatch, you want 
> >> > me to implement optracker (without internal shard) per opwq shard 
> >> > 
> >> > Am I right ? 
> >> > 
> >> > Thanks & Regards 
> >> > Somnath 
> >> > 
> >> > -Original Message- 
> >> > From: Samuel Just [mailto:sam.j...@inktank.com] 
> >> > Sent: Wednesday, September 10, 2014 3:08 PM 
> >> > To: Somnath Roy 
> >> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> > ceph-users@lists.ceph.com 
> >> > Subject: Re: OpTracker optimization 
> >> > 
> >> > I don't quite understand. 
> >> > -Sam 
> >> > 
> >> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy  
> >> > wrote: 
> >> >> Thanks Sam. 
> >> >> So, you want me to go with optracker/shadedopWq , right ? 
> >> >> 
> >> >> Regards 
> >> >> Somnath 
> >> >> 
> >> >> -Original Message- 
> >> >> From: Samuel Just [mailto:sam.j...@inktank.com] 
> >> >> Sent: Wednesday, September 10, 2014 2:36 PM 
> >> >> To: Somnath Roy 
> >> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> >> ceph-users@lists.ceph.com 
> >> >> Subject: Re: OpTracker optimization 
> >> >> 
> >> >> Responded with cosmetic nonsense. Once you've got that and the other 
> >> >> comments addressed, I can put it in wip-sam-testing. 
> >> >> -Sam 
> >> >> 
> >> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy  
> >> >> wrote: 
> >> >>> Thanks Sam..I responded back :-) 
> >> >>> 
> >> >>> -Original Message- 
> >> >>> From: ceph-devel-ow...@vger.kernel.org 
> >> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel 
> >> >>> Just 
> >> >>> Sent: Wednesday, September 10, 2014 11:17 AM 
> >> >>> To: Somnath Roy 
> >> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> >>> ceph-users@lists.ceph.com 
> >> >>> Subject: Re: OpTracker optimization 
> >> >>> 
> >> >>> Added a comment about the approach. 
> >> >>> -Sam 
> >> >>> 
> >> >>> On Tue, Sep 9, 2014 at 1:33 PM, Somnath Roy  
> >> >>> wrote: 
> >>  Hi Sam/Sage, 
> >>  
> >>  As we discussed earlier, enabling the present OpTracker code 
> >>

Re: [ceph-users] OpTracker optimization

2014-09-13 Thread Somnath Roy

Thanks Sage!

-Original Message-
From: Sage Weil [mailto:sw...@redhat.com] 
Sent: Saturday, September 13, 2014 7:32 AM
To: Alexandre DERUMIER
Cc: Somnath Roy; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com; Samuel 
Just
Subject: Re: [ceph-users] OpTracker optimization

On Sat, 13 Sep 2014, Alexandre DERUMIER wrote:
> Hi,
> as ceph user, It could be wonderfull to have it for Giant, optracker 
> performance impact is really huge (See my ssd benchmark on ceph user 
> mailing)

Definitely.  More importantly, it resolves a few crashes we've observed. 
It's going through some testing right now, but once that's done it'll go into 
giant.

sage


> 
> Regards,
> 
> Alexandre Derumier
> 
> - Mail original -
> 
> De: "Somnath Roy" 
> ?: "Samuel Just" 
> Cc: "Sage Weil" , ceph-de...@vger.kernel.org, 
> ceph-users@lists.ceph.com
> Envoy?: Samedi 13 Septembre 2014 10:03:52
> Objet: Re: [ceph-users] OpTracker optimization
> 
> Sam/Sage,
> I saw Giant is forked off today. We need the pull request 
> (https://github.com/ceph/ceph/pull/2440) to be in Giant. So, could you please 
> merge this into Giant when it will be ready ? 
> 
> Thanks & Regards
> Somnath
> 
> -Original Message-
> From: Samuel Just [mailto:sam.j...@inktank.com]
> Sent: Thursday, September 11, 2014 11:31 AM
> To: Somnath Roy
> Cc: Sage Weil; ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Subject: Re: OpTracker optimization
> 
> Just added it to wip-sam-testing. 
> -Sam
> 
> On Thu, Sep 11, 2014 at 11:30 AM, Somnath Roy  
> wrote: 
> > Sam/Sage,
> > I have addressed all of your comments and pushed the changes to the same 
> > pull request. 
> > 
> > https://github.com/ceph/ceph/pull/2440
> > 
> > Thanks & Regards
> > Somnath
> > 
> > -Original Message-
> > From: Sage Weil [mailto:sw...@redhat.com]
> > Sent: Wednesday, September 10, 2014 8:33 PM
> > To: Somnath Roy
> > Cc: Samuel Just; ceph-de...@vger.kernel.org; 
> > ceph-users@lists.ceph.com
> > Subject: RE: OpTracker optimization
> > 
> > I had two substantiative comments on the first patch and then some 
> > trivial whitespace nits. Otherwise looks good!
> > 
> > tahnks-
> > sage
> > 
> > On Thu, 11 Sep 2014, Somnath Roy wrote: 
> > 
> >> Sam/Sage,
> >> I have incorporated all of your comments. Please have a look at the same 
> >> pull request. 
> >> 
> >> https://github.com/ceph/ceph/pull/2440
> >> 
> >> Thanks & Regards
> >> Somnath
> >> 
> >> -Original Message-
> >> From: Samuel Just [mailto:sam.j...@inktank.com]
> >> Sent: Wednesday, September 10, 2014 3:25 PM
> >> To: Somnath Roy
> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> ceph-users@lists.ceph.com
> >> Subject: Re: OpTracker optimization
> >> 
> >> Oh, I changed my mind, your approach is fine. I was unclear. 
> >> Currently, I just need you to address the other comments. 
> >> -Sam
> >> 
> >> On Wed, Sep 10, 2014 at 3:13 PM, Somnath Roy  
> >> wrote: 
> >> > As I understand, you want me to implement the following. 
> >> > 
> >> > 1. Keep this implementation one sharded optracker for the ios going 
> >> > through ms_dispatch path. 
> >> > 
> >> > 2. Additionally, for ios going through ms_fast_dispatch, you want 
> >> > me to implement optracker (without internal shard) per opwq shard
> >> > 
> >> > Am I right ? 
> >> > 
> >> > Thanks & Regards
> >> > Somnath
> >> > 
> >> > -Original Message-
> >> > From: Samuel Just [mailto:sam.j...@inktank.com]
> >> > Sent: Wednesday, September 10, 2014 3:08 PM
> >> > To: Somnath Roy
> >> > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> > ceph-users@lists.ceph.com
> >> > Subject: Re: OpTracker optimization
> >> > 
> >> > I don't quite understand. 
> >> > -Sam
> >> > 
> >> > On Wed, Sep 10, 2014 at 2:38 PM, Somnath Roy  
> >> > wrote: 
> >> >> Thanks Sam. 
> >> >> So, you want me to go with optracker/shadedopWq , right ? 
> >> >> 
> >> >> Regards
> >> >> Somnath
> >> >> 
> >> >> -Original Message-
> >> >> From: Samuel Just [mailto:sam.j...@inktank.com]
> >> >> Sent: Wednesday, September 10, 2014 2:36 PM
> >> >> To: Somnath Roy
> >> >> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> >> ceph-users@lists.ceph.com
> >> >> Subject: Re: OpTracker optimization
> >> >> 
> >> >> Responded with cosmetic nonsense. Once you've got that and the other 
> >> >> comments addressed, I can put it in wip-sam-testing. 
> >> >> -Sam
> >> >> 
> >> >> On Wed, Sep 10, 2014 at 1:30 PM, Somnath Roy  
> >> >> wrote: 
> >> >>> Thanks Sam..I responded back :-)
> >> >>> 
> >> >>> -Original Message-
> >> >>> From: ceph-devel-ow...@vger.kernel.org 
> >> >>> [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Samuel 
> >> >>> Just
> >> >>> Sent: Wednesday, September 10, 2014 11:17 AM
> >> >>> To: Somnath Roy
> >> >>> Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; 
> >> >>> ceph-users@lists.ceph.com
> >> >>> Subject: Re: OpTracker optimization
> >> >>> 
> >> >>> Added a comment abou

[ceph-users] Cache tier unable to auto flush data to storage tier

2014-09-13 Thread Karan Singh

Hello Cephers

I have created a Cache pool and looks like cache tiering agent is not able to 
flush/evict data as per defined policy. However when i manually evict / flush 
data , it migrates data from cache-tier to storage-tier

Kindly advice if there is something wrong with policy or anything else i am 
missing.

Ceph Version: 0.80.5
OS : Cent OS 6.4

Cache pool created using the following commands :

ceph osd tier add data cache-pool 
ceph osd tier cache-mode cache-pool writeback
ceph osd tier set-overlay data cache-pool
ceph osd pool set cache-pool hit_set_type bloom
ceph osd pool set cache-pool hit_set_count 1
ceph osd pool set cache-pool hit_set_period 300
ceph osd pool set cache-pool target_max_bytes 1
ceph osd pool set cache-pool target_max_objects 100
ceph osd pool set cache-pool cache_min_flush_age 60
ceph osd pool set cache-pool cache_min_evict_age 60


[root@ceph-node1 ~]# date
Sun Sep 14 00:49:59 EEST 2014
[root@ceph-node1 ~]# rados -p data  put file1 /etc/hosts
[root@ceph-node1 ~]# rados -p data ls
[root@ceph-node1 ~]# rados -p cache-pool ls
file1
[root@ceph-node1 ~]#


[root@ceph-node1 ~]# date
Sun Sep 14 00:59:33 EEST 2014
[root@ceph-node1 ~]# rados -p data ls
[root@ceph-node1 ~]# 
[root@ceph-node1 ~]# rados -p cache-pool ls
file1
[root@ceph-node1 ~]#


[root@ceph-node1 ~]# date
Sun Sep 14 01:08:02 EEST 2014
[root@ceph-node1 ~]# rados -p data ls
[root@ceph-node1 ~]# rados -p cache-pool ls
file1
[root@ceph-node1 ~]#



[root@ceph-node1 ~]# rados -p cache-pool  cache-flush-evict-all
file1
[root@ceph-node1 ~]#
[root@ceph-node1 ~]# rados -p data ls
file1
[root@ceph-node1 ~]# rados -p cache-pool ls
[root@ceph-node1 ~]#


Regards
Karan Singh

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] writing to rbd mapped device produces hang tasks

2014-09-13 Thread Andrei Mikhailovsky

Hello guys, 

I've been trying to map an rbd disk to run some testing and I've noticed that 
while I can successfully read from the rbd image mapped to /dev/rbdX, I am 
failing to reliably write to it. Sometimes write tests work perfectly well, 
especially if I am using large block sizes. But often writes hang for a 
considerable amount of time and I have kernel hang task messages (shown below) 
in my dmesg. the hang tasks show particularly frequently when using 4K block 
size. However, with large block sizes writes also sometimes hang, but for sure 
less frequent 

I am using simple dd tests like: dd if=/dev/zero of= bs=4K 
count=250K. 

I am running firefly on Ubuntu 12.04 on all osd/mon servers. The rbd image is 
mapped on one of the osd servers. All osd servers are running kernel version 
3.15.10-031510-generic . 


Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.439974] INFO: task 
jbd2/rbd0-8:3505 blocked for more than 120 seconds. 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.441586] Not tainted 
3.15.10-031510-generic #201408132333 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.443022] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message. 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444862] jbd2/rbd0-8 D 
0003 0 3505 2 0x 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444870] 8803a10a7c48 
0002 88007963b288 8803a10a7fd8 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444874] 00014540 
00014540 880469f63260 880866969930 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444876] 8803a10a7c58 
8803a10a7d88 88034d142100 880848519824 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444879] Call Trace: 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444893] [] 
schedule+0x29/0x70 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444901] [] 
jbd2_journal_commit_transaction+0x240/0x1510 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444908] [] ? 
sched_clock_cpu+0x85/0xc0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444920] [] ? 
arch_vtime_task_switch+0x8a/0x90 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444923] [] ? 
vtime_common_task_switch+0x3d/0x50 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444928] [] ? 
__wake_up_sync+0x20/0x20 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444933] [] ? 
try_to_del_timer_sync+0x4f/0x70 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444938] [] 
kjournald2+0xb8/0x240 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444941] [] ? 
__wake_up_sync+0x20/0x20 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444943] [] ? 
commit_timeout+0x10/0x10 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444949] [] 
kthread+0xc9/0xe0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444952] [] ? 
flush_kthread_worker+0xb0/0xb0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444965] [] 
ret_from_fork+0x7c/0xb0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.444969] [] ? 
flush_kthread_worker+0xb0/0xb0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.445141] INFO: task dd:21180 
blocked for more than 120 seconds. 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.446595] Not tainted 
3.15.10-031510-generic #201408132333 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.448070] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message. 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449910] dd D 0002 
0 21180 19562 0x0002 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449913] 880485a1b5d8 
0002 880485a1b5e8 880485a1bfd8 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449916] 00014540 
00014540 880341833260 88011086cb90 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449919] 880485a1b5a8 
88046fc94e40 88011086cb90 81204da0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449921] Call Trace: 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449927] [] ? 
__wait_on_buffer+0x30/0x30 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449930] [] 
schedule+0x29/0x70 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449934] [] 
io_schedule+0x8f/0xd0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449936] [] 
sleep_on_buffer+0xe/0x20 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449940] [] 
__wait_on_bit+0x62/0x90 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449945] [] ? 
bio_alloc_bioset+0xa0/0x1d0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449947] [] ? 
__wait_on_buffer+0x30/0x30 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449951] [] 
out_of_line_wait_on_bit+0x7c/0x90 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449954] [] ? 
wake_atomic_t_function+0x40/0x40 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449957] [] 
__wait_on_buffer+0x2e/0x30 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449962] [] 
ext4_wait_block_bitmap+0xd8/0xe0 
Sep 13 21:24:30 arh-ibstorage2-ib kernel: [11880.449969] [] 
ext4

Re: [ceph-users] Cache tier unable to auto flush data to storage tier

2014-09-13 Thread Jean-Charles LOPEZ

Hi Karan,

May be setting the dirty byte ratio (flush) and the full ratio (eviction). Just 
try to see if it makes any difference
- cache_target_dirty_ratio .1
- cache_target_full_ratio .2

Tune the percentage as desired relatively to target_max_bytes and 
target_max_objects. The first threshold reached will trigger flush or eviction 
(num objects or num bytes)

JC



On Sep 13, 2014, at 15:23, Karan Singh  wrote:

> Hello Cephers
> 
> I have created a Cache pool and looks like cache tiering agent is not able to 
> flush/evict data as per defined policy. However when i manually evict / flush 
> data , it migrates data from cache-tier to storage-tier
> 
> Kindly advice if there is something wrong with policy or anything else i am 
> missing.
> 
> Ceph Version: 0.80.5
> OS : Cent OS 6.4
> 
> Cache pool created using the following commands :
> 
> ceph osd tier add data cache-pool 
> ceph osd tier cache-mode cache-pool writeback
> ceph osd tier set-overlay data cache-pool
> ceph osd pool set cache-pool hit_set_type bloom
> ceph osd pool set cache-pool hit_set_count 1
> ceph osd pool set cache-pool hit_set_period 300
> ceph osd pool set cache-pool target_max_bytes 1
> ceph osd pool set cache-pool target_max_objects 100
> ceph osd pool set cache-pool cache_min_flush_age 60
> ceph osd pool set cache-pool cache_min_evict_age 60
> 
> 
> [root@ceph-node1 ~]# date
> Sun Sep 14 00:49:59 EEST 2014
> [root@ceph-node1 ~]# rados -p data  put file1 /etc/hosts
> [root@ceph-node1 ~]# rados -p data ls
> [root@ceph-node1 ~]# rados -p cache-pool ls
> file1
> [root@ceph-node1 ~]#
> 
> 
> [root@ceph-node1 ~]# date
> Sun Sep 14 00:59:33 EEST 2014
> [root@ceph-node1 ~]# rados -p data ls
> [root@ceph-node1 ~]# 
> [root@ceph-node1 ~]# rados -p cache-pool ls
> file1
> [root@ceph-node1 ~]#
> 
> 
> [root@ceph-node1 ~]# date
> Sun Sep 14 01:08:02 EEST 2014
> [root@ceph-node1 ~]# rados -p data ls
> [root@ceph-node1 ~]# rados -p cache-pool ls
> file1
> [root@ceph-node1 ~]#
> 
> 
> 
> [root@ceph-node1 ~]# rados -p cache-pool  cache-flush-evict-all
> file1
> [root@ceph-node1 ~]#
> [root@ceph-node1 ~]# rados -p data ls
> file1
> [root@ceph-node1 ~]# rados -p cache-pool ls
> [root@ceph-node1 ~]#
> 
> 
> Regards
> Karan Singh
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)

2014-09-13 Thread Peter Drake

David Moreau Simard  writes:

> 
> Hi,
> 
> Trying to update my continuous integration environment.. same deployment 
method with the following specs:
> - Ubuntu Precise, Kernel 3.2, Emperor (0.72.2) - Yields a successful, 
healthy cluster.
> - Ubuntu Trusty, Kernel 3.13, Firefly (0.80.5) - I have stuck placement 
groups.
> 
> Here’s some relevant bits from the Trusty/Firefly setup before I move on 
to what I’ve done/tried:
> http://pastebin.com/eqQTHcxU <— This was about halfway through PG healing.
> 
> So, the setup is three monitors, two other hosts on which there are 9 OSDs 
each.
> At the beginning, all my placement groups were stuck unclean.
> 
> I tried the easy things first:
> - set crush tunables to optimal
> - run repairs/scrub on OSDs
> - restart OSDs
> 
> Nothing happened. All ~12000 PGs remained stuck unclean since forever 
active+remapped.
> Next, I played with the crush map. I deleted the default 
replicated_ruleset rule and created a (basic) rule
> for each pool for the time being.
> I set the pools to use their respective rule and also reduced their size 
to 2 and min_size to 1.
> 
> Still nothing, all PGs stuck.
> I’m not sure why but I tried setting the crush tunables to legacy - I 
guess in a trial and error attempt.
> 
> Half my PGs healed almost immediately. 6082 PGs remained in 
active+remapped.
> I try running scrubs/repairs - it won’t heal the other half. I set the 
tunables back to optimal, still nothing.
> 
> I set tunables to legacy again and most of them end up healing with only 
1335 left in active+remapped.
> 
> The remainder of the PGs healed when I restarted the OSDs.
> 
> Does anyone have a clue why this happened ?
> It looks like switching back and forth between tunables fixed the stuck 
PGs ?
> 
> I can easily reproduce this if anyone wants more info.
> 
> Let me know !
> --
> David Moreau Simard
> 

I recently encountered the exact same problem.  I have been working on a new 
cloud deployment using vagrant to simulate the physical hosts.  I have 4 
hosts, each is both a mon and osd for testing purposes.

System details:
Ubuntu Trusty (14.04)
Kernel 3.13
Firefly 0.80.5

On deployment of a new cluster, all of my pgs were stuck (HEALTH_WARN 320 
pgs incomplete; 320 pgs stuck inactive; 320 pgs stuck unclean).  I tried a 
ton of recommended processes for getting them working and nothing could get 
them to budge.  I did `ceph osd crush tunables legacy` and all 320 pgs went 
from stuck to active.  This is definitely repeatable as I can deploy a new 
cluster with vagrant/puppet and this happens every time.

So, thank you for posting this work-around.

Peter


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OpTracker optimization

Re: [ceph-users] OpTracker optimization

Re: [ceph-users] error while installing ceph in cluster node

Re: [ceph-users] OpTracker optimization

Re: [ceph-users] OpTracker optimization

[ceph-users] Cache tier unable to auto flush data to storage tier

[ceph-users] writing to rbd mapped device produces hang tasks

Re: [ceph-users] Cache tier unable to auto flush data to storage tier

Re: [ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)

9 matches

Site Navigation

Mail list logo

Footer information