[ceph-users] jewel - rgw blocked on deep-scrub of bucket index pg

2017-05-05 Thread Sam Wouters
Hi,

we have a small cluster running on jewel 10.2.7; NL-SAS disks only, osd
data and journal co located on the disks; main purpose rgw secondary zone.

Since the upgrade to jewel, whenever a deep scrub starts on one of the
rgw index pool pg's, slow requests start piling up and rgw requests are
blocked after some hours.
The deep-scrub doesn't seem to finish (still running after +11 hours)
and only escape I found so far is a restart of the primary osd holding
the pg.

Maybe important to know, we have some large rgw buckets regarding
#objects (+ 3 million) with only index sharding of 8.

scrub related settings:
osd scrub sleep = 0.1
osd scrub during recovery = False
osd scrub priority = 1
osd deep scrub stride = 1048576
osd scrub chunk min = 1
osd scrub chunk max = 1

Any help on debugging / resolving would be very much appreciated...

regards,
Sam

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How does ceph pg repair work in jewel or later versions of ceph?

2017-05-05 Thread David Turner
This was covered in depth in the last 3 months on this ML thread.

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-February/016373.html

On Fri, May 5, 2017, 2:58 AM shadow_lin  wrote:

> I have read that the pg repair is simply copy the data from the primary
> osd to other osds.Is that true?or the later version of ceph has improved
> that?
>
> 2017-05-05
> --
> lin.yunfan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Installing pybind manually from source

2017-05-05 Thread Henry Ngo
It appears that the pybind folder from git source needs to be installed
manually if building Ceph from source. Where exactly does it need to be
copied in Centos? I have tried /usr/local/lib, /usr/local/lib/python2.7/
and /usr/local/lib/python2.7/site-packages with no luck.

http://tracker.ceph.com/issues/7968
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing replica size of a running pool

2017-05-05 Thread Alejandro Comisario
Thanks David!
Any one ? more thoughts ?

On Wed, May 3, 2017 at 3:38 PM, David Turner  wrote:

> Those are both things that people have done and both work.  Neither is
> optimal, but both options work fine.  The best option is to definitely just
> get a third node now as you aren't going to be getting it for additional
> space from it later.  Your usable space between a 2 node size 2 cluster and
> a 3 node size 3 cluster is identical.
>
> If getting a third node is not possible, I would recommend a size 2
> min_size 2 configuration.  You will block writes if either of your nodes or
> any copy of your data is down, but you will not get into an inconsistent
> state that can happen with min_size of 1 (and you can always set the
> min_size of a pool to 1 on the fly to perform maintenance).  If you go with
> the option to use the failure domain of OSDs instead of hosts and have size
> 3, then a single node going down will block writes into your cluster.  The
> only you gain from this is having 3 physical copies of the data until you
> get a third node, but a lot of backfilling when you change the crush rule.
>
> A more complex option that I think would be a better solution than your 2
> options would be to create 2 hosts in your crush map for each physical host
> and split the OSDs in each host evenly between them.  That way you can have
> 2 copies of data in a given node, but never all 3 copies.  You have your 3
> copies of data and guaranteed that not all 3 are on the same host.
> Assuming min_size of 2, you will still block writes if you restart either
> node.
>
> If modifying the hosts in your crush map doesn't sound daunting, then I
> would recommend going that route... For most people that is more complex
> than they'd like to go and I would say size 2 min_size 2 would be the way
> to go until you get a third node.  #my2cents
>
> On Wed, May 3, 2017 at 12:41 PM Maximiliano Venesio 
> wrote:
>
>> Guys hi.
>>
>> I have a Jewel Cluster composed by two storage servers which are
>> configured on
>> the crush map as different buckets to store data.
>>
>> I've to configure two new pools on this cluster with the certainty
>> that i'll have to add more servers in a short term.
>>
>> Taking into account that the recommended replication size for every
>> pool is 3, i'm thinking in two possible scenarios.
>>
>> 1) Set the replica size in 2 now, and in the future change the replica
>> size to 3 on a running pool.
>> Is that possible? Can i have serious issues with the rebalance of the
>> pgs, changing the pool size on the fly ?
>>
>> 2) Set the replica size to 3, and change the ruleset to replicate by
>> OSD instead of HOST now, and in the future change this rule in the
>> ruleset to replicate again by host in a running pool.
>> Is that possible? Can i have serious issues with the rebalance of the
>> pgs, changing the ruleset in a running pool ?
>>
>> Which do you think is the best option ?
>>
>>
>> Thanks in advanced.
>>
>>
>> Maximiliano Venesio
>> Chief Cloud Architect | NUBELIU
>> E-mail: massimo@nubeliu.comCell: +54 9 11 3770 1853
>> <+54%209%2011%203770-1853>
>> _
>> www.nubeliu.com
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
*Alejandro Comisario*
*CTO | NUBELIU*
E-mail: alejandro@nubeliu.comCell: +54 9 11 3770 1857
_
www.nubeliu.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RGW: removal of support for fastcgi

2017-05-05 Thread Yehuda Sadeh-Weinraub
RGW has supported since forever. Originally it was the only supported
frontend, and nowadays it is the least preferred one.

Rgw was first developed over fastcgi + lighttpd, but there were some
issues with this setup, so we switched to fastcgi + apache as our main
supported configuration. This was also sub-optimal, as there wasn't a
good, supported, and mainained fastcgi module for apache that we could
find. At the time there were two modules: mod_fcgid, and mod_fastcgi.
The former had a major flaw, which it buffered all PUTs before sending
them to the backend (rgw). It also didn't really support 100-continue.
The latter was mostly unmaintained, and didn't quite support
100-continue either. We ended up maintaining a fork of mod_fastcgi for
quite a while, which was a pain. Later came mod-proxy-fcgi, which also
didn't fully support 100-continue (iirc), but it was maintained, and
was good enough to use out of the box, so we settled on it. At that
time we already had civetweb as a frontend, so we didn't really worry
about it.
Now, I'd like to know whether anyone actually uses and needs fastcgi.
I get an occasional request to remove it altogether, and I'd like to
have some more info before we go and do it. The requests for removal
usually cite security reasons, but I can also add 'we don't really
want to maintain it anymore'.
A valid replacement for fastcgi could be using civetweb directly, or
using mod-proxy (in apache, I'd expect similar solution in other
webservers) with civetweb as the rgw frontend.

TL;DR: Does anyone care if we remove support for fastcgi in rgw?

Yehuda
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW: removal of support for fastcgi

2017-05-05 Thread Roger Brown
I'm using fastcgi/apache2 instead of civetweb (centos7) because i couldn't
get civetweb to work with SSL on port 443 and in a subdomain of my main
website.


On Fri, May 5, 2017 at 1:51 PM Yehuda Sadeh-Weinraub 
wrote:

> RGW has supported since forever. Originally it was the only supported
> frontend, and nowadays it is the least preferred one.
>
> Rgw was first developed over fastcgi + lighttpd, but there were some
> issues with this setup, so we switched to fastcgi + apache as our main
> supported configuration. This was also sub-optimal, as there wasn't a
> good, supported, and mainained fastcgi module for apache that we could
> find. At the time there were two modules: mod_fcgid, and mod_fastcgi.
> The former had a major flaw, which it buffered all PUTs before sending
> them to the backend (rgw). It also didn't really support 100-continue.
> The latter was mostly unmaintained, and didn't quite support
> 100-continue either. We ended up maintaining a fork of mod_fastcgi for
> quite a while, which was a pain. Later came mod-proxy-fcgi, which also
> didn't fully support 100-continue (iirc), but it was maintained, and
> was good enough to use out of the box, so we settled on it. At that
> time we already had civetweb as a frontend, so we didn't really worry
> about it.
> Now, I'd like to know whether anyone actually uses and needs fastcgi.
> I get an occasional request to remove it altogether, and I'd like to
> have some more info before we go and do it. The requests for removal
> usually cite security reasons, but I can also add 'we don't really
> want to maintain it anymore'.
> A valid replacement for fastcgi could be using civetweb directly, or
> using mod-proxy (in apache, I'd expect similar solution in other
> webservers) with civetweb as the rgw frontend.
>
> TL;DR: Does anyone care if we remove support for fastcgi in rgw?
>
> Yehuda
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW: removal of support for fastcgi

2017-05-05 Thread Roger Brown
I'm using fastcgi/apache2 instead of civetweb (centos7) because i couldn't
get civetweb to work with SSL on port 443 and in a subdomain of my main
website.
So I have domain.com, www.domain.com, s3.domain.com (RGW), and *.
s3.domain.com for the RGW buckets. As long as you can do the same with
civitweb for RGW, then maybe I don't care.

Roger


On Fri, May 5, 2017 at 1:51 PM Yehuda Sadeh-Weinraub 
wrote:

> RGW has supported since forever. Originally it was the only supported
> frontend, and nowadays it is the least preferred one.
>
> Rgw was first developed over fastcgi + lighttpd, but there were some
> issues with this setup, so we switched to fastcgi + apache as our main
> supported configuration. This was also sub-optimal, as there wasn't a
> good, supported, and mainained fastcgi module for apache that we could
> find. At the time there were two modules: mod_fcgid, and mod_fastcgi.
> The former had a major flaw, which it buffered all PUTs before sending
> them to the backend (rgw). It also didn't really support 100-continue.
> The latter was mostly unmaintained, and didn't quite support
> 100-continue either. We ended up maintaining a fork of mod_fastcgi for
> quite a while, which was a pain. Later came mod-proxy-fcgi, which also
> didn't fully support 100-continue (iirc), but it was maintained, and
> was good enough to use out of the box, so we settled on it. At that
> time we already had civetweb as a frontend, so we didn't really worry
> about it.
> Now, I'd like to know whether anyone actually uses and needs fastcgi.
> I get an occasional request to remove it altogether, and I'd like to
> have some more info before we go and do it. The requests for removal
> usually cite security reasons, but I can also add 'we don't really
> want to maintain it anymore'.
> A valid replacement for fastcgi could be using civetweb directly, or
> using mod-proxy (in apache, I'd expect similar solution in other
> webservers) with civetweb as the rgw frontend.
>
> TL;DR: Does anyone care if we remove support for fastcgi in rgw?
>
> Yehuda
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-05 Thread Stefan Priebe - Profihost AG
Hello Json,

while doing further testing it happens only with images created with
hammer and that got upgraded to jewel AND got enabled exclusive lock.

Greets,
Stefan

Am 04.05.2017 um 14:20 schrieb Jason Dillaman:
> Odd. Can you re-run "rbd rm" with "--debug-rbd=20" added to the
> command and post the resulting log to a new ticket at [1]? I'd also be
> interested if you could re-create that
> "librbd::object_map::InvalidateRequest" issue repeatably.
> 
> [1] http://tracker.ceph.com/projects/rbd/issues
> 
> On Thu, May 4, 2017 at 3:45 AM, Stefan Priebe - Profihost AG
>  wrote:
>> Example:
>> # rbd rm cephstor2/vm-136-disk-1
>> Removing image: 99% complete...
>>
>> Stuck at 99% and never completes. This is an image which got corrupted
>> for an unknown reason.
>>
>> Greets,
>> Stefan
>>
>> Am 04.05.2017 um 08:32 schrieb Stefan Priebe - Profihost AG:
>>> I'm not sure whether this is related but our backup system uses rbd
>>> snapshots and reports sometimes messages like these:
>>> 2017-05-04 02:42:47.661263 7f3316ffd700 -1
>>> librbd::object_map::InvalidateRequest: 0x7f3310002570 should_complete: r=0
>>>
>>> Stefan
>>>
>>>
>>> Am 04.05.2017 um 07:49 schrieb Stefan Priebe - Profihost AG:
 Hello,

 since we've upgraded from hammer to jewel 10.2.7 and enabled
 exclusive-lock,object-map,fast-diff we've problems with corrupting VM
 filesystems.

 Sometimes the VMs are just crashing with FS errors and a restart can
 solve the problem. Sometimes the whole VM is not even bootable and we
 need to import a backup.

 All of them have the same problem that you can't revert to an older
 snapshot. The rbd command just hangs at 99% forever.

 Is this a known issue - anythink we can check?

 Greets,
 Stefan

>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com