Re: [ceph-users] zone, zonegroup and resharding bucket on luminous

2017-10-02 Thread Orit Wasserman
On Fri, Sep 29, 2017 at 5:56 PM, Yoann Moulin  wrote:
> Hello,
>
> I'm doing some tests on the radosgw on luminous (12.2.1), I have a few 
> questions.
>
> In the documentation[1], there is a reference to "radosgw-admin region get" 
> but it seems not to be available anymore.
> It should be "radosgw-admin zonegroup get" I guess.
>
> 1. http://docs.ceph.com/docs/luminous/install/install-ceph-gateway/
>
> I have installed my luminous cluster with ceph-ansible playbook.
>
> but when I try to manipulate zonegroup or zone, I have this
>
>> # radosgw-admin zonegroup get
>> failed to init zonegroup: (2) No such file or directory
>

try with --rgw-zonegroup=default

>> # radosgw-admin  zone get
>> unable to initialize zone: (2) No such file or directory
>
try with --rgw-zone=default

> I guessed it's because I don't have a realm set and not default zone and 
> zonegroup ?
>

The default zone and zonegroup are  part of the realm so without a
realm you cannot set them as defaults.
This means you have to specifiy --rgw-zonegroup=default and --rgw-zone=default
 I am guessing our documentation needs updating :(
I think we can improve our behavior and make those command works
without a realm , i.e return the default zonegroup and zone. I will
open a tracker issue for that.

>> # radosgw-admin realm list
>> {
>> "default_info": "",
>> "realms": []
>> }
>
>> # radosgw-admin zonegroup list
>> {
>> "default_info": "",
>> "zonegroups": [
>> "default"
>> ]
>> }
>
>> # radosgw-admin zone list
>> {
>> "default_info": "",
>> "zones": [
>> "default"
>> ]
>> }
>
> Is that the default behaviour not to create default realm on a fresh radosgw 
> ? Or is it a side effect of ceph-ansible installation ?
>
It is the default behavior, there is no default realm.

> I have a bucket that referred to a zonegroup but without realm. Can I create 
> a default realm ? Is that safe for the bucket that has already been
> uploaded ?
>
Yes You can create a realm and add the zonegroup to it.
Don't forgot to do "radosgw-admin period update --commit" to commit the changes.

> On the "default" zonegroup (which is not set as default), the  
> "bucket_index_max_shards" is set to "0", can I modify it without reaml ?
>
I just updated this section in this pr: https://github.com/ceph/ceph/pull/18063

Regards,
Orit
> some useful information (I guess) :
>
>> # radosgw-admin zonegroup get --rgw-zonegroup=default
>> {
>> "id": "43d23097-56b9-48a6-ad52-de42341be4bd",
>> "name": "default",
>> "api_name": "",
>> "is_master": "true",
>> "endpoints": [],
>> "hostnames": [],
>> "hostnames_s3website": [],
>> "master_zone": "69d2fd65-fcf9-461b-865f-3dbb053803c4",
>> "zones": [
>> {
>> "id": "69d2fd65-fcf9-461b-865f-3dbb053803c4",
>> "name": "default",
>> "endpoints": [],
>> "log_meta": "false",
>> "log_data": "false",
>> "bucket_index_max_shards": 0,
>> "read_only": "false",
>> "tier_type": "",
>> "sync_from_all": "true",
>> "sync_from": []
>> }
>> ],
>> "placement_targets": [
>> {
>> "name": "default-placement",
>> "tags": []
>> }
>> ],
>> "default_placement": "default-placement",
>> "realm_id": ""
>> }
>
>> # radosgw-admin zone get --rgw-zone=default
>> {
>> "id": "69d2fd65-fcf9-461b-865f-3dbb053803c4",
>> "name": "default",
>> "domain_root": "default.rgw.meta:root",
>> "control_pool": "default.rgw.control",
>> "gc_pool": "default.rgw.log:gc",
>> "lc_pool": "default.rgw.log:lc",
>> "log_pool": "default.rgw.log",
>> "intent_log_pool": "default.rgw.log:intent",
>> "usage_log_pool": "default.rgw.log:usage",
>> "reshard_pool": "default.rgw.log:reshard",
>> "user_keys_pool": "default.rgw.meta:users.keys",
>> "user_email_pool": "default.rgw.meta:users.email",
>> "user_swift_pool": "default.rgw.meta:users.swift",
>> "user_uid_pool": "default.rgw.meta:users.uid",
>> "system_key": {
>> "access_key": "",
>> "secret_key": ""
>> },
>> "placement_pools": [
>> {
>> "key": "default-placement",
>> "val": {
>> "index_pool": "default.rgw.buckets.index",
>> "data_pool": "default.rgw.buckets.data",
>> "data_extra_pool": "default.rgw.buckets.non-ec",
>> "index_type": 0,
>> "compression": ""
>> }
>> }
>> ],
>> "metadata_heap": "",
>> "tier_config": [],
>> "realm_id": ""
>> }
>
>> # radosgw-admin metadata get bucket:image-net
>> {
>> "key": "bucket:image-net",
>> "ver": {
>> "tag": "_2_RFnI5pKQV7XEc5s2euJJW",
>> "ver": 1
>> },
>> "mtime": "2017-08-28 12:27:35.629882Z",
>> "data": {
>> "bucket": {
>> "name": "image-net",
>> 

Re: [ceph-users] tunable question

2017-10-02 Thread Manuel Lausch
Hi, 

We have similar issues.
After upgradeing from hammer to jewel the tunable "choose leave stabel"
was introduces. If we activate it nearly all data will be moved. The
cluster has 2400 OSD on 40 nodes over two datacenters and is filled with
2,5 PB Data. 

We tried to enable it but the backfillingtraffic is to high to be
handled without impacting other services on the Network.

Do someone know if it is neccessary to enable this tunable? And could
it be a problem in the future if we want to upgrade to newer versions
wihout it enabled?

Regards,
Manuel Lausch

Am Thu, 28 Sep 2017 10:29:58 +0200
schrieb Dan van der Ster :

> Hi,
> 
> How big is your cluster and what is your use case?
> 
> For us, we'll likely never enable the recent tunables that need to
> remap *all* PGs -- it would simply be too disruptive for marginal
> benefit.
> 
> Cheers, Dan
> 
> 
> On Thu, Sep 28, 2017 at 9:21 AM, mj  wrote:
> > Hi,
> >
> > We have completed the upgrade to jewel, and we set tunables to
> > hammer. Cluster again HEALTH_OK. :-)
> >
> > But now, we would like to proceed in the direction of luminous and
> > bluestore OSDs, and we would like to ask for some feedback first.
> >
> > From the jewel ceph docs on tubables: "Changing tunable to
> > "optimal" on an existing cluster will result in a very large amount
> > of data movement as almost every PG mapping is likely to change."
> >
> > Given the above, and the fact that we would like to proceed to
> > luminous/bluestore in the not too far away future: What is cleverer:
> >
> > 1 - keep the cluster at tunable hammer now, upgrade to luminous in
> > a little while, change OSDs to bluestore, and then set tunables to
> > optimal
> >
> > or
> >
> > 2 - set tunable to optimal now, take the impact of "almost all PG
> > remapping", and when that is finished, upgrade to luminous,
> > bluestore etc.
> >
> > Which route is the preferred one?
> >
> > Or is there a third (or fourth?) option..? :-)
> >
> > MJ
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] BlueStore questions about workflow and performance

2017-10-02 Thread Sam Huracan
Hi,

I'm reading this document:
 http://storageconference.us/2017/Presentations/CephObjectStore-slides.pdf

I have 3 questions:

1. BlueStore writes both data (to raw block device) and metadata (to
RockDB) simultaneously, or sequentially?

2. From my opinion, performance of BlueStore can not compare to FileStore
using SSD Journal, because performance of raw disk is less than using
buffer. (this is buffer purpose). How do you think?

3.  Do setting Rock DB and Rock DB Wal in SSD only enhance write, read
performance? or both?

Hope your answer,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph monitoring

2017-10-02 Thread Osama Hasebou
Hi Everyone, 

Is there a guide/tutorial about how to setup Ceph monitoring system using 
collectd / grafana / graphite ? Other suggestions are welcome as well ! 

I found some GitHub solutions but not much documentation on how to implement. 

Thanks. 

Regards, 
Ossi 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-announce] Luminous v12.2.1 released

2017-10-02 Thread Fabian Grünbichler
On Thu, Sep 28, 2017 at 05:46:30PM +0200, Abhishek wrote:
> This is the first bugfix release of Luminous v12.2.x long term stable
> release series. It contains a range of bug fixes and a few features
> across CephFS, RBD & RGW. We recommend all the users of 12.2.x series
> update.
> 
> For more details, refer to the release notes entry at the official
> blog[1] and the complete changelog[2]
> 
> Notable Changes
> ---
> 
> [ snip ]
> 
> * The maximum number of PGs per OSD before the monitor issues a
>warning has been reduced from 300 to 200 PGs.  200 is still twice
>the generally recommended target of 100 PGs per OSD.  This limit can
>be adjusted via the ``mon_max_pg_per_osd`` option on the
>monitors.  The older ``mon_pg_warn_max_per_osd`` option has been
> removed.
> 
> * Creating pools or adjusting pg_num will now fail if the change would
>make the number of PGs per OSD exceed the configured
>``mon_max_pg_per_osd`` limit.  The option can be adjusted if it
>is really necessary to create a pool with more PGs.
> 
> [ snip ]
> 
> Getting Ceph
> 
> 
> [ snip ]
> 
> [1]: http://ceph.com/releases/v12-2-1-luminous-released/
> [2]: https://github.com/ceph/ceph/blob/master/doc/changelog/v12.2.1.txt
> 

those release notes should be corrected, [1] did apparently not make the
cut for 12.2.1 but makes up 1/3 of the notable changes..

1: https://github.com/ceph/ceph/pull/17814


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW how to delete orphans

2017-10-02 Thread Webert de Souza Lima
Hey Christian,

On 29 Sep 2017 12:32 a.m., "Christian Wuerdig" 
> wrote:
>
>> I'm pretty sure the orphan find command does exactly just that -
>> finding orphans. I remember some emails on the dev list where Yehuda
>> said he wasn't 100% comfortable of automating the delete just yet.
>> So the purpose is to run the orphan find tool and then delete the
>> orphaned objects once you're happy that they all are actually
>> orphaned.
>>
>>
so what you mean is that one should manually remove the result listed
objects that are output?


Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OSD on Hardware RAID

2017-10-02 Thread Vincent Godin
In addition to the points that you made :

I noticed on RAID0 disk that read IO errors are not always trapped by
ceph leading to unattended behaviour of the impacted OSD daemon.

On both RAID0 disk or non-RAID disk, a IO error is trapped on /var/log/messages

Oct  2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 FAILED
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct  2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 Sense Key :
Medium Error [current]
Oct  2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 Add. Sense:
Unrecovered read error
Oct  2 15:20:37 os-ceph05 kernel: sd 0:1:0:7: [sdh] tag#0 CDB:
Read(16) 88 00 00 00 00 00 00 00 37 e0 00 00 00 08 00 00
Oct  2 15:20:37 os-ceph05 kernel: blk_update_request: critical medium
error, dev sdh, sector 14304

On non RAID0 disk, we can see the I/O errors in the OSD log

2017-09-27 00:55:52.100678 7faceba7b700 -1 filestore(/var/lib/ceph/osd/ceph-276)
FileStore::read(9.103_head/#9:c086eeb2:::rbd_data.6592c12eb141f2.00058795:head#)
pread error: (5) Input/output error
2017-09-27 00:55:52.128147 7faceba7b700 -1 os/filestore/FileStore.cc:
In function 'virtual int FileStore::read(const coll_t&, const
ghobject_t&, uint64_t, size_t, ceph::bufferlist&, uint32_t, bool) '
thread 7faceba7b700 time 2017-09-27 00:55:52.101208
os/filestore/FileStore.cc: 3016: FAILED assert(0 == "eio on pread")

On RAID0 disk, we only see thread timeout in the OSD log

2017-10-02 15:20:26.360683 7f3240154700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15
2017-10-02 15:20:26.360729 7f3240053700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15
2017-10-02 15:20:26.413488 7f323f144700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15
2017-10-02 15:20:26.413574 7f323f043700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15
2017-10-02 15:20:26.536500 7f323f548700  1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f3250c3c700' had timed out after 15

On non RAID disk, a IO error, till jewel, will restart the OSD, making
the inconsistent pg (and the others) peer to on other OSD with clean
data

On RAID0 disk, the IO error can lead to an increasing number of slow
requests, blocking the all cluster if the load is high to just some
slow requests and an IO error from the client view if the load is low.

The RAID0s i'm talking about are built on a HP Smart Array P420i on a
SL4540. It may be only related to this hardware
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 1 osd Segmentation fault in test cluster

2017-10-02 Thread Gregory Farnum
Please file a tracker ticket with all the info you have for stuff like
this. They’re a lot harder to lose than emails are. ;)
On Sat, Sep 30, 2017 at 8:31 AM Marc Roos  wrote:

> Is this useful for someone?
>
>
>
> [Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket
> closed (con state OPEN)
> [Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket
> closed (con state CONNECTING)
> [Sat Sep 30 15:51:11 2017] libceph: osd5 down
> [Sat Sep 30 15:51:11 2017] libceph: osd5 down
> [Sat Sep 30 15:52:52 2017] libceph: osd5 up
> [Sat Sep 30 15:52:52 2017] libceph: osd5 up
>
>
>
> 2017-09-30 15:48:08.542202 7f7623ce9700  0 log_channel(cluster) log
> [WRN] : slow request 31.456482 seconds old, received at 2017-09-30
> 15:47:37.085589: osd_op(mds.0.9227:1289186 20.2b 20.9af42b6b (undecoded)
> ondisk+write+known_if_redirected+full_force e15675) currently
> queued_for_pg
> 2017-09-30 15:48:08.542207 7f7623ce9700  0 log_channel(cluster) log
> [WRN] : slow request 31.456086 seconds old, received at 2017-09-30
> 15:47:37.085984: osd_op(mds.0.9227:1289190 20.13 20.e44f3f53 (undecoded)
> ondisk+write+known_if_redirected+full_force e15675) currently
> queued_for_pg
> 2017-09-30 15:48:08.542212 7f7623ce9700  0 log_channel(cluster) log
> [WRN] : slow request 31.456005 seconds old, received at 2017-09-30
> 15:47:37.086065: osd_op(mds.0.9227:1289194 20.2b 20.6733bdeb (undecoded)
> ondisk+write+known_if_redirected+full_force e15675) currently
> queued_for_pg
> 2017-09-30 15:51:12.592490 7f7611cc5700  0 log_channel(cluster) log
> [DBG] : 20.3f scrub starts
> 2017-09-30 15:51:24.514602 7f76214e4700 -1 *** Caught signal
> (Segmentation fault) **
>  in thread 7f76214e4700 thread_name:bstore_mempool
>
>  ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous
> (stable)
>  1: (()+0xa29511) [0x7f762e5b2511]
>  2: (()+0xf370) [0x7f762afa5370]
>  3: (BlueStore::TwoQCache::_trim(unsigned long, unsigned long)+0x2df)
> [0x7f762e481a2f]
>  4: (BlueStore::Cache::trim(unsigned long, float, float, float)+0x1d1)
> [0x7f762e4543e1]
>  5: (BlueStore::MempoolThread::entry()+0x14d) [0x7f762e45a71d]
>  6: (()+0x7dc5) [0x7f762af9ddc5]
>  7: (clone()+0x6d) [0x7f762a09176d]
>  NOTE: a copy of the executable, or `objdump -rdS ` is
> needed to interpret this.
>
> --- begin dump of recent events ---
> -1> 2017-09-30 15:51:05.105915 7f76284ac700  5 --
> 192.168.10.113:0/27661 >> 192.168.10.111:6810/6617 conn(0x7f766b736000
> :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=19 cs=1 l=1). rx
> osd.0 seq 19546 0x7f76a2daf000 osd_ping(ping_reply e15675 stamp
> 2017-09-30 15:51:05.105439) v4
>  -> 2017-09-30 15:51:05.105963 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.11:6805/6491 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f7683e98a00 con 0
>  -9998> 2017-09-30 15:51:05.105960 7f76284ac700  1 --
> 192.168.10.113:0/27661 <== osd.0 192.168.10.111:6810/6617 19546 
> osd_ping(ping_reply e15675 stamp 2017-09-30 15:51:05.105439) v4 
> 2004+0+0 (1212154800 0 0) 0x7f76a2daf000 con 0x7f766b736000
>  -9997> 2017-09-30 15:51:05.105961 7f76274aa700  5 -- 10.0.0.13:0/27661
> >> 10.0.0.11:6808/6646 conn(0x7f766b745800 :-1
> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=24 cs=1 l=1). rx osd.3
> seq 19546 0x7f769b95f200 osd_ping(ping_reply e15675 stamp 2017-09-30
> 15:51:05.105439) v4
>  -9996> 2017-09-30 15:51:05.105983 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.111:6805/6491 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f7683e97600 con 0
>  -9995> 2017-09-30 15:51:05.106001 7f76274aa700  1 -- 10.0.0.13:0/27661
> <== osd.3 10.0.0.11:6808/6646 19546  osd_ping(ping_reply e15675
> stamp 2017-09-30 15:51:05.105439) v4  2004+0+0 (1212154800 0 0)
> 0x7f769b95f200 con 0x7f766b745800
>  -9994> 2017-09-30 15:51:05.106015 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.11:6807/6470 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f7683e99800 con 0
>  -9993> 2017-09-30 15:51:05.106035 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.111:6808/6470 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f763b72a200 con 0
>  -9992> 2017-09-30 15:51:05.106072 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.11:6809/6710 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f768633dc00 con 0
>  -9991> 2017-09-30 15:51:05.106093 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.111:6804/6710 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f76667d3600 con 0
>  -9990> 2017-09-30 15:51:05.106114 7f760fcc1700  1 -- 10.0.0.13:0/27661
> --> 10.0.0.12:6805/1949 -- osd_ping(ping e15675 stamp 2017-09-30
> 15:51:05.105439) v4 -- 0x7f768fcd6200 con 0
>  -9989> 2017-09-30 15:51:05.106134 7f760fcc1700  1 --
> 192.168.10.113:0/27661 --> 192.168.10.112:6805/1949 -- osd_ping(ping
> e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f765f27a800 con 0
>
>
> ...
>
>
>   -29> 2017

Re: [ceph-users] Ceph monitoring

2017-10-02 Thread David
If you take Ceph out of your search string you should find loads of
tutorials on setting up the popular collectd/influxdb/grafana stack. Once
you've got that in place, the Ceph bit should be fairly easy. There's Ceph
collectd plugins out there or you could write your own.



On Mon, Oct 2, 2017 at 12:34 PM, Osama Hasebou  wrote:

> Hi Everyone,
>
> Is there a guide/tutorial about how to setup Ceph monitoring system using
> collectd / grafana / graphite ? Other suggestions are welcome as well !
>
> I found some GitHub solutions but not much documentation on how to
> implement.
>
> Thanks.
>
> Regards,
> Ossi
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Discontiune of cn.ceph.com

2017-10-02 Thread Shengjing Zhu
Hi,

According to the regulation in China, we, the mirror site of
mirrors.ustc.edu.cn, is no longer able to serve the domain
cn.ceph.com, which has no ICP license[1].

Please either disable the CNAME record of cn.ceph.com or change it to a
mirror like hk.ceph.com.

People can still access our mirror via https://mirrors.ustc.edu.cn/ceph

Sorry for the inconvenience.

[1] https://en.wikipedia.org/wiki/ICP_license

Shengjing Zhu,
on behalf of admins of mirrors.ustc.edu.cn


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitoring

2017-10-02 Thread German Anders
prometheus has a nice data exporter build in go, that then you can send to
grafana or any other tool

https://github.com/digitalocean/ceph_exporter

*German*

2017-10-02 8:34 GMT-03:00 Osama Hasebou :

> Hi Everyone,
>
> Is there a guide/tutorial about how to setup Ceph monitoring system using
> collectd / grafana / graphite ? Other suggestions are welcome as well !
>
> I found some GitHub solutions but not much documentation on how to
> implement.
>
> Thanks.
>
> Regards,
> Ossi
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitoring

2017-10-02 Thread Matthew Vernon
On 02/10/17 12:34, Osama Hasebou wrote:
> Hi Everyone,
> 
> Is there a guide/tutorial about how to setup Ceph monitoring system
> using collectd / grafana / graphite ? Other suggestions are welcome as
> well !

We just installed the collectd plugin for ceph, and pointed it at our
grahphite server; that did most of what we wanted (we also needed a
script to monitor wear on our SSD devices).

Making a dashboard is rather a matter of personal preference - we plot
client and s3 i/o, network, server load & CPU use, and have indicator
plots for numbers of osds up&in, and monitor quorum.

[I could share our dashboard JSON, but it's obviously specific to our
data sources]

Regards,

Matthew


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitoring

2017-10-02 Thread Erik McCormick
On Mon, Oct 2, 2017 at 11:55 AM, Matthew Vernon  wrote:
> On 02/10/17 12:34, Osama Hasebou wrote:
>> Hi Everyone,
>>
>> Is there a guide/tutorial about how to setup Ceph monitoring system
>> using collectd / grafana / graphite ? Other suggestions are welcome as
>> well !
>
> We just installed the collectd plugin for ceph, and pointed it at our
> grahphite server; that did most of what we wanted (we also needed a
> script to monitor wear on our SSD devices).
>
> Making a dashboard is rather a matter of personal preference - we plot
> client and s3 i/o, network, server load & CPU use, and have indicator
> plots for numbers of osds up&in, and monitor quorum.
>
> [I could share our dashboard JSON, but it's obviously specific to our
> data sources]
>
> Regards,
>
> Matthew
>
>

I for one would love to see your dashboard. host and data source names
can be easily replaced :)

-Erik

> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] decreasing number of PGs

2017-10-02 Thread Andrei Mikhailovsky
Hello everyone, 

what is the safest way to decrease the number of PGs in the cluster. Currently, 
I have too many per osd. 

Thanks 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitoring

2017-10-02 Thread Reed Dier
As someone currently running collectd/influxdb/grafana stack for monitoring, I 
am curious if anyone has seen issues moving Jewel -> Luminous.

I thought I remembered reading that collectd wasn’t working perfectly in 
Luminous, likely not helped with the MGR daemon.

Also thought about trying telegraf, however permissions issues in Jewel caused 
me to punt. (see: https://github.com/influxdata/telegraf/issues/1657 
)
Looked like Telegraf could supply pool level performance ops that I wasn’t 
seeing in collectd.

Was planning on starting the Luminous upgrades this week, so this thread seemed 
a good time to ask.

> ii  collectd 5.6.2.37.gfd01cdd-1~xenial

Looks like I’m running the 5.6 branch of collectd, I don’t see any ceph changes 
in the 5.7 branch, so wouldn’t immediately rock the boat with upgrading to 
5.7.x immediately.

Just curious what the early Luminous users are seeing.

Thanks,

Reed

> On Oct 2, 2017, at 2:26 PM, Erik McCormick  wrote:
> 
> On Mon, Oct 2, 2017 at 11:55 AM, Matthew Vernon  > wrote:
>> On 02/10/17 12:34, Osama Hasebou wrote:
>>> Hi Everyone,
>>> 
>>> Is there a guide/tutorial about how to setup Ceph monitoring system
>>> using collectd / grafana / graphite ? Other suggestions are welcome as
>>> well !
>> 
>> We just installed the collectd plugin for ceph, and pointed it at our
>> grahphite server; that did most of what we wanted (we also needed a
>> script to monitor wear on our SSD devices).
>> 
>> Making a dashboard is rather a matter of personal preference - we plot
>> client and s3 i/o, network, server load & CPU use, and have indicator
>> plots for numbers of osds up&in, and monitor quorum.
>> 
>> [I could share our dashboard JSON, but it's obviously specific to our
>> data sources]
>> 
>> Regards,
>> 
>> Matthew
>> 
>> 
> 
> I for one would love to see your dashboard. host and data source names
> can be easily replaced :)
> 
> -Erik
> 
>> --
>> The Wellcome Trust Sanger Institute is operated by Genome Research
>> Limited, a charity registered in England with number 1021457 and a
>> company registered in England with number 2742969, whose registered
>> office is 215 Euston Road, London, NW1 2BE.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] decreasing number of PGs

2017-10-02 Thread Jack
You cannot;


On 02/10/2017 21:43, Andrei Mikhailovsky wrote:
> Hello everyone, 
> 
> what is the safest way to decrease the number of PGs in the cluster. 
> Currently, I have too many per osd. 
> 
> Thanks 
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] decreasing number of PGs

2017-10-02 Thread David Turner
Adding more OSDs or deleting/recreating pools that have too many PGs are
your only 2 options to reduce the number of PG's per OSD.  It is on the
Ceph roadmap, but is not a currently supported feature.  You can
alternatively adjust the setting threshold for the warning, but it is still
a problem you should address in your cluster.

On Mon, Oct 2, 2017 at 4:02 PM Jack  wrote:

> You cannot;
>
>
> On 02/10/2017 21:43, Andrei Mikhailovsky wrote:
> > Hello everyone,
> >
> > what is the safest way to decrease the number of PGs in the cluster.
> Currently, I have too many per osd.
> >
> > Thanks
> >
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MDS crashes shortly after startup while trying to purge stray files.

2017-10-02 Thread Patrick Donnelly
On Thu, Sep 28, 2017 at 5:16 AM, Micha Krause  wrote:
> Hi,
>
> I had a chance to catch John Spray at the Ceph Day, and he suggested that I
> try to reproduce this bug in luminos.

Did you edit the code before trying Luminous? I also noticed from your
original mail that it appears you're using multiple active metadata
servers? If so, that's not stable in Jewel. You may have tripped on
one of many bugs fixed in Luminous for that configuration.

-- 
Patrick Donnelly
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore questions about workflow and performance

2017-10-02 Thread Sam Huracan
Anyone can help me?

On Oct 2, 2017 17:56, "Sam Huracan"  wrote:

> Hi,
>
> I'm reading this document:
>  http://storageconference.us/2017/Presentations/CephObjectStore-slides.pdf
>
> I have 3 questions:
>
> 1. BlueStore writes both data (to raw block device) and metadata (to
> RockDB) simultaneously, or sequentially?
>
> 2. From my opinion, performance of BlueStore can not compare to FileStore
> using SSD Journal, because performance of raw disk is less than using
> buffer. (this is buffer purpose). How do you think?
>
> 3.  Do setting Rock DB and Rock DB Wal in SSD only enhance write, read
> performance? or both?
>
> Hope your answer,
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph on ARM meeting canceled

2017-10-02 Thread Leonardo Vaz
Hey Cephers,

My apologies for the short notice, but the Ceph on ARM meeting scheduled
for tomorrow (Oct 3) has been canceled.

Kindest regards,

Leo

-- 
Leonardo Vaz
Ceph Community Manager
Open Source and Standards Team
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RGW how to delete orphans

2017-10-02 Thread Christian Wuerdig
yes, at least that's how I'd interpret the information given in this
thread: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-February/016521.html

On Tue, Oct 3, 2017 at 1:11 AM, Webert de Souza Lima
 wrote:
> Hey Christian,
>
>> On 29 Sep 2017 12:32 a.m., "Christian Wuerdig"
>>  wrote:
>>>
>>> I'm pretty sure the orphan find command does exactly just that -
>>> finding orphans. I remember some emails on the dev list where Yehuda
>>> said he wasn't 100% comfortable of automating the delete just yet.
>>> So the purpose is to run the orphan find tool and then delete the
>>> orphaned objects once you're happy that they all are actually
>>> orphaned.
>>>
>
> so what you mean is that one should manually remove the result listed
> objects that are output?
>
>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> Belo Horizonte - Brasil
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Discontiune of cn.ceph.com

2017-10-02 Thread Leonardo Vaz
On Mon, Oct 02, 2017 at 11:47:47PM +0800, Shengjing Zhu wrote:
> Hi,
> 
> According to the regulation in China, we, the mirror site of
> mirrors.ustc.edu.cn, is no longer able to serve the domain
> cn.ceph.com, which has no ICP license[1].
> 
> Please either disable the CNAME record of cn.ceph.com or change it to a
> mirror like hk.ceph.com.
> 
> People can still access our mirror via https://mirrors.ustc.edu.cn/ceph
> 
> Sorry for the inconvenience.
> 
> [1] https://en.wikipedia.org/wiki/ICP_license
> 
> Shengjing Zhu,
> on behalf of admins of mirrors.ustc.edu.cn

Thanks for contacting us Shengjing, we are going to remove the CNAME.

Kindest regards,

Leo

-- 
Leonardo Vaz
Ceph Community Manager
Open Source and Standards Team
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph monitoring

2017-10-02 Thread Jasper Spaans
Hi,

On 02/10/2017 13:34, Osama Hasebou wrote:

> Hi Everyone,
>
> Is there a guide/tutorial about how to setup Ceph monitoring system
> using collectd / grafana / graphite ? Other suggestions are welcome as
> well !
>
> I found some GitHub solutions but not much documentation on how to
> implement.
>

I tried setting up Prometheus to add monitoring to my Ceph
single-node-cluster at home using the new ceph-mgr goodies, but that
didn't really work out of the box[0]. This because there are some issues
with the identifier names being generated by the prometheus module for
ceph-mgr in luminous, which appear to have been solved in the master branch.

Just plugging in a fresh prometheus/module.py[1] and restarting the mgr
daemon allowed me to actually scrape the target using prometheus though.

Now to find or build a pretty dashboard with all of these metrics. I
wasn't able to find something in the grafana supplied dashboards, and
haven't spent enough time on openattic to extract a dashboard from
there. Any pointers appreciated!

As a side note, during his talk at the NL Ceph day, John Spray spoke
about being more liberal with updates/backports for those modules. Would
this be a candidate for such a policy, as the current one is dysfunctional?


Cheers,
Jasper


[0] I'm running the Ceph-supplied packages on Debian, currently at
12.2.1-1~bpo90+1.
[1]
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/prometheus/module.py
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com