Re: [ceph-users] More problems building Ceph....
Noah Watkins writes: > Oh, it looks like autogen.sh is smart about that now. If you using the > latest master, my suggestion may not be the solution. > > On Fri, Jul 25, 2014 at 11:51 AM, Noah Watkins > wrote: >> Make sure you are intializing the sub-modules.. the autogen.sh script >> should probably notify users when these are missing and/or initialize >> them automatically.. >> >> git submodule init >> git submodule update >> >> or alternatively, git clone --recursive ... >> >> On Fri, Jul 25, 2014 at 11:48 AM, Deven Phillips >> wrote: >>> I'm trying to build DEB packages for my armhf devices, but my most recent >>> efforts are dying. Anny suggestions would be MOST welcome! >>> >>> make[5]: Entering directory `/home/cubie/Source/ceph/src/java' >>> jar cf libcephfs.jar -C java com/ceph/fs/CephMount.class -C java >>> com/ceph/fs/CephStat.class -C java com/ceph/fs/CephStatVFS.class -C java >>> com/ceph/fs/CephNativeLoader.class -C java >>> com/ceph/fs/CephNotMountedException.class -C java >>> com/ceph/fs/CephFileAlreadyExistsException.class -C java >>> com/ceph/fs/CephAlreadyMountedException.class -C java >>> com/ceph/fs/CephNotDirectoryException.class -C java >>> com/ceph/fs/CephPoolException.class -C java com/ceph/fs/CephFileExtent.class >>> -C java com/ceph/crush/Bucket.class >>> export CLASSPATH=:/usr/share/java/junit4.jar:java/:test/ ; \ >>> javac -source 1.5 -target 1.5 -Xlint:-options >>> test/com/ceph/fs/*.java >>> jar cf libcephfs-test.jar -C test com/ceph/fs/CephDoubleMountTest.class -C >>> test com/ceph/fs/CephMountCreateTest.class -C test >>> com/ceph/fs/CephMountTest.class -C test com/ceph/fs/CephUnmountedTest.class >>> -C test com/ceph/fs/CephAllTests.class >>> make[5]: Leaving directory `/home/cubie/Source/ceph/src/java' >>> make[4]: Leaving directory `/home/cubie/Source/ceph/src/java' >>> Making all in libs3 >>> make[4]: Entering directory `/home/cubie/Source/ceph/src/libs3' >>> make[4]: *** No rule to make target `all'. Stop. >>> make[4]: Leaving directory `/home/cubie/Source/ceph/src/libs3' >>> make[3]: *** [all-recursive] Error 1 >>> make[3]: Leaving directory `/home/cubie/Source/ceph/src' >>> make[2]: *** [all] Error 2 >>> make[2]: Leaving directory `/home/cubie/Source/ceph/src' >>> make[1]: *** [all-recursive] Error 1 >>> make[1]: Leaving directory `/home/cubie/Source/ceph' >>> make: *** [build-stamp] Error 2 >>> dpkg-buildpackage: error: debian/rules build gave error exit status 2 For me (ubuntu trusty) building via dpkg-buildpackage seems to work perfectly fine. However the other day when I tried building libs3 as a standalone, it errored out. Here I found that the Makefile has a default version number of trunk.trunk which breaks the debian rules. Changing that to a numeric value seemed to work. >>> >>> Thanks in advance! >>> >>> Deven >>> -- Abhishek >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fw: single node installation
Lorieri writes: > http://ceph.com/docs/dumpling/start/quick-ceph-deploy/ These steps work against the current ceph release (firefly) as well, for me, as far as the config file has the setting osd crush chooseleaf type = 0 -- Abhishek L pgp: 69CF 4838 8EE3 746C 5ED4 1F16 F9F0 641F 1B65 ED5F ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Fw: single node installation
pragya jain writes: > hi abhishek and lorieri! > > the link you have mentioned has two node installation: one is admin-node and > another is server-node. > for this installation, as i understand, i need two Ubuntu VMs - one for each > node. > > Am i right? > You can do the same steps on a single node itself ie. install the mons & osds on a single node itself. Alternatively if you plan on running ceph as a backend for openstack glance & cinder, you could try the latest devstack http://techs.enovance.com/6572/brace-yourself-devstack-ceph-is-here Regards -- Abhishek L pgp: 69CF 4838 8EE3 746C 5ED4 1F16 F9F0 641F 1B65 ED5F ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] updation of container and account while using Swift API
Hi pragya jain writes: > Hello all! > I have some basic questions about the process followed by Ceph > software when a user use SwiftAPIs for accessing its > storage1. According to my understanding, to keep the objects listing > in containers and containers listing in an account, Ceph software > maintains different pools for accounts and containers. To which > extent, it is right?2 Yes, they are maintained in different pools. The pools .rgw stores the buckets, the users/accounts are stored in .users (or users.uid) pool, IIRC, the list of buckets per user is stored as omap keys of the user object >. When a user upload an object using Swift APIs, then what procedure > does Ceph software follow to update object listing and bytes used in >container and account? Please help in this regard. The objects are stored in .rgw.buckets pool, a bucket stores the list of objects it contains as an omap keys of the bucket object. So this would be the place to look for. I'm not sure where the size info for objects/buckets are stored while doing a HEAD on the swift account, though it would be interesting to know. I had written some notes on this mostly from different mailing list discussions in ceph-devel[1] though it was not updated as much I wanted to. [1] https://github.com/theanalyst/notes/blob/master/rgw.org#buckets > Thank you-RegardsPragya JainDepartment of Computer ScienceUniversity of > DelhiDelhi, India___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Shadow files
Yehuda Sadeh-Weinraub writes: >> Is there a quick way to see which shadow files are safe to delete >> easily? > > There's no easy process. If you know that a lot of the removed data is on > buckets that shouldn't exist anymore then you could start by trying to > identify that. You could do that by: > > $ radosgw-admin metadata list bucket > > then, for each bucket: > > $ radosgw-admin metadata get bucket: > > This will give you the bucket markers of all existing buckets. Each data > object (head and shadow objects) is prefixed by bucket markers. Objects that > don't have valid bucket markers can be removed. Note that I would first list > all objects, then get the list of valid bucket markers, as the operation is > racy and new buckets can be created in the mean time. > > We did discuss a new garbage cleanup tool that will address your specific > issue, and we have a design for it, but it's not there yet. > Could you share the design/ideas for making the cleanup tool. After an initial search I could only find two issues [1] http://tracker.ceph.com/issues/10342 [2] http://tracker.ceph.com/issues/9604 though not much details are there to get started. -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Radosgw multi-region user creation question
Hi I'm trying to set up a POC multi-region radosgw configuration (with different ceph clusters). Following the official docs[1], here the part about creation of zone system users was not very clear. Going by an example configuration of 2 regions US (master zone us-dc1), EU (master zone eu-dc1) for eg. (with secondary zones of other also created in these regions). If I create zone users seperately in the 2 regions ie. us-dc1 zone user & eu-dc1 zone user, while the metadata sync does occur, if I try to create a bucket with location passed as the secondary region, it fails with an 403, access denied, as the system user of secondary region is unknown to master region. I was able to bypass this by creating a system user for secondary zone of secondary region in the master region (ie creating a system user for eu secondary zone in us region) and then recreating the user in the secondary region by passing on --access & --secret-key parameter to recreate the same user with same keys. This seemed to work, however I'm not sure whether this is the direction to proceed, as the docs do not mention a step like this [1] http://ceph.com/docs/master/radosgw/federated-config/#configure-a-secondary-region -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Radosgw startup failures & misdirected client requests
We've had a hammer (0.94.1) (virtual) 3 node/3 osd cluster with radosgws failing to start, failing continously with the following error: --8<---cut here---start->8--- 2015-05-06 04:40:38.815545 7f3ef9046840 0 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process radosgw, pid 32580 2015-05-06 04:40:38.834785 7f3ef9046840 -1 failed reading default region info: (6) No such device or address 2015-05-06 04:40:38.839987 7f3ef9046840 -1 Couldn't init storage provider (RADOS) --8<---cut here---end--->8--- The ceph logs at the same moment reveal the following: --8<---cut here---start->8--- 2015-05-06 04:39:56.287522 mon.0 X.X.X.6:6789/0 304 : cluster [INF] HEALTH_OK 2015-05-06 04:40:03.721690 osd.0 X.X.X.10:6800/26741 61 : cluster [WRN] client.14340 X.X.X.7:0/1029229 misdirected client.14340.0:1 pg 4.9a566808 to osd.0 not [2,1,0] in e17/17 2015-05-06 04:40:38.834364 osd.0 X.X.X.10:6800/26741 62 : cluster [WRN] client.24328 X.X.X.10:0/1032583 misdirected client.24328.0:1 pg 4.9a566808 to osd.0 not [2,1,0] in e17/17 2015-05-06 04:40:53.521053 osd.0 X.X.X.10:6800/26741 63 : cluster [WRN] client.24337 X.X.X.6:0/1030829 misdirected client.24337.0:1 pg 4.9a566808 to osd.0 not [2,1,0] in e17/17 2015-05-06 04:41:01.949213 osd.0 X.X.X.10:6800/26741 64 : cluster [WRN] client.24346 X.X.X.10:0/1001510 misdirected client.24346.0:1 pg4.9a566808 to osd.0 not [2,1,0] in e17/17 --8<---cut here---end--->8--- Radosgw was failing forever with the same log, with the ceph cluster showing similiar logs as above. On looking at the pools created by radosgw only `.rgw.root` pool could be found with no objects inside the pool. Pools were not manually assigned to rgw, so rgw should create this at the first time, so looking back at the beginning of radosgw's logs, during the first initialization cycle we see: --8<---cut here---start->8--- 2015-05-06 04:35:43.355111 7f5214bab840 0 ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process radosgw, pid 24495 2015-05-06 04:37:00.084655 7f5214bab840 0 couldn't find old data placement pools config, setting up new ones for the zone 2015-05-06 04:37:00.091554 7f5214bab840 -1 error storing zone params: (6) No such device or address 2015-05-06 04:37:00.095111 7f5214bab840 -1 Couldn't init storage provider (RADOS) --8<---cut here---end--->8--- At this point the ceph cluster was still at 2 osds; the ceph logs as below: --8<---cut here---start->8--- 2015-05-06 04:35:53.971872 mon.0 X.X.X.6:6789/0 145 : cluster [INF] osdmap e10: 2 osds: 1 up, 1 in 2015-05-06 04:35:54.231036 mon.0 X.X.X.6:6789/0 146 : cluster [INF] pgmap v23: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, 4610 MB / 8740 MB avail 2015-05-06 04:35:57.844222 mon.0 X.X.X.6:6789/0 148 : cluster [INF] pgmap v24: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, 4610 MB / 8740 MB avail 2015-05-06 04:36:14.237299 osd.0 X.X.X.10:6800/26741 47 : cluster [WRN] 6 slow requests, 1 included below; oldest blocked for > 1128.060992 secs 2015-05-06 04:36:14.237310 osd.0 X.X.X.10:6800/26741 48 : cluster [WRN] slow request 30.869139 seconds old, received at 2015-05-06 04:35:43.368075: osd_op(client.6515.0:1 default.region [getxattrs,stat] 4.9a566808 ack+read+known_if_redirected e9) currently reached_pg 2015-05-06 04:36:20.142985 mon.0 X.X.X.6:6789/0 165 : cluster [INF] osdmap e11: 2 osds: 1 up, 1 in 2015-05-06 04:36:20.249438 mon.0 X.X.X.6:6789/0 166 : cluster [INF] pgmap v25: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, 4610 MB / 8740 MB avail 2015-05-06 04:36:21.437516 mon.0 X.X.X.6:6789/0 167 : cluster [INF] osd.1 X.X.X.6:6800/25630 boot 2015-05-06 04:36:21.465908 mon.0 X.X.X.6:6789/0 168 : cluster [INF] osdmap e12: 2 osds: 2 up, 2 in 2015-05-06 04:36:21.551237 mon.0 X.X.X.6:6789/0 169 : cluster [INF] pgmap v26: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, 4610 MB / 8740 MB avail 2015-05-06 04:36:22.752684 mon.0 X.X.X.6:6789/0 170 : cluster [INF] osdmap e13: 2 osds: 2 up, 2 in 2015-05-06 04:36:22.845573 mon.0 X.X.X.6:6789/0 171 : cluster [INF] pgmap v27: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, 4610 MB / 8740 MB avail 2015-05-06 04:36:15.237727 osd.0 X.X.X.10:6800/26741 49 : cluster [WRN] 7 slow requests, 1 included below; oldest blocked for > 1129.061337 secs 2015-05-06 04:36:15.237752 osd.0 X.X.X.10:6800/26741 50 : cluster [WRN] slow request 30.746163 seconds old, received at 2015-05-06 04:35:44.491396: osd_op(client.6524.0:1 default.region [getxattrs,stat] 4.9a566808 ack+read+known_if_redirected e9) currently reached_pg 2015-05-06 04:36:27.497253 mon.0 X.X.X.6:6789/0 174 : cluster [INF] pgmap v28: 120 pgs: 29 cr
Re: [ceph-users] Radosgw startup failures & misdirected client requests
On Tue, May 12, 2015 at 9:13 PM, Abhishek L wrote: > > We've had a hammer (0.94.1) (virtual) 3 node/3 osd cluster with radosgws > failing to start, failing continously with the following error: > > --8<---cut here---start->8--- > 2015-05-06 04:40:38.815545 7f3ef9046840 0 ceph version 0.94.1 > (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process radosgw, pid 32580 > 2015-05-06 04:40:38.834785 7f3ef9046840 -1 failed reading default region > info: (6) No such device or address > 2015-05-06 04:40:38.839987 7f3ef9046840 -1 Couldn't init storage provider > (RADOS) > --8<---cut here---end--->8--- > > The ceph logs at the same moment reveal the following: > > --8<---cut here---start->8--- > 2015-05-06 04:39:56.287522 mon.0 X.X.X.6:6789/0 304 : cluster [INF] HEALTH_OK > 2015-05-06 04:40:03.721690 osd.0 X.X.X.10:6800/26741 61 : cluster [WRN] > client.14340 X.X.X.7:0/1029229 misdirected client.14340.0:1 pg 4.9a566808 to > osd.0 not [2,1,0] in e17/17 > 2015-05-06 04:40:38.834364 osd.0 X.X.X.10:6800/26741 62 : cluster [WRN] > client.24328 X.X.X.10:0/1032583 misdirected client.24328.0:1 pg 4.9a566808 to > osd.0 not [2,1,0] in e17/17 > 2015-05-06 04:40:53.521053 osd.0 X.X.X.10:6800/26741 63 : cluster [WRN] > client.24337 X.X.X.6:0/1030829 misdirected client.24337.0:1 pg 4.9a566808 to > osd.0 not [2,1,0] in e17/17 > 2015-05-06 04:41:01.949213 osd.0 X.X.X.10:6800/26741 64 : cluster [WRN] > client.24346 X.X.X.10:0/1001510 misdirected client.24346.0:1 pg4.9a566808 to > osd.0 not [2,1,0] in e17/17 > --8<---cut here---end--->8--- > > Radosgw was failing forever with the same log, with the ceph cluster > showing similiar logs as above. On looking at the pools created by > radosgw only `.rgw.root` pool could be found with no objects inside the > pool. Pools were not manually assigned to rgw, so rgw should create > this at the first time, so looking back at the beginning of radosgw's > logs, during the first initialization cycle we see: > > --8<---cut here---start->8--- > 2015-05-06 04:35:43.355111 7f5214bab840 0 ceph version 0.94.1 > (e4bfad3a3c51054df7e537a724c8d0bf9be972ff), process radosgw, pid 24495 > 2015-05-06 04:37:00.084655 7f5214bab840 0 couldn't find old data placement > pools config, setting up new ones for the zone > 2015-05-06 04:37:00.091554 7f5214bab840 -1 error storing zone params: (6) No > such device or address > 2015-05-06 04:37:00.095111 7f5214bab840 -1 Couldn't init storage provider > (RADOS) > --8<---cut here---end--->8--- > > At this point the ceph cluster was still at 2 osds; the ceph logs as > below: > > --8<---cut here---start->8--- > 2015-05-06 04:35:53.971872 mon.0 X.X.X.6:6789/0 145 : cluster [INF] osdmap > e10: 2 osds: 1 up, 1 in > 2015-05-06 04:35:54.231036 mon.0 X.X.X.6:6789/0 146 : cluster [INF] pgmap > v23: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, > 4610 MB / 8740 MB avail > 2015-05-06 04:35:57.844222 mon.0 X.X.X.6:6789/0 148 : cluster [INF] pgmap > v24: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, > 4610 MB / 8740 MB avail > 2015-05-06 04:36:14.237299 osd.0 X.X.X.10:6800/26741 47 : cluster [WRN] 6 > slow requests, 1 included below; oldest blocked for > 1128.060992 secs > 2015-05-06 04:36:14.237310 osd.0 X.X.X.10:6800/26741 48 : cluster [WRN] slow > request 30.869139 seconds old, received at 2015-05-06 04:35:43.368075: > osd_op(client.6515.0:1 default.region [getxattrs,stat] 4.9a566808 > ack+read+known_if_redirected e9) currently reached_pg > 2015-05-06 04:36:20.142985 mon.0 X.X.X.6:6789/0 165 : cluster [INF] osdmap > e11: 2 osds: 1 up, 1 in > 2015-05-06 04:36:20.249438 mon.0 X.X.X.6:6789/0 166 : cluster [INF] pgmap > v25: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, > 4610 MB / 8740 MB avail > 2015-05-06 04:36:21.437516 mon.0 X.X.X.6:6789/0 167 : cluster [INF] osd.1 > X.X.X.6:6800/25630 boot > 2015-05-06 04:36:21.465908 mon.0 X.X.X.6:6789/0 168 : cluster [INF] osdmap > e12: 2 osds: 2 up, 2 in > 2015-05-06 04:36:21.551237 mon.0 X.X.X.6:6789/0 169 : cluster [INF] pgmap > v26: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used, > 4610 MB / 8740 MB avail > 2015-05-06 04:36:22.752684 mon.0 X.X.X.6:6789/0 170 : cluster [INF] osdmap > e13: 2 osds: 2 up, 2 in > 2015-05-06 04:36:22.845573 mon.0 X.X.X.6:6789/0 171 : cluster [INF] pgmap > v27: 120 pgs: 120 undersized+degraded+peered; 0 bytes data, 4129 MB used,
Re: [ceph-users] Radosgw startup failures & misdirected client requests
[..] > Seeing this in the firefly cluster as well. Tried a couple of rados > commands on the .rgw.root pool this is what is happening: > > abhi@st:~$ sudo rados -p .rgw.root put test.txt test.txt > error putting .rgw.root/test.txt: (6) No such device or address > > abhi@st:~$ sudo ceph osd map .rgw.root test.txt > osdmap e83 pool '.rgw.root' (6) object 'test.txt' -> pg 6.8b0b6108 > (6.0) -> up ([1,2,0], p1) acting ([1,2,0], p1) > > abhi@st:~$ sudo ceph pg map 6.8b0b6108 > osdmap e83 pg 6.8b0b6108 (6.0) -> up [0,2,1] acting [0,2,1] > > Looks like the osd map says the object must go to primary osd as 1, > whereas pg map says that the pg is hosted with 0 as primary. > [..] Solved the problem; just posting it here in case anyone comes across this same error. Primarily the issue was due a misconfiguration from our config management system, where `osd pool default pgp num` got set in ceph.conf and `pg num` didn't, which led to the rgw pools having the default pg num (8) and pgp_num set to a value of 128. Though trying out commands like `ceph osd pool create` will fail without specifying the pg count; `rados mkpool` does allow pool creation without the specification of pg count and falling back to the default values; which probably explains what happened to the .rgw.default pool. An easy way to simulate this error would be to just do a setting like `ceph tell mon.0 injectargs --osd_pool_default_pgp_num 128` and then starting a fresh radosgw (assuming its not installed previously); or creating any pool with rados commands, which will fail when putting different objects because of the increased pgp count compared to the pg count. -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG object skew settings
Hi Is it safe to tweak the value of `mon pg warn max object skew` from the default value of 10 to a higher value of 20-30 or so. What would be a safe upper limit for this value? Also what does exceeding this ratio signify in terms of the cluster health? We are sometimes hitting this limit in buckets index pool (.rgw.buckets.index) which had the some number of pgs compared to a few other pools which host almost no data like the gc pool, root pool etc. Cheers! Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph.conf boolean value for mon_cluster_log_to_syslog
Gregory Farnum writes: > On Thu, May 21, 2015 at 8:24 AM, Kenneth Waegeman > wrote: >> Hi, >> >> Some strange issue wrt boolean values in the config: >> >> this works: >> >> osd_crush_update_on_start = 0 -> osd not updated >> osd_crush_update_on_start = 1 -> osd updated >> >> In a previous version we could set boolean values in the ceph.conf file with >> the integers 1(true) and false(0) also for mon_cluster_log_to_syslog, but >> this does not work anymore..: >> >> mon_cluster_log_to_syslog = true >> works, but >> mon_cluster_log_to_syslog = 1 >> does not. >> >> Is mon_cluster_log_to_syslog not a real boolean anymore? Or what could this >> be? > > Looking at src/common/config_opts.h, mon_cluster_log_to_syslog is a > string type now. I presume the code is interpreting it and there are > different options but I don't know when or why it got changed. :) Git blame tells me that it was introduced in https://github.com/ceph/ceph/pull/2118, (b97b06) where it was changed From bool to string. Though I can't answer the why :) > -Greg > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] xattrs vs. omap with radosgw
On Wed, Jun 17, 2015 at 1:02 PM, Nathan Cutler wrote: >> We've since merged something >> that stripes over several small xattrs so that we can keep things inline, >> but it hasn't been backported to hammer yet. See >> c6cdb4081e366f471b372102905a1192910ab2da. > > Hi Sage: > > You wrote "yet" - should we earmark it for hammer backport? > I'm guessing https://github.com/ceph/ceph/pull/4973 is the backport for hammer (issue http://tracker.ceph.com/issues/11981) Regards Abhishek ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] 'pgs stuck unclean ' problem
jan.zel...@id.unibe.ch writes: > Hi, > > as I had the same issue in a little virtualized test environment (3 x 10g lvm > volumes) I would like to understand the 'weight' thing. > I did not find any "userfriendly explanation" for that kind of problem. > > The only explanation I found is on > http://ceph.com/docs/master/rados/operations/crush-map/ : > > "specify a weight relative to the total capacity/capability of its item(s) > ... Items may have a weight that reflects the relative weight of the item." > > Can someone point me in the right direction ? In the same document, http://docs.ceph.com/docs/master/rados/operations/crush-map/#crush-map-bucket-hierarchy there is a note on "Weighting Bucket items", which roughly describes how that could be done. > > --- > > Jan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Health WARN, ceph errors looping
Steve Dainard writes: > Hello, > > Ceph 0.94.1 > 2 hosts, Centos 7 > > I have two hosts, one which ran out of / disk space which crashed all > the osd daemons. After cleaning up the OS disk storage and restarting > ceph on that node, I'm seeing multiple errors, then health OK, then > back into the errors: > > # ceph -w > http://pastebin.com/mSKwNzYp Is the error still consistently happening? (the last lines shows active+clean) Wild guess, but is it possible some sort of iptables/firewall rules are preventing communication between the osds? > > Any help is appreciated. > > Thanks, > Steve > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] pg_num docs conflict with Hammer PG count warning
On Thu, Aug 6, 2015 at 1:55 PM, Hector Martin wrote: > On 2015-08-06 17:18, Wido den Hollander wrote: >> >> The mount of PGs is cluster wide and not per pool. So if you have 48 >> OSDs the rule of thumb is: 48 * 100 / 3 = 1600 PGs cluster wide. >> >> Now, with enough memory you can easily have 100 PGs per OSD, but keep in >> mind that the PG count is cluster-wide and not per pool. > > > I understand that now, but that is not what the docs say. The docs say 4096 > PGs per pool (i.e. in the "ceph osd pool create" command) for 48 OSDs. Which > seems to be off by a factor of 2.5x from the actual do-the-math > recommendation for one 3x pool, and successively larger factors as you add > pools. > 4096 was the count considering *all* pools in mind.. since you have 4 pools you should consider reducing the number. Also follow the rule Wido said in the earlier mail for calculation ie n_OSD*100/3 . BTW http://ceph.com/pgcalc/ might help you in selecting this number better. __ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Repair inconsistent pgs..
Voloshanenko Igor writes: > Hi Irek, Please read careful ))) > You proposal was the first, i try to do... That's why i asked about > help... ( > > 2015-08-18 8:34 GMT+03:00 Irek Fasikhov : > >> Hi, Igor. >> >> You need to repair the PG. >> >> for i in `ceph pg dump| grep inconsistent | grep -v 'inconsistent+repair' >> | awk {'print$1'}`;do ceph pg repair $i;done >> >> С уважением, Фасихов Ирек Нургаязович >> Моб.: +79229045757 >> >> 2015-08-18 8:27 GMT+03:00 Voloshanenko Igor : >> >>> Hi all, at our production cluster, due high rebalancing ((( we have 2 pgs >>> in inconsistent state... >>> >>> root@temp:~# ceph health detail | grep inc >>> HEALTH_ERR 2 pgs inconsistent; 18 scrub errors >>> pg 2.490 is active+clean+inconsistent, acting [56,15,29] >>> pg 2.c4 is active+clean+inconsistent, acting [56,10,42] >>> >>> From OSD logs, after recovery attempt: >>> >>> root@test:~# ceph pg dump | grep -i incons | cut -f 1 | while read i; do >>> ceph pg repair ${i} ; done >>> dumped all in format plain >>> instructing pg 2.490 on osd.56 to repair >>> instructing pg 2.c4 on osd.56 to repair >>> >>> /var/log/ceph/ceph-osd.56.log:51:2015-08-18 07:26:37.035910 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> f5759490/rbd_data.1631755377d7e.04da/head//2 expected clone >>> 90c59490/rbd_data.eb486436f2beb.7a65/141//2 >>> /var/log/ceph/ceph-osd.56.log:52:2015-08-18 07:26:37.035960 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> fee49490/rbd_data.12483d3ba0794b.522f/head//2 expected clone >>> f5759490/rbd_data.1631755377d7e.04da/141//2 >>> /var/log/ceph/ceph-osd.56.log:53:2015-08-18 07:26:37.036133 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> a9b39490/rbd_data.12483d3ba0794b.37b3/head//2 expected clone >>> fee49490/rbd_data.12483d3ba0794b.522f/141//2 >>> /var/log/ceph/ceph-osd.56.log:54:2015-08-18 07:26:37.036243 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> bac19490/rbd_data.1238e82ae8944a.032e/head//2 expected clone >>> a9b39490/rbd_data.12483d3ba0794b.37b3/141//2 >>> /var/log/ceph/ceph-osd.56.log:55:2015-08-18 07:26:37.036289 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> 98519490/rbd_data.123e9c2ae8944a.0807/head//2 expected clone >>> bac19490/rbd_data.1238e82ae8944a.032e/141//2 >>> /var/log/ceph/ceph-osd.56.log:56:2015-08-18 07:26:37.036314 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> c3c09490/rbd_data.1238e82ae8944a.0c2b/head//2 expected clone >>> 98519490/rbd_data.123e9c2ae8944a.0807/141//2 >>> /var/log/ceph/ceph-osd.56.log:57:2015-08-18 07:26:37.036363 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> 28809490/rbd_data.edea7460fe42b.01d9/head//2 expected clone >>> c3c09490/rbd_data.1238e82ae8944a.0c2b/141//2 >>> /var/log/ceph/ceph-osd.56.log:58:2015-08-18 07:26:37.036432 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : deep-scrub 2.490 >>> e1509490/rbd_data.1423897545e146.09a6/head//2 expected clone >>> 28809490/rbd_data.edea7460fe42b.01d9/141//2 >>> /var/log/ceph/ceph-osd.56.log:59:2015-08-18 07:26:38.548765 7f94663b3700 >>> -1 log_channel(cluster) log [ERR] : 2.490 deep-scrub 17 errors >>> >>> So, how i can solve "expected clone" situation by hand? >>> Thank in advance! I've had an inconsistent pg once, but it was a different sort of an error (some sort of digest mismatch, where the secondary object copies had later timestamps). This was fixed by moving the object away and restarting, the osd which got fixed when the osd peered, similar to what was mentioned in Sebastian Han's blog[1]. I'm guessing the same method will solve this error as well, but not completely sure, maybe someone else who has seen this particular error could guide you better. [1]: http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Why are RGW pools all prefixed with a period (.)?
On Thu, Aug 27, 2015 at 3:01 PM, Wido den Hollander wrote: > On 08/26/2015 05:17 PM, Yehuda Sadeh-Weinraub wrote: >> On Wed, Aug 26, 2015 at 6:26 AM, Gregory Farnum wrote: >>> On Wed, Aug 26, 2015 at 9:36 AM, Wido den Hollander wrote: Hi, It's something which has been 'bugging' me for some time now. Why are RGW pools prefixed with a period? I tried setting the root pool to 'rgw.root', but RGW (0.94.1) refuses to start: ERROR: region root pool name must start with a period I'm sending pool statistics to Graphite and when sending a key like this you 'break' Graphite: ceph.pools.stats..kb_read A pool like .rgw.root will break this since Graphite splits on periods. So is there any reason why this is? What's the reasoning behind it? >>> >>> This might just be a leftover from when we were mapping buckets into >>> RADOS pools. Yehuda, is there some more current reason? >> >> No current reason. Moreover, I removed the need for that in the new >> multi-site work, so this requirement will be gone sooner or later >> (probably for Jewel). Note that users will still be able to prefix >> pools with a period, and some pools will stay like that for backward >> compatibility, so anything that breaks with these strings should be >> fixed. >> > > Ah, understood. Nice to know. > > I personally would like to see that all default pools from RGW are > prefixed with 'rgw.' so that it's easy to identify which pools belong to > RGW. > > No existing setups should be touched, but only new ones or people who > change the mapping of their pools. > BTW the current master pull request (https://github.com/ceph/ceph/pull/4944) does this ie. removing the dependencies for periods on poolnames ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to back up RGW buckets or RBD snapshots
Somnath Roy writes: > Hi, > I wanted to know how RGW users are backing up the bucket contents , so that > in the disaster scenario user can recreate the setup. > I know there is geo replication support but it could be an expensive > proposition. > I wanted to know if there is any simple solution like plugging in traditional > backup application to RGW. > The same problem applies for RBD as well, how people are backing up RBD > snapshots ? > I am sure production ceph users have something already in place and > appreciate any suggestion on this. As far as RBD is concerned you could backup the volumes to a different pool (or even a different cluster), as explained here: https://ceph.com/dev-notes/incremental-snapshots-with-rbd/ ie. doing a deep copy first and then copying incremental snapshots above that. Openstack cinder , for eg., supports this functionality in form of *backups* allowing backups to a different pool or cluster. > > Thanks & Regards > Somnath > > > > > PLEASE NOTE: The information contained in this electronic mail message is > intended only for the use of the designated recipient(s) named above. If the > reader of this message is not the intended recipient, you are hereby notified > that you have received this message in error and that any review, > dissemination, distribution, or copying of this message is strictly > prohibited. If you have received this communication in error, please notify > the sender by telephone or e-mail (as shown above) immediately and destroy > any and all copies of this message in your possession (whether hard copies or > electronically stored copies). > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph fs has error: no valid command found; 10 closest matches: fsid
Huynh Dac Nguyen writes: > Hi Chris, > > > I see. > I'm runing on version 0.80.7. > How do we know which part of document for our version? As you see, we > have only one ceph document here, It make us confused. > Could you show me the document for ceph version 0.80.7? > Tried ceph.com/docs/firefly [..] -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Query about osd pool default flags & hashpspool
Hi I was going through various conf options to customize a ceph cluster and came across `osd pool default flags` in pool-pg config ref[1]. Though the value specifies an integer, though I couldn't find a mention of possible values this can take in the docs. Looking a bit deeper onto ceph sources [2] a bunch of options are seen at osd_types.h which resemble FLAG_HASHPSPOOL = 1<<0, // hash pg seed and pool together (instead of adding) FLAG_FULL = 1<<1, // pool is full FLAG_DEBUG_FAKE_EC_POOL = 1<<2, // require ReplicatedPG to act like an EC pg FLAG_INCOMPLETE_CLONES = = 1<<3, // may have incomplete clones (bc we are/were an overlay) Are these the configurable options for the osd pool flags? Also in particular what does the `hashpspool' option do? [1] http://ceph.com/docs/next/rados/configuration/pool-pg-config-ref/ [2] https://github.com/ceph/ceph/blob/giant/src/osd/osd_types.h#L815-820 -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] normalizing radosgw
Sage Weil writes: [..] > Thoughts? Suggestions? > [..] Suggestion: radosgw should handle injectargs like other ceph clients do? This is not a major annoyance, but it would be nice to have. -- Abhishek signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph.conf
On Thu, Sep 10, 2015 at 2:51 PM, Shinobu Kinjo wrote: > Thank you for your really really quick reply, Greg. > > > Yes. A bunch shouldn't ever be set by users. > > Anyhow, this is one of my biggest concern right now -; > > rgw_keystone_admin_password = > > > MUST not be there. I know the dangers of this (ie keystone admin password being visible); but isn't this already visible in ceph/radosgw configuration file as well if you configure keystone.[1] [1]: http://ceph.com/docs/master/radosgw/keystone/#integrating-with-openstack-keystone > Shinobu > > - Original Message - > From: "Gregory Farnum" > To: "Shinobu Kinjo" > Cc: "ceph-users" , "ceph-devel" > > Sent: Thursday, September 10, 2015 5:57:52 PM > Subject: Re: [ceph-users] Ceph.conf > > On Thu, Sep 10, 2015 at 9:44 AM, Shinobu Kinjo wrote: >> Hello, >> >> I'm seeing 859 parameters in the output of: >> >> $ ./ceph --show-config | wc -l >> *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** >> 859 >> >> In: >> >> $ ./ceph --version >> *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** >> ceph version 9.0.2-1454-g050e1c5 >> (050e1c5c7471f8f237d9fa119af98c1efa9a8479) >> >> Since I'm quite new to Ceph, so my question is: >> >> Where can I know what each parameter exactly mean? >> >> I am probably right. Some parameters are just for tes- >> ting purpose. > > Yes. A bunch shouldn't ever be set by users. A lot of the ones that > should be are described as part of various operations in > ceph.com/docs, but I don't know which ones of interest are missing > from there. It's not very discoverable right now, unfortunately. > -Greg > >> >> Thank you for your help in advance. >> >> Shinobu >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] RGW Keystone interaction (was Ceph.conf)
On Thu, Sep 10, 2015 at 3:27 PM, Shinobu Kinjo wrote: > Thank you for letting me know your thought, Abhishek!! > > > > The Ceph Object Gateway will query Keystone periodically > > for a list of revoked tokens. These requests are encoded > > and signed. Also, Keystone may be configured to provide > > self-signed tokens, which are also encoded and signed. > Looked a bit more into this, swift apis seem to support the use of an admin tenant, user & token for validating the bearer token, similar to other openstack service which use a service tenant credentials for authenticating. Though it needs documentation, the configurables `rgw keystone admin tenant`, `rgw keystone admin user` and `rgw keystone admin password` make this possible, so as to avoid configuring the keystone shared admin password compoletely. S3 APIs with keystone seem to be a bit more different, apparently s3tokens interface does seem to allow authenticating without an `X-Auth-Token` in the headers and validates based on the access key, secret key provided to it. So basically not configuring `rgw_keystone_admin_password` seem to work, can you also see if this is the case for you. > This is completely absolutely out of scope of my original > question. > > But I would like to ask you if above implementation that > **periodically** talks to keystone with tokens is really > secure or not. > > I'm just asking you. Because I'm just thinking of keysto- > ne federation. > But you can ignore me anyhow or point out anything to me -; > Shinobu > > - Original Message - > From: "Abhishek L" > To: "Shinobu Kinjo" > Cc: "Gregory Farnum" , "ceph-users" > , "ceph-devel" > Sent: Thursday, September 10, 2015 6:35:31 PM > Subject: Re: [ceph-users] Ceph.conf > > On Thu, Sep 10, 2015 at 2:51 PM, Shinobu Kinjo wrote: >> Thank you for your really really quick reply, Greg. >> >> > Yes. A bunch shouldn't ever be set by users. >> >> Anyhow, this is one of my biggest concern right now -; >> >> rgw_keystone_admin_password = >> >> >> MUST not be there. > > > I know the dangers of this (ie keystone admin password being visible); > but isn't this already visible in ceph/radosgw configuration file as > well if you configure keystone.[1] > > [1]: > http://ceph.com/docs/master/radosgw/keystone/#integrating-with-openstack-keystone > >> Shinobu >> >> - Original Message - >> From: "Gregory Farnum" >> To: "Shinobu Kinjo" >> Cc: "ceph-users" , "ceph-devel" >> >> Sent: Thursday, September 10, 2015 5:57:52 PM >> Subject: Re: [ceph-users] Ceph.conf >> >> On Thu, Sep 10, 2015 at 9:44 AM, Shinobu Kinjo wrote: >>> Hello, >>> >>> I'm seeing 859 parameters in the output of: >>> >>> $ ./ceph --show-config | wc -l >>> *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** >>> 859 >>> >>> In: >>> >>> $ ./ceph --version >>> *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** >>> ceph version 9.0.2-1454-g050e1c5 >>> (050e1c5c7471f8f237d9fa119af98c1efa9a8479) >>> >>> Since I'm quite new to Ceph, so my question is: >>> >>> Where can I know what each parameter exactly mean? >>> >>> I am probably right. Some parameters are just for tes- >>> ting purpose. >> >> Yes. A bunch shouldn't ever be set by users. A lot of the ones that >> should be are described as part of various operations in >> ceph.com/docs, but I don't know which ones of interest are missing >> from there. It's not very discoverable right now, unfortunately. >> -Greg >> >>> >>> Thank you for your help in advance. >>> >>> Shinobu >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] radosgw and keystone version 3 domains
On Fri, Sep 18, 2015 at 4:38 AM, Robert Duncan wrote: > > Hi > > > > It seems that radosgw cannot find users in Keystone V3 domains, that is, > > When keystone is configured for domain specific drivers radossgw cannot find > the users in the keystone users table (as they are not there) > > I have a deployment in which ceph providers object block ephemeral and user > storage, however any user outside of the ‘default’ sql backed domain cannot > be found by radosgw. > > Has anyone seen this issue before when using ceph in openstack? Is it > possible to configure radosgw to use a keystone v3 url? I'm not sure whether keystone v3 support for radosgw is there yet, particularly for the swift api. Currently keystone v2 api is supported, and due to the change in format between v2 and v3 tokens, I'm not sure whether swift apis will work with v3 yet, though keystone v3 *might* just work on the s3 interface due to the different format used. > > > Thanks, > > Rob. > > > > The information contained and transmitted in this e-mail is confidential > information, and is intended only for the named recipient to which it is > addressed. The content of this e-mail may not have been sent with the > authority of National College of Ireland. Any views or opinions presented are > solely those of the author and do not necessarily represent those of National > College of Ireland. If the reader of this message is not the named recipient > or a person responsible for delivering it to the named recipient, you are > notified that the review, dissemination, distribution, transmission, printing > or copying, forwarding, or any other use of this message or any part of it, > including any attachments, is strictly prohibited. If you have received this > communication in error, please delete the e-mail and destroy all record of > this communication. Thank you for your assistance. > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] v12.1.0 Luminous RC released
This is the first release candidate for Luminous, the next long term stable release. Ceph Luminous will be the foundation for the next long-term stable release series. There have been major changes since Kraken (v11.2.z) and Jewel (v10.2.z). Major Changes from Kraken - - *General*: * Ceph now has a simple, built-in web-based dashboard for monitoring cluster status. - *RADOS*: * *BlueStore*: - The new *BlueStore* backend for *ceph-osd* is now stable and the new default for newly created OSDs. BlueStore manages data stored by each OSD by directly managing the physical HDDs or SSDs without the use of an intervening file system like XFS. This provides greater performance and features. - BlueStore supports *full data and metadata checksums* of all data stored by Ceph. - BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph also supports zstd for RGW compression but zstd is not recommended for BlueStore for performance reasons.) * *Erasure coded* pools now have full support for *overwrites*, allowing them to be used with RBD and CephFS. * *ceph-mgr*: - There is a new daemon, *ceph-mgr*, which is a required part of any Ceph deployment. Although IO can continue when *ceph-mgr* is down, metrics will not refresh and some metrics-related calls (e.g., ``ceph df``) may block. We recommend deploying several instances of *ceph-mgr* for reliability. See the notes on `Upgrading`_ below. - The *ceph-mgr* daemon includes a REST-based management API. The API is still experimental and somewhat limited but will form the basis for API-based management of Ceph going forward. * The overall *scalability* of the cluster has improved. We have successfully tested clusters with up to 10,000 OSDs. * Each OSD can now have a *device class* associated with it (e.g., `hdd` or `ssd`), allowing CRUSH rules to trivially map data to a subset of devices in the system. Manually writing CRUSH rules or manual editing of the CRUSH is normally not required. * You can now *optimize CRUSH weights* can now be optimized to maintain a *near-perfect distribution of data* across OSDs. * There is also a new `upmap` exception mechanism that allows individual PGs to be moved around to achieve a *perfect distribution* (this requires luminous clients). * Each OSD now adjusts its default configuration based on whether the backing device is an HDD or SSD. Manual tuning generally not required. * The prototype *mclock QoS queueing algorithm* is now available. * There is now a *backoff* mechanism that prevents OSDs from being overloaded by requests to objects or PGs that are not currently able to process IO. * There is a *simplified OSD replacement process* that is more robust. * You can query the supported features and (apparent) releases of all connected daemons and clients with ``ceph features``. * You can configure the oldest Ceph client version you wish to allow to connect to the cluster via ``ceph osd set-require-min-compat-client`` and Ceph will prevent you from enabling features that will break compatibility with those clients. * Several `sleep` settings, include ``osd_recovery_sleep``, ``osd_snap_trim_sleep``, and ``osd_scrub_sleep`` have been reimplemented to work efficiently. (These are used in some cases to work around issues throttling background work.) - *RGW*: * RGW *metadata search* backed by ElasticSearch now supports end user requests service via RGW itself, and also supports custom metadata fields. A query language a set of RESTful APIs were created for users to be able to search objects by their metadata. New APIs that allow control of custom metadata fields were also added. * RGW now supports *dynamic bucket index sharding*. As the number of objects in a bucket grows, RGW will automatically reshard the bucket index in response. No user intervention or bucket size capacity planning is required. * RGW introduces *server side encryption* of uploaded objects with three options for the management of encryption keys: automatic encryption (only recommended for test setups), customer provided keys similar to Amazon SSE-C specification, and through the use of an external key management service (Openstack Barbician) similar to Amazon SSE-KMS specification. * RGW now has preliminary AWS-like bucket policy API support. For now, policy is a means to express a range of new authorization concepts. In the future it will be the founation for additional auth capabilities such as STS and group policy. * RGW has consolidated the several metadata index pools via the use of rados namespaces. - *RBD*: * RBD now has full, stable support for *erasure coded pools* via the new ``--data-pool`` option to ``rbd create``.
Re: [ceph-users] Stealth Jewel release?
On Wed, Jul 12, 2017 at 9:13 PM, Xiaoxi Chen wrote: > +However, it also introduced a regression that could cause MDS damage. > +Therefore, we do *not* recommend that Jewel users upgrade to this version - > +instead, we recommend upgrading directly to v10.2.9 in which the regression > is > +fixed. > > It looks like this version is NOT production ready. Curious why we > want a not-recwaended version to be released? We found a regression in MDS right after packages were built, and the release was about to be announced. This is why we didn't announce the release. We're currently running tests after the fix for MDS was merged. So when we do announce the release we'll announce 10.2.9 so that users can upgrade from 10.2.7->10.2.9 Best, Abhishek > 2017-07-12 22:44 GMT+08:00 David Turner : >> The lack of communication on this makes me tentative to upgrade to it. Are >> the packages available to Ubuntu/Debian systems production ready and >> intended for upgrades? >> >> On Tue, Jul 11, 2017 at 8:33 PM Brad Hubbard wrote: >>> >>> On Wed, Jul 12, 2017 at 12:58 AM, David Turner >>> wrote: >>> > I haven't seen any release notes for 10.2.8 yet. Is there a document >>> > somewhere stating what's in the release? >>> >>> https://github.com/ceph/ceph/pull/16274 for now although it should >>> make it into the master doc tree soon. >>> >>> > >>> > On Mon, Jul 10, 2017 at 1:41 AM Henrik Korkuc wrote: >>> >> >>> >> On 17-07-10 08:29, Christian Balzer wrote: >>> >> > Hello, >>> >> > >>> >> > so this morning I was greeted with the availability of 10.2.8 for >>> >> > both >>> >> > Jessie and Stretch (much appreciated), but w/o any announcement here >>> >> > or >>> >> > updated release notes on the website, etc. >>> >> > >>> >> > Any reason other "Friday" (US time) for this? >>> >> > >>> >> > Christian >>> >> >>> >> My guess is that they didn't have time to announce it yet. Maybe pkgs >>> >> were not ready yet on friday? >>> >> >>> >> ___ >>> >> ceph-users mailing list >>> >> ceph-users@lists.ceph.com >>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >>> > >>> > ___ >>> > ceph-users mailing list >>> > ceph-users@lists.ceph.com >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > >>> >>> >>> >>> -- >>> Cheers, >>> Brad >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] v11.0.2 released
This development checkpoint release includes a lot of changes and improvements to Kraken. This is the first release introducing ceph-mgr, a new daemon which provides additional monitoring & interfaces to external monitoring/management systems. There are also many improvements to bluestore, RGW introduces sync modules, copy part for multipart uploads and metadata search via elastic search as a tech preview. We've had to skip releasing 11.0.1 due to an issue with git tags and package versions as we were transitioning away from autotools to cmake and the new build system in place. Due to the really long changelog in this release, please read the list of changes here: http://ceph.com/releases/kraken-11-0-2-released/ The debian and rpm packages are available at the usual locations at http://download.ceph.com/debian-kraken/ and http://download.ceph.com/rpm-kraken respectively. For more details refer below. Getting Ceph * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-11.0.2.tar.gz * For packages, see http://ceph.com/docs/master/install/get-packages * For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy -- Abhishek Lekshmanan SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) signature.asc Description: PGP signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to create two isolated rgw services in one ceph cluster?
piglei writes: > Hi, I am a ceph newbie. I want to create two isolated rgw services in a > single ceph cluster, the requirements: > > * Two radosgw will have different hosts, such as radosgw-x.site.com and > radosgw-y.site.com. File uploaded to rgw-xcannot be accessed via rgw-y. > * Isolated bucket and user namespaces is not necessary, because I could > prepend term to bucket name and user name, like "x-bucket" or "y-bucket" > > At first I thought region and zone may be the solution, but after a little > more researchs, I found that region and zone are for different geo locations, > they share the same metadata (buckets and users) and objects instead of > isolated copies. > > After that I noticed ceph's multi-tenancy feature since jewel release, which > is probably what I'm looking for, here is my solution using multi-tenancy: > > * using two tenant called x and y, each rgw service matches one tenant. > * Limit incoming requests to rgw in it's own tenant, which means you can only > retrieve resources belongs to buckets "x:bucket" when > callingradosgw-x.site.com. This can be archived by some custom nginx rules. > > Is this the right approach or Should I just use two different clusters > instead? Looking forward to your awesome advises. > Since jewel, you can also consider looking into realms which sort of provide for isolated namespaces within a zone or zonegroup. -- Abhishek ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] 10.2.4 Jewel released
This point release fixes several important bugs in RBD mirroring, RGW multi-site, CephFS, and RADOS. We recommend that all v10.2.x users upgrade. Also note the following when upgrading from hammer Upgrading from hammer - When the last hammer OSD in a cluster containing jewel MONs is upgraded to jewel, as of 10.2.4 the jewel MONs will issue this warning: "all OSDs are running jewel or later but the 'require_jewel_osds' osdmap flag is not set" and change the cluster health status to HEALTH_WARN. This is a signal for the admin to do "ceph osd set require_jewel_osds" - by doing this, the upgrade path is complete and no more pre-Jewel OSDs may be added to the cluster. Notable Changes --- * build/ops: aarch64: Compiler-based detection of crc32 extended CPU type is broken (issue#17516 , pr#11492 , Alexander Graf) * build/ops: allow building RGW with LDAP disabled (issue#17312 , pr#11478 , Daniel Gryniewicz) * build/ops: backport 'logrotate: Run as root/ceph' (issue#17381 , pr#11201 , Boris Ranto) * build/ops: ceph installs stuff in %_udevrulesdir but does not own that directory (issue#16949 , pr#10862 , Nathan Cutler) * build/ops: ceph-osd-prestart.sh fails confusingly when data directory does not exist (issue#17091 , pr#10812 , Nathan Cutler) * build/ops: disable LTTng-UST in openSUSE builds (issue#16937 , pr#10794 , Michel Normand) * build/ops: i386 tarball gitbuilder failure on master (issue#16398 , pr#10855 , Vikhyat Umrao, Kefu Chai) * build/ops: include more files in "make dist" tarball (issue#17560 , pr#11431 , Ken Dreyer) * build/ops: incorrect value of CINIT_FLAG_DEFER_DROP_PRIVILEGES (issue#16663 , pr#10278 , Casey Bodley) * build/ops: remove SYSTEMD_RUN from initscript (issue#7627 , issue#16441 , issue#16440 , pr#9872 , Vladislav Odintsov) * build/ops: systemd: add install section to rbdmap.service file (issue#17541 , pr#11158 , Jelle vd Kooij) * common: Enable/Disable of features is allowed even the features are already enabled/disabled (issue#16079 , pr#11460 , Lu Shi) * common: Log.cc: Assign LOG_INFO priority to syslog calls (issue#15808 , pr#11231 , Brad Hubbard) * common: Proxied operations shouldn't result in error messages if replayed (issue#16130 , pr#11461 , Vikhyat Umrao) * common: Request exclusive lock if owner sends -ENOTSUPP for proxied maintenance op (issue#16171 , pr#10784 , Jason Dillaman) * common: msgr/async: Messenger thread long time lock hold risk (issue#15758 , pr#10761 , Wei Jin) * doc: fix description for rsize and rasize (issue#17357 , pr#11171 , Andreas Gerstmayr) * filestore: can get stuck in an unbounded loop during scrub (issue#17859 , pr#12001 , Sage Weil) * fs: Failure in snaptest-git-ceph.sh (issue#17172 , pr#11419 , Yan, Zheng) * fs: Log path as well as ino when detecting metadata damage (issue#16973 , pr#11418 , John Spray) * fs: client: FAILED assert(root_ancestor->qtree == __null) (issue#16066 , issue#16067 , pr#10107 , Yan, Zheng) * fs: client: add missing client_lock for get_root (issue#17197 , pr#10921 , Patrick Donnelly) * fs: client: fix shutdown with open inodes (issue#16764 , pr#10958 , John Spray) * fs: client: nlink count is not maintained correctly (issue#16668 , pr#10877 , Jeff Layton) * fs: multimds: allow_multimds not required when max_mds is set in ceph.conf at startup (issue#17105 , pr#10997 , Patrick Donnelly) * librados: memory leaks from ceph::crypto (WITH_NSS) (issue#17205 , pr#11409 , Casey Bodley) * librados: modify Pipe::connect() to return the error code (issue#15308 , pr#11193 , Vikhyat Umrao) * librados: remove new setxattr overload to avoid breaking the C++ ABI (issue#18058 , pr#12207 , Josh Durgin) * librbd: cannot disable journaling or remove non-mirrored, non-primary image (issue#16740 , pr#11337 , Jason Dillaman) * librbd: discard after write can result in assertion failure (issue#17695 , pr#11644 , Jason Dillaman) * librbd::Operations: update notification failed: (2) No such file or directory (issue#17549 , pr#11420 , Jason Dillaman) * mds: Crash in Client::_invalidate_kernel_dcache when reconnecting during unmount (issue#17253 , pr#11414 , Yan, Zheng) * mds: Duplicate damage table entries (issue#17173 , pr#11412 , John Spray) * mds: Failure in dirfrag.sh (issue#17286 , pr#11416 , Yan, Zheng) * mds: Failure in snaptest-git-ceph.sh (issue#17271 , pr#11415 , Yan, Zheng) * mon: Ceph Status - Segmentation Fault (issue#16266 , pr#11408 , Brad Hubbard) * mon: Display full flag in ceph status if full flag is set (issue#15809 , pr#9388 , Vikhyat Umrao) * mon: Error EINVAL: removing mon.a at 172.21.15.16:6789/0, there will be 1 monitors (issue#17725 , pr#12267 , Joao Eduardo Luis) * mon: OSDMonitor: only reject MOSDBoot based on up_from if inst matches (issue#17899 , pr#12067 , Samuel Just) * mon: OSDMonitor: Missing nearfull flag set (issue#17390 , pr#11272 , Igor Podoski) * mon: Upgrading 0.94.6 -> 0.94.9 saturating mon node networking (issue#17365 , issue#17386 , pr#
[ceph-users] v11.1.0 kraken candidate released
Hi everyone, This is the first release candidate for Kraken, the next stable release series. There have been major changes from jewel with many features being added. Please note the upgrade process from jewel, before upgrading. Major Changes from Jewel - *RADOS*: * The new *BlueStore* backend now has a stable disk format and is passing our failure and stress testing. Although the backend is still flagged as experimental, we encourage users to try it out for non-production clusters and non-critical data sets. * RADOS now has experimental support for *overwrites on erasure-coded* pools. Because the disk format and implementation are not yet finalized, there is a special pool option that must be enabled to test the new feature. Enabling this option on a cluster will permanently bar that cluster from being upgraded to future versions. * We now default to the AsyncMessenger (``ms type = async``) instead of the legacy SimpleMessenger. The most noticeable difference is that we now use a fixed sized thread pool for network connections (instead of two threads per socket with SimpleMessenger). * Some OSD failures are now detected almost immediately, whereas previously the heartbeat timeout (which defaults to 20 seconds) had to expire. This prevents IO from blocking for an extended period for failures where the host remains up but the ceph-osd process is no longer running. * There is a new ``ceph-mgr`` daemon. It is currently collocated with the monitors by default, and is not yet used for much, but the basic infrastructure is now in place. * The size of encoded OSDMaps has been reduced. * The OSDs now quiesce scrubbing when recovery or rebalancing is in progress. - *RGW*: * RGW now supports a new zone type that can be used for metadata indexing via Elasticseasrch. * RGW now supports the S3 multipart object copy-part API. * It is possible now to reshard an existing bucket. Note that bucket resharding currently requires that all IO (especially writes) to the specific bucket is quiesced. * RGW now supports data compression for objects. * Civetweb version has been upgraded to 1.8 * The Swift static website API is now supported (S3 support has been added previously). * S3 bucket lifecycle API has been added. Note that currently it only supports object expiration. * Support for custom search filters has been added to the LDAP auth implementation. * Support for NFS version 3 has been added to the RGW NFS gateway. * A Python binding has been created for librgw. - *RBD*: * RBD now supports images stored in an *erasure-coded* RADOS pool using the new (experimental) overwrite support. Images must be created using the new rbd CLI "--data-pool " option to specify the EC pool where the backing data objects are stored. Attempting to create an image directly on an EC pool will not be successful since the image's backing metadata is only supported on a replicated pool. * The rbd-mirror daemon now supports replicating dynamic image feature updates and image metadata key/value pairs from the primary image to the non-primary image. * The number of image snapshots can be optionally restricted to a configurable maximum. * The rbd Python API now supports asynchronous IO operations. - *CephFS*: * libcephfs function definitions have been changed to enable proper uid/gid control. The library version has been increased to reflect the interface change. * Standby replay MDS daemons now consume less memory on workloads doing deletions. * Scrub now repairs backtrace, and populates `damage ls` with discovered errors. * A new `pg_files` subcommand to `cephfs-data-scan` can identify files affected by a damaged or lost RADOS PG. * The false-positive "failing to respond to cache pressure" warnings have been fixed. Upgrading from Jewel * All clusters must first be upgraded to Jewel 10.2.z before upgrading to Kraken 11.2.z (or, eventually, Luminous 12.2.z). * The ``sortbitwise`` flag must be set on the Jewel cluster before upgrading to Kraken. The latest Jewel (10.2.4+) releases issue a health warning if the flag is not set, so this is probably already set. If it is not, Kraken OSDs will refuse to start and will print and error message in their log. Upgrading - * The list of monitor hosts/addresses for building the monmap can now be obtained from DNS SRV records. The service name used in when querying the DNS is defined in the "mon_dns_srv_name" config option, which defaults to "ceph-mon". * The 'osd class load list' config option is a list of object class names that the OSD is permitted to load (or '*' for all classes). By default it contains all existing in-tree classes for backwards compatibility. * The 'osd class default list' config option is a list of
[ceph-users] v11.1.1 Kraken rc released
Hi everyone, This is the second release candidate for kraken, the next stable release series. Major Changes from Jewel - *RADOS*: * The new *BlueStore* backend now has a change in the on-disk format, from the previous release candidate 11.1.0 and there might possibly be a change before the final release is cut Notable Changes: - * bluestore: ceph-disk: adjust bluestore default device sizes (pr#12530, Sage Weil) * bluestore: os/bluestore: avoid resharding if the last shard size fall below shar… (pr#12447, Igor Fedotov) * bluestore: os/bluestore: clear omap flag if parent has none (pr#12351, xie xingguo) * bluestore: os/bluestore: don’t implicitly create the source object for clone (pr#12353, xie xingguo) * bluestore: os/bluestore: fix ondisk encoding for blobs (pr#12488, Varada Kari, Sage Weil) * bluestore: os/bluestore: fix potential csum_order overflow (pr#12333, xie xingguo) * bluestore: os/bluestore: fix target_buffer value overflow in Cache::trim() (pr#12507, Igor Fedotov) * bluestore: os/bluestore: include modified objects in flush list even if onode unchanged (pr#12541, Sage Weil) * bluestore: os/bluestore: preserve source collection cache during split (pr#12574, Sage Weil) * bluestore: os/bluestore: remove ‘extents’ from shard_info (pr#12629, Sage Weil) * bluestore: os/bluestore: simplified allocator interfaces to single apis (pr#12355, Ramesh Chander) * bluestore: os/bluestore: simplify allocator release flow (pr#12343, Sage Weil) * build/ops,common: common/str_list.h: fix clang warning about std::move (pr#12570, Willem Jan Withagen) * build/ops: CMakeLists: add vstart-base target (pr#12476, Sage Weil) * build/ops: systemd: Fix startup of ceph-mgr on Debian 8 (pr#12555, Mark Korenberg) * build/ops: upstart: fix ceph-crush-location default (issue#6698, pr#803, Jason Dillaman) * cephfs,cleanup: ceph-fuse: start up log on parent process before shutdown (issue#18157, pr#12358, Kefu Chai) * cephfs: client/mds: Clear setuid bits when writing or truncating (issue#18131, pr#12412, Jeff Layton) * cephfs: client: fix mutex name typos (pr#12401, Yunchuan Wen)ceph-devel , ceph-users , ceph-maintain...@ceph.com, ceph-annou...@ceph.com * cephfs: client: set metadata[“root”] from mount method when it’s called with … (pr#12505, Jeff Layton) * cephfs: get new fsmap after marking clusters down (issue#7271, issue#17894, pr#1262, Patrick Donnelly) * cephfs: libcephfs: add readlink function in cephfs.pyx (pr#12384, huanwen ren) * cephfs: mon/MDSMonitor: fix iterating over mutated map (issue#18166, pr#12395, John Spray) * cephfs: systemd: add ceph-fuse service file (pr#11542, Patrick Donnelly) * cephfs: test fragment size limit (issue#16164, pr#1069, Patrick Donnelly) * cephfs: test readahead is working (issue#16024, pr#1046, Patrick Donnelly) * cephfs: update tests to enable multimds when needed (pr#933, Greg Farnum) * cephfs: Port/bootstrap (pr#827, Yan, Zheng) * cleanup,common: common/blkdev: use realpath instead of readlink to resolve the recurs… (pr#12462, Xinze Chi) * cleanup,rbd: journal: avoid logging an error when a watch is blacklisted (issue#18243, pr#12473, Jason Dillaman) * cleanup,rbd: journal: prevent repetitive error messages after being blacklisted (issue#18243, pr#12497, Jason Dillaman) * cleanup: Wip ctypos (pr#12495, xianxiaxiao) * cleanup: fix typos (pr#12502, xianxiaxiao) * cleanup: remove unused declaration (pr#12466, Li Wang, Yunchuan Wen) * common: fix clang compilation error (pr#12565, Mykola Golub) * common: osd/osdmap: fix divide by zero error (pr#12521, Yunchuan Wen) * common: client/Client.cc: fix/silence “logically dead code” CID-Error (pr#291, Yehuda Sadeh) * core,cleanup: ceph-disk: do not create bluestore wal/db partitions by default (issue#18291, pr#12531, Loic Dachary) * core,cleanup: osd/ReplicatedPG: remove redundant check for balance/localize read (pr#10209, runsisi) * core,cleanup: src: rename ReplicatedPG to PrimaryLogPG (pr#12487, Samuel Just) * core,performance: osd/PrimaryLogPG: don’t truncate if we don’t have to for WRITEFULL (pr#12534, Samuel Just) * core,tests: test/rados/list.cc: Memory leak in ceph_test_rados_api_list (issue#18250, pr#12479, Brad Hubbard) * core: mon: make it more clearly to debug for paxos state (pr#12438, song baisen) * core: FreeBSD/OSD.cc: add client_messenger to the avoid_ports set. (pr#12463, Willem Jan Withagen) * core: ceph.in: allow ‘flags’ to not be present in cmddescs (issue#18297, pr#12540, Dan Mick) * core: erasure-code: synchronize with upstream gf-complete (issue#18092, pr#12382, Loic Dachary) * core: osd/PG: add “down” pg state (distinct from down+peering) (pr#12289, Sage Weil) * core: osd/ReplicatedPG::record_write_error: don’t leak orig_reply on cancel (issue#18180, pr#12450, Samuel Just) * core: remove spurious executable permissions on source code files (pr#1061, Samuel Just) * doc: doc/dev/osd_internals: add pgpool.rst (pr#12500, Brad Hubbard) * doc: document osd tell ben
[ceph-users] v11.2.0 kraken released
This is the first release of the Kraken series. It is suitable for use in production deployments and will be maintained until the next stable release, Luminous, is completed in the Spring of 2017. Major Changes from Jewel - *RADOS*: * The new *BlueStore* backend now has a stable disk format and is passing our failure and stress testing. Although the backend is still flagged as experimental, we encourage users to try it out for non-production clusters and non-critical data sets. * RADOS now has experimental support for *overwrites on erasure-coded* pools. Because the disk format and implementation are not yet finalized, there is a special pool option that must be enabled to test the new feature. Enabling this option on a cluster will permanently bar that cluster from being upgraded to future versions. * We now default to the AsyncMessenger (``ms type = async``) instead of the legacy SimpleMessenger. The most noticeable difference is that we now use a fixed sized thread pool for network connections (instead of two threads per socket with SimpleMessenger). * Some OSD failures are now detected almost immediately, whereas previously the heartbeat timeout (which defaults to 20 seconds) had to expire. This prevents IO from blocking for an extended period for failures where the host remains up but the ceph-osd process is no longer running. * There is a new ``ceph-mgr`` daemon. It is currently collocated with the monitors by default, and is not yet used for much, but the basic infrastructure is now in place. * The size of encoded OSDMaps has been reduced. * The OSDs now quiesce scrubbing when recovery or rebalancing is in progress. - *RGW*: * RGW now supports a new zone type that can be used for metadata indexing via ElasticSearch. * RGW now supports the S3 multipart object copy-part API. * It is possible now to reshard an existing bucket. Note that bucket resharding currently requires that all IO (especially writes) to the specific bucket is quiesced. * RGW now supports data compression for objects. * Civetweb version has been upgraded to 1.8 * The Swift static website API is now supported (S3 support has been added previously). * S3 bucket lifecycle API has been added. Note that currently it only supports object expiration. * Support for custom search filters has been added to the LDAP auth implementation. * Support for NFS version 3 has been added to the RGW NFS gateway. * A Python binding has been created for librgw. - *RBD*: * RBD now supports images stored in an *erasure-coded* RADOS pool using the new (experimental) overwrite support. Images must be created using the new rbd CLI "--data-pool " option to specify the EC pool where the backing data objects are stored. Attempting to create an image directly on an EC pool will not be successful since the image's backing metadata is only supported on a replicated pool. * The rbd-mirror daemon now supports replicating dynamic image feature updates and image metadata key/value pairs from the primary image to the non-primary image. * The number of image snapshots can be optionally restricted to a configurable maximum. * The rbd Python API now supports asynchronous IO operations. - *CephFS*: * libcephfs function definitions have been changed to enable proper uid/gid control. The library version has been increased to reflect the interface change. * Standby replay MDS daemons now consume less memory on workloads doing deletions. * Scrub now repairs backtrace, and populates `damage ls` with discovered errors. * A new `pg_files` subcommand to `cephfs-data-scan` can identify files affected by a damaged or lost RADOS PG. * The false-positive "failing to respond to cache pressure" warnings have been fixed. Upgrading from Kraken release candidate 11.1.0 -- * The new *BlueStore* backend had an on-disk format change after 11.1.0. Any BlueStore OSDs created with 11.1.0 will need to be destroyed and recreated. Upgrading from Jewel * All clusters must first be upgraded to Jewel 10.2.z before upgrading to Kraken 11.2.z (or, eventually, Luminous 12.2.z). * The ``sortbitwise`` flag must be set on the Jewel cluster before upgrading to Kraken. The latest Jewel (10.2.4+) releases issue a health warning if the flag is not set, so this is probably already set. If it is not, Kraken OSDs will refuse to start and will print and error message in their log. * You may upgrade OSDs, Monitors, and MDSs in any order. RGW daemons should be upgraded last. * When upgrading, new ceph-mgr daemon instances will be created automatically alongside any monitors. This will be true for Jewel to Kraken and Jewel to Luminous upgrades, but likely not be true for future upgrades bey
[ceph-users] v12.0.0 Luminous (dev) released
This is the first development checkpoint release of Luminous series, the next long time stable release. We're off to a good start to release Luminous in the spring of '17. Major changes from Kraken - * When assigning a network to the public network and not to the cluster network the network specification of the public network will be used for the cluster network as well. In older versions this would lead to cluster services being bound to 0.0.0.0:, thus making the cluster service even more publicly available than the public services. When only specifying a cluster network it will still result in the public services binding to 0.0.0.0. * Some variants of the omap_get_keys and omap_get_vals librados functions have been deprecated in favor of omap_get_vals2 and omap_get_keys2. The new methods include an output argument indicating whether there are additional keys left to fetch. Previously this had to be inferred from the requested key count vs the number of keys returned, but this breaks with new OSD-side limits on the number of keys or bytes that can be returned by a single omap request. These limits were introduced by kraken but are effectively disabled by default (by setting a very large limit of 1 GB) because users of the newly deprecated interface cannot tell whether they should fetch more keys or not. In the case of the standalone calls in the C++ interface (IoCtx::get_omap_{keys,vals}), librados has been updated to loop on the client side to provide a correct result via multiple calls to the OSD. In the case of the methods used for building multi-operation transactions, however, client-side looping is not practical, and the methods have been deprecated. Note that use of either the IoCtx methods on older librados versions or the deprecated methods on any version of librados will lead to incomplete results if/when the new OSD limits are enabled. * In previous versions, if a client sent an op to the wrong OSD, the OSD would reply with ENXIO. The rationale here is that the client or OSD is clearly buggy and we want to surface the error as clearly as possible. We now only send the ENXIO reply if the osd_enxio_on_misdirected_op option is enabled (it's off by default). This means that a VM using librbd that previously would have gotten an EIO and gone read-only will now see a blocked/hung IO instead. * When configuring ceph-fuse mounts in /etc/fstab, a new syntax is available that uses "ceph.=" in the options column, instead of putting configuration in the device column. The old style syntax still works. See the documentation page "Mount CephFS in your file systems table" for details. A more detailed information regarding the release is available at the official ceph blog at http://ceph.com/releases/v12-0-0-luminous-dev-released/ Notable Changes --- * bluestore: avoid unnecessary copy with coll_t (pr#12576, Yunchuan Wen) * bluestore: fixed compilation error when enable spdk (pr#12672, Pan Liu) * bluestore: os/bluestore: add a debug option to bypass block device writes for bl… (pr#12464, Igor Fedotov) * bluestore: os/bluestore: Add bluestore pextent vector to mempool (pr#12946, Igor Fedotvo, Igor Fedotov) * bluestore: os/bluestore: add perf variable for throttle info in bluestore (pr#12583, Pan Liu) * bluestore: os/bluestore: allow multiple SPDK BlueStore OSD instances (`issue#16966, pr#12604, Orlando Moreno) * bluestore: os/bluestore/BitmapFreelistManager: readability improvements (pr#12719, xie xingguo) * bluestore: os/bluestore/BlueFS: fix reclaim_blocks (`issue#18368, pr#12725, Sage Weil) * bluestore: os/bluestore: conditionally load crr option (pr#12877, xie xingguo) * bluestore: os/bluestore: fix Allocator::allocate() int truncation (`issue#18595, pr#13010, Sage Weil) * bluestore: os/bluestore: fix min_alloc_size at mkfs time (pr#13192, Sage Weil) * bluestore: os/bluestore: fix NVMEDevice::open failure if serial number ends with a … (pr#12956, Hongtong Liu) * bluestore: os/bluestore: fix OnodeSizeTracking testing (pr#12684, xie xingguo) * bluestore: os/bluestore: fix potential assert in cache _trim method. (pr#13234, Igor Fedotov) * bluestore: os/bluestore: fix reclaim_blocks and clean up Allocator interface (`issue#18573, pr#12963, Sage Weil) * bluestore: os/bluestore: include logical object offset in crc error (pr#13074, Sage Weil) * bluestore: os/bluestore/KernelDevice: fix debug message (pr#13135, Sage Weil) * bluestore: os/bluestore/KernelDevice: kill zeros (pr#12856, xie xingguo) * bluestore: os/bluestore: kill BufferSpace.empty() (pr#12871, xie xingguo) * bluestore: os/bluestore: kill orphan declaration of do_write_check_depth() (pr#12853, xie xingguo) * bluestore: os/bluestore: miscellaneous fixes to BitAllocator (pr#12696, xie xingguo) * bluestore: os/bluestore: nullptr in OmapIteratorImpl::valid (pr#12900, Xinze Chi) *
[ceph-users] v0.94.10 Hammer released
This Hammer point release fixes several bugs and adds a couple of new features. We recommend that all hammer v0.94.x users upgrade. Please note that Hammer will be retired when Luminous is released later during the spring of this year. Until then, the focus will be primarily on bugs that would hinder upgrades to Jewel. New Features * ceph-objectstore-tool and ceph-monstore-tool now enable user to rebuild the monitor database from OSDs. (This feature is especially useful when all monitors fail to boot due to leveldb corruption.) * In RADOS Gateway, it is now possible to reshard an existing bucket's index using an off-line tool. Usage: $ radosgw-admin bucket reshard --bucket= --num_shards= This will create a new linked bucket instance that points to the newly created index objects. The old bucket instance still exists and currently it's up to the user to manually remove the old bucket index objects. (Note that bucket resharding currently requires that all IO (especially writes) to the specific bucket is quiesced.) Other Notable Changes - * build/ops: ceph-create-keys loops forever (issue#17753, pr#12805, Alfredo Deza) * build/ops: improve ceph.in error message (issue#11101, pr#10905, Kefu Chai) * build/ops: make stop.sh more portable (issue#16918, pr#10569, Mykola Golub) * build/ops: remove SYSTEMD_RUN from initscript (issue#16440, issue#7627, pr#9873, Vladislav Odintsov) * cephx: Fix multiple segfaults due to attempts to encrypt or decrypt (issue#16266, pr#11930, Brad Hubbard) * common: SIGABRT in TrackedOp::dump() via dump_ops_in_flight() (issue#8885, pr#12121, Jianpeng Ma, Zhiqiang Wang, David Zafman) * common: os/ObjectStore: fix _update_op for split dest_cid (issue#15345, pr#12071, Sage Weil) * crush: reset bucket->h.items[i] when removing tree item (issue#16525, pr#10724, Kefu Chai) * doc: add "Upgrading to Hammer" section (issue#17386, pr#11372, Kefu Chai) * doc: add orphan options to radosgw-admin --help and man page (issue#17281, issue#17280, pr#11140, Abhishek Lekshmanan, Casey Bodley, Ken Dreyer, Thomas Serlin) * doc: clarify that RGW bucket object versioning is supported (issue#16574, pr#10437, Yuan Zhou, shawn chen) * librados: bad flags can crash the osd (issue#16012, pr#11936, Jianpeng Ma, Sage Weil) * librbd: ceph 10.2.2 rbd status on image format 2 returns "(2) No such file or directory" (issue#16887, pr#10987, Jason Dillaman) * librbd: diffs to clone's first snapshot should include parent diffs (issue#18068, pr#12446, Jason Dillaman) * librbd: image.stat() call in librbdpy fails sometimes (issue#17310, pr#11949, Jason Dillaman) * librbd: request exclusive lock if current owner cannot execute op (issue#16171, pr#12018, Mykola Golub) * mds: fix cephfs-java ftruncate unit test failure (issue#11258, pr#11939, Yan, Zheng) * mon: %USED of ceph df is wrong (issue#16933, pr#11934, Kefu Chai) * mon: MonmapMonitor should return success when MON will be removed (issue#17725, pr#12006, Joao Eduardo Luis) * mon: OSDMonitor: Missing nearfull flag set (issue#17390, pr#11273, Igor Podoski) * mon: OSDs marked OUT wrongly after monitor failover (issue#17719, pr#11946, Dong Wu) * mon: fix memory leak in prepare_beacon (issue#17285, pr#10238, Igor Podoski) * mon: osd flag health message is misleading (issue#18175, pr#12687, Sage Weil) * mon: prepare_pgtemp needs to only update up_thru if newer than the existing one (issue#16185, pr#11937, Samuel Just) * mon: return size_t from MonitorDBStore::Transaction::size() (issue#14217, pr#10904, Kefu Chai) * mon: send updated monmap to its subscribers (issue#17558, pr#11457, Kefu Chai) * msgr: OpTracker needs to release the message throttle in _unregistered (issue#14248, pr#11938, Samuel Just) * msgr: simple/Pipe: error decoding addr (issue#18072, pr#12266, Sage Weil) * osd: PG::_update_calc_stats wrong for CRUSH_ITEM_NONE up set items (issue#16998, pr#11933, Samuel Just) * osd: PG::choose_acting valgrind error or ./common/hobject.h: 182: FAILED assert(!max || (*this == hobject_t(hobject_t::get_max( (issue#13967, pr#11932, Tao Chang) * osd: ReplicatedBackend::build_push_op: add a second config to limit omap entries/chunk independently of object data (issue#16128, pr#12417, Wanlong Gao) * osd: crash on EIO during deep-scrubbing (issue#16034, pr#11935, Nathan Cutler) * osd: filestore: FALLOC_FL_PUNCH_HOLE must be used with FALLOC_FL_KEEP_SIZE (issue#18446, pr#13041, xinxin shu) * osd: fix cached_removed_snaps bug in PGPool::update after map gap (issue#18628, issue#15943, pr#12906, Samuel Just) * osd: fix collection_list shadow return value (issue#17713, pr#11927, Haomai Wang) * osd: fix fiemap issue in xfs when #extents > 1364 (issue#17610, pr#11615, Kefu Chai, Ning Yao) * osd: update PGPool to detect map gaps and reset cached_removed_snaps (issue#15943, pr#11676, Samuel Just) * rbd: export diff should open image as read-only (issue#17671, pr#11948, liyan
Re: [ceph-users] Hammer update
Sasha Litvak writes: > Hello everyone, > > Hammer 0.94.10 update was announced in the blog a week ago. However, there > are no packages available for either version of redhat. Can someone tell me > what is going on? I see the packages at http://download.ceph.com/rpm-hammer/el7/x86_64/. Are you able to see the packages after following the instructions at http://docs.ceph.com/docs/master/install/get-packages/ ? Best, Abhishek -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Jewel v10.2.6 released
This point release fixes several important bugs in RBD mirroring, RGW multi-site, CephFS, and RADOS. We recommend that all v10.2.x users upgrade. For more detailed information, see the complete changelog[1] and the release notes[2] Notable Changes --- * build/ops: add hostname sanity check to run-{c}make-check.sh (issue#18134 , pr#12302 , Nathan Cutler) * build/ops: add ldap lib to rgw lib deps based on build config (issue#17313 , pr#13183 , Nathan Cutler) * build/ops: ceph-create-keys loops forever (issue#17753 , pr#11884 , Alfredo Deza) * build/ops: ceph daemons DUMPABLE flag is cleared by setuid preventing coredumps (issue#17650 , pr#11736 , Patrick Donnelly) * build/ops: fixed compilation error when --with-radowsgw=no (issue#18512 , pr#12729 , Pan Liu) * build/ops: fixed the issue when --disable-server, compilation fails. (issue#18120 , pr#12239 , Pan Liu) * build/ops: fix undefined crypto references with --with-xio (issue#18133 , pr#12296 , Nathan Cutler) * build/ops: install-deps.sh based on /etc/os-release (issue#18466 , issue#18198 , pr#12405 , Jan Fajerski, Nitin A Kamble, Nathan Cutler) * build/ops: Remove the runtime dependency on lsb_release (issue#17425 , pr#11875 , John Coyle, Brad Hubbard) * build/ops: rpm: /etc/ceph/rbdmap is packaged with executable access rights (issue#17395 , pr#11855 , Ken Dreyer) * build/ops: selinux: Allow ceph to manage tmp files (issue#17436 , pr#13048 , Boris Ranto) * build/ops: systemd: Restart Mon after 10s in case of failure (issue#18635 , pr#13058 , Wido den Hollander) * build/ops: systemd restarts Ceph Mon to quickly after failing to start (issue#18635 , pr#13184 , Wido den Hollander) * ceph-disk: fix flake8 errors (issue#17898 , pr#11976 , Ken Dreyer) * cephfs: fuse client crash when adding a new osd (issue#17270 , pr#11860 , John Spray) * cli: ceph-disk: convert none str to str before printing it (issue#18371 , pr#13187 , Kefu Chai) * client: Fix lookup of "/.." in jewel (issue#18408 , pr#12766 , Jeff Layton) * client: fix stale entries in command table (issue#17974 , pr#12137 , John Spray) * client: populate metadata during mount (issue#18361 , pr#13085 , John Spray) * cli: implement functionality for adding, editing and removing omap values with binary keys (issue#18123 , pr#12755 , Jason Dillaman) * common: Improve linux dcache hash algorithm (issue#17599 , pr#11529 , Yibo Cai) * common: utime.h: fix timezone issue in round_to_* funcs. (issue#14862 , pr#11508 , Zhao Chao) * doc: Python Swift client commands in Quick Developer Guide don't match configuration in vstart.sh (issue#17746 , pr#13043 , Ronak Jain) * librbd: allow to open an image without opening parent image (issue#18325 , pr#13130 , Ricardo Dias) * librbd: metadata_set API operation should not change global config setting (issue#18465 , pr#13168 , Mykola Golub) * librbd: new API method to force break a peer's exclusive lock (issue#15632 , issue#16773 , issue#17188 , issue#16988 , issue#17210 , issue#17251 , issue#18429 , issue#17227 , issue#18327 , issue#17015 , pr#12890 , Danny Al-Gaaf, Mykola Golub, Jason Dillaman) * librbd: properly order concurrent updates to the object map (issue#16176 , pr#12909 , Jason Dillaman) * librbd: restore journal access when force disabling mirroring (issue#17588 , pr#11916 , Mykola Golub) * mds: Cannot create deep directories when caps contain path=/somepath (issue#17858 , pr#12154 , Patrick Donnelly) * mds: cephfs metadata pool: deep-scrub error omap_digest != best guess omap_digest (issue#17177 , pr#12380 , Yan, Zheng) * mds: cephfs test failures (ceph.com/qa is broken, should be download.ceph.com/qa) (issue#18574 , pr#13023 , John Spray) * mds: ceph-fuse crash during snapshot tests (issue#18460 , pr#13120 , Yan, Zheng) * mds: ceph_volume_client: fix recovery from partial auth update (issue#17216 , pr#11656 , Ramana Raja) * mds: ceph_volume_client.py : Error: Can't handle arrays of non-strings (issue#17800 , pr#12325 , Ramana Raja) * mds: Cleanly reject session evict command when in replay (issue#17801 , pr#12153 , Yan, Zheng) * mds: client segfault on ceph_rmdir path / (issue#9935 , pr#13029 , Michal Jarzabek) * mds: Clients without pool-changing caps shouldn't be allowed to change pool_namespace (issue#17798 , pr#12155 , John Spray) * mds: Decode errors on backtrace will crash MDS (issue#18311 , pr#12836 , Nathan Cutler, John Spray) * mds: false failing to respond to cache pressure warning (issue#17611 , pr#11861 , Yan, Zheng) * mds: finish clientreplay requests before requesting active state (issue#18461 , pr#13113 , Yan, Zheng) * mds: fix incorrect assertion in Server::_dir_is_nonempty() (issue#18578 , pr#13459 , Yan, Zheng) * mds: fix MDSMap upgrade decoding (issue#17837 , pr#13139 , John Spray, Patrick Donnelly) * mds: fix missing ll_get for ll_walk (issue#18086 , pr#13125 , Gui Hecheng) * mds: Fix mount root for ceph_mount users and change tarball format (issue#18312 , issue#18254 , pr#12592 ,
[ceph-users] v12.0.3 Luminous (dev) released
This is the fourth development checkpoint release of Luminous, the next long term stable release. This would most likely be the final development checkpoint release before we move to a release candidate soon. This release introduces several improvements in bluestore, monitor, rbd & rgw. Major changes from v12.0.2 -- * The "journaler allow split entries" config setting has been removed. For a detailed list of changes that went into this release refer to http://ceph.com/releases/ceph-v12-0-3-luminous-dev-released/ Getting Ceph * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-12.0.3.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * For ceph-deploy, see http://docs.ceph.com/docs/master/install/install-ceph-deploy * Release sha1: f2337d1b42fa49dbb0a93e4048a42762e3dffbbf -- Abhishek Lekshmanan SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com