[ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hey all, This is a copy of Bug #6040 (http://tracker.ceph.com/issues/6040) I created in the tracker. Thought I would pass it through the list as well, to get an idea if anyone else is running into it. It may only show under higher loads. More info about my setup is in the bug-report above. Here goes: I'm running a Ceph-cluster with 3 nodes, each of which runs a mon, osd and mds. I'm using RBD on this cluster as storage for KVM, CephFS is unused at this time. While still on v0.61.7 Cuttlefish, I got 70-100 +MB/sec on simple linear writes to a file with `dd' inside a VM on this cluster under regular load and the osds usually averaged 20-100% CPU-utilisation in `top'. After the upgrade to Dumpling, CPU-usage for the osds shot up to 100% to 400% in `top' (multi-core system) and the speed for my writes with `dd' inside a VM dropped to 20-40MB/sec. Users complained that disk-access inside the VMs was significantly slower and the backups of the RBD-store I was running, also got behind quickly. After downgrading only the osds to v0.61.7 Cuttlefish and leaving the rest at 0.67 Dumpling, speed and load returned to normal. I have repeated this performance-hit upon upgrade on a similar test-cluster under no additional load at all. Although CPU-usage for the osds wasn't as dramatic during these tests because there was no base-load from other VMs, I/O-performance dropped significantly after upgrading during these tests as well, and returned to normal after downgrading the osds. I'm not sure what to make of it. There are no visible errors in the logs and everything runs and reports good health, it's just a lot slower, with a lot more CPU-usage. Regards, Oliver ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
On 08/17/2013 06:13 AM, Oliver Daudey wrote: Hey all, This is a copy of Bug #6040 (http://tracker.ceph.com/issues/6040) I created in the tracker. Thought I would pass it through the list as well, to get an idea if anyone else is running into it. It may only show under higher loads. More info about my setup is in the bug-report above. Here goes: I'm running a Ceph-cluster with 3 nodes, each of which runs a mon, osd and mds. I'm using RBD on this cluster as storage for KVM, CephFS is unused at this time. While still on v0.61.7 Cuttlefish, I got 70-100 +MB/sec on simple linear writes to a file with `dd' inside a VM on this cluster under regular load and the osds usually averaged 20-100% CPU-utilisation in `top'. After the upgrade to Dumpling, CPU-usage for the osds shot up to 100% to 400% in `top' (multi-core system) and the speed for my writes with `dd' inside a VM dropped to 20-40MB/sec. Users complained that disk-access inside the VMs was significantly slower and the backups of the RBD-store I was running, also got behind quickly. After downgrading only the osds to v0.61.7 Cuttlefish and leaving the rest at 0.67 Dumpling, speed and load returned to normal. I have repeated this performance-hit upon upgrade on a similar test-cluster under no additional load at all. Although CPU-usage for the osds wasn't as dramatic during these tests because there was no base-load from other VMs, I/O-performance dropped significantly after upgrading during these tests as well, and returned to normal after downgrading the osds. I'm not sure what to make of it. There are no visible errors in the logs and everything runs and reports good health, it's just a lot slower, with a lot more CPU-usage. Hi Oliver, If you have access to the perf command on this system, could you try running: "sudo perf top" And if that doesn't give you much, "sudo perf record -g" then: "sudo perf report | less" during the period of high CPU usage? This will give you a call graph. There may be symbols missing, but it might help track down what the OSDs are doing. Mark Regards, Oliver ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] radosgw creating pool with empty name after upgrade from 0.61.7 cuttlefish to 0.67 dumpling
Hi, This seems like a bug. # ceph df NAME ID USED %USED OBJECTS .rgw.root 26 778 0 3 .rgw 27 1118 0 8 .rgw.gc28 0 0 32 30 0 0 8 .users.uid 31 369 0 2 .users 32 200 2 .rgw.buckets.index 33 0 0 4 .rgw.buckets 34 3519 0 1 Notice the empty pool name. # rados -p "" ls notify.0 notify.1 notify.2 notify.3 notify.4 notify.5 notify.6 notify.7 Based on https://github.com/ceph/ceph/blob/dumpling/src/rgw/rgw_rados.cc it seems this pool is really supposed to be named ".rgw.control". I am running Ubuntu 12.10 Quantal with ceph packages from sources.list: "deb http://eu.ceph.com/debian-dumpling/ quantal main" Note: i deleted all rgw pools before i upgraded to 0.67 so im not bringing rgw data from 0.61.7. Other than the empty pool name, radosgw seems to be working just fine. I have uploaded and retrieved objects without any problems. Regards, Øystein ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] large memory leak on scrubbing
Hi Dominic, There is a bug fixed a couple of months back that fixes excessive memory consumption during scrub. You can upgrade to the latest 'bobtail' branch. See http://ceph.com/docs/master/install/debian/#development-testing-packages Installing that package should clear this up. sage On Fri, 16 Aug 2013, Mostowiec Dominik wrote: > Hi, > We noticed some issues on CEPH/S3 cluster, I think it related with scrubbing: > large memory leaks. > > Logs 09.xx: > https://www.dropbox.com/s/4z1fzg239j43igs/ceph-osd.4.log_09xx.tar.gz > >From 09.30 to 09.44 (14 minutes) osd.4 proces grows up to 28G. > > I think this is something curious: > 2013-08-16 09:43:48.801331 7f6570d2e700 0 log [WRN] : slow request 32.794125 > seconds old, received at 2013-08-16 09:43:16.007104: osd_sub_op(unknown.0.0:0 > 16.113d 0//0//-1 [scrub-reserve] v 0'0 snapset=0=[]:[] snapc=0=[]) v7 > currently no flag points reached > > We have large rgw index and a lot of large files than on this cluster. > ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c) > Setup: > - 12 servers x 12 OSD > - 3 mons > Default scrubbing settings. > Journal and filestore settings: > journal aio = true > filestore flush min = 0 > filestore flusher = false > filestore fiemap = false > filestore op threads = 4 > filestore queue max ops = 4096 > filestore queue max bytes = 10485760 > filestore queue committing max bytes = 10485760 > journal max write bytes = 10485760 > journal queue max bytes = 10485760 > ms dispatch throttle bytes = 10485760 > objecter infilght op bytes = 10485760 > > Is this a known bug in this version? > (Do you know some workaround to fix this?) > > --- > Regards > Dominik > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] v0.67.1 Dumpling released
This is a bug fix release for Dumpling that resolves a problem with the librbd python bindings (triggered by OpenStack) and a hang in librbd when caching is disabled. OpenStack users are encouraged to upgrade. No other serious bugs have come up since v0.67 came out earlier this week. Notable changes: * librados, librbd: fix constructor for python bindings with certain usages (in particular, that used by OpenStack) * librados, librbd: fix aio_flush wakeup when cache is disabled * librados: fix locking for aio completion refcounting * fixes 'ceph --admin-daemon ...' command error code on error * fixes 'ceph daemon ... config set ...' command for boolean config options. For more detailed information, see the complete release notes and changelog: * http://ceph.com/docs/master/release-notes/#v0-67-1-dumpling * http://ceph.com/docs/master/_downloads/v0.67.1.txt You can download v0.67.1 Dumpling from the usual locations: * Git at git://github.com/ceph/ceph.git * Tarball at http://ceph.com/download/ceph-0.67.1.tar.gz * For Debian/Ubuntu packages, see http://ceph.com/docs/master/install/debian * For RPMs, see http://ceph.com/docs/master/install/rpm ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Mds lock
[moving to ceph-devel] On Fri, 16 Aug 2013, Jun Jun8 Liu wrote: > > Hi all, > > I am doing some research about mds. > > > > there are so many types lock and states .But I don't found some > document to describe. > > > > Is there anybody tell me what is "loner" and "lock_mix" 'loner' tracks when a single client has some exclusive capabilities on the file. For example, when it is the only client with the file open, it can buffer reads and perform setattr operations locally. lock_mix (lock->mix) is a transition state from lock (primary mds copy exclusive lock, sort of) and mix (shared write lock between multiple mds's). The MDS locking is quite complex, but unfortunately there is not much in the way of documentation for the code. :( sage___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] performance questions
Hi, When we rebuilt our ceph cluster, we opted to make our rbd storage replication level 3 rather than the previously configured replication level 2. Things are MUCH slower (5 nodes, 13 osd's) than before even though most of our I/O is read. Is this to be expected? What are the recommended ways of seeing who/what is consuming the largest amount of disk/network bandwidth? Thanks! Jeff -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph 0.67, 0.67.1: ceph_init bug
Hi, troubles with ceph_init (after a test reboot) # ceph_init restart osd # ceph_init restart osd.0 /usr/lib/ceph/ceph_init.sh: osd.0 not found (/etc/ceph/ceph.conf defines mon.xxx , /var/lib/ceph defines mon.xxx) 1 # ceph-disk list [...] /dev/sdc : /dev/sdc1 ceph data, prepared, cluster ceph, osd.0 /dev/sdd : /dev/sdd1 ceph data, prepared, cluster ceph, osd.1 /dev/sde : /dev/sde1 ceph data, prepared, cluster ceph, osd.2 /dev/sdf : /dev/sdf1 ceph data, prepared, cluster ceph, osd.3 /dev/sdg : /dev/sdg1 ceph data, prepared, cluster ceph, osd.4 /dev/sdh : /dev/sdh1 ceph data, prepared, cluster ceph, osd.5 I see at the end of ceph_init taht there's a ceph-disk activate-all, but it does nothing I can see: # ceph_init start osd.0 /usr/lib/ceph/ceph_init.sh: osd.0 not found (/etc/ceph/ceph.conf defines mon.xxx , /var/lib/ceph defines mon.xxx) 1 # ceph-disk activate-all # mount |grep ceph /dev/mapper/ssd1-ceph--mon on /var/lib/ceph/mon/ceph-xxx type ext4 (rw,noatime,nodiratime) Anything I missed ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] performance questions
On Sat, 17 Aug 2013, Jeff Moskow wrote: > Hi, > > When we rebuilt our ceph cluster, we opted to make our rbd storage > replication level 3 rather than the previously configured replication > level 2. > > Things are MUCH slower (5 nodes, 13 osd's) than before even though > most of our I/O is read. Is this to be expected? What are the > recommended ways of seeing who/what is consuming the largest amount of > disk/network bandwidth? It really doesn't sound like the replica count is the source of the performance difference. What else has changed? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On Sun, 18 Aug 2013, Mika?l Cluseau wrote: > Hi, > > troubles with ceph_init (after a test reboot) > > # ceph_init restart osd > # ceph_init restart osd.0 > /usr/lib/ceph/ceph_init.sh: osd.0 not found (/etc/ceph/ceph.conf defines > mon.xxx , /var/lib/ceph defines mon.xxx) > 1 # ceph-disk list > [...] > /dev/sdc : > /dev/sdc1 ceph data, prepared, cluster ceph, osd.0 > /dev/sdd : > /dev/sdd1 ceph data, prepared, cluster ceph, osd.1 > /dev/sde : > /dev/sde1 ceph data, prepared, cluster ceph, osd.2 > /dev/sdf : > /dev/sdf1 ceph data, prepared, cluster ceph, osd.3 > /dev/sdg : > /dev/sdg1 ceph data, prepared, cluster ceph, osd.4 > /dev/sdh : > /dev/sdh1 ceph data, prepared, cluster ceph, osd.5 > > I see at the end of ceph_init taht there's a ceph-disk activate-all, but it > does nothing I can see: > > # ceph_init start osd.0 > /usr/lib/ceph/ceph_init.sh: osd.0 not found (/etc/ceph/ceph.conf defines > mon.xxx , /var/lib/ceph defines mon.xxx) > 1 # ceph-disk activate-all > # mount |grep ceph > /dev/mapper/ssd1-ceph--mon on /var/lib/ceph/mon/ceph-xxx type ext4 > (rw,noatime,nodiratime) The ceph-disk activate-all command is looking for partitions that are marked with the ceph type uuid. Maybe the jouranls are missing? What does ceph-disk -v activate /dev/sdc1 say? Or ceph-disk -v activate-all Where does the 'journal' symlink in the ceph data partitions point to? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph-deploy mon create / gatherkeys problems
Hi everyone, We're trying to get to the bottom of the problems people have been having with ceph-deploy mon create .. and ceph-deploy gatherkeys. There seem to be a series of common pitfalls that are causing these problems. So far we've been chasing them in emails on this list and in various bugs in the tracker, but it is hard to keep track! * If you are still seeing any problems here, please reply to this thread! * If you previously had a problem here and then realized you were doing something not quite right and got it working, please reply and share what it was so we can make sure others avoid the problem! We have a couple of feature tickets open to streamline this process to be simpler (1 or 2 steps instead of 4 or 5), but before we rush off and implement it I want to make sure we fully understand where all of the problems lay. Thanks! sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On 08/18/2013 08:35 AM, Sage Weil wrote: The ceph-disk activate-all command is looking for partitions that are marked with the ceph type uuid. Maybe the jouranls are missing? What does ceph-disk -v activate /dev/sdc1 say? Or ceph-disk -v activate-all Where does the 'journal' symlink in the ceph data partitions point to? ummm ceph-disk activate seems to work: # ceph-disk -v activate-all DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid # ceph-disk -v activate /dev/sdc1 DEBUG:ceph-disk-python2.7:Mounting /dev/sdc1 on /var/lib/ceph/tmp/mnt.zGfWOj with options noatime DEBUG:ceph-disk-python2.7:Cluster uuid is d2f0857a-8bea-4b7e-af0c-ee164bc7ecf7 DEBUG:ceph-disk-python2.7:Cluster name is ceph DEBUG:ceph-disk-python2.7:OSD uuid is ee7dcd25-d65c-47ba-85cb-3c64566ba980 DEBUG:ceph-disk-python2.7:OSD id is 0 DEBUG:ceph-disk-python2.7:Marking with init system sysvinit DEBUG:ceph-disk-python2.7:ceph osd.0 data dir is ready at /var/lib/ceph/tmp/mnt.zGfWOj DEBUG:ceph-disk-python2.7:Moving mount to final location... DEBUG:ceph-disk-python2.7:Starting ceph osd.0... The journal symlink points to journal partitions on my SSDs : # for d in d e f g h ; do ceph-disk activate /dev/sd${d}1; done # ls -l /var/lib/ceph/osd/ceph-?/journal lrwxrwxrwx 1 root root 9 Aug 17 09:05 /var/lib/ceph/osd/ceph-0/journal -> /dev/sda5 lrwxrwxrwx 1 root root 9 Aug 17 09:05 /var/lib/ceph/osd/ceph-1/journal -> /dev/sda6 lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-2/journal -> /dev/sda7 lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-3/journal -> /dev/sdb5 lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-4/journal -> /dev/sdb6 lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-5/journal -> /dev/sdb7 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On 08/18/2013 08:39 AM, Mikaël Cluseau wrote: # ceph-disk -v activate-all DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid Maybe /dev/disk/by-parttypeuuid is specific? # ls -l /dev/disk total 0 drwxr-xr-x 2 root root 1220 Aug 18 07:01 by-id drwxr-xr-x 2 root root 60 Aug 18 07:01 by-partlabel drwxr-xr-x 2 root root 160 Aug 18 07:01 by-partuuid drwxr-xr-x 2 root root 600 Aug 18 07:01 by-path drwxr-xr-x 2 root root 200 Aug 18 07:01 by-uuid ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Significant slowdown of osds since v0.67 Dumpling
Hey Mark, On za, 2013-08-17 at 08:16 -0500, Mark Nelson wrote: > On 08/17/2013 06:13 AM, Oliver Daudey wrote: > > Hey all, > > > > This is a copy of Bug #6040 (http://tracker.ceph.com/issues/6040) I > > created in the tracker. Thought I would pass it through the list as > > well, to get an idea if anyone else is running into it. It may only > > show under higher loads. More info about my setup is in the bug-report > > above. Here goes: > > > > > > I'm running a Ceph-cluster with 3 nodes, each of which runs a mon, osd > > and mds. I'm using RBD on this cluster as storage for KVM, CephFS is > > unused at this time. While still on v0.61.7 Cuttlefish, I got 70-100 > > +MB/sec on simple linear writes to a file with `dd' inside a VM on this > > cluster under regular load and the osds usually averaged 20-100% > > CPU-utilisation in `top'. After the upgrade to Dumpling, CPU-usage for > > the osds shot up to 100% to 400% in `top' (multi-core system) and the > > speed for my writes with `dd' inside a VM dropped to 20-40MB/sec. Users > > complained that disk-access inside the VMs was significantly slower and > > the backups of the RBD-store I was running, also got behind quickly. > > > > After downgrading only the osds to v0.61.7 Cuttlefish and leaving the > > rest at 0.67 Dumpling, speed and load returned to normal. I have > > repeated this performance-hit upon upgrade on a similar test-cluster > > under no additional load at all. Although CPU-usage for the osds wasn't > > as dramatic during these tests because there was no base-load from other > > VMs, I/O-performance dropped significantly after upgrading during these > > tests as well, and returned to normal after downgrading the osds. > > > > I'm not sure what to make of it. There are no visible errors in the logs > > and everything runs and reports good health, it's just a lot slower, > > with a lot more CPU-usage. > > Hi Oliver, > > If you have access to the perf command on this system, could you try > running: > > "sudo perf top" > > And if that doesn't give you much, > > "sudo perf record -g" > > then: > > "sudo perf report | less" > > during the period of high CPU usage? This will give you a call graph. > There may be symbols missing, but it might help track down what the OSDs > are doing. Thanks for your help! I did a couple of runs on my test-cluster, loading it with writes from 3 VMs concurrently and measuring the results at the first node with all 0.67 Dumpling-components and with the osds replaced by 0.61.7 Cuttlefish. I let `perf top' run and settle for a while, then copied anything that showed in red and green into this post. Here are the results (sorry for the word-wraps): First, with 0.61.7 osds: 19.91% [kernel][k] intel_idle 10.18% [kernel][k] _raw_spin_lock_irqsave 6.79% ceph-osd[.] ceph_crc32c_le 4.93% [kernel][k] default_send_IPI_mask_sequence_phys 2.71% [kernel][k] copy_user_generic_string 1.42% libc-2.11.3.so [.] memcpy 1.23% [kernel][k] find_busiest_group 1.13% librados.so.2.0.0 [.] ceph_crc32c_le_intel 1.11% [kernel][k] _raw_spin_lock 0.99% kvm [.] 0x1931f8 0.92% [igb] [k] igb_poll 0.87% [kernel][k] native_write_cr0 0.80% [kernel][k] csum_partial 0.78% [kernel][k] __do_softirq 0.63% [kernel][k] hpet_legacy_next_event 0.53% [ip_tables] [k] ipt_do_table 0.50% libc-2.11.3.so [.] 0x74433 Second test, with 0.67 osds: 18.32% [kernel] [k] intel_idle 7.58% [kernel] [k] _raw_spin_lock_irqsave 7.04% ceph-osd [.] PGLog::undirty() 4.39% ceph-osd [.] ceph_crc32c_le_intel 3.92% [kernel] [k] default_send_IPI_mask_sequence_phys 2.25% [kernel] [k] copy_user_generic_string 1.76% libc-2.11.3.so[.] memcpy 1.56% librados.so.2.0.0 [.] ceph_crc32c_le_intel 1.40% libc-2.11.3.so[.] vfprintf 1.12% libc-2.11.3.so[.] 0x7217b 1.05% [kernel] [k] _raw_spin_lock 1.01% [kernel] [k] find_busiest_group 0.83% kvm [.] 0x193ab8 0.80% [kernel] [k] native_write_cr0 0.76% [kernel] [k] __do_softirq 0.73% libc-2.11.3.so[.] _IO_default_xsputn 0.70% [kernel] [k] csum_partial 0.68% [igb] [k] igb_poll 0.58% [kernel] [k] hpet_legacy_next_event 0.54% [kernel] [k] __schedule What jumps right out, is the "PGLog::undirty()", which doesn't show up with 0.61.7
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On 08/18/2013 08:44 AM, Mikaël Cluseau wrote: On 08/18/2013 08:39 AM, Mikaël Cluseau wrote: # ceph-disk -v activate-all DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid Maybe /dev/disk/by-parttypeuuid is specific? # ls -l /dev/disk total 0 drwxr-xr-x 2 root root 1220 Aug 18 07:01 by-id drwxr-xr-x 2 root root 60 Aug 18 07:01 by-partlabel drwxr-xr-x 2 root root 160 Aug 18 07:01 by-partuuid drwxr-xr-x 2 root root 600 Aug 18 07:01 by-path drwxr-xr-x 2 root root 200 Aug 18 07:01 by-uuid ok, it seems the udev rules are missing from my packaging, I'll have take a better look at your debian packages ;) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On Sun, 18 Aug 2013, Mika?l Cluseau wrote: > On 08/18/2013 08:44 AM, Mika?l Cluseau wrote: > > On 08/18/2013 08:39 AM, Mika?l Cluseau wrote: > > > > > > # ceph-disk -v activate-all > > > DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid > > > > Maybe /dev/disk/by-parttypeuuid is specific? > > > > # ls -l /dev/disk > > total 0 > > drwxr-xr-x 2 root root 1220 Aug 18 07:01 by-id > > drwxr-xr-x 2 root root 60 Aug 18 07:01 by-partlabel > > drwxr-xr-x 2 root root 160 Aug 18 07:01 by-partuuid > > drwxr-xr-x 2 root root 600 Aug 18 07:01 by-path > > drwxr-xr-x 2 root root 200 Aug 18 07:01 by-uuid > > ok, it seems the udev rules are missing from my packaging, I'll have take a > better look at your debian packages ;) Yep! What distro is this? sage ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On Sun, 18 Aug 2013, Mika?l Cluseau wrote: > On 08/18/2013 08:35 AM, Sage Weil wrote: > > The ceph-disk activate-all command is looking for partitions that are > > marked with the ceph type uuid. Maybe the jouranls are missing? What > > does > > > > ceph-disk -v activate /dev/sdc1 > > > > say? Or > > > > ceph-disk -v activate-all > > > > Where does the 'journal' symlink in the ceph data partitions point to? > > ummm ceph-disk activate seems to work: > > # ceph-disk -v activate-all > DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid find /dev/disk/by-parttypeuuid -ls ? > # ceph-disk -v activate /dev/sdc1 > DEBUG:ceph-disk-python2.7:Mounting /dev/sdc1 on /var/lib/ceph/tmp/mnt.zGfWOj > with options noatime > DEBUG:ceph-disk-python2.7:Cluster uuid is d2f0857a-8bea-4b7e-af0c-ee164bc7ecf7 > DEBUG:ceph-disk-python2.7:Cluster name is ceph > DEBUG:ceph-disk-python2.7:OSD uuid is ee7dcd25-d65c-47ba-85cb-3c64566ba980 > DEBUG:ceph-disk-python2.7:OSD id is 0 > DEBUG:ceph-disk-python2.7:Marking with init system sysvinit > DEBUG:ceph-disk-python2.7:ceph osd.0 data dir is ready at > /var/lib/ceph/tmp/mnt.zGfWOj > DEBUG:ceph-disk-python2.7:Moving mount to final location... > DEBUG:ceph-disk-python2.7:Starting ceph osd.0... > > The journal symlink points to journal partitions on my SSDs : > > # for d in d e f g h ; do ceph-disk activate /dev/sd${d}1; done > # ls -l /var/lib/ceph/osd/ceph-?/journal > lrwxrwxrwx 1 root root 9 Aug 17 09:05 /var/lib/ceph/osd/ceph-0/journal -> > /dev/sda5 > lrwxrwxrwx 1 root root 9 Aug 17 09:05 /var/lib/ceph/osd/ceph-1/journal -> > /dev/sda6 > lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-2/journal -> > /dev/sda7 > lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-3/journal -> > /dev/sdb5 > lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-4/journal -> > /dev/sdb6 > lrwxrwxrwx 1 root root 9 Aug 17 09:06 /var/lib/ceph/osd/ceph-5/journal -> > /dev/sdb7 > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On 08/18/2013 08:53 AM, Sage Weil wrote: Yep! What distro is this? I'm working on Gentoo packaging to get a full stack of ceph and openstack. Overlay here: git clone https://git.isi.nc/cloud/cloud-overlay.git And a small fork of ceph-deploy to add gentoo support: git clone https://git.isi.nc/cloud/ceph-deploy.git I'll put these on github eventually but I'd like to get at least my own cluster working before :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug
On 08/18/2013 08:53 AM, Sage Weil wrote: Yep! It's working without any change in the udev rules files ;) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] radosgw creating pool with empty name after upgrade from 0.61.7 cuttlefish to 0.67 dumpling
On Sat, Aug 17, 2013 at 6:25 AM, Øystein Lønning Nerhus wrote: > Hi, > > This seems like a bug. > > # ceph df > NAME ID USED %USED OBJECTS > .rgw.root 26 778 0 3 > .rgw 27 1118 0 8 > .rgw.gc28 0 0 32 >30 0 0 8 > .users.uid 31 369 0 2 > .users 32 200 2 > .rgw.buckets.index 33 0 0 4 > .rgw.buckets 34 3519 0 1 > > Notice the empty pool name. > > # rados -p "" ls > notify.0 > notify.1 > notify.2 > notify.3 > notify.4 > notify.5 > notify.6 > notify.7 > > Based on https://github.com/ceph/ceph/blob/dumpling/src/rgw/rgw_rados.cc it > seems this pool is really supposed to be named ".rgw.control". > > I am running Ubuntu 12.10 Quantal with ceph packages from sources.list: > "deb http://eu.ceph.com/debian-dumpling/ quantal main" > > Note: i deleted all rgw pools before i upgraded to 0.67 so im not bringing > rgw data from 0.61.7. > > Other than the empty pool name, radosgw seems to be working just fine. I > have uploaded and retrieved objects without any problems. Sounds like a bug, I opened issue #6046. Thanks, Yehuda ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com