Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
>>The results are with journal and data configured in the same SSD ? yes >>Also, how are you configuring your journal device, is it a block device ? yes. ~ceph-deploy osd create node:sdb # parted /dev/sdb GNU Parted 2.3 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA Crucial_CT1024M5 (scsi) Disk /dev/sdb: 1024GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Number Start End SizeFile system Name Flags 2 1049kB 5369MB 5368MB ceph journal 1 5370MB 1024GB 1019GB xfs ceph data >>If journal and data are not in the same device result may change. yes, sure of course >>BTW, there are SSDs like SanDisk optimas drives that is using capacitor >>backed DRAM and thus always ignore these CMD_FLUSH command since drive >>guarantees that once data reaches drive, it will power fail safe.So, >>you >>don't need kernel patch. Oh, good to known! note that kernel patch is really usefull for theses cheap consumer crucial m550, but I don't see to much difference for intel s3500. >>Optimus random write performance is ~15K (4K io_size). Presently, I don't >>have any write performance data (on ceph) with that, I will run some test >>with that soon and share. Impresive results! I don't have choose yet my ssds model for my production cluster (target 2015),I'll have a look for this optimus drives - Mail original - De: "Somnath Roy" À: "Mark Kirkwood" , "Alexandre DERUMIER" , "Sebastien Han" Cc: ceph-users@lists.ceph.com Envoyé: Mercredi 17 Septembre 2014 03:22:05 Objet: RE: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Hi Mark/Alexandre, The results are with journal and data configured in the same SSD ? Also, how are you configuring your journal device, is it a block device ? If journal and data are not in the same device result may change. BTW, there are SSDs like SanDisk optimas drives that is using capacitor backed DRAM and thus always ignore these CMD_FLUSH command since drive guarantees that once data reaches drive, it will power fail safe.So, you don't need kernel patch. Optimus random write performance is ~15K (4K io_size). Presently, I don't have any write performance data (on ceph) with that, I will run some test with that soon and share. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Kirkwood Sent: Tuesday, September 16, 2014 3:36 PM To: Alexandre DERUMIER; Sebastien Han Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS On 17/09/14 08:39, Alexandre DERUMIER wrote: > Hi, > >>> I’m just surprised that you’re only getting 5299 with 0.85 since >>> I’ve been able to get 6,4K, well I was using the 200GB model > > Your model is > DC S3700 > > mine is DC s3500 > > with lower writes, so that could explain the difference. > Interesting - I was getting 8K IOPS with 0.85 on a 128G M550 - this suggests that the bottleneck is not only sync write performance (as your S3500 do much better there), but write performance generally (where the M550 is faster). Cheers Mark ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Multiple cephfs filesystems per cluster
Hi Cephalopods, Browsing the list archives, I know this has come up before, but I thought I'd check in for an update. I'm in an environment where it would be useful to run a file system per department in a single cluster (or at a pinch enforcing some client / fs tree security). Has there been much progress recently? Many thanks, Dave ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multiple cephfs filesystems per cluster
On 09/17/2014 12:11 PM, David Barker wrote: > Hi Cephalopods, > > Browsing the list archives, I know this has come up before, but I thought > I'd check in for an update. > > I'm in an environment where it would be useful to run a file system per > department in a single cluster (or at a pinch enforcing some client / fs > tree security). Has there been much progress recently? > No, that's not possible. It's a single hierarchy. However, you can create subdirectories per department and do a subtree mount. ACLs and tree security isn't implemented yet however. > Many thanks, > > Dave > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] TypeError: unhashable type: 'list'
Hi all, Anyone have successful in replicating data across two zones of Federated gateway configuration. I am getting "TypeError: unhashable type: 'list'" error. I am not seeing data part getting replicated. verbose log : application/json; charset=UTF-8 Wed, 17 Sep 2014 09:59:22 GMT /admin/log 2014-09-17T15:29:22.219 15995:DEBUG:boto:Signature: AWS V280N25RDUA6EQ55T28V:woT+s+oqufKWoHyMIxdK/++Hz7U= 2014-09-17T15:29:22.220 15995:DEBUG:boto:url = ' http://cephog1.santhosh.com:81/admin/log?lock' params={'locker-id': 'cephOG1:15984', 'length': 60, 'zone-id': u'in-west', 'type': 'metadata', 'id': 52} headers={'Date': 'Wed, 17 Sep 2014 09:59:22 GMT', 'Content-Length': '0', 'Content-Type': 'application/json; charset=UTF-8', 'Authorization': 'AWS V280N25RDUA6EQ55T28V:woT+s+oqufKWoHyMIxdK/++Hz7U=', 'User-Agent': 'Boto/2.20.1 Python/2.7.6 Linux/3.13.0-24-generic'} data=None 2014-09-17T15:29:22.222 15995:INFO:urllib3.connectionpool:Starting new HTTP connection (1): cephog1.santhosh.com 2014-09-17T15:29:22.223 15995:DEBUG:urllib3.connectionpool:Setting read timeout to None 2014-09-17T15:29:22.257 15995:DEBUG:urllib3.connectionpool:"POST /admin/log?lock&locker-id=cephOG1%3A15984&length=60&zone-id=in-west&type=metadata&id=52 HTTP/1.1" 200 None 2014-09-17T15:29:22.258 15995:ERROR:radosgw_agent.worker:syncing entries for shard 52 failed Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/radosgw_agent/worker.py", line 151, in run new_retries = self.sync_entries(log_entries, retries) File "/usr/lib/python2.7/dist-packages/radosgw_agent/worker.py", line 437, in sync_entries for section, name in mentioned.union(split_retries): TypeError: unhashable type: 'list' 2014-09-17T15:29:22.259 15995:DEBUG:radosgw_agent.lock:release and clear lock 2014-09-17T15:29:22.259 15995:DEBUG:boto:path=/admin/log?unlock 2014-09-17T15:29:22.260 15995:DEBUG:boto:auth_path=/admin/log?unlock 2014-09-17T15:29:22.260 15995:DEBUG:boto:StringToSign: POST Any idea/help on this error to resolve? Regards, Santhosh ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multiple cephfs filesystems per cluster
Hi David, We haven't written any code for the multiple filesystems feature so far, but the new "fs new"/"fs rm"/"fs ls" management commands were designed with this in mind -- currently only supporting one filesystem, but to allow slotting in the multiple filesystems feature without too much disruption. There is some design work to be done as well, such as how the system should handle standby MDSs (assigning to a particular filesystem, floating between filesystems, etc). Cheers, John On Wed, Sep 17, 2014 at 11:11 AM, David Barker wrote: > Hi Cephalopods, > > Browsing the list archives, I know this has come up before, but I thought > I'd check in for an update. > > I'm in an environment where it would be useful to run a file system per > department in a single cluster (or at a pinch enforcing some client / fs > tree security). Has there been much progress recently? > > Many thanks, > > Dave > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Dumpling cluster can't resolve peering failures, ceph pg query blocks, auth failures in logs
Thanks, I did check on that too as I'd seen this before and this was "the usual drill", but alas, no, that wasn't the problem. This cluster is having other issues too, though, so I probably need to look into those first. Cheers, Florian On Mon, Sep 15, 2014 at 7:29 PM, Gregory Farnum wrote: > Not sure, but have you checked the clocks on their nodes? Extreme > clock drift often results in strange cephx errors. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Sun, Sep 14, 2014 at 11:03 PM, Florian Haas wrote: >> Hi everyone, >> >> [Keeping this on the -users list for now. Let me know if I should >> cross-post to -devel.] >> >> I've been asked to help out on a Dumpling cluster (a system >> "bequeathed" by one admin to the next, currently on 0.67.10, was >> originally installed with 0.67.5 and subsequently updated a few >> times), and I'm seeing a rather odd issue there. The cluster is >> relatively small, 3 MONs, 4 OSD nodes; each OSD node hosts a rather >> non-ideal 12 OSDs but its performance issues aren't really the point >> here. >> >> "ceph health detail" shows a bunch of PGs peering, but the usual >> troubleshooting steps don't really seem to work. >> >> For some PGs, "ceph pg query" just blocks, doesn't return >> anything. Adding --debug_ms=10 shows that it's simply not getting a >> response back from one of the OSDs it's trying to talk to, as if >> packets dropped on the floor or were filtered out. However, opening a >> simple TCP connection to the OSD's IP and port works perfectly fine >> (netcat returns a Ceph signature). >> >> (Note, though, that because of a daemon flapping issue they at some >> point set both "noout" and "nodown", so the cluster may not be >> behaving as normally expected when OSDs fail to respond in time.) >> >> Then there are some PGs where "ceph pg query" is a little more >> verbose, though not exactly more successful: >> >> From ceph health detail: >> >> pg 6.c10 is stuck inactive for 1477.781394, current state peering, >> last acting [85,16] >> >> ceph pg 6.b1 query: >> >> 2014-09-15 01:06:48.200418 7f29a6efc700 0 cephx: verify_reply >> couldn't decrypt with error: error decoding block for decryption >> 2014-09-15 01:06:48.200428 7f29a6efc700 0 -- 10.47.17.1:0/1020420 >> >> 10.47.16.33:6818/15630 pipe(0x2c00b00 sd=4 :43263 s=1 pgs=0 cs=0 l=1 >> c=0x2c00d90).failed verifying authorize reply >> 2014-09-15 01:06:48.200465 7f29a6efc700 0 -- 10.47.17.1:0/1020420 >> >> 10.47.16.33:6818/15630 pipe(0x2c00b00 sd=4 :43263 s=1 pgs=0 cs=0 l=1 >> c=0x2c00d90).fault >> 2014-09-15 01:06:48.201000 7f29a6efc700 0 cephx: verify_reply >> couldn't decrypt with error: error decoding block for decryption >> 2014-09-15 01:06:48.201008 7f29a6efc700 0 -- 10.47.17.1:0/1020420 >> >> 10.47.16.33:6818/15630 pipe(0x2c00b00 sd=4 :43264 s=1 pgs=0 cs=0 l=1 >> c=0x2c00d90).failed verifying authorize reply >> >> Oops. Now the admins swear they didn't touch the keys, but they are >> also (understandably) reluctant to just kill and redeploy all those >> OSDs, as these issues are basically scattered over a bunch of PGs >> touching many OSDs. How would they pinpoint this to be sure that >> they're not being bitten by a bug or misconfiguration? >> >> Not sure if people have seen this before — if so, I'd be grateful for >> some input. Loïc, Sébastien perhaps? Or João, Greg, Sage? >> >> Thanks in advance for any insight people might be able to share. :) >> >> Cheers, >> Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multiple cephfs filesystems per cluster
Thanks John - It did look like it was heading in that direction! I did wonder if a 'fs map' & 'fs unmap' would be useful too; filesystem backups, migrations between clusters & async DR could be facilitated by moving underlying pool objects around between clusters. Dave On Wed, Sep 17, 2014 at 11:22 AM, John Spray wrote: > Hi David, > > We haven't written any code for the multiple filesystems feature so > far, but the new "fs new"/"fs rm"/"fs ls" management commands were > designed with this in mind -- currently only supporting one > filesystem, but to allow slotting in the multiple filesystems feature > without too much disruption. There is some design work to be done as > well, such as how the system should handle standby MDSs (assigning to > a particular filesystem, floating between filesystems, etc). > > Cheers, > John > > On Wed, Sep 17, 2014 at 11:11 AM, David Barker > wrote: > > Hi Cephalopods, > > > > Browsing the list archives, I know this has come up before, but I thought > > I'd check in for an update. > > > > I'm in an environment where it would be useful to run a file system per > > department in a single cluster (or at a pinch enforcing some client / fs > > tree security). Has there been much progress recently? > > > > Many thanks, > > > > Dave > > > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] monitor quorum
Hi, I have a ceph cluster running 0.80.1 on Ubuntu 14.04. I have 3 monitors and 4 OSD nodes currently. Everything has been running great up until today where I've got an issue with the monitors. I moved mon03 to a different switchport so it would have temporarily lost connectivity. Since then, the cluster is reporting that that mon is down, although it's definitely up. I've tried restarting the mon services on all three mons, but that hasn't made a difference. I definitely, 100% do not have any clock skew on any of the mons. This has been triple-checked as the ceph docs seem to suggest that might be the cause of this issue. Here is what ceph -s and ceph health detail are reporting as well as the mon_status for each monitor: # ceph -s ; ceph health detail cluster XXX health HEALTH_WARN 1 mons down, quorum 0,1 ceph-mon-01,ceph-mon-02 monmap e2: 3 mons at {ceph-mon-01= 10.1.1.64:6789/0,ceph-mon-02=10.1.1.65:6789/0,ceph-mon-03=10.1.1.66:6789/0}, election epoch 932, quorum 0,1 ceph-mon-01,ceph-mon-02 osdmap e49213: 80 osds: 80 up, 80 in pgmap v18242952: 4864 pgs, 5 pools, 69910 GB data, 17638 kobjects 197 TB used, 95904 GB / 290 TB avail 8 active+clean+scrubbing+deep 4856 active+clean client io 6893 kB/s rd, 5657 kB/s wr, 2090 op/s HEALTH_WARN 1 mons down, quorum 0,1 ceph-mon-01,ceph-mon-02 mon.ceph-mon-03 (rank 2) addr 10.1.1.66:6789/0 is down (out of quorum) { "name": "ceph-mon-01", "rank": 0, "state": "leader", "election_epoch": 932, "quorum": [ 0, 1], "outside_quorum": [], "extra_probe_peers": [], "sync_provider": [], "monmap": { "epoch": 2, "fsid": "XXX", "modified": "0.00", "created": "0.00", "mons": [ { "rank": 0, "name": "ceph-mon-01", "addr": "10.1.1.64:6789\/0"}, { "rank": 1, "name": "ceph-mon-02", "addr": "10.1.1.65:6789\/0"}, { "rank": 2, "name": "ceph-mon-03", "addr": "10.1.1.66:6789\/0"}]}} { "name": "ceph-mon-02", "rank": 1, "state": "peon", "election_epoch": 932, "quorum": [ 0, 1], "outside_quorum": [], "extra_probe_peers": [], "sync_provider": [], "monmap": { "epoch": 2, "fsid": "XXX", "modified": "0.00", "created": "0.00", "mons": [ { "rank": 0, "name": "ceph-mon-01", "addr": "10.1.1.64:6789\/0"}, { "rank": 1, "name": "ceph-mon-02", "addr": "10.1.1.65:6789\/0"}, { "rank": 2, "name": "ceph-mon-03", "addr": "10.1.1.66:6789\/0"}]}} { "name": "ceph-mon-03", "rank": 2, "state": "electing", "election_epoch": 931, "quorum": [], "outside_quorum": [], "extra_probe_peers": [], "sync_provider": [], "monmap": { "epoch": 2, "fsid": "XXX", "modified": "0.00", "created": "0.00", "mons": [ { "rank": 0, "name": "ceph-mon-01", "addr": "10.1.1.64:6789\/0"}, { "rank": 1, "name": "ceph-mon-02", "addr": "10.1.1.65:6789\/0"}, { "rank": 2, "name": "ceph-mon-03", "addr": "10.1.1.66:6789\/0"}]}} Any help or advice is appreciated. Regards James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] monitor quorum
On Wed, Sep 17, 2014 at 1:58 PM, James Eckersall wrote: > Hi, > > I have a ceph cluster running 0.80.1 on Ubuntu 14.04. I have 3 monitors and > 4 OSD nodes currently. > > Everything has been running great up until today where I've got an issue > with the monitors. > I moved mon03 to a different switchport so it would have temporarily lost > connectivity. > Since then, the cluster is reporting that that mon is down, although it's > definitely up. > I've tried restarting the mon services on all three mons, but that hasn't > made a difference. > I definitely, 100% do not have any clock skew on any of the mons. This has > been triple-checked as the ceph docs seem to suggest that might be the cause > of this issue. > > Here is what ceph -s and ceph health detail are reporting as well as the > mon_status for each monitor: > > > # ceph -s ; ceph health detail > cluster XXX > health HEALTH_WARN 1 mons down, quorum 0,1 ceph-mon-01,ceph-mon-02 > monmap e2: 3 mons at > {ceph-mon-01=10.1.1.64:6789/0,ceph-mon-02=10.1.1.65:6789/0,ceph-mon-03=10.1.1.66:6789/0}, > election epoch 932, quorum 0,1 ceph-mon-01,ceph-mon-02 > osdmap e49213: 80 osds: 80 up, 80 in > pgmap v18242952: 4864 pgs, 5 pools, 69910 GB data, 17638 kobjects > 197 TB used, 95904 GB / 290 TB avail >8 active+clean+scrubbing+deep > 4856 active+clean > client io 6893 kB/s rd, 5657 kB/s wr, 2090 op/s > HEALTH_WARN 1 mons down, quorum 0,1 ceph-mon-01,ceph-mon-02 > mon.ceph-mon-03 (rank 2) addr 10.1.1.66:6789/0 is down (out of quorum) > > > { "name": "ceph-mon-01", > "rank": 0, > "state": "leader", > "election_epoch": 932, > "quorum": [ > 0, > 1], > "outside_quorum": [], > "extra_probe_peers": [], > "sync_provider": [], > "monmap": { "epoch": 2, > "fsid": "XXX", > "modified": "0.00", > "created": "0.00", > "mons": [ > { "rank": 0, > "name": "ceph-mon-01", > "addr": "10.1.1.64:6789\/0"}, > { "rank": 1, > "name": "ceph-mon-02", > "addr": "10.1.1.65:6789\/0"}, > { "rank": 2, > "name": "ceph-mon-03", > "addr": "10.1.1.66:6789\/0"}]}} > > > { "name": "ceph-mon-02", > "rank": 1, > "state": "peon", > "election_epoch": 932, > "quorum": [ > 0, > 1], > "outside_quorum": [], > "extra_probe_peers": [], > "sync_provider": [], > "monmap": { "epoch": 2, > "fsid": "XXX", > "modified": "0.00", > "created": "0.00", > "mons": [ > { "rank": 0, > "name": "ceph-mon-01", > "addr": "10.1.1.64:6789\/0"}, > { "rank": 1, > "name": "ceph-mon-02", > "addr": "10.1.1.65:6789\/0"}, > { "rank": 2, > "name": "ceph-mon-03", > "addr": "10.1.1.66:6789\/0"}]}} > > > { "name": "ceph-mon-03", > "rank": 2, > "state": "electing", > "election_epoch": 931, > "quorum": [], > "outside_quorum": [], > "extra_probe_peers": [], > "sync_provider": [], > "monmap": { "epoch": 2, > "fsid": "XXX", > "modified": "0.00", > "created": "0.00", > "mons": [ > { "rank": 0, > "name": "ceph-mon-01", > "addr": "10.1.1.64:6789\/0"}, > { "rank": 1, > "name": "ceph-mon-02", > "addr": "10.1.1.65:6789\/0"}, > { "rank": 2, > "name": "ceph-mon-03", > "addr": "10.1.1.66:6789\/0"}]}} > > > Any help or advice is appreciated. It looks like your mon has been unable to communicate with the other hosts, presumably since the time you un-/replugged it. Check your switch port configuration. Also, make sure that from 10.1.1.66, you can not only ping 10.1.1.64 and 10.1.1.65, but make a TCP connection on port 6789. With that out of the way, check your mon log on ceph-mon-03 (in /var/log/ceph/mon); it should provide some additional insight into the problem. Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU
Hi Craig, just dug this up in the list archives. On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis wrote: > In the interest of removing variables, I removed all snapshots on all pools, > then restarted all ceph daemons at the same time. This brought up osd.8 as > well. So just to summarize this: your 100% CPU problem at the time went away after you removed all snapshots, and the actual cause of the issue was never found? I am seeing a similar issue now, and have filed http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost again. Can you take a look at that issue and let me know if anything in the description sounds familiar? You mentioned in a later message in the same thread that you would keep your snapshot script running and "repeat the experiment". Did the situation change in any way after that? Did the issue come back? Or did you just stop using snapshots altogether? Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] monitor quorum
Hi, Thanks for the advice. I feel pretty dumb as it does indeed look like a simple networking issue. You know how you check things 5 times and miss the most obvious one... J On 17 September 2014 16:04, Florian Haas wrote: > On Wed, Sep 17, 2014 at 1:58 PM, James Eckersall > wrote: > > Hi, > > > > I have a ceph cluster running 0.80.1 on Ubuntu 14.04. I have 3 monitors > and > > 4 OSD nodes currently. > > > > Everything has been running great up until today where I've got an issue > > with the monitors. > > I moved mon03 to a different switchport so it would have temporarily lost > > connectivity. > > Since then, the cluster is reporting that that mon is down, although it's > > definitely up. > > I've tried restarting the mon services on all three mons, but that hasn't > > made a difference. > > I definitely, 100% do not have any clock skew on any of the mons. This > has > > been triple-checked as the ceph docs seem to suggest that might be the > cause > > of this issue. > > > > Here is what ceph -s and ceph health detail are reporting as well as the > > mon_status for each monitor: > > > > > > # ceph -s ; ceph health detail > > cluster XXX > > health HEALTH_WARN 1 mons down, quorum 0,1 ceph-mon-01,ceph-mon-02 > > monmap e2: 3 mons at > > {ceph-mon-01= > 10.1.1.64:6789/0,ceph-mon-02=10.1.1.65:6789/0,ceph-mon-03=10.1.1.66:6789/0 > }, > > election epoch 932, quorum 0,1 ceph-mon-01,ceph-mon-02 > > osdmap e49213: 80 osds: 80 up, 80 in > > pgmap v18242952: 4864 pgs, 5 pools, 69910 GB data, 17638 kobjects > > 197 TB used, 95904 GB / 290 TB avail > >8 active+clean+scrubbing+deep > > 4856 active+clean > > client io 6893 kB/s rd, 5657 kB/s wr, 2090 op/s > > HEALTH_WARN 1 mons down, quorum 0,1 ceph-mon-01,ceph-mon-02 > > mon.ceph-mon-03 (rank 2) addr 10.1.1.66:6789/0 is down (out of quorum) > > > > > > { "name": "ceph-mon-01", > > "rank": 0, > > "state": "leader", > > "election_epoch": 932, > > "quorum": [ > > 0, > > 1], > > "outside_quorum": [], > > "extra_probe_peers": [], > > "sync_provider": [], > > "monmap": { "epoch": 2, > > "fsid": "XXX", > > "modified": "0.00", > > "created": "0.00", > > "mons": [ > > { "rank": 0, > > "name": "ceph-mon-01", > > "addr": "10.1.1.64:6789\/0"}, > > { "rank": 1, > > "name": "ceph-mon-02", > > "addr": "10.1.1.65:6789\/0"}, > > { "rank": 2, > > "name": "ceph-mon-03", > > "addr": "10.1.1.66:6789\/0"}]}} > > > > > > { "name": "ceph-mon-02", > > "rank": 1, > > "state": "peon", > > "election_epoch": 932, > > "quorum": [ > > 0, > > 1], > > "outside_quorum": [], > > "extra_probe_peers": [], > > "sync_provider": [], > > "monmap": { "epoch": 2, > > "fsid": "XXX", > > "modified": "0.00", > > "created": "0.00", > > "mons": [ > > { "rank": 0, > > "name": "ceph-mon-01", > > "addr": "10.1.1.64:6789\/0"}, > > { "rank": 1, > > "name": "ceph-mon-02", > > "addr": "10.1.1.65:6789\/0"}, > > { "rank": 2, > > "name": "ceph-mon-03", > > "addr": "10.1.1.66:6789\/0"}]}} > > > > > > { "name": "ceph-mon-03", > > "rank": 2, > > "state": "electing", > > "election_epoch": 931, > > "quorum": [], > > "outside_quorum": [], > > "extra_probe_peers": [], > > "sync_provider": [], > > "monmap": { "epoch": 2, > > "fsid": "XXX", > > "modified": "0.00", > > "created": "0.00", > > "mons": [ > > { "rank": 0, > > "name": "ceph-mon-01", > > "addr": "10.1.1.64:6789\/0"}, > > { "rank": 1, > > "name": "ceph-mon-02", > > "addr": "10.1.1.65:6789\/0"}, > > { "rank": 2, > > "name": "ceph-mon-03", > > "addr": "10.1.1.66:6789\/0"}]}} > > > > > > Any help or advice is appreciated. > > It looks like your mon has been unable to communicate with the other > hosts, presumably since the time you un-/replugged it. Check your > switch port configuration. Also, make sure that from 10.1.1.66, you > can not only ping 10.1.1.64 and 10.1.1.65, but make a TCP connection > on port 6789. With that out of the way, check your mon log on > ceph-mon-03 (in /var/log/ceph/mon); it should provide some additional > insight into the problem. > > Cheers, > Florian > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU
Hi Florian, > On 17 Sep 2014, at 17:09, Florian Haas wrote: > > Hi Craig, > > just dug this up in the list archives. > > On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis > wrote: >> In the interest of removing variables, I removed all snapshots on all pools, >> then restarted all ceph daemons at the same time. This brought up osd.8 as >> well. > > So just to summarize this: your 100% CPU problem at the time went away > after you removed all snapshots, and the actual cause of the issue was > never found? > > I am seeing a similar issue now, and have filed > http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost > again. Can you take a look at that issue and let me know if anything > in the description sounds familiar? Could your ticket be related to the snap trimming issue I’ve finally narrowed down in the past couple days? http://tracker.ceph.com/issues/9487 Bump up debug_osd to 20 then check the log during one of your incidents. If it is busy logging the snap_trimmer messages, then it’s the same issue. (The issue is that rbd pools have many purged_snaps, but sometimes after backfilling a PG the purged_snaps list is lost and thus the snap trimmer becomes very busy whilst re-trimming thousands of snaps. During that time (a few minutes on my cluster) the OSD is blocked.) Cheers, Dan > > You mentioned in a later message in the same thread that you would > keep your snapshot script running and "repeat the experiment". Did the > situation change in any way after that? Did the issue come back? Or > did you just stop using snapshots altogether? > > Cheers, > Florian > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU
On Wed, Sep 17, 2014 at 5:24 PM, Dan Van Der Ster wrote: > Hi Florian, > >> On 17 Sep 2014, at 17:09, Florian Haas wrote: >> >> Hi Craig, >> >> just dug this up in the list archives. >> >> On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis >> wrote: >>> In the interest of removing variables, I removed all snapshots on all pools, >>> then restarted all ceph daemons at the same time. This brought up osd.8 as >>> well. >> >> So just to summarize this: your 100% CPU problem at the time went away >> after you removed all snapshots, and the actual cause of the issue was >> never found? >> >> I am seeing a similar issue now, and have filed >> http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost >> again. Can you take a look at that issue and let me know if anything >> in the description sounds familiar? > > > Could your ticket be related to the snap trimming issue I’ve finally narrowed > down in the past couple days? > > http://tracker.ceph.com/issues/9487 > > Bump up debug_osd to 20 then check the log during one of your incidents. If > it is busy logging the snap_trimmer messages, then it’s the same issue. (The > issue is that rbd pools have many purged_snaps, but sometimes after > backfilling a PG the purged_snaps list is lost and thus the snap trimmer > becomes very busy whilst re-trimming thousands of snaps. During that time (a > few minutes on my cluster) the OSD is blocked.) That sounds promising, thank you! debug_osd=10 should actually be sufficient as those snap_trim messages get logged at that level. :) Do I understand your issue report correctly in that you have found setting osd_snap_trim_sleep to be ineffective, because it's being applied when iterating from PG to PG, rather than from snap to snap? If so, then I'm guessing that that can hardly be intentional... Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU
Hi, (Sorry for top posting, mobile now). That's exactly what I observe -- one sleep per PG. The problem is that the sleep can't simply be moved since AFAICT the whole PG is locked for the duration of the trimmer. So the options I proposed are to limit the number of snaps trimmed per call to e.g 16, or to fix the loss of purged_snaps after backfilling. Actually, probably both of those are needed. But a real dev would know better. Cheers, Dan From: Florian Haas Sent: Sep 17, 2014 5:33 PM To: Dan Van Der Ster Cc: Craig Lewis ;ceph-users@lists.ceph.com Subject: Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU On Wed, Sep 17, 2014 at 5:24 PM, Dan Van Der Ster wrote: > Hi Florian, > >> On 17 Sep 2014, at 17:09, Florian Haas wrote: >> >> Hi Craig, >> >> just dug this up in the list archives. >> >> On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis >> wrote: >>> In the interest of removing variables, I removed all snapshots on all pools, >>> then restarted all ceph daemons at the same time. This brought up osd.8 as >>> well. >> >> So just to summarize this: your 100% CPU problem at the time went away >> after you removed all snapshots, and the actual cause of the issue was >> never found? >> >> I am seeing a similar issue now, and have filed >> http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost >> again. Can you take a look at that issue and let me know if anything >> in the description sounds familiar? > > > Could your ticket be related to the snap trimming issue I’ve finally narrowed > down in the past couple days? > > http://tracker.ceph.com/issues/9487 > > Bump up debug_osd to 20 then check the log during one of your incidents. If > it is busy logging the snap_trimmer messages, then it’s the same issue. (The > issue is that rbd pools have many purged_snaps, but sometimes after > backfilling a PG the purged_snaps list is lost and thus the snap trimmer > becomes very busy whilst re-trimming thousands of snaps. During that time (a > few minutes on my cluster) the OSD is blocked.) That sounds promising, thank you! debug_osd=10 should actually be sufficient as those snap_trim messages get logged at that level. :) Do I understand your issue report correctly in that you have found setting osd_snap_trim_sleep to be ineffective, because it's being applied when iterating from PG to PG, rather than from snap to snap? If so, then I'm guessing that that can hardly be intentional... Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU
On Wed, Sep 17, 2014 at 5:42 PM, Dan Van Der Ster wrote: > From: Florian Haas > Sent: Sep 17, 2014 5:33 PM > To: Dan Van Der Ster > Cc: Craig Lewis ;ceph-users@lists.ceph.com > Subject: Re: [ceph-users] RGW hung, 2 OSDs using 100% CPU > > On Wed, Sep 17, 2014 at 5:24 PM, Dan Van Der Ster > wrote: >> Hi Florian, >> >>> On 17 Sep 2014, at 17:09, Florian Haas wrote: >>> >>> Hi Craig, >>> >>> just dug this up in the list archives. >>> >>> On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis >>> wrote: In the interest of removing variables, I removed all snapshots on all pools, then restarted all ceph daemons at the same time. This brought up osd.8 as well. >>> >>> So just to summarize this: your 100% CPU problem at the time went away >>> after you removed all snapshots, and the actual cause of the issue was >>> never found? >>> >>> I am seeing a similar issue now, and have filed >>> http://tracker.ceph.com/issues/9503 to make sure it doesn't get lost >>> again. Can you take a look at that issue and let me know if anything >>> in the description sounds familiar? >> >> >> Could your ticket be related to the snap trimming issue I’ve finally >> narrowed down in the past couple days? >> >> http://tracker.ceph.com/issues/9487 >> >> Bump up debug_osd to 20 then check the log during one of your incidents. >> If it is busy logging the snap_trimmer messages, then it’s the same issue. >> (The issue is that rbd pools have many purged_snaps, but sometimes after >> backfilling a PG the purged_snaps list is lost and thus the snap trimmer >> becomes very busy whilst re-trimming thousands of snaps. During that time (a >> few minutes on my cluster) the OSD is blocked.) > > That sounds promising, thank you! debug_osd=10 should actually be > sufficient as those snap_trim messages get logged at that level. :) > > Do I understand your issue report correctly in that you have found > setting osd_snap_trim_sleep to be ineffective, because it's being > applied when iterating from PG to PG, rather than from snap to snap? > If so, then I'm guessing that that can hardly be intentional... > > Cheers, > Florian > > Hi, > (Sorry for top posting, mobile now). I've taken the liberty to reformat. :) > That's exactly what I observe -- one sleep per PG. The problem is that the > sleep can't simply be moved since AFAICT the whole PG is locked for the > duration of the trimmer. So the options I proposed are to limit the number > of snaps trimmed per call to e.g 16, or to fix the loss of purged_snaps > after backfilling. Actually, probably both of those are needed. But a real > dev would know better. Okay. Certainly worth a try. Thanks again! I'll let you know when I know more. Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] monitor quorum
On Wed, Sep 17, 2014 at 5:21 PM, James Eckersall wrote: > Hi, > > Thanks for the advice. > > I feel pretty dumb as it does indeed look like a simple networking issue. > You know how you check things 5 times and miss the most obvious one... > > J No worries at all .:) Cheers, Florian ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] monitor quorum
Hi, Now I feel dumb for jumping to the conclusion that it was a simple networking issue - it isn't. I've just checked connectivity properly and I can ping and telnet 6789 from all mon servers to all other mon servers. I've just restarted the mon03 service and the log is showing the following: 2014-09-17 16:49:02.355148 7f7ef9f8c800 0 starting mon.ceph-mon-03 rank 2 at 10.1.1.66:6789/0 mon_data /var/lib/ceph/mon/ceph-ceph-mon-03 fsid 74069c87-b361-4bb8-8ce8-6ae9deb8a9bd 2014-09-17 16:49:02.355375 7f7ef9f8c800 1 mon.ceph-mon-03@-1(probing) e2 preinit fsid 74069c87-b361-4bb8-8ce8-6ae9deb8a9bd 2014-09-17 16:49:02.356347 7f7ef9f8c800 1 mon.ceph-mon-03@-1(probing).paxosservice(pgmap 18241250..18241952) refresh upgraded, format 0 -> 1 2014-09-17 16:49:02.356360 7f7ef9f8c800 1 mon.ceph-mon-03@-1(probing).pg v0 on_upgrade discarding in-core PGMap 2014-09-17 16:49:02.400316 7f7ef9f8c800 0 mon.ceph-mon-03@-1(probing).mds e1 print_map epoch 1 flags 0 created 2013-12-09 10:19:58.534310 modified 2013-12-09 10:19:58.534332 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 last_failure 0 last_failure_osd_epoch 0 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding} max_mds 1 in up {} failed stopped data_pools 0 metadata_pool 1 inline_data disabled 2014-09-17 16:49:02.402373 7f7ef9f8c800 0 mon.ceph-mon-03@-1(probing).osd e49212 crush map has features 1107558400, adjusting msgr requires 2014-09-17 16:49:02.402384 7f7ef9f8c800 0 mon.ceph-mon-03@-1(probing).osd e49212 crush map has features 1107558400, adjusting msgr requires 2014-09-17 16:49:02.402386 7f7ef9f8c800 0 mon.ceph-mon-03@-1(probing).osd e49212 crush map has features 1107558400, adjusting msgr requires 2014-09-17 16:49:02.402388 7f7ef9f8c800 0 mon.ceph-mon-03@-1(probing).osd e49212 crush map has features 1107558400, adjusting msgr requires 2014-09-17 16:49:02.403725 7f7ef9f8c800 1 mon.ceph-mon-03@-1(probing).paxosservice(auth 26001..26154) refresh upgraded, format 0 -> 1 2014-09-17 16:49:02.404834 7f7ef9f8c800 0 mon.ceph-mon-03@-1(probing) e2 my rank is now 2 (was -1) 2014-09-17 16:49:02.407439 7f7ef331b700 1 mon.ceph-mon-03@2(synchronizing) e2 sync_obtain_latest_monmap 2014-09-17 16:49:02.407588 7f7ef331b700 1 mon.ceph-mon-03@2(synchronizing) e2 sync_obtain_latest_monmap obtained monmap e2 2014-09-17 16:49:09.514365 7f7ef331b700 0 log [INF] : mon.ceph-mon-03 calling new monitor election 2014-09-17 16:49:09.514523 7f7ef331b700 1 mon.ceph-mon-03@2(electing).elector(931) init, last seen epoch 931 2014-09-17 16:49:09.514658 7f7ef331b700 1 mon.ceph-mon-03@2(electing).paxos(paxos recovering c 31223899..31224482) is_readable now=2014-09-17 16:49:09.514659 lease_expire=0.00 has v0 lc 31224482 2014-09-17 16:49:09.514665 7f7ef331b700 1 mon.ceph-mon-03@2(electing).paxos(paxos recovering c 31223899..31224482) is_readable now=2014-09-17 16:49:09.514666 lease_expire=0.00 has v0 lc 31224482 2014-09-17 16:49:15.533876 7f7ef3b1c700 1 mon.ceph-mon-03@2(electing).elector(933) init, last seen epoch 933 2014-09-17 16:49:21.578269 7f7ef3b1c700 1 mon.ceph-mon-03@2(electing).elector(935) init, last seen epoch 935 2014-09-17 16:49:26.578526 7f7ef3b1c700 1 mon.ceph-mon-03@2(electing).elector(935) init, last seen epoch 935 2014-09-17 16:49:31.578790 7f7ef3b1c700 1 mon.ceph-mon-03@2(electing).elector(935) init, last seen epoch 935 2014-09-17 16:49:36.579044 7f7ef3b1c700 1 mon.ceph-mon-03@2(electing).elector(935) init, last seen epoch 935 The last lines about "electing" repeat forever. The other mons are logging far more entries than I have seen them log before. They look like the following (note the timestamps - all of these log lines are from just a 2 second period): 2014-09-17 16:55:10.019407 7fd5a479a700 1 mon.ceph-mon-02@1(peon).paxos(paxos active c 31224401..31225038) is_readable now=2014-09-17 16:55:10.019408 lease_expire=2014-09-17 16:55:14.518716 has v0 lc 31225038 2014-09-17 16:55:10.019418 7fd5a479a700 1 mon.ceph-mon-02@1(peon).paxos(paxos active c 31224401..31225038) is_readable now=2014-09-17 16:55:10.019418 lease_expire=2014-09-17 16:55:14.518716 has v0 lc 31225038 2014-09-17 16:55:10.180220 7fd5a479a700 1 mon.ceph-mon-02@1(peon).paxos(paxos active c 31224401..31225038) is_readable now=2014-09-17 16:55:10.180222 lease_expire=2014-09-17 16:55:14.518716 has v0 lc 31225038 2014-09-17 16:55:10.180233 7fd5a479a700 1 mon.ceph-mon-02@1(peon).paxos(paxos active c 31224401..31225038) is_readable now=2014-09-17 16:55:10.180234 lease_expire=2014-09-17 16:55:14.518716 has v0 lc 31225038 2014-09-17 16:55:10.192668 7fd5a479a700 1 mon.ceph-mon-02@1(peon).paxos(paxos active c 31224401..31225038) is_readable now=2014-09-17 16:55:10.192670 lease_expire=2014-09-17 16:55:14.518716 has v0 lc 31225038 2014-09-17 16:55:10.192691 7fd5a479a700 1 mon.ceph-mon-02@1(peon).paxos(paxos active c 31224401..3122
Re: [ceph-users] [Ceph-community] Can't Start-up MDS
That looks like the beginning of an mds creation to me. What's your problem in more detail, and what's the output of "ceph -s"? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 15, 2014 at 5:34 PM, Shun-Fa Yang wrote: > Hi all, > > I'm installed ceph v 0.80.5 on Ubuntu 14.04 server version by using > apt-get... > > The log of mds shows as following: > > 2014-09-15 17:24:58.291305 7fd6f6d47800 0 ceph version 0.80.5 > (38b73c67d375a2552d8ed67843c8a65c2c0feba6), process ceph-mds, pid 10487 > > 2014-09-15 17:24:58.302164 7fd6f6d47800 -1 mds.-1.0 *** no OSDs are up as of > epoch 8, waiting > > 2014-09-15 17:25:08.302930 7fd6f6d47800 -1 mds.-1.-1 *** no OSDs are up as > of epoch 8, waiting > > 2014-09-15 17:25:19.322092 7fd6f1938700 1 mds.-1.0 handle_mds_map standby > > 2014-09-15 17:25:19.325024 7fd6f1938700 1 mds.0.3 handle_mds_map i am now > mds.0.3 > > 2014-09-15 17:25:19.325026 7fd6f1938700 1 mds.0.3 handle_mds_map state > change up:standby --> up:creating > > 2014-09-15 17:25:19.325196 7fd6f1938700 0 mds.0.cache creating system inode > with ino:1 > > 2014-09-15 17:25:19.325377 7fd6f1938700 0 mds.0.cache creating system inode > with ino:100 > > 2014-09-15 17:25:19.325381 7fd6f1938700 0 mds.0.cache creating system inode > with ino:600 > > 2014-09-15 17:25:19.325449 7fd6f1938700 0 mds.0.cache creating system inode > with ino:601 > > 2014-09-15 17:25:19.325489 7fd6f1938700 0 mds.0.cache creating system inode > with ino:602 > > 2014-09-15 17:25:19.325538 7fd6f1938700 0 mds.0.cache creating system inode > with ino:603 > > 2014-09-15 17:25:19.325564 7fd6f1938700 0 mds.0.cache creating system inode > with ino:604 > > 2014-09-15 17:25:19.325603 7fd6f1938700 0 mds.0.cache creating system inode > with ino:605 > > 2014-09-15 17:25:19.325627 7fd6f1938700 0 mds.0.cache creating system inode > with ino:606 > > 2014-09-15 17:25:19.325655 7fd6f1938700 0 mds.0.cache creating system inode > with ino:607 > > 2014-09-15 17:25:19.325682 7fd6f1938700 0 mds.0.cache creating system inode > with ino:608 > > 2014-09-15 17:25:19.325714 7fd6f1938700 0 mds.0.cache creating system inode > with ino:609 > > 2014-09-15 17:25:19.325738 7fd6f1938700 0 mds.0.cache creating system inode > with ino:200 > > Could someone tell me how to solve it? > > Thanks. > > -- > 楊順發(yang shun-fa) > > ___ > Ceph-community mailing list > ceph-commun...@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Next Week: Ceph Day San Jose
Hey everyone! We just posted the agenda for next week’s Ceph Day in San Jose: http://ceph.com/cephdays/san-jose/ This Ceph Day will be held in a beautiful facility provided by our friends at Brocade. We have a lot of great speakers from Brocade, Red Hat, Dell, Fujitsu, HGST, and Supermicro, so if you’re in the area we welcome you to join us. To register with a 25% discount, use this link: https://cephdaysanjose.eventbrite.com/?discount=Community We hope to see you there! Cheers, Ross -- Ross Turk Director, Ceph Marketing & Community @rossturk @ceph "Sufficiently advanced technology is indistinguishable from magic." -- Arthur C. Clarke ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] getting ulimit set error while installing ceph in admin node
Hi, any suggestions ? Regards, Subhadip --- On Wed, Sep 17, 2014 at 9:05 AM, Subhadip Bagui wrote: > Hi > > I'm getting the below error while installing ceph in admin node. Please > let me know how to resolve the same. > > > [ceph@ceph-admin ceph-cluster]$* ceph-deploy mon create-initial > ceph-admin* > > > [ceph_deploy.conf][DEBUG ] found configuration file at: > /home/ceph/.cephdeploy.conf > > [ceph_deploy.cli][INFO ] Invoked (1.5.14): /usr/bin/ceph-deploy mon > create-initial ceph-admin > > [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-admin > > [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-admin ... > > [ceph-admin][DEBUG ] connected to host: ceph-admin > > [ceph-admin][DEBUG ] detect platform information from remote host > > [ceph-admin][DEBUG ] detect machine type > > [ceph_deploy.mon][INFO ] distro info: CentOS 6.5 Final > > [ceph-admin][DEBUG ] determining if provided host has same hostname in > remote > > [ceph-admin][DEBUG ] get remote short hostname > > [ceph-admin][DEBUG ] deploying mon to ceph-admin > > [ceph-admin][DEBUG ] get remote short hostname > > [ceph-admin][DEBUG ] remote hostname: ceph-admin > > [ceph-admin][DEBUG ] write cluster configuration to > /etc/ceph/{cluster}.conf > > [ceph-admin][DEBUG ] create the mon path if it does not exist > > [ceph-admin][DEBUG ] checking for done path: > /var/lib/ceph/mon/ceph-ceph-admin/done > > [ceph-admin][DEBUG ] done path does not exist: > /var/lib/ceph/mon/ceph-ceph-admin/done > > [ceph-admin][INFO ] creating keyring file: > /var/lib/ceph/tmp/ceph-ceph-admin.mon.keyring > > [ceph-admin][DEBUG ] create the monitor keyring file > > [ceph-admin][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs > -i ceph-admin --keyring /var/lib/ceph/tmp/ceph-ceph-admin.mon.keyring > > [ceph-admin][DEBUG ] ceph-mon: set fsid to > a36227e3-a39f-41cb-bba1-fea098a4fc65 > > [ceph-admin][DEBUG ] ceph-mon: created monfs at > /var/lib/ceph/mon/ceph-ceph-admin for mon.ceph-admin > > [ceph-admin][INFO ] unlinking keyring file > /var/lib/ceph/tmp/ceph-ceph-admin.mon.keyring > > [ceph-admin][DEBUG ] create a done file to avoid re-doing the mon > deployment > > [ceph-admin][DEBUG ] create the init path if it does not exist > > [ceph-admin][DEBUG ] locating the `service` executable... > > [ceph-admin][INFO ] Running command: sudo /sbin/service ceph -c > /etc/ceph/ceph.conf start mon.ceph-admin > > [ceph-admin][DEBUG ] === mon.ceph-admin === > > [ceph-admin][DEBUG ] Starting Ceph mon.ceph-admin on ceph-admin... > > [ceph-admin][DEBUG ] *failed: 'ulimit -n 32768; /usr/bin/ceph-mon -i > ceph-admin --pid-file /var/run/ceph/mon.ceph-admin.pid -c > /etc/ceph/ceph.conf --cluster ceph '* > > [ceph-admin][DEBUG ] Starting ceph-create-keys on ceph-admin... > > [ceph-admin][WARNIN] No data was received after 7 seconds, disconnecting... > > [ceph-admin][INFO ] Running command: sudo ceph --cluster=ceph > --admin-daemon /var/run/ceph/ceph-mon.ceph-admin.asok mon_status > > [ceph-admin][ERROR ] admin_socket: exception getting command descriptions: > [Errno 2] No such file or directory > > [ceph-admin][WARNIN] monitor: mon.ceph-admin, might not be running yet > > [ceph-admin][INFO ] Running command: sudo ceph --cluster=ceph > --admin-daemon /var/run/ceph/ceph-mon.ceph-admin.asok mon_status > > [ceph-admin][ERROR ] admin_socket: exception getting command descriptions: > [Errno 2] No such file or directory > > [ceph-admin][WARNIN] ceph-admin is not defined in `mon initial members` > > [ceph-admin][WARNIN] monitor ceph-admin does not exist in monmap > > [ceph-admin][WARNIN] neither `public_addr` nor `public_network` keys are > defined for monitors > > [ceph-admin][WARNIN] monitors may not be able to form quorum > [ceph_deploy.mon][INFO ] processing monitor mon.ceph-monitor > > > > Regards, > Subhadip > > --- > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph issue: rbd vs. qemu-kvm
I am trying to use Ceph as a data store with OpenNebula 4.6 and have followed the instructions in OpenNebula's documentation at http://docs.opennebula.org/4.8/administration/storage/ceph_ds.html and compared them against the "using libvirt with ceph" http://ceph.com/docs/master/rbd/libvirt/ We are using the ceph-recompiled qemu-kvm and qemu-img as found at http://ceph.com/packages/qemu-kvm/ under Scientific Linux 6.5 which is a Redhat clone. Also a kernel-lt-3.10 kernel. [root@fgtest15 qemu]# kvm -version QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c) 2003-2008 Fabrice Bellard From qemu-img Supported formats: raw cow qcow vdi vmdk cloop dmg bochs vpc vvfat qcow2 qed parallels nbd blkdebug host_cdrom host_floppy host_device file rbd -- Libvirt is trying to execute the following KVM command: 2014-09-17 19:50:12.774+: starting up LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name one-60 -S -M rhel6.3.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -uuid 572499bf-07f3-3014-8d6a-dfa1ebb99aa4 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-60.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/one//datastores/102/60/disk.1,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/var/lib/one//datastores/102/60/disk.2,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:00:02:0b:04,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:60 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 char device redirected to /dev/pts/3 qemu-kvm: -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789: Invalid argument 2014-09-17 19:50:12.980+: shutting down --- just to show that from the command line I can see the rbd pool fine [root@fgtest15 qemu]# rbd list one foo one-19 one-19-58-0 one-19-60-0 [root@fgtest15 qemu]# rbd info one/one-19-60-0 rbd image 'one-19-60-0': size 40960 MB in 10240 objects order 22 (4096 kB objects) block_name_prefix: rb.0.3c39.238e1f29 format: 1 and even mount stuff with rbd map, etc. It's only inside libvirt that we had the problem. At first we were getting "permission denied" but then I upped the permissions allowed to the libvirt user (client.libvirt2) and then we are just getting "invalid argument" client.libvirt2 key: AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA== caps: [mon] allow r caps: [osd] allow *, allow rwx pool=one -- Any idea why kvm doesn't like the argument I am delivering in the file= argument? Better--does anyone have a working kvm command out of either opennebula or openstack against which I can compare? Thanks Steve Timm -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Scientific Computing Division, Scientific Computing Services Quad. Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] getting ulimit set error while installing ceph in admin node
Subhadip, I updated the master branch of the preflight docs here: http://ceph.com/docs/master/start/ We did encounter some issues that were resolved with those preflight steps. I think it might be either requiretty or SELinux. I will keep you posted. Let me know if it helps. On Wed, Sep 17, 2014 at 12:13 PM, Subhadip Bagui wrote: > Hi, > > any suggestions ? > > Regards, > Subhadip > > --- > > On Wed, Sep 17, 2014 at 9:05 AM, Subhadip Bagui wrote: >> >> Hi >> >> I'm getting the below error while installing ceph in admin node. Please >> let me know how to resolve the same. >> >> >> [ceph@ceph-admin ceph-cluster]$ ceph-deploy mon create-initial ceph-admin >> >> >> [ceph_deploy.conf][DEBUG ] found configuration file at: >> /home/ceph/.cephdeploy.conf >> >> [ceph_deploy.cli][INFO ] Invoked (1.5.14): /usr/bin/ceph-deploy mon >> create-initial ceph-admin >> >> [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-admin >> >> [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-admin ... >> >> [ceph-admin][DEBUG ] connected to host: ceph-admin >> >> [ceph-admin][DEBUG ] detect platform information from remote host >> >> [ceph-admin][DEBUG ] detect machine type >> >> [ceph_deploy.mon][INFO ] distro info: CentOS 6.5 Final >> >> [ceph-admin][DEBUG ] determining if provided host has same hostname in >> remote >> >> [ceph-admin][DEBUG ] get remote short hostname >> >> [ceph-admin][DEBUG ] deploying mon to ceph-admin >> >> [ceph-admin][DEBUG ] get remote short hostname >> >> [ceph-admin][DEBUG ] remote hostname: ceph-admin >> >> [ceph-admin][DEBUG ] write cluster configuration to >> /etc/ceph/{cluster}.conf >> >> [ceph-admin][DEBUG ] create the mon path if it does not exist >> >> [ceph-admin][DEBUG ] checking for done path: >> /var/lib/ceph/mon/ceph-ceph-admin/done >> >> [ceph-admin][DEBUG ] done path does not exist: >> /var/lib/ceph/mon/ceph-ceph-admin/done >> >> [ceph-admin][INFO ] creating keyring file: >> /var/lib/ceph/tmp/ceph-ceph-admin.mon.keyring >> >> [ceph-admin][DEBUG ] create the monitor keyring file >> >> [ceph-admin][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs >> -i ceph-admin --keyring /var/lib/ceph/tmp/ceph-ceph-admin.mon.keyring >> >> [ceph-admin][DEBUG ] ceph-mon: set fsid to >> a36227e3-a39f-41cb-bba1-fea098a4fc65 >> >> [ceph-admin][DEBUG ] ceph-mon: created monfs at >> /var/lib/ceph/mon/ceph-ceph-admin for mon.ceph-admin >> >> [ceph-admin][INFO ] unlinking keyring file >> /var/lib/ceph/tmp/ceph-ceph-admin.mon.keyring >> >> [ceph-admin][DEBUG ] create a done file to avoid re-doing the mon >> deployment >> >> [ceph-admin][DEBUG ] create the init path if it does not exist >> >> [ceph-admin][DEBUG ] locating the `service` executable... >> >> [ceph-admin][INFO ] Running command: sudo /sbin/service ceph -c >> /etc/ceph/ceph.conf start mon.ceph-admin >> >> [ceph-admin][DEBUG ] === mon.ceph-admin === >> >> [ceph-admin][DEBUG ] Starting Ceph mon.ceph-admin on ceph-admin... >> >> [ceph-admin][DEBUG ] failed: 'ulimit -n 32768; /usr/bin/ceph-mon -i >> ceph-admin --pid-file /var/run/ceph/mon.ceph-admin.pid -c >> /etc/ceph/ceph.conf --cluster ceph ' >> >> [ceph-admin][DEBUG ] Starting ceph-create-keys on ceph-admin... >> >> [ceph-admin][WARNIN] No data was received after 7 seconds, >> disconnecting... >> >> [ceph-admin][INFO ] Running command: sudo ceph --cluster=ceph >> --admin-daemon /var/run/ceph/ceph-mon.ceph-admin.asok mon_status >> >> [ceph-admin][ERROR ] admin_socket: exception getting command descriptions: >> [Errno 2] No such file or directory >> >> [ceph-admin][WARNIN] monitor: mon.ceph-admin, might not be running yet >> >> [ceph-admin][INFO ] Running command: sudo ceph --cluster=ceph >> --admin-daemon /var/run/ceph/ceph-mon.ceph-admin.asok mon_status >> >> [ceph-admin][ERROR ] admin_socket: exception getting command descriptions: >> [Errno 2] No such file or directory >> >> [ceph-admin][WARNIN] ceph-admin is not defined in `mon initial members` >> >> [ceph-admin][WARNIN] monitor ceph-admin does not exist in monmap >> >> [ceph-admin][WARNIN] neither `public_addr` nor `public_network` keys are >> defined for monitors >> >> [ceph-admin][WARNIN] monitors may not be able to form quorum >> >> [ceph_deploy.mon][INFO ] processing monitor mon.ceph-monitor >> >> >> >> Regards, >> Subhadip >> >> --- > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- John Wilkins Senior Technical Writer Inktank john.wilk...@inktank.com (415) 425-9599 http://inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] radosgw-admin pools list error
Does radosgw-admin have authentication keys available and with appropriate permissions? http://ceph.com/docs/master/radosgw/config/#create-a-user-and-keyring On Fri, Sep 12, 2014 at 3:13 AM, Santhosh Fernandes wrote: > Hi, > > Anyone help me why my radosgw-admin pool list give me this error > > #radosgw-admin pools list > couldn't init storage provider > > But the rados lspools list all the pools, > > Regards, > Santhosh > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- John Wilkins Senior Technical Writer Inktank john.wilk...@inktank.com (415) 425-9599 http://inktank.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph issue: rbd vs. qemu-kvm
Hi, >From the ones we managed to configure in our lab here. I noticed that using >image format "raw" instead of "qcow2" worked for us. Regards, Luke -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Steven Timm Sent: Thursday, 18 September, 2014 5:01 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] ceph issue: rbd vs. qemu-kvm I am trying to use Ceph as a data store with OpenNebula 4.6 and have followed the instructions in OpenNebula's documentation at http://docs.opennebula.org/4.8/administration/storage/ceph_ds.html and compared them against the "using libvirt with ceph" http://ceph.com/docs/master/rbd/libvirt/ We are using the ceph-recompiled qemu-kvm and qemu-img as found at http://ceph.com/packages/qemu-kvm/ under Scientific Linux 6.5 which is a Redhat clone. Also a kernel-lt-3.10 kernel. [root@fgtest15 qemu]# kvm -version QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c) 2003-2008 Fabrice Bellard >From qemu-img Supported formats: raw cow qcow vdi vmdk cloop dmg bochs vpc vvfat qcow2 qed parallels nbd blkdebug host_cdrom host_floppy host_device file rbd -- Libvirt is trying to execute the following KVM command: 2014-09-17 19:50:12.774+: starting up LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name one-60 -S -M rhel6.3.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -uuid 572499bf-07f3-3014-8d6a-dfa1ebb99aa4 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-60.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/one//datastores/102/60/disk.1,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/var/lib/one//datastores/102/60/disk.2,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:00:02:0b:04,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:60 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 char device redirected to /dev/pts/3 qemu-kvm: -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789: Invalid argument 2014-09-17 19:50:12.980+: shutting down --- just to show that from the command line I can see the rbd pool fine [root@fgtest15 qemu]# rbd list one foo one-19 one-19-58-0 one-19-60-0 [root@fgtest15 qemu]# rbd info one/one-19-60-0 rbd image 'one-19-60-0': size 40960 MB in 10240 objects order 22 (4096 kB objects) block_name_prefix: rb.0.3c39.238e1f29 format: 1 and even mount stuff with rbd map, etc. It's only inside libvirt that we had the problem. At first we were getting "permission denied" but then I upped the permissions allowed to the libvirt user (client.libvirt2) and then we are just getting "invalid argument" client.libvirt2 key: AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA== caps: [mon] allow r caps: [osd] allow *, allow rwx pool=one -- Any idea why kvm doesn't like the argument I am delivering in the file= argument? Better--does anyone have a working kvm command out of either opennebula or openstack against which I can compare? Thanks Steve Timm -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Scientific Computing Division, Scientific Computing Services Quad. Grid and Cloud Services Dept., Associate Dept. Head for Cloud Computing ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com DISCLAIMER: This e-mail (including any attachments) is for the addressee(s) only and may be confidential, especially as regards personal data. If you are not the intended recipient, please note t
Re: [ceph-users] radosgw-admin pools list error
Hi john, I specify the name then I got this error. #radosgw-admin pools list -n client.radosgw.in-west-1 could not list placement set: (2) No such file or directory Regards, Santhosh On Thu, Sep 18, 2014 at 3:44 AM, John Wilkins wrote: > Does radosgw-admin have authentication keys available and with > appropriate permissions? > > http://ceph.com/docs/master/radosgw/config/#create-a-user-and-keyring > > On Fri, Sep 12, 2014 at 3:13 AM, Santhosh Fernandes > wrote: > > Hi, > > > > Anyone help me why my radosgw-admin pool list give me this error > > > > #radosgw-admin pools list > > couldn't init storage provider > > > > But the rados lspools list all the pools, > > > > Regards, > > Santhosh > > > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > John Wilkins > Senior Technical Writer > Inktank > john.wilk...@inktank.com > (415) 425-9599 > http://inktank.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph mds unable to start with 0.85
dear, my ceph cluster worked for about two weeks, mds crashed every 2-3 days, Now it stuck on replay , looks like replay crash and restart mds process again what can i do for this? 1015 => # ceph -s cluster 07df7765-c2e7-44de-9bb3-0b13f6517b18 health HEALTH_ERR 56 pgs inconsistent; 56 scrub errors; mds cluster is degraded; noscrub,nodeep-scrub flag(s) set monmap e1: 2 mons at {storage-1-213=10.1.0.213:6789/0,storage-1-214=10.1.0.214:6789/0}, election epoch 26, quorum 0,1 storage-1-213,storage-1-214 mdsmap e624: 1/1/1 up {0=storage-1-214=up:replay}, 1 up:standby osdmap e1932: 18 osds: 18 up, 18 in flags noscrub,nodeep-scrub pgmap v732381: 500 pgs, 3 pools, 2155 GB data, 39187 kobjects 4479 GB used, 32292 GB / 36772 GB avail 444 active+clean 56 active+clean+inconsistent client io 125 MB/s rd, 31 op/s MDS log here: 014-09-18 12:36:10.684841 7f8240512700 5 mds.-1.-1 handle_mds_map epoch 620 from mon.0 2014-09-18 12:36:10.684888 7f8240512700 1 mds.-1.0 handle_mds_map standby 2014-09-18 12:38:55.584370 7f8240512700 5 mds.-1.0 handle_mds_map epoch 621 from mon.0 2014-09-18 12:38:55.584432 7f8240512700 1 mds.0.272 handle_mds_map i am now mds.0.272 2014-09-18 12:38:55.584436 7f8240512700 1 mds.0.272 handle_mds_map state change up:standby --> up:replay 2014-09-18 12:38:55.584440 7f8240512700 1 mds.0.272 replay_start 2014-09-18 12:38:55.584456 7f8240512700 7 mds.0.cache set_recovery_set 2014-09-18 12:38:55.584460 7f8240512700 1 mds.0.272 recovery set is 2014-09-18 12:38:55.584464 7f8240512700 1 mds.0.272 need osdmap epoch 1929, have 1927 2014-09-18 12:38:55.584467 7f8240512700 1 mds.0.272 waiting for osdmap 1929 (which blacklists prior instance) 2014-09-18 12:38:55.584523 7f8240512700 5 mds.0.272 handle_mds_failure for myself; not doing anything 2014-09-18 12:38:55.585662 7f8240512700 2 mds.0.272 boot_start 0: opening inotable 2014-09-18 12:38:55.585864 7f8240512700 2 mds.0.272 boot_start 0: opening sessionmap 2014-09-18 12:38:55.586003 7f8240512700 2 mds.0.272 boot_start 0: opening mds log 2014-09-18 12:38:55.586049 7f8240512700 5 mds.0.log open discovering log bounds 2014-09-18 12:38:55.586136 7f8240512700 2 mds.0.272 boot_start 0: opening snap table 2014-09-18 12:38:55.586984 7f8240512700 5 mds.0.272 ms_handle_connect on 10.1.0.213:6806/6114 2014-09-18 12:38:55.587037 7f8240512700 5 mds.0.272 ms_handle_connect on 10.1.0.213:6811/6385 2014-09-18 12:38:55.587285 7f8240512700 5 mds.0.272 ms_handle_connect on 10.1.0.213:6801/6110 2014-09-18 12:38:55.591700 7f823ca08700 4 mds.0.log Waiting for journal 200 to recover... 2014-09-18 12:38:55.593297 7f8240512700 5 mds.0.272 ms_handle_connect on 10.1.0.214:6806/6238 2014-09-18 12:38:55.600952 7f823ca08700 4 mds.0.log Journal 200 recovered. 2014-09-18 12:38:55.600967 7f823ca08700 4 mds.0.log Recovered journal 200 in format 1 2014-09-18 12:38:55.600973 7f823ca08700 2 mds.0.272 boot_start 1: loading/discovering base inodes 2014-09-18 12:38:55.600979 7f823ca08700 0 mds.0.cache creating system inode with ino:100 2014-09-18 12:38:55.601279 7f823ca08700 0 mds.0.cache creating system inode with ino:1 2014-09-18 12:38:55.602557 7f8240512700 5 mds.0.272 ms_handle_connect on 10.1.0.214:6811/6276 2014-09-18 12:38:55.607234 7f8240512700 2 mds.0.272 boot_start 2: replaying mds log 2014-09-18 12:38:55.675025 7f823ca08700 7 mds.0.cache adjust_subtree_auth -1,-2 -> -2,-2 on [dir 1 / [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 0x5da] 2014-09-18 12:38:55.675055 7f823ca08700 7 mds.0.cache current root is [dir 1 / [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 | subtree=1 0x5da] 2014-09-18 12:38:55.675065 7f823ca08700 7 mds.0.cache adjust_subtree_auth -1,-2 -> -2,-2 on [dir 100 ~mds0/ [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 0x5da03b8] 2014-09-18 12:38:55.675076 7f823ca08700 7 mds.0.cache current root is [dir 100 ~mds0/ [2,head] auth v=0 cv=0/0 state=1073741824 f() n() hs=0+0,ss=0+0 | subtree=1 0x5da03b8] 2014-09-18 12:38:55.675087 7f823ca08700 7 mds.0.cache adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 1 / [2,head] auth v=1076158 cv=0/0 dir_auth=-2 state=1073741824 f(v0 m2014-09-09 17:49:20.00 1=0+1) n(v87567 rc2014-09-16 12:44:41.750069 b1824476527135 31747410=31708953+38457)/n(v87567 rc2014-09-16 12:44:38.450226 b1824464654503 31746894=31708437+38457) hs=0+0,ss=0+0 | subtree=1 0x5da] bound_dfs [] 2014-09-18 12:38:55.675116 7f823ca08700 7 mds.0.cache adjust_bounded_subtree_auth -2,-2 -> 0,-2 on [dir 1 / [2,head] auth v=1076158 cv=0/0 dir_auth=-2 state=1073741824 f(v0 m2014-09-09 17:49:20.00 1=0+1) n(v87567 rc2014-09-16 12:44:41.750069 b1824476527135 31747410=31708953+38457)/n(v87567 rc2014-09-16 12:44:38.450226 b1824464654503 31746894=31708437+38457) hs=0+0,ss=0+0 | subtree=1 0x5da] bounds 2014-09-18 12:38:55.675129 7f823ca08700 7 mds.0.cache current root is [dir 1 / [2,head] auth v=1076158 cv=0/0 dir_auth=-2 state=1073741824 f(v0 m2014-09-09 17:49:20
Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS
Have anyone ever testing multi volume performance on a *FULL* SSD setup? We are able to get ~18K IOPS for 4K random read on a single volume with fio (with rbd engine) on a 12x DC3700 Setup, but only able to get ~23K (peak) IOPS even with multiple volumes. Seems the maximum random write performance we can get on the entire cluster is quite close to single volume performance. Thanks Jian -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sebastien Han Sent: Tuesday, September 16, 2014 9:33 PM To: Alexandre DERUMIER Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Hi, Thanks for keeping us updated on this subject. dsync is definitely killing the ssd. I don't have much to add, I'm just surprised that you're only getting 5299 with 0.85 since I've been able to get 6,4K, well I was using the 200GB model, that might explain this. On 12 Sep 2014, at 16:32, Alexandre DERUMIER wrote: > here the results for the intel s3500 > > max performance is with ceph 0.85 + optracker disabled. > intel s3500 don't have d_sync problem like crucial > > %util show almost 100% for read and write, so maybe the ssd disk performance > is the limit. > > I have some stec zeusram 8GB in stock (I used them for zfs zil), I'll try to > bench them next week. > > > > > > > INTEL s3500 > --- > raw disk > > > randread: fio --filename=/dev/sdb --direct=1 --rw=randread --bs=4k > --iodepth=32 --group_reporting --invalidate=0 --name=abc > --ioengine=aio bw=288207KB/s, iops=72051 > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0,00 0,00 73454,000,00 293816,00 0,00 8,00 > 30,960,420,420,00 0,01 99,90 > > randwrite: fio --filename=/dev/sdb --direct=1 --rw=randwrite --bs=4k > --iodepth=32 --group_reporting --invalidate=0 --name=abc --ioengine=aio > --sync=1 bw=48131KB/s, iops=12032 > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0,00 0,000,00 24120,00 0,00 48240,00 4,00 > 2,080,090,000,09 0,04 100,00 > > > ceph 0.80 > - > randread: no tuning: bw=24578KB/s, iops=6144 > > > randwrite: bw=10358KB/s, iops=2589 > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0,00 373,000,00 8878,00 0,00 34012,50 7,66 > 1,630,180,000,18 0,06 50,90 > > > ceph 0.85 : > - > > randread : bw=41406KB/s, iops=10351 > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 2,00 0,00 10425,000,00 41816,00 0,00 8,02 > 1,360,130,130,00 0,07 75,90 > > randwrite : bw=17204KB/s, iops=4301 > > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0,00 333,000,00 9788,00 0,00 57909,0011,83 > 1,460,150,000,15 0,07 67,80 > > > ceph 0.85 tuning op_tracker=false > > > randread : bw=86537KB/s, iops=21634 > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 25,00 0,00 21428,000,00 86444,00 0,00 8,07 > 3,130,150,150,00 0,05 98,00 > > randwrite: bw=21199KB/s, iops=5299 > Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0,00 1563,000,00 9880,00 0,00 75223,5015,23 > 2,090,210,000,21 0,07 80,00 > > > - Mail original - > > De: "Alexandre DERUMIER" > À: "Cedric Lemarchand" > Cc: ceph-users@lists.ceph.com > Envoyé: Vendredi 12 Septembre 2014 08:15:08 > Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over > 3, 2K IOPS > > results of fio on rbd with kernel patch > > > > fio rbd crucial m550 1 osd 0.85 (osd_enable_op_tracker true or false, same > result): > --- > bw=12327KB/s, iops=3081 > > So no much better than before, but this time, iostat show only 15% > utils, and latencies are lower > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await > r_await w_await svctm %util sdb 0,00 29,00 0,00 3075,00 0,00 36748,50 > 23,90 0,29 0,10 0,00 0,10 0,05 15,20 > > > So, the write bottleneck seem to be in ceph. > > > > I will send s3500 result today > > - Mail original - > > De: "Alexandre DERUMIER" > À: "Cedric Lemarchand" > Cc: ceph-users@lists.ceph.com > Envoyé: Vendredi 12 Septem
Re: [ceph-users] ceph issue: rbd vs. qemu-kvm
hi steven, we ran into issues when trying to use a non-default user ceph user in opennebula (don't remeber what the default was; but it's probably not libvirt2 ), patches are in https://github.com/OpenNebula/one/pull/33, devs sort-of confirmed they will be in 4.8.1. this way you can set CEPH_USER in the datastore template. (but if this is the case, i think that onedatastore list fails to show size of the datastore) stijn On 09/18/2014 04:38 AM, Luke Jing Yuan wrote: Hi, From the ones we managed to configure in our lab here. I noticed that using image format "raw" instead of "qcow2" worked for us. Regards, Luke -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Steven Timm Sent: Thursday, 18 September, 2014 5:01 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] ceph issue: rbd vs. qemu-kvm I am trying to use Ceph as a data store with OpenNebula 4.6 and have followed the instructions in OpenNebula's documentation at http://docs.opennebula.org/4.8/administration/storage/ceph_ds.html and compared them against the "using libvirt with ceph" http://ceph.com/docs/master/rbd/libvirt/ We are using the ceph-recompiled qemu-kvm and qemu-img as found at http://ceph.com/packages/qemu-kvm/ under Scientific Linux 6.5 which is a Redhat clone. Also a kernel-lt-3.10 kernel. [root@fgtest15 qemu]# kvm -version QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c) 2003-2008 Fabrice Bellard From qemu-img Supported formats: raw cow qcow vdi vmdk cloop dmg bochs vpc vvfat qcow2 qed parallels nbd blkdebug host_cdrom host_floppy host_device file rbd -- Libvirt is trying to execute the following KVM command: 2014-09-17 19:50:12.774+: starting up LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name one-60 -S -M rhel6.3.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -uuid 572499bf-07f3-3014-8d6a-dfa1ebb99aa4 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-60.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/one//datastores/102/60/disk.1,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/var/lib/one//datastores/102/60/disk.2,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:00:02:0b:04,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:60 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 char device redirected to /dev/pts/3 qemu-kvm: -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789: Invalid argument 2014-09-17 19:50:12.980+: shutting down --- just to show that from the command line I can see the rbd pool fine [root@fgtest15 qemu]# rbd list one foo one-19 one-19-58-0 one-19-60-0 [root@fgtest15 qemu]# rbd info one/one-19-60-0 rbd image 'one-19-60-0': size 40960 MB in 10240 objects order 22 (4096 kB objects) block_name_prefix: rb.0.3c39.238e1f29 format: 1 and even mount stuff with rbd map, etc. It's only inside libvirt that we had the problem. At first we were getting "permission denied" but then I upped the permissions allowed to the libvirt user (client.libvirt2) and then we are just getting "invalid argument" client.libvirt2 key: AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA== caps: [mon] allow r caps: [osd] allow *, allow rwx pool=one -- Any idea why kvm doesn't like the argument I am delivering in the file= argument? Better--does anyone have a working kvm command out of either opennebula or openstack against which I can compare? Thanks Steve Timm -- Steven C. Timm, Ph.D (630) 840-8525 t...@fnal.gov http://home.fnal.gov/~timm/ Fermilab Scientific Computing Division, Sci
Re: [ceph-users] ceph issue: rbd vs. qemu-kvm
On 2014年09月18日 10:38, Luke Jing Yuan wrote: Hi, From the ones we managed to configure in our lab here. I noticed that using image format "raw" instead of "qcow2" worked for us. Regards, Luke -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Steven Timm Sent: Thursday, 18 September, 2014 5:01 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] ceph issue: rbd vs. qemu-kvm I am trying to use Ceph as a data store with OpenNebula 4.6 and have followed the instructions in OpenNebula's documentation at http://docs.opennebula.org/4.8/administration/storage/ceph_ds.html and compared them against the "using libvirt with ceph" http://ceph.com/docs/master/rbd/libvirt/ We are using the ceph-recompiled qemu-kvm and qemu-img as found at http://ceph.com/packages/qemu-kvm/ under Scientific Linux 6.5 which is a Redhat clone. Also a kernel-lt-3.10 kernel. [root@fgtest15 qemu]# kvm -version QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c) 2003-2008 Fabrice Bellard From qemu-img Supported formats: raw cow qcow vdi vmdk cloop dmg bochs vpc vvfat qcow2 qed parallels nbd blkdebug host_cdrom host_floppy host_device file rbd -- Libvirt is trying to execute the following KVM command: 2014-09-17 19:50:12.774+: starting up LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name one-60 -S -M rhel6.3.0 -enable-kvm -m 4096 -smp 2,sockets=2,cores=1,threads=1 -uuid 572499bf-07f3-3014-8d6a-dfa1ebb99aa4 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/one-60.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/var/lib/one//datastores/102/60/disk.1,if=none,id=drive-virtio-disk1,format=raw,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/var/lib/one//datastores/102/60/disk.2,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=54:52:00:02:0b:04,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:60 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 char device redirected to /dev/pts/3 qemu-kvm: -drive file=rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789,if=none,id=drive-virtio-disk0,format=qcow2,cache=none: could not open disk image rbd:one/one-19-60-0:id=libvirt2:key=AQAV5BlU2OV7NBAApurqxG0K8UkZlQVy6hKmkA==:auth_supported=cephx\;none:mon_host=stkendca01a\:6789\;stkendca04a\:6789\;stkendca02a\:6789: Invalid argument The error is from qemu-kvm. You need to check whether your qemu-kvm supports all the arguments listed above for option "-drive". As you mentioned, the qemu-kvm is built by youself. It's likely that you missed something or the qemu-kvm version is old, and doesn't support some of the arguments. Regards, Osier ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com