Re: [ceph-users] [cephfs] About feature 'snapshot'

2016-03-18 Thread Gregory Farnum
Snapshots are disabled by default in Jewel as well. Depending on user
feedback about what's most important, we hope to have them ready for Kraken
or the L release (but we'll see).
-Greg

On Friday, March 18, 2016, 施柏安  wrote:

> Hi John,
> Really thank you for your help, and sorry about that I ask such a stupid
> question of setting...
> So isn't this feature ready in Jewel? I found something info says that the
> features(snapshot, quota...) become stable in Jewel
>
> Thank you
>
> 2016-03-18 21:07 GMT+09:00 John Spray  >:
>
>> On Fri, Mar 18, 2016 at 1:33 AM, 施柏安 > > wrote:
>> > Hi John,
>> > How to set this feature on?
>>
>> ceph mds set allow_new_snaps true --yes-i-really-mean-it
>>
>> John
>>
>> > Thank you
>> >
>> > 2016-03-17 21:41 GMT+08:00 Gregory Farnum > >:
>> >>
>> >> On Thu, Mar 17, 2016 at 3:49 AM, John Spray > > wrote:
>> >> > Snapshots are disabled by default:
>> >> >
>> >> >
>> http://docs.ceph.com/docs/hammer/cephfs/early-adopters/#most-stable-configuration
>> >>
>> >> Which makes me wonder if we ought to be hiding the .snaps directory
>> >> entirely in that case. I haven't previously thought about that, but it
>> >> *is* a bit weird.
>> >> -Greg
>> >>
>> >> >
>> >> > John
>> >> >
>> >> > On Thu, Mar 17, 2016 at 10:02 AM, 施柏安 > > wrote:
>> >> >> Hi all,
>> >> >> I encounter a trouble about cephfs sanpshot. It seems that the
>> folder
>> >> >> '.snap' is exist.
>> >> >> But I use 'll -a' can't let it show up. And I enter that folder and
>> >> >> create
>> >> >> folder in it, it showed something wrong to use snapshot.
>> >> >>
>> >> >> Please check : http://imgur.com/elZhQvD
>> >> >>
>> >> >>
>> >> >> ___
>> >> >> ceph-users mailing list
>> >> >> ceph-users@lists.ceph.com
>> 
>> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >>
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> 
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ZFS or BTRFS for performance?

2016-03-18 Thread Lionel Bouton
Hi,

Le 18/03/2016 20:58, Mark Nelson a écrit :
> FWIW, from purely a performance perspective Ceph usually looks pretty
> fantastic on a fresh BTRFS filesystem.  In fact it will probably
> continue to look great until you do small random writes to large
> objects (like say to blocks in an RBD volume).  Then COW starts
> fragmenting the objects into oblivion.  I've seen sequential read
> performance drop by 300% after 5 minutes of 4K random writes to the
> same RBD blocks.
>
> Autodefrag might help.

With 3.19 it wasn't enough for our workload and we had to develop our
own defragmentation, see scheduler https://github.com/jtek/ceph-utils.
We tried autodefrag again with a 4.0.5 kernel but it wasn't good enough
yet (and based on my reading of the linux-btrfs list I don't think there
is any work done on it currently).

>   A long time ago I recall Josef told me it was dangerous to use (I
> think it could run the node out of memory and corrupt the FS), but it
> may be that it's safer now.

No problem here (as long as we use our defragmentation scheduler,
otherwise the performance degrades over time/amount of rewrites).

>   In any event we don't really do a lot of testing with BTRFS these
> days as bluestore is indeed the next gen OSD backend.

Will bluestore provide the same protection against bitrot than BTRFS?
Ie: with BTRFS the deep-scrubs detect inconsistencies *and* the OSD(s)
with invalid data get IO errors when trying to read corrupted data and
as such can't be used as the source for repairs even if they are primary
OSD(s). So with BTRFS you get a pretty good overall protection against
bitrot in Ceph (it allowed us to automate the repair process in the most
common cases). With XFS IIRC unless you override the default behavior
the primary OSD is always the source for repairs (even if all the
secondaries agree on another version of the data).

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] inconsistent PG -> unfound objects on an erasure coded system

2016-03-18 Thread Jeffrey McDonald
Thanks Sam.

Since I have prepared a script for this, I decided to go ahead with the
checks.(patience isn't one of my extended attributes)

I've got a file that searches the full erasure encoded spaces and does your
checklist below.   I have operated only on one PG so far, the 70.459 one
that we've been discussing.There was only the one file that I found to
be out of place--the one we already discussed/found and it has been
removed.

The pg is still marked as inconsistent.   I've scrubbed it a couple of
times now and what I've seen is:

2016-03-17 09:29:53.202818 7f2e816f8700  0 log_channel(cluster) log [INF] :
70.459 deep-scrub starts
2016-03-17 09:36:38.436821 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
70.459s0 deep-scrub stat mismatch, got 22319/22321 objects, 0/0 clones,
22319/22321 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts,
68440088914/68445454633 bytes,0/0 hit_set_archive bytes.
2016-03-17 09:36:38.436844 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
70.459 deep-scrub 1 errors
2016-03-17 09:44:23.592302 7f2e816f8700  0 log_channel(cluster) log [INF] :
70.459 deep-scrub starts
2016-03-17 09:47:01.237846 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
70.459s0 deep-scrub stat mismatch, got 22319/22321 objects, 0/0 clones,
22319/22321 dirty, 0/0 omap, 0/0 hit_set_archive, 0/0 whiteouts,
68440088914/68445454633 bytes,0/0 hit_set_archive bytes.
2016-03-17 09:47:01.237880 7f2e816f8700 -1 log_channel(cluster) log [ERR] :
70.459 deep-scrub 1 errors


Should the scrub be sufficient to remove the inconsistent flag?   I took
the osd offline during the repairs.I've looked at files in all of the
osds in the placement group and I'm not finding any more problem files.
 The vast majority of files do not have the user.cephos.lfn3 attribute.
 There are 22321 objects that I seen and only about 230 have the
user.cephos.lfn3 file attribute.   The files will have other attributes,
just not user.cephos.lfn3.

Regards,
Jeff


On Wed, Mar 16, 2016 at 3:53 PM, Samuel Just  wrote:

> Ok, like I said, most files with _long at the end are *not orphaned*.
> The generation number also is *not* an indication of whether the file
> is orphaned -- some of the orphaned files will have 
> as the generation number and others won't.  For each long filename
> object in a pg you would have to:
> 1) Pull the long name out of the attr
> 2) Parse the hash out of the long name
> 3) Turn that into a directory path
> 4) Determine whether the file is at the right place in the path
> 5) If not, remove it (or echo it to be checked)
>
> You probably want to wait for someone to get around to writing a
> branch for ceph-objectstore-tool.  Should happen in the next week or
> two.
> -Sam
>
>
-- 

Jeffrey McDonald, PhD
Assistant Director for HPC Operations
Minnesota Supercomputing Institute
University of Minnesota Twin Cities
599 Walter Library   email: jeffrey.mcdon...@msi.umn.edu
117 Pleasant St SE   phone: +1 612 625-6905
Minneapolis, MN 55455fax:   +1 612 624-8861
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RGW quota

2016-03-18 Thread Derek Yarnell
Hi,

We have a user with a 50GB quota and has now a single bucket with 20GB
of files.  They had previous buckets created and removed but the quota
has not decreased.  I understand that we do garbage collection but it
has been significantly longer than the defaults that we have not
overridden.  They get 403 QuotaExceeded when trying to write additional
data to a new bucket or the existing bucket.

# radosgw-admin user info --uid=username
...
"user_quota": {
"enabled": true,
"max_size_kb": 52428800,
"max_objects": -1
},

# radosgw-admin bucket stats --bucket=start
...
"usage": {
"rgw.main": {
"size_kb": 21516505,
"size_kb_actual": 21516992,
"num_objects": 243
}
},

# radosgw-admin user stats --uid=username
...
{
"stats": {
"total_entries": 737,
"total_bytes": 55060794604,
"total_bytes_rounded": 55062102016
},
"last_stats_sync": "2016-03-16 14:16:25.205060Z",
"last_stats_update": "2016-03-16 14:16:25.190605Z"
}

Thanks,
derek

-- 
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data corruption with hammer

2016-03-18 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Sage,

You patch seems to have resolved the issue for us. We can't reproduce
the problem with ceph_test_rados or our VM test. I also figured out
that those are all backports that were cherry-picked so it was showing
the original commit date. There was quite a bit of work on
ReplicatedPG.cc since 0.94.6 so it probably only makes sense to wait
for 0.94.7 for this fix.

Thanks for looking into this so quick!

As a work around for 0.94.6, our testing shows that
min_read_recency_for_promote 1 does not have the corruption as it
keeps the original behavior. Something for people to be aware of with
0.94.6 and using cache tiers.

Hopefully there is a way to detect this in a unittest.
-BEGIN PGP SIGNATURE-
Version: Mailvelope v1.3.6
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJW6wILCRDmVDuy+mK58QAAcVQP/0t8jGZuwmwg2RIwkgjQ
Kb3mIxvsmnA9BQ4dICJB3Wu6FPT1/V34t0ThASehWyVSJyiUkdf+pxhXbDaQ
vOr4OOyTwCB2Ly6jaLEgAiyGTL45uOnMYcSttXPG95lilTb+oGUcqBdQzRbw
yJHG18UiEgMvKnttFjTLbd1FjICIY7xkkP7lrdHvaqe200aqQmb+g8CHTVj/
HqzYm/gTs84c2vK+x/nV8OFxY9Yf5WAV+O7uozeWC3SAc2VMlQgi8rdng51N
B+andt/SXgGq9VCDqdmEzcEpBN+2wK6usZQCZJmMXRmW4BXYVK4yAdfgKJOB
MEUN2cDA1i7bMIUcDrh1hnqwEfizkbqOWXpgrgAkQYhtlbp/gvEucl5nYMUy
kv9jNYg/KFQn9tzZqKWmvHj3sjl6DmOlN+A9XA2fGppOiiKk0s4dVKRDFwSJ
LNxUIZm4CtAekaQ4KymE/hK6RhRU2REQl7qSMF+wtw73nhA9gzqP32Ag46yd
WoeGpOngWRnMaejQfkuTSjiDSLvbCd7X5LM/WXH4dJHtHNSSA2qK3c4Nvvqp
yDhvFLdvybtJvWj0+hHczpcP0VlFZH9s7uGWz0+cNabkRnm41EC2+XD6sJ5+
kinZO+CgjbC2AQPdoEKMuvRwBgnftH0YuZJFl0sQPkgBg23r+eCfIxfW/9v/
iLgk
=6It+
-END PGP SIGNATURE-

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Thu, Mar 17, 2016 at 11:55 AM, Robert LeBlanc  wrote:
> Cherry-picking that commit onto v0.94.6 wasn't clean so I'm just
> building your branch. I'm not sure what the difference between your
> branch and 0.94.6 is, I don't see any commits against
> osd/ReplicatedPG.cc in the last 5 months other than the one you did
> today.
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>
> On Thu, Mar 17, 2016 at 11:38 AM, Robert LeBlanc  wrote:
>> Yep, let me pull and build that branch. I tried installing the dbg
>> packages and running it in gdb, but it didn't load the symbols.
>> 
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>>
>> On Thu, Mar 17, 2016 at 11:36 AM, Sage Weil  wrote:
>>> On Thu, 17 Mar 2016, Robert LeBlanc wrote:
 Also, is this ceph_test_rados rewriting objects quickly? I think that
 the issue is with rewriting objects so if we can tailor the
 ceph_test_rados to do that, it might be easier to reproduce.
>>>
>>> It's doing lots of overwrites, yeah.
>>>
>>> I was albe to reproduce--thanks!  It looks like it's specific to
>>> hammer.  The code was rewritten for jewel so it doesn't affect the
>>> latest.  The problem is that maybe_handle_cache may proxy the read and
>>> also still try to handle the same request locally (if it doesn't trigger a
>>> promote).
>>>
>>> Here's my proposed fix:
>>>
>>> https://github.com/ceph/ceph/pull/8187
>>>
>>> Do you mind testing this branch?
>>>
>>> It doesn't appear to be directly related to flipping between writeback and
>>> forward, although it may be that we are seeing two unrelated issues.  I
>>> seemed to be able to trigger it more easily when I flipped modes, but the
>>> bug itself was a simple issue in the writeback mode logic.  :/
>>>
>>> Anyway, please see if this fixes it for you (esp with the RBD workload).
>>>
>>> Thanks!
>>> sage
>>>
>>>
>>>
>>>
 
 Robert LeBlanc
 PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


 On Thu, Mar 17, 2016 at 11:05 AM, Robert LeBlanc  
 wrote:
 > I'll  miss the Ceph community as well. There was a few things I really
 > wanted to work in with Ceph.
 >
 > I got this:
 >
 > update_object_version oid 13 v 1166 (ObjNum 1028 snap 0 seq_num 1028)
 > dirty exists
 > 1038:  left oid 13 (ObjNum 1028 snap 0 seq_num 1028)
 > 1040:  finishing write tid 1 to nodez23350-256
 > 1040:  finishing write tid 2 to nodez23350-256
 > 1040:  finishing write tid 3 to nodez23350-256
 > 1040:  finishing write tid 4 to nodez23350-256
 > 1040:  finishing write tid 6 to nodez23350-256
 > 1035: done (4 left)
 > 1037: done (3 left)
 > 1038: done (2 left)
 > 1043: read oid 430 snap -1
 > 1043:  expect (ObjNum 429 snap 0 seq_num 429)
 > 1040:  finishing write tid 7 to nodez23350-256
 > update_object_version oid 256 v 661 (ObjNum 1029 snap 0 seq_num 1029)
 > dirty exists
 > 1040:  left oid 256 (ObjNum 1029 snap 0 seq_num 1029)
 > 1042:  expect (ObjNum 664 snap 0 seq_num 664)
 > 1043: Error: oid 430 read returned error code -2
 > ./test/osd/RadosModel.h: In function 'virtual void
 > ReadOp::_fini

[ceph-users] cephfs infernalis (ceph version 9.2.1) - bonnie++

2016-03-18 Thread Michael Hanscho
Hi!

Trying to run bonnie++ on cephfs mounted via the kernel driver on a
centos 7.2.1511 machine resulted in:

# bonnie++ -r 128 -u root -d /data/cephtest/bonnie2/
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir):
Directory not empty
Cleaning up test directory after error.

# ceph -w
cluster 
 health HEALTH_OK
 monmap e3: 3 mons at
{cestor4=:6789/0,cestor5=:6789/0,cestor6=:6789/0}
election epoch 62, quorum 0,1,2 cestor4,cestor5,cestor6
 mdsmap e30: 1/1/1 up {0=cestor2=up:active}, 1 up:standby
 osdmap e703: 60 osds: 60 up, 60 in
flags sortbitwise
  pgmap v135437: 1344 pgs, 4 pools, 4315 GB data, 2315 kobjects
7262 GB used, 320 TB / 327 TB avail
1344 active+clean

Any ideas?

Gruesse
Michael
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Local SSD cache for ceph on each compute node.

2016-03-18 Thread Sebastien Han
I’d rather like to see this implemented at the hypervisor level, i.e.: QEMU, so 
we can have a common layer for all the storage backends.
Although this is less portable...

> On 17 Mar 2016, at 11:00, Nick Fisk  wrote:
> 
> 
> 
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Daniel Niasoff
>> Sent: 16 March 2016 21:02
>> To: Nick Fisk ; 'Van Leeuwen, Robert'
>> ; 'Jason Dillaman' 
>> Cc: ceph-users@lists.ceph.com
>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node.
>> 
>> Hi Nick,
>> 
>> Your solution requires manual configuration for each VM and cannot be
>> setup as part of an automated OpenStack deployment.
> 
> Absolutely, potentially flaky as well.
> 
>> 
>> It would be really nice if it was a hypervisor based setting as opposed to
> a VM
>> based setting.
> 
> Yes, I can't wait until we can just specify "rbd_cache_device=/dev/ssd" in
> the ceph.conf and get it to write to that instead. Ideally ceph would also
> provide some sort of lightweight replication for the cache devices, but
> otherwise a iSCSI SSD farm or switched SAS could be used so that the caching
> device is not tied to one physical host.
> 
>> 
>> Thanks
>> 
>> Daniel
>> 
>> -Original Message-
>> From: Nick Fisk [mailto:n...@fisk.me.uk]
>> Sent: 16 March 2016 08:59
>> To: Daniel Niasoff ; 'Van Leeuwen, Robert'
>> ; 'Jason Dillaman' 
>> Cc: ceph-users@lists.ceph.com
>> Subject: RE: [ceph-users] Local SSD cache for ceph on each compute node.
>> 
>> 
>> 
>>> -Original Message-
>>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>>> Of Daniel Niasoff
>>> Sent: 16 March 2016 08:26
>>> To: Van Leeuwen, Robert ; Jason Dillaman
>>> 
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node.
>>> 
>>> Hi Robert,
>>> 
 Caching writes would be bad because a hypervisor failure would result
 in
>>> loss of the cache which pretty much guarantees inconsistent data on
>>> the ceph volume.
 Also live-migration will become problematic compared to running
>>> everything from ceph since you will also need to migrate the
>> local-storage.
>> 
>> I tested a solution using iSCSI for the cache devices. Each VM was using
>> flashcache with a combination of a iSCSI LUN from a SSD and a RBD. This
> gets
>> around the problem of moving things around or if the hypervisor goes down.
>> It's not local caching but the write latency is at least 10x lower than
> the RBD.
>> Note I tested it, I didn't put it into production :-)
>> 
>>> 
>>> My understanding of how a writeback cache should work is that it
>>> should only take a few seconds for writes to be streamed onto the
>>> network and is focussed on resolving the speed issue of small sync
>>> writes. The writes
>> would
>>> be bundled into larger writes that are not time sensitive.
>>> 
>>> So there is potential for a few seconds data loss but compared to the
>> current
>>> trend of using ephemeral storage to solve this issue, it's a major
>>> improvement.
>> 
>> Yeah, problem is a couple of seconds data loss mean different things to
>> different people.
>> 
>>> 
 (considering the time required for setting up and maintaining the
 extra
>>> caching layer on each vm, unless you work for free ;-)
>>> 
>>> Couldn't agree more there.
>>> 
>>> I am just so surprised how the openstack community haven't looked to
>>> resolve this issue. Ephemeral storage is a HUGE compromise unless you
>>> have built in failure into every aspect of your application but many
>>> people use openstack as a general purpose devstack.
>>> 
>>> (Jason pointed out his blueprint but I guess it's at least a year or 2
>> away -
>>> http://tracker.ceph.com/projects/ceph/wiki/Rbd_-_ordered_crash-
>>> consistent_write-back_caching_extension)
>>> 
>>> I see articles discussing the idea such as this one
>>> 
>>> http://www.sebastien-han.fr/blog/2014/06/10/ceph-cache-pool-tiering-
>>> scalable-cache/
>>> 
>>> but no real straightforward  validated setup instructions.
>>> 
>>> Thanks
>>> 
>>> Daniel
>>> 
>>> 
>>> -Original Message-
>>> From: Van Leeuwen, Robert [mailto:rovanleeu...@ebay.com]
>>> Sent: 16 March 2016 08:11
>>> To: Jason Dillaman ; Daniel Niasoff
>>> 
>>> Cc: ceph-users@lists.ceph.com
>>> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node.
>>> 
 Indeed, well understood.
 
 As a shorter term workaround, if you have control over the VMs, you
 could
>>> always just slice out an LVM volume from local SSD/NVMe and pass it
>>> through to the guest.  Within the guest, use dm-cache (or similar) to
>>> add
>> a
>>> cache front-end to your RBD volume.
>>> 
>>> If you do this you need to setup your cache as read-cache only.
>>> Caching writes would be bad because a hypervisor failure would result
>>> in
>> loss
>>> of the cache which pretty much guarantees inconsistent data on the
>>> ceph volume.
>>> Also live-migration w

Re: [ceph-users] data corruption with hammer

2016-03-18 Thread Gregory Farnum
This tracker ticket happened to go by my eyes today:
http://tracker.ceph.com/issues/12814 . There isn't a lot of detail
there but the headline matches.
-Greg

On Wed, Mar 16, 2016 at 2:02 AM, Nick Fisk  wrote:
>
>
>> -Original Message-
>> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
>> Christian Balzer
>> Sent: 16 March 2016 07:08
>> To: Robert LeBlanc 
>> Cc: Robert LeBlanc ; ceph-users > us...@lists.ceph.com>; William Perkins 
>> Subject: Re: [ceph-users] data corruption with hammer
>>
>>
>> Hello Robert,
>>
>> On Tue, 15 Mar 2016 10:54:20 -0600 Robert LeBlanc wrote:
>>
>> > -BEGIN PGP SIGNED MESSAGE-
>> > Hash: SHA256
>> >
>> > There are no monitors on the new node.
>> >
>> So one less possible source of confusion.
>>
>> > It doesn't look like there has been any new corruption since we
>> > stopped changing the cache modes. Upon closer inspection, some files
>> > have been changed such that binary files are now ASCII files and visa
>> > versa. These are readable ASCII files and are things like PHP or
>> > script files. Or C files where ASCII files should be.
>> >
>> What would be most interesting is if the objects containing those
> corrupted
>> files did reside on the new OSDs (primary PG) or the old ones, or both.
>>
>> Also, what cache mode was the cluster in before the first switch
> (writeback I
>> presume from the timeline) and which one is it in now?
>>
>> > I've seen this type of corruption before when a SAN node misbehaved
>> > and both controllers were writing concurrently to the backend disks.
>> > The volume was only mounted by one host, but the writes were split
>> > between the controllers when it should have been active/passive.
>> >
>> > We have killed off the OSDs on the new node as a precaution and will
>> > try to replicate this in our lab.
>> >
>> > I suspicion is that is has to do with the cache promotion code update,
>> > but I'm not sure how it would have caused this.
>> >
>> While blissfully unaware of the code, I have a hard time imagining how it
>> would cause that as well.
>> Potentially a regression in the code that only triggers in one cache mode
> and
>> when wanting to promote something?
>>
>> Or if it is actually the switching action, not correctly promoting things
> as it
>> happens?
>> And thus referencing a stale object?
>
> I can't think of any other reason why the recency would break things in any
> other way. Can the OP confirm what recency setting is being used?
>
> When you switch to writeback, if you haven't reached the required recency
> yet, all reads will be proxied, previous behaviour would have pretty much
> promoted all the time regardless. So unless something is happening where
> writes are getting sent to one tier in forward mode and then read from a
> different tier in WB mode, I'm out of ideas.  I'm pretty sure the code says
> Proxy Read then check for promotion, so I'm not even convinced that there
> should be any difference anyway.
>
> I note the documentation states that in forward mode, modified objects get
> written to the backing tier, I'm not if that sounds correct to me. But if
> that is what is happening, that could also be related to the problem???
>
> I think this might be easyish to reproduce using the get/put commands with a
> couple of objects on a test pool if anybody out there is running 94.6 on the
> whole cluster.
>
>>
>> Christian
>>
>> > -BEGIN PGP SIGNATURE-
>> > Version: Mailvelope v1.3.6
>> > Comment: https://www.mailvelope.com
>> >
>> >
>> wsFcBAEBCAAQBQJW6D4zCRDmVDuy+mK58QAAoW0QAKmaNnN78m/3/YLI
>> IlAB
>> > U+q9PKXgB4ptds1prEJrB/HJqtxIi021M2urk6iO2XRUgR4qSWZyVJWMmeE9
>> > 6EhM6IvLbweOePr2LJ5nAVEkL5Fns+ya/aOAvilqo2WJGr8jt9J1ABjQgodp
>> >
>> SAGwDywo3GbGUmdxWWy5CrhLsdc9WNhiXdBxREh/uqWFvw2D8/1Uq4/u8
>> tEv
>> > fohrGD+SZfYLQwP9O/v8Rc1C3A0h7N4ytSMiN7Xg2CC9bJDynn0FTrP2LAr/
>> >
>> edEYx+SWF2VtKuG7wVHrQqImTfDUoTLJXP5Q6B+Oxy852qvWzglfoRhaKwGf
>> >
>> fodaxFlTDQaeMnyhMlODRMMXadmiTmyM/WK44YBuMjM8tnlaxf7yKgh09A
>> Dz
>> > ay5oviRWnn7peXmq65TvaZzUfz6Mx5ZWYtqIevaXb0ieFgrxCTdVbdpnMNRt
>> >
>> bMwQ+yVQ8WB5AQmEqN6p6enBCxpvr42p8Eu484dO0xqjIiEOfsMANT/8V63
>> y
>> > RzjPMOaFKFnl3JoYNm61RGAUYszNBeX/Plm/3mP0qiiGBAeHYoxh7DNYlrs/
>> >
>> gUb/O9V0yNuHQIRTs8ZRyrzZKpmh9YMYo8hCsfIqWZjMwEyQaRFuysQB3NaR
>> > lQCO/o12Khv2cygmTCQxS2L7vp2zrkPaS/KietqQ0gwkV1XbynK0XyLkAVDw
>> > zTLa
>> > =Wk/a
>> > -END PGP SIGNATURE-
>> > 
>> > Robert LeBlanc
>> > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>> >
>> >
>> > On Mon, Mar 14, 2016 at 9:35 PM, Christian Balzer  wrote:
>> > >
>> > > Hello,
>> > >
>> > > On Mon, 14 Mar 2016 20:51:04 -0600 Mike Lovell wrote:
>> > >
>> > >> something weird happened on one of the ceph clusters that i
>> > >> administer tonight which resulted in virtual machines using rbd
>> > >> volumes seeing corruption in multiple forms.
>> > >>
>> > >> when everything was fine earlier in the day, the cluster was a
>> > >> number of storage nodes spread a

Re: [ceph-users] [cephfs] About feature 'snapshot'

2016-03-18 Thread John Spray
On Thu, Mar 17, 2016 at 1:41 PM, Gregory Farnum  wrote:
> On Thu, Mar 17, 2016 at 3:49 AM, John Spray  wrote:
>> Snapshots are disabled by default:
>> http://docs.ceph.com/docs/hammer/cephfs/early-adopters/#most-stable-configuration
>
> Which makes me wonder if we ought to be hiding the .snaps directory
> entirely in that case. I haven't previously thought about that, but it
> *is* a bit weird.

Hmm, we could use the ever_allowed_snaps field to hide .snap.

However, we would still want to prevent people creating a directory
with that name, because if they ever enabled snapshots, we wouldn't
have a way of resolving that.  So it would be weird to omit .snap from
the directory listing, but then give an error if someone tries to
create a folder with that name.  Perhaps showing the folder (even if
snaps are disabled) is the lesser evil.

John

> -Greg
>
>>
>> John
>>
>> On Thu, Mar 17, 2016 at 10:02 AM, 施柏安  wrote:
>>> Hi all,
>>> I encounter a trouble about cephfs sanpshot. It seems that the folder
>>> '.snap' is exist.
>>> But I use 'll -a' can't let it show up. And I enter that folder and create
>>> folder in it, it showed something wrong to use snapshot.
>>>
>>> Please check : http://imgur.com/elZhQvD
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ZFS or BTRFS for performance?

2016-03-18 Thread Schlacta, Christ
Insofar as I've been able to tell, both BTRFS and ZFS provide similar
capabilities back to CEPH, and both are sufficiently stable for the
basic CEPH use case (Single disk -> single mount point), so the
question becomes this:  Which actually provides better performance?
Which is the more highly optimized single write path for ceph?  Does
anybody have a handful of side-by-side benchmarks?  I'm more
interested in higher IOPS, since you can always scale-out throughput,
but throughput is also important.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data corruption with hammer

2016-03-18 Thread Sage Weil
On Thu, 17 Mar 2016, Robert LeBlanc wrote:
> Also, is this ceph_test_rados rewriting objects quickly? I think that
> the issue is with rewriting objects so if we can tailor the
> ceph_test_rados to do that, it might be easier to reproduce.

It's doing lots of overwrites, yeah.

I was albe to reproduce--thanks!  It looks like it's specific to 
hammer.  The code was rewritten for jewel so it doesn't affect the 
latest.  The problem is that maybe_handle_cache may proxy the read and 
also still try to handle the same request locally (if it doesn't trigger a 
promote).

Here's my proposed fix:

https://github.com/ceph/ceph/pull/8187

Do you mind testing this branch?

It doesn't appear to be directly related to flipping between writeback and 
forward, although it may be that we are seeing two unrelated issues.  I 
seemed to be able to trigger it more easily when I flipped modes, but the 
bug itself was a simple issue in the writeback mode logic.  :/

Anyway, please see if this fixes it for you (esp with the RBD workload).

Thanks!
sage




> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> 
> 
> On Thu, Mar 17, 2016 at 11:05 AM, Robert LeBlanc  wrote:
> > I'll  miss the Ceph community as well. There was a few things I really
> > wanted to work in with Ceph.
> >
> > I got this:
> >
> > update_object_version oid 13 v 1166 (ObjNum 1028 snap 0 seq_num 1028)
> > dirty exists
> > 1038:  left oid 13 (ObjNum 1028 snap 0 seq_num 1028)
> > 1040:  finishing write tid 1 to nodez23350-256
> > 1040:  finishing write tid 2 to nodez23350-256
> > 1040:  finishing write tid 3 to nodez23350-256
> > 1040:  finishing write tid 4 to nodez23350-256
> > 1040:  finishing write tid 6 to nodez23350-256
> > 1035: done (4 left)
> > 1037: done (3 left)
> > 1038: done (2 left)
> > 1043: read oid 430 snap -1
> > 1043:  expect (ObjNum 429 snap 0 seq_num 429)
> > 1040:  finishing write tid 7 to nodez23350-256
> > update_object_version oid 256 v 661 (ObjNum 1029 snap 0 seq_num 1029)
> > dirty exists
> > 1040:  left oid 256 (ObjNum 1029 snap 0 seq_num 1029)
> > 1042:  expect (ObjNum 664 snap 0 seq_num 664)
> > 1043: Error: oid 430 read returned error code -2
> > ./test/osd/RadosModel.h: In function 'virtual void
> > ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fa1bf7fe700 time
> > 2016-03-17 10:47:19.085414
> > ./test/osd/RadosModel.h: 1109: FAILED assert(0)
> > ceph version 0.94.6 (e832001feaf8c176593e0325c8298e3f16dfb403)
> > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x76) [0x4db956]
> > 2: (ReadOp::_finish(TestOp::CallbackInfo*)+0xec) [0x4c959c]
> > 3: (()+0x9791d) [0x7fa1d472191d]
> > 4: (()+0x72519) [0x7fa1d46fc519]
> > 5: (()+0x13c178) [0x7fa1d47c6178]
> > 6: (()+0x80a4) [0x7fa1d425a0a4]
> > 7: (clone()+0x6d) [0x7fa1d2bd504d]
> > NOTE: a copy of the executable, or `objdump -rdS ` is
> > needed to interpret this.
> > terminate called after throwing an instance of 'ceph::FailedAssertion'
> > Aborted
> >
> > I had to toggle writeback/forward and min_read_recency_for_promote a
> > few times to get it, but I don't know if it is because I only have one
> > job running. Even with six jobs running, it is not easy to trigger
> > with ceph_test_rados, but it is very instant in the RBD VMs.
> >
> > Here are the six run crashes (I have about the last 2000 lines of each
> > if needed):
> >
> > nodev:
> > update_object_version oid 1015 v 1255 (ObjNum 1014 snap 0 seq_num
> > 1014) dirty exists
> > 1015:  left oid 1015 (ObjNum 1014 snap 0 seq_num 1014)
> > 1016:  finishing write tid 1 to nodev21799-1016
> > 1016:  finishing write tid 2 to nodev21799-1016
> > 1016:  finishing write tid 3 to nodev21799-1016
> > 1016:  finishing write tid 4 to nodev21799-1016
> > 1016:  finishing write tid 6 to nodev21799-1016
> > 1016:  finishing write tid 7 to nodev21799-1016
> > update_object_version oid 1016 v 1957 (ObjNum 1015 snap 0 seq_num
> > 1015) dirty exists
> > 1016:  left oid 1016 (ObjNum 1015 snap 0 seq_num 1015)
> > 1017:  finishing write tid 1 to nodev21799-1017
> > 1017:  finishing write tid 2 to nodev21799-1017
> > 1017:  finishing write tid 3 to nodev21799-1017
> > 1017:  finishing write tid 5 to nodev21799-1017
> > 1017:  finishing write tid 6 to nodev21799-1017
> > update_object_version oid 1017 v 1010 (ObjNum 1016 snap 0 seq_num
> > 1016) dirty exists
> > 1017:  left oid 1017 (ObjNum 1016 snap 0 seq_num 1016)
> > 1018:  finishing write tid 1 to nodev21799-1018
> > 1018:  finishing write tid 2 to nodev21799-1018
> > 1018:  finishing write tid 3 to nodev21799-1018
> > 1018:  finishing write tid 4 to nodev21799-1018
> > 1018:  finishing write tid 6 to nodev21799-1018
> > 1018:  finishing write tid 7 to nodev21799-1018
> > update_object_version oid 1018 v 1093 (ObjNum 1017 snap 0 seq_num
> > 1017) dirty exists
> > 1018:  left oid 1018 (ObjNum 1017 snap 0 seq_num 1017)
> > 1019:  finishing write tid 1 to nodev21799-1019
> > 1019:  finishing write tid 2 to node

Re: [ceph-users] DONTNEED fadvise flag

2016-03-18 Thread Gregory Farnum
On Wed, Mar 16, 2016 at 9:46 AM, Kenneth Waegeman
 wrote:
> Hi all,
>
> Quick question: Does cephFS pass the fadvise DONTNEED flag and take it into
> account?
> I want to use the --drop-cache option of rsync 3.1.1 to not fill the cache
> when rsyncing to cephFS

It looks like ceph-fuse unfortunately does not. I'm not sure about the
kernel client though.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is there an api to list all s3 user

2016-03-18 Thread Mikaël Guichard

Hi

Php aws sdk with personal updates can do that.
First, you need a functionnal php_aws_sdk with your radosgw and an 
account (access/secret key) with metadata caps.


I use aws Version 2.8.22 :

   $aws = Aws::factory('config.php');
   $this->s3client = $aws->get('s3');

http://docs.aws.amazon.com/aws-sdk-php/v2/guide/configuration.html

Exemple of config.php file :

   return array (
'includes' => array('_aws'),
'services' => array(
// All AWS clients extend from 'default_settings'. Here we are
// overriding 'default_settings' with our default
   credentials and
// providing a default region setting.
'default_settings' => array(
'params' => array(
'version' => 'latest',
'region' => 'us-west-1',
'endpoint' => HOST,
'signature_version' => 'v2',
'credentials' => array(
'key'=> AWS_KEY,
'secret' => AWS_SECRET_KEY,
),
"bucket_endpoint" => false,
'debug'   => true,
)
)
)
   );



And second, this is an example of my updates :

I add a new ServiceDescription to Guzzle :

   $aws = Aws::factory(YOUR_PHP_RGW_CONFIG_FILE);
   $this->s3client = $aws->get('s3');
   $cephCommand = include __DIR__.'/ceph-services.php';
   $description = new
   \Guzzle\Service\Description\ServiceDescription($cephCommand);

   $default =
   
\Guzzle\Service\Command\Factory\CompositeFactory::getDefaultChain($this->s3client);
   $default->add(
new
   \Guzzle\Service\Command\Factory\ServiceDescriptionFactory($description),
'Guzzle\Service\Command\Factory\ServiceDescriptionFactory');

   $this->s3client->setCommandFactory($default);


The ceph-services.php file contains :

   return array(
'apiVersion' => '2015-12-08',
'serviceFullName' => 'Ceph Gateway',
'serviceAbbreviation' => 'CEPH ULR',
'serviceType' => 'rest-xml',
'operations' => array(
'ListAllUsers' => array(
'httpMethod' => 'GET',
'uri' => '/admin/metadata/user',
'class' => 'Aws\S3\Command\S3Command',
'responseClass' => 'ListAllUsersOutput',
'responseType' => 'model',
'parameters' => array(
'format' => array(
'type' => 'string',
'location' => 'query',
'sentAs' => 'format',
'require' => true,
'default' => 'xml',
),
),
),
),
'models' => array(
'ListAllUsersOutput' => array(
'type' => 'object',
'additionalProperties' => true,
'properties' => array(
'Keys' => array(
'type' => 'string',
'location' => 'xml',
),
),
),
)
   );


Just call it by :

   $result = $this->s3client->listAllUsers();


Good luck ...

Mikaël

Le 16/03/2016 07:51, Mika c a écrit :

Hi all,
Hi, I try to find an api that can list all s3 user like command 
'radosgw-admin metadata list user'.
But I can not find any document related. Have anyone know how to get 
this information?

Any comments will be much appreciated!
​​

Best wishes,
Mika


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSDs for journals vs SSDs for a cache tier, which is better?

2016-03-18 Thread Christian Balzer

Hello,

On Wed, 16 Mar 2016 16:22:06 + Stephen Harker wrote:

> On 2016-02-17 11:07, Christian Balzer wrote:
> > 
> > On Wed, 17 Feb 2016 10:04:11 +0100 Piotr Wachowicz wrote:
> > 
> >> > > Let's consider both cases:
> >> > > Journals on SSDs - for writes, the write operation returns right
> >> > > after data lands on the Journal's SSDs, but before it's written to
> >> > > the backing HDD. So, for writes, SSD journal approach should be
> >> > > comparable to having a SSD cache tier.
> >> > Not quite, see below.
> >> >
> >> >
> >> Could you elaborate a bit more?
> >> 
> >> Are you saying that with a Journal on a SSD writes from clients, 
> >> before
> >> they can return from the operation to the client, must end up on both 
> >> the
> >> SSD (Journal) *and* HDD (actual data store behind that journal)?
> > 
> > No, your initial statement is correct.
> > 
> > However that burst of speed doesn't last indefinitely.
> > 
> > Aside from the size of the journal (which is incidentally NOT the most
> > limiting factor) there are various "filestore" parameters in Ceph, in
> > particular the sync interval ones.
> > There was a more in-depth explanation by a developer about this in
> > this ML,
> > try your google-foo.
> > 
> > For short bursts of activity, the journal helps a LOT.
> > If you send a huge number of for example 4KB writes to your cluster, 
> > the
> > speed will eventually (after a few seconds) go down to what your 
> > backing
> > storage (HDDs) are capable of sustaining.
> > 
> >> > (Which SSDs do you plan to use anyway?)
> >> >
> >> 
> >> Intel DC S3700
> >> 
> > Good choice, with the 200GB model prefer the 3700 over the 3710 (higher
> > sequential write speed).
> 
> Hi All,
> 
> I am looking at using PCI-E SSDs as journals in our (4) Ceph OSD nodes, 
> each of which has 6 4TB SATA drives within. I had my eye on these:
> 
> 400GB Intel P3500 DC AIC SSD, HHHL PCIe 3.0
> 
> but reading through this thread, it might be better to go with the P3700 
> given the improved iops. So a couple of questions.
> 
The 3700's will also last significantly longer than the 3500's.
IOPS (of the device) are mostly irrelevant, sequential write speed is
where it's at.
In the same vein, remember that journals are never ever read from unless
there was a crash.
 
> * Are the PCI-E versions of these drives different in any other way than 
> the interface?
> 
> * Would one of these as a journal for 6 4TB OSDs be overkill 
> (connectivity is 10GE, or will be shortly anyway), would the SATA S3700 
> be sufficient?
> 
Overkill, but not insanely so.

>From my (not insignificant) experience you want to match your journal(s)
firstly towards your network speed and then the devices behind them.

A SATA HDD can write indeed about 180MB/s sequentially, but that's firmly
in the land of theory when it comes to Ceph.

Ceph/RBD writes are 4MB objects at the largest, they are spread out all
over the cluster and of course most likely interspersed with competing
(seeking) reads and other writes to the same OSD.
That is before all the IO and thus seeks needed for for file system
operations, LevelDB updates, etc.
I thus spec my journals to 100MB/s write speed per SATA based HDD and
that's already generous.

Concrete case in point, 4 node cluster, 4 DC S3700 100GB SSDs with 2
journals each, 8 7.2k 3TB SATA HDDs, Infiniband network. 
That cluster is very lightly loaded.

Doing this fio from a client VM:
---
fio --size=6G --ioengine=libaio --invalidate=1 --direct=1 --numjobs=1 
--rw=randwrite --name=fiojob --blocksize=4M --iodepth=32
---
and watching all 4 nodes simultaneously with atop shows us that the HDDs
are pushed up to around 80% utilization while writing only about 50MB/s.
The journal SSDs (which can handle 200MB/s writes) are consequently
semi-bored at about 45% utilization writing around 95MB/s.

As others mentioned, the P series will give you significantly lower
latencies if that's important in your use case (small writes that in their
sum do not exceed the abilities  of your backing storage and CPUs).

Also a lot of this depends on your actual HW (cases), how many hot-swap
bays do you have, how many free PCIe slots, etc.
With entirely new HW you could go for something that has 1-2 NVMe hot-swap
bays and get the best of both worlds.

Summing things up, the 400GB P3700 matches your network speed and thus can
deal with short bursts at full speed. 
However it is overkill for your 6 HDDs, especially once they get busy
(like backfilling or tests as above). 
I'd be surprised to see them handle more than 400MB/s writes combined. 

If you're trying to economize, a single 200GB DC S3700 or 2 100GB ones
(smaller failure domains) should do the trick, too.

> Given they're not hot-swappable, it'd be good if they didn't wear out in 
> 6 months too.
> 
See above. 
I haven't been able to make more than 1% impact in the media wearout of
200GB DC S3700s that receive a constant write stream of 3MB/s over 500
days of operation.
 
Christian
-- 
Christian

Re: [ceph-users] ceph-disk from jewel has issues on redhat 7

2016-03-18 Thread Vasu Kulkarni
I can raise a tracker for this issue since it looks like an intermittent
issue and mostly dependent on specific hardware or it would be better if
you add all the hardware/os details in tracker.ceph.com,  also from your
logs it looks like you have
 Resource busy issue: Error: Failed to add partition 2 (Device or resource
busy)

 From my test run logs on centos 7.2 , 10.0.5 (
http://qa-proxy.ceph.com/teuthology/vasu-2016-03-15_15:34:41-selinux-master---basic-mira/62626/teuthology.log
)

2016-03-15T18:49:56.305
INFO:teuthology.orchestra.run.mira041.stderr:[ceph_deploy.osd][DEBUG ]
Preparing host mira041 disk /dev/sdb journal None activate True
2016-03-15T18:49:56.305
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][DEBUG ] find
the location of an executable
2016-03-15T18:49:56.309
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][INFO  ] Running
command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type
xfs -- /dev/sdb
2016-03-15T18:49:56.546
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-osd --cluster=ceph
--show-config-value=fsid
2016-03-15T18:49:56.611
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-osd --check-allows-journal -i
0 --cluster ceph
2016-03-15T18:49:56.643
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0
--cluster ceph
2016-03-15T18:49:56.708
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0
--cluster ceph
2016-03-15T18:49:56.708
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.709
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
set_type: Will colocate journal with data on /dev/sdb
2016-03-15T18:49:56.709
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-osd --cluster=ceph
--show-config-value=osd_journal_size
2016-03-15T18:49:56.774
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.774
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.775
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.775
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_mkfs_options_xfs
2016-03-15T18:49:56.777
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_fs_mkfs_options_xfs
2016-03-15T18:49:56.809
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_mount_options_xfs
2016-03-15T18:49:56.841
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command: Running command: /usr/bin/ceph-conf --cluster=ceph
--name=osd. --lookup osd_fs_mount_options_xfs
2016-03-15T18:49:56.857
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.858
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.858
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
ptype_tobe_for_name: name = journal
2016-03-15T18:49:56.859
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
get_dm_uuid: get_dm_uuid /dev/sdb uuid path is
/sys/dev/block/8:16/dm/uuid
2016-03-15T18:49:56.859
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
create_partition: Creating journal partition num 2 size 5120 on
/dev/sdb
2016-03-15T18:49:56.859
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command_check_call: Running command: /sbin/sgdisk --new=2:0:+5120M
--change-name=2:ceph journal
--partition-guid=2:d4b2fa8d-3f2a-4ce9-a2fe-2a3872d7e198
--typecode=2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 --mbrtogpt --
/dev/sdb
2016-03-15T18:49:57.927
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][DEBUG ] The
operation has completed successfully.
2016-03-15T18:49:57.927
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
update_partition: Calling partprobe on created device /dev/sdb
2016-03-15T18:49:57.928
INFO:teuthology.orchestra.run.mira041.stderr:[mira041][WARNING]
command_check_call: Running command: /usr/bin/udevadm settle
--timeout=600
2016-03-15T18:49:58.393
INFO:teuthology.orchestra.run.mira041.stderr:[mira04