Re: [ceph-users] Self serve / automated S3 key creation?

2019-02-01 Thread Burkhard Linke

Hi,

On 1/31/19 6:11 PM, shubjero wrote:
Has anyone automated the ability to generate S3 keys for OpenStack 
users in Ceph? Right now we take in a users request manually (Hey we 
need an S3 API key for our OpenStack project 'X', can you help?). We 
as cloud/ceph admins just use radosgw-admin to create them an 
access/secret key pair for their specific OpenStack project and 
provide it to them manually. Was just wondering if there was a 
self-serve way to do that. Curious to hear what others have done in 
regards to this.



You can link RGW to Keystone, and pass authentication / signature check 
requests to it. The user can create project scoped EC2 credentials in 
Openstack (via API/CLI/web interface), and use these credentials for 
authentication to the RGW S3 API.



Works well on our side. You may want to ensure that default quotas for 
bucket/objects/size are in place.



The main drawback is the extra latency introduced by the keystone 
upcall. The EC2 credentials are not send to the RGW, so _each_ S3 
request has to authenticated via the keystone API. Add TCP and SSL 
handshake overhead (not sure whether RGW uses a persistent connection)...



You can still use "local" authentication using credentials created with 
rgw-admin. AFAIK there's also a setting to define the order for trying 
authentication, so special users and services get a local set of 
credentials (and thus lower latency, but more administrative overhead), 
normal users can use the keystone calls and are completely self-service.



Regards,

Burkhard


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.11 Luminous released

2019-02-01 Thread Mark Schouten
On Fri, Feb 01, 2019 at 08:44:51AM +0100, Abhishek wrote:
> * This release fixes the pg log hard limit bug that was introduced in
>   12.2.9, https://tracker.ceph.com/issues/36686.  A flag called
>   `pglog_hardlimit` has been introduced, which is off by default. Enabling
>   this flag will limit the length of the pg log.  In order to enable
>   that, the flag must be set by running `ceph osd set pglog_hardlimit`
>   after completely upgrading to 12.2.11. Once the cluster has this flag
>   set, the length of the pg log will be capped by a hard limit. Once set,
>   this flag *must not* be unset anymore.

I'm confused about this. I have a cluster runnine 12.2.9, but should a
just upgrade and be done with it, or should I execute the steps
mentioned above? The pglog_hardlimit is off by default, which suggests I
should not do anything. But since it is related to this bug which I may
or may not be hitting, I'm not sure.

> * There have been fixes to RGW dynamic and manual resharding, which no
> longer
>   leaves behind stale bucket instances to be removed manually. For finding
> and
>   cleaning up older instances from a reshard a radosgw-admin command
> `reshard
>   stale-instances list` and `reshard stale-instances rm` should do the
> necessary
>   cleanup.


Very happy about this! It will cleanup my cluster for sure! This also closes
https://tracker.ceph.com/issues/23651 I think?

-- 
Mark Schouten  | Tuxis Internet Engineering
KvK: 61527076  | http://www.tuxis.nl/
T: 0318 200208 | i...@tuxis.nl

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.11 Luminous released

2019-02-01 Thread Sean Purdy
On Fri, 1 Feb 2019 08:47:47 +0100
Wido den Hollander  wrote:

> 
> 
> On 2/1/19 8:44 AM, Abhishek wrote:
> > We are glad to announce the eleventh bug fix release of the Luminous
> > v12.2.x long term stable release series. We recommend that all users

> > * There have been fixes to RGW dynamic and manual resharding, which no
> > longer
> >   leaves behind stale bucket instances to be removed manually. For
> > finding and
> >   cleaning up older instances from a reshard a radosgw-admin command
> > `reshard
> >   stale-instances list` and `reshard stale-instances rm` should do the
> > necessary
> >   cleanup.
> > 
> 
> Great news! I hope this works! This has been biting a lot of people in
> the last year. I have helped a lot of people to manually clean this up,
> but it's great that this is now available as a regular command.
> 
> Wido

I hope so too, especially when bucket lifecycles and versioning is enabled.

Sean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS MDS journal

2019-02-01 Thread Mahmoud Ismail
Hello,

I'm a bit confused about how the journaling actually works in the MDS.

I was reading about these two configuration parameters (journal write head
interval)  and (mds early reply). Does the MDS flush the journal
synchronously after each operation? and by setting mds eary reply to true
it allows operations to return without flushing. If so, what the other
parameter (journal write head interval) do or isn't it for MDS?. Also, can
all operations return without flushing with the mds early reply or is it
specific to a subset of operations?.

Another question, are open operations also written to the journal?

Regards,
Mahmoud
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Stuart Longland
Hi all,

I'm just in the process of migrating my 3-node Ceph cluster from
BTRFS-backed Filestore over to Bluestore.

Last weekend I did this with my first node, and while the migration went
fine, I noted that the OSD did not survive a reboot test: after
rebooting /var/lib/ceph/osd/ceph-0 was completely empty and
/etc/init.d/ceph-osd.0 (I run OpenRC init on Gentoo) would refuse to start.

https://stuartl.longlandclan.id.au/blog/2019/01/28/solar-cluster-adventures-in-ceph-migration/

I managed to recover it, but tonight I'm trying with my second node.
I've provisioned a temporary OSD (plugged in via USB3) for it to migrate
to using BlueStore.  The ceph cluster called it osd.4.

One thing I note is that `ceph-volume` seems to have created a `tmpfs`
mount for the new OSD:

> tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)

Admittedly this is just a temporary OSD, tomorrow I'll be blowing away
the *real* OSD on this node (osd.1) and provisioning it again using
BlueStore.

I really don't want the ohh crap moment I had on Monday afternoon (as
one does on the Australia Day long weekend) frantically digging through
man pages and having to do the `ceph-bluestore-tool prime-osd-dir` dance.

I think mounting tmpfs for something that should be persistent is highly
dangerous.  Is there some flag I should be using when creating the
BlueStore OSD to avoid that issue?
-- 
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Burkhard Linke

Hi,

On 2/1/19 11:40 AM, Stuart Longland wrote:

Hi all,

I'm just in the process of migrating my 3-node Ceph cluster from
BTRFS-backed Filestore over to Bluestore.

Last weekend I did this with my first node, and while the migration went
fine, I noted that the OSD did not survive a reboot test: after
rebooting /var/lib/ceph/osd/ceph-0 was completely empty and
/etc/init.d/ceph-osd.0 (I run OpenRC init on Gentoo) would refuse to start.

https://stuartl.longlandclan.id.au/blog/2019/01/28/solar-cluster-adventures-in-ceph-migration/

I managed to recover it, but tonight I'm trying with my second node.
I've provisioned a temporary OSD (plugged in via USB3) for it to migrate
to using BlueStore.  The ceph cluster called it osd.4.

One thing I note is that `ceph-volume` seems to have created a `tmpfs`
mount for the new OSD:


tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)

Admittedly this is just a temporary OSD, tomorrow I'll be blowing away
the *real* OSD on this node (osd.1) and provisioning it again using
BlueStore.

I really don't want the ohh crap moment I had on Monday afternoon (as
one does on the Australia Day long weekend) frantically digging through
man pages and having to do the `ceph-bluestore-tool prime-osd-dir` dance.

I think mounting tmpfs for something that should be persistent is highly
dangerous.  Is there some flag I should be using when creating the
BlueStore OSD to avoid that issue?



The tmpfs setup is expected. All persistent data for bluestore OSDs 
setup with LVM are stored in LVM metadata. The LVM/udev handler for 
bluestore volumes create these tmpfs filesystems on the fly and populate 
them with the information from the metadata.



All our ceph nodes do not have any persistent data in /var/lib/ceph/osd 
anymore:


root@bcf-01:~# mount
...

/dev/sdm1 on /boot type ext4 (rw,relatime,data=ordered)
tmpfs on /var/lib/ceph/osd/ceph-125 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-128 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-130 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-3 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-1 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-2 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-129 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-5 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-127 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-131 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-6 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-126 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-124 type tmpfs (rw,relatime)



This works fine on machines using systemd. If your setup does not 
support this, you might want to use the 'simple' ceph-volume mode 
instead of the 'lvm' one. AFAIK it uses the gpt partition type method 
that has been around for years.


Regards,

Burkhard


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Self serve / automated S3 key creation?

2019-02-01 Thread Matthew Vernon

Hi,

On 31/01/2019 17:11, shubjero wrote:
Has anyone automated the ability to generate S3 keys for OpenStack users 
in Ceph? Right now we take in a users request manually (Hey we need an 
S3 API key for our OpenStack project 'X', can you help?). We as 
cloud/ceph admins just use radosgw-admin to create them an access/secret 
key pair for their specific OpenStack project and provide it to them 
manually. Was just wondering if there was a self-serve way to do that. 
Curious to hear what others have done in regards to this.


We've set something up so our Service Desk folks can do this; they use 
"rundeck", so we made a script that rundeck runs that works, in very 
brief outline, thus:


ssh to one of our RGW machines, as a restricted user with forced-command

that user calls a userv service

the userv service does some sanity-checking, then calls a script that 
executes the radosgw-admin command(s) and returns the new keys


the rundeck user has access to user home directories, so makes a .s3cfg 
file with the keys returned, places them in the users' home 
directory[0], and emails the user (including our "getting started with 
S3" docs).


...with similar setup for quota adjustments, and similar.

We quota S3 space separately from Openstack volumes and suchlike.

Regards,

Matthew

[0] strictly, the users can override this behaviour with a userv service 
of their own



--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Alfredo Deza
On Fri, Feb 1, 2019 at 6:28 AM Burkhard Linke
 wrote:
>
> Hi,
>
> On 2/1/19 11:40 AM, Stuart Longland wrote:
> > Hi all,
> >
> > I'm just in the process of migrating my 3-node Ceph cluster from
> > BTRFS-backed Filestore over to Bluestore.
> >
> > Last weekend I did this with my first node, and while the migration went
> > fine, I noted that the OSD did not survive a reboot test: after
> > rebooting /var/lib/ceph/osd/ceph-0 was completely empty and
> > /etc/init.d/ceph-osd.0 (I run OpenRC init on Gentoo) would refuse to start.
> >
> > https://stuartl.longlandclan.id.au/blog/2019/01/28/solar-cluster-adventures-in-ceph-migration/
> >
> > I managed to recover it, but tonight I'm trying with my second node.
> > I've provisioned a temporary OSD (plugged in via USB3) for it to migrate
> > to using BlueStore.  The ceph cluster called it osd.4.
> >
> > One thing I note is that `ceph-volume` seems to have created a `tmpfs`
> > mount for the new OSD:
> >
> >> tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
> > Admittedly this is just a temporary OSD, tomorrow I'll be blowing away
> > the *real* OSD on this node (osd.1) and provisioning it again using
> > BlueStore.
> >
> > I really don't want the ohh crap moment I had on Monday afternoon (as
> > one does on the Australia Day long weekend) frantically digging through
> > man pages and having to do the `ceph-bluestore-tool prime-osd-dir` dance.
> >
> > I think mounting tmpfs for something that should be persistent is highly
> > dangerous.  Is there some flag I should be using when creating the
> > BlueStore OSD to avoid that issue?
>
>
> The tmpfs setup is expected. All persistent data for bluestore OSDs
> setup with LVM are stored in LVM metadata. The LVM/udev handler for
> bluestore volumes create these tmpfs filesystems on the fly and populate
> them with the information from the metadata.

That is mostly what happens. There isn't a dependency on UDEV anymore
(yay), but the reason why files are mounted on tmpfs
is because *bluestore* spits them out on activation, this makes the
path fully ephemeral (a great thing!)

The step-by-step is documented in this summary section of  'activate'
http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#summary

Filestore doesn't have any of these capabilities and it is why it does
have an actual existing path (vs. tmpfs), and the files come from the
data partition that
gets mounted.

>
>
> All our ceph nodes do not have any persistent data in /var/lib/ceph/osd
> anymore:
>
> root@bcf-01:~# mount
> ...
>
> /dev/sdm1 on /boot type ext4 (rw,relatime,data=ordered)
> tmpfs on /var/lib/ceph/osd/ceph-125 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-128 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-130 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-3 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-1 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-2 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-129 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-5 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-127 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-131 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-6 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-126 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-124 type tmpfs (rw,relatime)
> 
>
>
> This works fine on machines using systemd. If your setup does not
> support this, you might want to use the 'simple' ceph-volume mode
> instead of the 'lvm' one. AFAIK it uses the gpt partition type method
> that has been around for years.
>
> Regards,
>
> Burkhard
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Some objects in the tier pool after detaching.

2019-02-01 Thread Andrey Groshev

Hi, PPL!

I disconnect tier pool from data pool.
"rados -p tier.pool ls" shows that there are no objects in the pool.
But "rados df -p=tier.pool" shows:

POOL_NAME USEDOBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED 
 RD_OPS  RD  WR_OPS  WR
tier.pool 148 KiB 960  0 2880   0   0 0 1341533 131 GiB 5276867 
2.8 TiB

What are these objects? How to see them and whether they are needed at all?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Explanation of perf dump of rbd

2019-02-01 Thread Jason Dillaman
On Fri, Feb 1, 2019 at 2:31 AM Sinan Polat  wrote:
>
> Thanks for the clarification!
>
> Great that the next release will include the feature. We are running on Red 
> Hat Ceph, so we might have to wait longer before having the feature available.
>
> Another related (simple) question:
> We are using
> /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
> in ceph.conf, can we include the volume name in the path?

Unfortunately there are no metavariables to translate down to the pool
and/or image names. The pool and image names are available within the
perf metrics dump parent object, but you would need to check all RBD
asok files for the correct image if you weren't planning to scrape all
the sockets periodically.

> Sinan
>
> > Op 1 feb. 2019 om 00:44 heeft Jason Dillaman  het 
> > volgende geschreven:
> >
> >> On Thu, Jan 31, 2019 at 12:16 PM Paul Emmerich  
> >> wrote:
> >>
> >> "perf schema" has a description field that may or may not contain
> >> additional information.
> >>
> >> My best guess for these fields would be bytes read/written since
> >> startup of this particular librbd instance. (Based on how these
> >> counters usually work)
> >
> > Correct -- they should be strictly increasing while the image is
> > in-use. If you periodically scrape the values (along w/ the current
> > timestamp), you can convert these values to the rates between the
> > current and previous metrics.
> >
> > On a semi-related subject: the forthcoming Nautilus release will
> > include new "rbd perf image iotop" and "rbd perf image iostat"
> > commands to monitor metrics by RBD image.
> >
> >> Paul
> >>
> >> --
> >> Paul Emmerich
> >>
> >> Looking for help with your Ceph cluster? Contact us at https://croit.io
> >>
> >> croit GmbH
> >> Freseniusstr. 31h
> >> 81247 München
> >> www.croit.io
> >> Tel: +49 89 1896585 90
> >>
> >>> On Thu, Jan 31, 2019 at 3:41 PM Sinan Polat  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I finally figured out how to measure the statistics of a specific RBD 
> >>> volume;
> >>>
> >>> $ ceph --admin-daemon  perf dump
> >>>
> >>>
> >>> It outputs a lot, but I don't know what it means, is there any 
> >>> documentation about the output?
> >>>
> >>> For now the most important values are:
> >>>
> >>> - bytes read
> >>>
> >>> - bytes written
> >>>
> >>>
> >>> I think I need to look at this:
> >>>
> >>> {
> >>> "rd": 1043,
> >>> "rd_bytes": 28242432,
> >>> "rd_latency": {
> >>> "avgcount": 1768,
> >>> "sum": 2.375461133,
> >>> "avgtime": 0.001343586
> >>> },
> >>> "wr": 76,
> >>> "wr_bytes": 247808,
> >>> "wr_latency": {
> >>> "avgcount": 76,
> >>> "sum": 0.970222300,
> >>> "avgtime": 0.012766082
> >>> }
> >>> }
> >>>
> >>>
> >>> But what is 28242432 (rd_bytes) and 247808 (wr_bytes). Is that 28242432 
> >>> bytes read and 247808 bytes written during the last minute/hour/day? Or 
> >>> is it since mounted, or...?
> >>>
> >>>
> >>> Thanks!
> >>>
> >>>
> >>> Sinan
> >>>
> >>> ___
> >>> ceph-users mailing list
> >>> ceph-users@lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> > --
> > Jason
>


-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Correct syntax for "mon host" line in ceph.conf?

2019-02-01 Thread Will Dennis
I am using the "ceph-ansible" set of Ansible playbooks to try to get a test 
cluster up and running (in Vagrant.) I am deploying Mimic (13.2.4) on Ubuntu 
16.04, with one (for now) monitor, and three osd servers.

I have a play in the Ansible that is erroring out, and in troubleshooting what 
that play does manually, I see:

vagrant@mon0:~$ sudo ceph --cluster ceph -n mon. -k 
/var/lib/ceph/mon/ceph-mon0/keyring mon_status --format json 
server name not found: [v2:192.168.42.10:3300 (Name or service not known) 
unable to parse addrs in '[v2:192.168.42.10:3300,v1:192.168.42.10:6789]'
2019-02-01 04:26:52.446 7f8a66e0e700 -1 monclient: get_monmap_and_config cannot 
identify monitors to contact [errno 22] error connecting to the cluster

The /etc/ceph/ceph.conf file on mon0 has:

vagrant@mon0:~$ less /etc/ceph/ceph.conf 
# Please do not change this file directly since it is managed by Ansible and 
will be overwritten
[global]
mon initial members = mon0
fsid = 039e920d-5f75-4fa3-aee1-f5a90b073f9e
mon host = [v2:192.168.42.10:3300,v1:192.168.42.10:6789]
public network = 192.168.42.0/24
cluster network = 192.168.42.0/24

If I change the "mon host" line to:

mon host = 192.168.42.10

Then the "mon_status" command works:

vagrant@mon0:~$ sudo ceph --cluster ceph -n mon. -k 
/var/lib/ceph/mon/ceph-mon0/keyring mon_status --format json

{"name":"mon0","rank":0,"state":"leader","election_epoch":5,"quorum":[0],"features":{"required_con":"144115738102218752","required_mon":["kraken","luminous","mimic","osdmap-prune"],"quorum_con":"4611087854031142907","quorum_mon":["kraken","luminous","mimic","osdmap-prune"]},"outside_quorum":[],"extra_probe_peers":[],"sync_provider":[],"monmap":{"epoch":1,"fsid":"839b40b9-4260-485d-ad74-4c4fbb223689","modified":"0.00","created":"0.00","features":{"persistent":["kraken","luminous","mimic","osdmap-prune"],"optional":[]},"mons":[{"rank":0,"name":"mon0","addr":"192.168.42.10:6789/0","public_addr":"192.168.42.10:6789/0"}]},"feature_map":{"mon":[{"features":"0x3ffddff8ffa4fffb","release":"luminous","num":1}],"client":[{"features":"0x3ffddff8ffa4fffb","release":"luminous","num":1}]}}


So, my question is: is the syntax of the "mon host" line that ceph-ansible is 
generating correct?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Correct syntax for "mon host" line in ceph.conf?

2019-02-01 Thread Will Dennis
So the problem was an issue with trying to use "master" branch of ceph-ansible, 
instead of a tagged branch...

   From: Sebastien Han [mailto:s...@redhat.com] 
   Sent: Friday, February 01, 2019 9:40 AM
   To: Will Dennis
   Cc: ceph-ansi...@lists.ceph.com
   Subject: Re: [Ceph-ansible] Problem with "ceph-mon : waiting for the 
monitor(s) to form the quorum" play

   Are you trying to deploy mimic or luminous with latest ceph-master master?
   If so, then your failure is expected.

   use stable-3.2 to deploy L or M, current master works only with Nautilus 
(not released yet).

   Thanks!
   -
   Sébastien Han
   Principal Software Engineer, Storage Architect

Sorry for the bother.


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Will 
Dennis
Sent: Friday, February 01, 2019 10:02 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Correct syntax for "mon host" line in ceph.conf?

I am using the "ceph-ansible" set of Ansible playbooks to try to get a test 
cluster up and running (in Vagrant.) I am deploying Mimic (13.2.4) on Ubuntu 
16.04, with one (for now) monitor, and three osd servers.

I have a play in the Ansible that is erroring out, and in troubleshooting what 
that play does manually, I see:

vagrant@mon0:~$ sudo ceph --cluster ceph -n mon. -k 
/var/lib/ceph/mon/ceph-mon0/keyring mon_status --format json server name not 
found: [v2:192.168.42.10:3300 (Name or service not known) unable to parse addrs 
in '[v2:192.168.42.10:3300,v1:192.168.42.10:6789]'
2019-02-01 04:26:52.446 7f8a66e0e700 -1 monclient: get_monmap_and_config cannot 
identify monitors to contact [errno 22] error connecting to the cluster

The /etc/ceph/ceph.conf file on mon0 has:

vagrant@mon0:~$ less /etc/ceph/ceph.conf # Please do not change this file 
directly since it is managed by Ansible and will be overwritten [global] mon 
initial members = mon0 fsid = 039e920d-5f75-4fa3-aee1-f5a90b073f9e
mon host = [v2:192.168.42.10:3300,v1:192.168.42.10:6789]
public network = 192.168.42.0/24
cluster network = 192.168.42.0/24

If I change the "mon host" line to:

mon host = 192.168.42.10

Then the "mon_status" command works:

vagrant@mon0:~$ sudo ceph --cluster ceph -n mon. -k 
/var/lib/ceph/mon/ceph-mon0/keyring mon_status --format json

{"name":"mon0","rank":0,"state":"leader","election_epoch":5,"quorum":[0],"features":{"required_con":"144115738102218752","required_mon":["kraken","luminous","mimic","osdmap-prune"],"quorum_con":"4611087854031142907","quorum_mon":["kraken","luminous","mimic","osdmap-prune"]},"outside_quorum":[],"extra_probe_peers":[],"sync_provider":[],"monmap":{"epoch":1,"fsid":"839b40b9-4260-485d-ad74-4c4fbb223689","modified":"0.00","created":"0.00","features":{"persistent":["kraken","luminous","mimic","osdmap-prune"],"optional":[]},"mons":[{"rank":0,"name":"mon0","addr":"192.168.42.10:6789/0","public_addr":"192.168.42.10:6789/0"}]},"feature_map":{"mon":[{"features":"0x3ffddff8ffa4fffb","release":"luminous","num":1}],"client":[{"features":"0x3ffddff8ffa4fffb","release":"luminous","num":1}]}}


So, my question is: is the syntax of the "mon host" line that ceph-ansible is 
generating correct?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Bluestore HDD Cluster Advice

2019-02-01 Thread John Petrini
Hello,

We'll soon be building out four new luminous clusters with Bluestore.
Our current clusters are running filestore so we're not very familiar
with Bluestore yet and I'd like to have an idea of what to expect.

Here are the OSD hardware specs (5x per cluster):
2x 3.0GHz 18c/36t
22x 1.8TB 10K SAS (RAID1 OS + 20 OSD's)
5x 480GB Intel S4610 SSD's (WAL and DB)
192 GB RAM
4X Mellanox 25GB NIC
PERC H730p

With filestore we've found that we can achieve sub-millisecond write
latency by running very fast journals (currently Intel S4610's). My
main concern is that Bluestore doesn't use journals and instead writes
directly to the higher latency HDD; in theory resulting in slower acks
and higher write latency. How does Bluestore handle this? Can we
expect similar or better performance then our current filestore
clusters?

I've heard it repeated that Bluestore performs better than Filestore
but I've also heard some people claiming this is not always the case
with HDD's. Is there any truth to that and if so is there a
configuration we can use to achieve this same type of performance with
Bluestore?

Thanks all.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.11 Luminous released

2019-02-01 Thread Neha Ojha
On Fri, Feb 1, 2019 at 1:11 AM Mark Schouten  wrote:
>
> On Fri, Feb 01, 2019 at 08:44:51AM +0100, Abhishek wrote:
> > * This release fixes the pg log hard limit bug that was introduced in
> >   12.2.9, https://tracker.ceph.com/issues/36686.  A flag called
> >   `pglog_hardlimit` has been introduced, which is off by default. Enabling
> >   this flag will limit the length of the pg log.  In order to enable
> >   that, the flag must be set by running `ceph osd set pglog_hardlimit`
> >   after completely upgrading to 12.2.11. Once the cluster has this flag
> >   set, the length of the pg log will be capped by a hard limit. Once set,
> >   this flag *must not* be unset anymore.
>
> I'm confused about this. I have a cluster runnine 12.2.9, but should a
> just upgrade and be done with it, or should I execute the steps
> mentioned above? The pglog_hardlimit is off by default, which suggests I
> should not do anything. But since it is related to this bug which I may
> or may not be hitting, I'm not sure.

If you would have hit the bug, you should have seen failures like
https://tracker.ceph.com/issues/36686.
Yes, pglog_hardlimit is off by default in 12.2.11. Since you are
running 12.2.9(which has the patch that allows you to limit the length
of the pg log), you could follow the steps and upgrade to 12.2.11 and
set this flag.

>
> > * There have been fixes to RGW dynamic and manual resharding, which no
> > longer
> >   leaves behind stale bucket instances to be removed manually. For finding
> > and
> >   cleaning up older instances from a reshard a radosgw-admin command
> > `reshard
> >   stale-instances list` and `reshard stale-instances rm` should do the
> > necessary
> >   cleanup.
>
>
> Very happy about this! It will cleanup my cluster for sure! This also closes
> https://tracker.ceph.com/issues/23651 I think?
>
> --
> Mark Schouten  | Tuxis Internet Engineering
> KvK: 61527076  | http://www.tuxis.nl/
> T: 0318 200208 | i...@tuxis.nl
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Stuart Longland
On 1/2/19 10:43 pm, Alfredo Deza wrote:
>>> I think mounting tmpfs for something that should be persistent is highly
>>> dangerous.  Is there some flag I should be using when creating the
>>> BlueStore OSD to avoid that issue?
>>
>> The tmpfs setup is expected. All persistent data for bluestore OSDs
>> setup with LVM are stored in LVM metadata. The LVM/udev handler for
>> bluestore volumes create these tmpfs filesystems on the fly and populate
>> them with the information from the metadata.
> That is mostly what happens. There isn't a dependency on UDEV anymore
> (yay), but the reason why files are mounted on tmpfs
> is because *bluestore* spits them out on activation, this makes the
> path fully ephemeral (a great thing!)
> 
> The step-by-step is documented in this summary section of  'activate'
> http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#summary
> 
> Filestore doesn't have any of these capabilities and it is why it does
> have an actual existing path (vs. tmpfs), and the files come from the
> data partition that
> gets mounted.
> 

Well, for whatever reason, ceph-osd isn't calling the activate script
before it starts up.

It is worth noting that the systems I'm using do not use systemd out of
simplicity.  I might need to write an init script to do that.  It wasn't
clear last weekend what commands I needed to run to activate a BlueStore
OSD.

For now though, sounds like tarring up the data directory, unmounting
the tmpfs then unpacking the tar is a good-enough work-around.  That's
what I've done for my second node (now I know of the problem) and so it
should survive a reboot now.

The only other two steps were to ensure `lvm` was marked to start at
boot (so it would bring up all the volume groups) and that there was a
UDEV rule in place to set the ownership on the LVM VGs for Ceph.
-- 
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Alfredo Deza
On Fri, Feb 1, 2019 at 3:08 PM Stuart Longland
 wrote:
>
> On 1/2/19 10:43 pm, Alfredo Deza wrote:
> >>> I think mounting tmpfs for something that should be persistent is highly
> >>> dangerous.  Is there some flag I should be using when creating the
> >>> BlueStore OSD to avoid that issue?
> >>
> >> The tmpfs setup is expected. All persistent data for bluestore OSDs
> >> setup with LVM are stored in LVM metadata. The LVM/udev handler for
> >> bluestore volumes create these tmpfs filesystems on the fly and populate
> >> them with the information from the metadata.
> > That is mostly what happens. There isn't a dependency on UDEV anymore
> > (yay), but the reason why files are mounted on tmpfs
> > is because *bluestore* spits them out on activation, this makes the
> > path fully ephemeral (a great thing!)
> >
> > The step-by-step is documented in this summary section of  'activate'
> > http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#summary
> >
> > Filestore doesn't have any of these capabilities and it is why it does
> > have an actual existing path (vs. tmpfs), and the files come from the
> > data partition that
> > gets mounted.
> >
>
> Well, for whatever reason, ceph-osd isn't calling the activate script
> before it starts up.

ceph-osd doesn't call the activate script. Systemd is the one that
calls ceph-volume to activate OSDs.
>
> It is worth noting that the systems I'm using do not use systemd out of
> simplicity.  I might need to write an init script to do that.  It wasn't
> clear last weekend what commands I needed to run to activate a BlueStore
> OSD.

If deployed with ceph-volume, you can just do:

ceph-volume lvm activate --all

>
> For now though, sounds like tarring up the data directory, unmounting
> the tmpfs then unpacking the tar is a good-enough work-around.  That's
> what I've done for my second node (now I know of the problem) and so it
> should survive a reboot now.

There is no need to tar anything. Calling out to ceph-volume to
activate everything should just work.

>
> The only other two steps were to ensure `lvm` was marked to start at
> boot (so it would bring up all the volume groups) and that there was a
> UDEV rule in place to set the ownership on the LVM VGs for Ceph.

Right, you do need to ensure LVM is installed/enabled. But *for sure*
there is no need to UDEV rules to set any ownership for Ceph, this is
a task
that ceph-volume handles

> --
> Stuart Longland (aka Redhatter, VK4MSL)
>
> I haven't lost my mind...
>   ...it's backed up on a tape somewhere.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.11 Luminous released

2019-02-01 Thread Robert Sander
Am 01.02.19 um 19:06 schrieb Neha Ojha:

> If you would have hit the bug, you should have seen failures like
> https://tracker.ceph.com/issues/36686.
> Yes, pglog_hardlimit is off by default in 12.2.11. Since you are
> running 12.2.9(which has the patch that allows you to limit the length
> of the pg log), you could follow the steps and upgrade to 12.2.11 and
> set this flag.

The question is: If I am now on 12.2.9 and see no issues, do I have to
set this flag after upgrading to 12.2.11?

Regards
-- 
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.11 Luminous released

2019-02-01 Thread Neha Ojha
On Fri, Feb 1, 2019 at 1:09 PM Robert Sander
 wrote:
>
> Am 01.02.19 um 19:06 schrieb Neha Ojha:
>
> > If you would have hit the bug, you should have seen failures like
> > https://tracker.ceph.com/issues/36686.
> > Yes, pglog_hardlimit is off by default in 12.2.11. Since you are
> > running 12.2.9(which has the patch that allows you to limit the length
> > of the pg log), you could follow the steps and upgrade to 12.2.11 and
> > set this flag.
>
> The question is: If I am now on 12.2.9 and see no issues, do I have to
> set this flag after upgrading to 12.2.11?
You don't have to.
This flag lets you restrict the length of your pg logs, so if you do
not want to use this functionality, no need to set this.

>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RBD default pool

2019-02-01 Thread solarflow99
I thought a new cluster would have the 'rbd' pool already created, has this
changed?  I'm using mimic.


# rbd ls
rbd: error opening default pool 'rbd'
Ensure that the default pool has been created or specify an alternate pool
name.
rbd: list: (2) No such file or directory
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD default pool

2019-02-01 Thread Alan Johnson
Confirm that no pools are created by default with Mimic.

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
solarflow99
Sent: Friday, February 1, 2019 2:28 PM
To: Ceph Users 
Subject: [ceph-users] RBD default pool

I thought a new cluster would have the 'rbd' pool already created, has this 
changed?  I'm using mimic.


# rbd ls
rbd: error opening default pool 'rbd'
Ensure that the default pool has been created or specify an alternate pool name.
rbd: list: (2) No such file or directory


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Problem replacing osd with ceph-deploy

2019-02-01 Thread Shain Miley

Hi,

I went to replace a disk today (which I had not had to do in a while) 
and after I added it the results looked rather odd compared to times past:


I was attempting to replace /dev/sdk on one of our osd nodes:

#ceph-deploy disk zap hqosd7 /dev/sdk
#ceph-deploy osd create --data /dev/sdk hqosd7

[ceph_deploy.conf][DEBUG ] found configuration file at: 
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/local/bin/ceph-deploy 
osd create --data /dev/sdk hqosd7

[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  bluestore : None
[ceph_deploy.cli][INFO  ]  cd_conf   : 


[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  fs_type   : xfs
[ceph_deploy.cli][INFO  ]  block_wal : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  journal   : None
[ceph_deploy.cli][INFO  ]  subcommand    : create
[ceph_deploy.cli][INFO  ]  host  : hqosd7
[ceph_deploy.cli][INFO  ]  filestore : None
[ceph_deploy.cli][INFO  ]  func  : at 0x7fa3b14b3398>

[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  zap_disk  : False
[ceph_deploy.cli][INFO  ]  data  : /dev/sdk
[ceph_deploy.cli][INFO  ]  block_db  : None
[ceph_deploy.cli][INFO  ]  dmcrypt   : False
[ceph_deploy.cli][INFO  ]  overwrite_conf    : False
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   : 
/etc/ceph/dmcrypt-keys

[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  debug : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device 
/dev/sdk

[hqosd7][DEBUG ] connected to host: hqosd7
[hqosd7][DEBUG ] detect platform information from remote host
[hqosd7][DEBUG ] detect machine type
[hqosd7][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 16.04 xenial
[ceph_deploy.osd][DEBUG ] Deploying osd to hqosd7
[hqosd7][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[hqosd7][DEBUG ] find the location of an executable
[hqosd7][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph 
lvm create --bluestore --data /dev/sdk

[hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
[hqosd7][DEBUG ] Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring 
-i - osd new c98a11d1-9b7f-487e-8c69-72fc662927d4
[hqosd7][DEBUG ] Running command: vgcreate --force --yes 
ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1 /dev/sdk

[hqosd7][DEBUG ]  stdout: Physical volume "/dev/sdk" successfully created
[hqosd7][DEBUG ]  stdout: Volume group 
"ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1" successfully created
[hqosd7][DEBUG ] Running command: lvcreate --yes -l 100%FREE -n 
osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4 
ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1
[hqosd7][DEBUG ]  stdout: Logical volume 
"osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4" created.

[hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
[hqosd7][DEBUG ] Running command: mount -t tmpfs tmpfs 
/var/lib/ceph/osd/ceph-81

[hqosd7][DEBUG ] Running command: chown -R ceph:ceph /dev/dm-0
[hqosd7][DEBUG ] Running command: ln -s 
/dev/ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1/osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4 
/var/lib/ceph/osd/ceph-81/block
[hqosd7][DEBUG ] Running command: ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring 
mon getmap -o /var/lib/ceph/osd/ceph-81/activate.monmap

[hqosd7][DEBUG ]  stderr: got monmap epoch 2
[hqosd7][DEBUG ] Running command: ceph-authtool 
/var/lib/ceph/osd/ceph-81/keyring --create-keyring --name osd.81 
--add-key AQCyyFRcSwWqGBAAKZR8rcWIEknj/o3rsehOdA==

[hqosd7][DEBUG ]  stdout: creating /var/lib/ceph/osd/ceph-81/keyring
[hqosd7][DEBUG ]  stdout: added entity osd.81 auth auth(auid = 
18446744073709551615 key=AQCyyFRcSwWqGBAAKZR8rcWIEknj/o3rsehOdA== with 0 
caps)
[hqosd7][DEBUG ] Running command: chown -R ceph:ceph 
/var/lib/ceph/osd/ceph-81/keyring
[hqosd7][DEBUG ] Running command: chown -R ceph:ceph 
/var/lib/ceph/osd/ceph-81/
[hqosd7][DEBUG ] Running command: /usr/bin/ceph-osd --cluster ceph 
--osd-objectstore bluestore --mkfs -i 81 --monmap 
/var/lib/ceph/osd/ceph-81/activate.monmap --keyfile - --osd-data 
/var/lib/ceph/osd/ceph-81/ --osd-uuid 
c98a11d1-9b7f-487e-8c69-72fc662927d4 --setuser ceph --setgroup ceph

[hqosd7][DEBUG ] --> ceph-volume lvm prepare successful for: /dev/sdk
[hqosd7][DEBUG ] Running command: cep

Re: [ceph-users] Problem replacing osd with ceph-deploy

2019-02-01 Thread Vladimir Prokofev
Your output looks a bit weird, but still, this is normal for bluestore. It
creates small separate data partition that is presented as XFS mounted in
/var/lib/ceph/osd, while real data partition is hidden as raw(bluestore)
block device.
It's no longer possible to check disk utilisation with df using bluestore.
To check your osd capacity use 'ceph osd df'

сб, 2 февр. 2019 г. в 02:07, Shain Miley :

> Hi,
>
> I went to replace a disk today (which I had not had to do in a while)
> and after I added it the results looked rather odd compared to times past:
>
> I was attempting to replace /dev/sdk on one of our osd nodes:
>
> #ceph-deploy disk zap hqosd7 /dev/sdk
> #ceph-deploy osd create --data /dev/sdk hqosd7
>
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/local/bin/ceph-deploy
> osd create --data /dev/sdk hqosd7
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  bluestore : None
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  fs_type   : xfs
> [ceph_deploy.cli][INFO  ]  block_wal : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  journal   : None
> [ceph_deploy.cli][INFO  ]  subcommand: create
> [ceph_deploy.cli][INFO  ]  host  : hqosd7
> [ceph_deploy.cli][INFO  ]  filestore : None
> [ceph_deploy.cli][INFO  ]  func  :  at 0x7fa3b14b3398>
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  zap_disk  : False
> [ceph_deploy.cli][INFO  ]  data  : /dev/sdk
> [ceph_deploy.cli][INFO  ]  block_db  : None
> [ceph_deploy.cli][INFO  ]  dmcrypt   : False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
> /etc/ceph/dmcrypt-keys
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  debug : False
> [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device
> /dev/sdk
> [hqosd7][DEBUG ] connected to host: hqosd7
> [hqosd7][DEBUG ] detect platform information from remote host
> [hqosd7][DEBUG ] detect machine type
> [hqosd7][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: Ubuntu 16.04 xenial
> [ceph_deploy.osd][DEBUG ] Deploying osd to hqosd7
> [hqosd7][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [hqosd7][DEBUG ] find the location of an executable
> [hqosd7][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph
> lvm create --bluestore --data /dev/sdk
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> -i - osd new c98a11d1-9b7f-487e-8c69-72fc662927d4
> [hqosd7][DEBUG ] Running command: vgcreate --force --yes
> ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1 /dev/sdk
> [hqosd7][DEBUG ]  stdout: Physical volume "/dev/sdk" successfully created
> [hqosd7][DEBUG ]  stdout: Volume group
> "ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1" successfully created
> [hqosd7][DEBUG ] Running command: lvcreate --yes -l 100%FREE -n
> osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
> ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1
> [hqosd7][DEBUG ]  stdout: Logical volume
> "osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4" created.
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
> [hqosd7][DEBUG ] Running command: mount -t tmpfs tmpfs
> /var/lib/ceph/osd/ceph-81
> [hqosd7][DEBUG ] Running command: chown -R ceph:ceph /dev/dm-0
> [hqosd7][DEBUG ] Running command: ln -s
> /dev/ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1/osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
>
> /var/lib/ceph/osd/ceph-81/block
> [hqosd7][DEBUG ] Running command: ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> mon getmap -o /var/lib/ceph/osd/ceph-81/activate.monmap
> [hqosd7][DEBUG ]  stderr: got monmap epoch 2
> [hqosd7][DEBUG ] Running command: ceph-authtool
> /var/lib/ceph/osd/ceph-81/keyring --create-keyring --name osd.81
> --add-key AQCyyFRcSwWqGBAAKZR8rcWIEknj/o3rsehOdA==
> [hqosd7][DEBUG ]  stdout: creating /var/lib/ceph/osd/ceph-81/keyring
> [hqosd7][DEBUG ]  stdout: added entity osd.81 auth auth(auid =
> 18446744073709551615 key=AQCyyFRcSwWqGBAAKZR8rcWIEknj/o3rsehOdA== with 0
> caps)
> [hqosd7][DEBUG ] Running command: cho

Re: [ceph-users] Problem replacing osd with ceph-deploy

2019-02-01 Thread Shain Miley

O.k. thank you!

I removed the osd just in case after the fact but I will re-add it back 
in and update the thread if things still don't look right.


Shain

On 2/1/19 6:35 PM, Vladimir Prokofev wrote:
Your output looks a bit weird, but still, this is normal for 
bluestore. It creates small separate data partition that is presented 
as XFS mounted in /var/lib/ceph/osd, while real data partition is 
hidden as raw(bluestore) block device.

It's no longer possible to check disk utilisation with df using bluestore.
To check your osd capacity use 'ceph osd df'

сб, 2 февр. 2019 г. в 02:07, Shain Miley >:


Hi,

I went to replace a disk today (which I had not had to do in a while)
and after I added it the results looked rather odd compared to
times past:

I was attempting to replace /dev/sdk on one of our osd nodes:

#ceph-deploy disk zap hqosd7 /dev/sdk
#ceph-deploy osd create --data /dev/sdk hqosd7

[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/local/bin/ceph-deploy
osd create --data /dev/sdk hqosd7
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  bluestore : None
[ceph_deploy.cli][INFO  ]  cd_conf   :

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  fs_type   : xfs
[ceph_deploy.cli][INFO  ]  block_wal : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  journal   : None
[ceph_deploy.cli][INFO  ]  subcommand    : create
[ceph_deploy.cli][INFO  ]  host  : hqosd7
[ceph_deploy.cli][INFO  ]  filestore : None
[ceph_deploy.cli][INFO  ]  func  :

[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  zap_disk  : False
[ceph_deploy.cli][INFO  ]  data  : /dev/sdk
[ceph_deploy.cli][INFO  ]  block_db  : None
[ceph_deploy.cli][INFO  ]  dmcrypt   : False
[ceph_deploy.cli][INFO  ]  overwrite_conf    : False
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
/etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  debug : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data
device
/dev/sdk
[hqosd7][DEBUG ] connected to host: hqosd7
[hqosd7][DEBUG ] detect platform information from remote host
[hqosd7][DEBUG ] detect machine type
[hqosd7][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: Ubuntu 16.04 xenial
[ceph_deploy.osd][DEBUG ] Deploying osd to hqosd7
[hqosd7][DEBUG ] write cluster configuration to
/etc/ceph/{cluster}.conf
[hqosd7][DEBUG ] find the location of an executable
[hqosd7][INFO  ] Running command: /usr/sbin/ceph-volume --cluster
ceph
lvm create --bluestore --data /dev/sdk
[hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool
--gen-print-key
[hqosd7][DEBUG ] Running command: /usr/bin/ceph --cluster ceph --name
client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring
-i - osd new c98a11d1-9b7f-487e-8c69-72fc662927d4
[hqosd7][DEBUG ] Running command: vgcreate --force --yes
ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1 /dev/sdk
[hqosd7][DEBUG ]  stdout: Physical volume "/dev/sdk" successfully
created
[hqosd7][DEBUG ]  stdout: Volume group
"ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1" successfully created
[hqosd7][DEBUG ] Running command: lvcreate --yes -l 100%FREE -n
osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1
[hqosd7][DEBUG ]  stdout: Logical volume
"osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4" created.
[hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool
--gen-print-key
[hqosd7][DEBUG ] Running command: mount -t tmpfs tmpfs
/var/lib/ceph/osd/ceph-81
[hqosd7][DEBUG ] Running command: chown -R ceph:ceph /dev/dm-0
[hqosd7][DEBUG ] Running command: ln -s

/dev/ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1/osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4

/var/lib/ceph/osd/ceph-81/block
[hqosd7][DEBUG ] Running command: ceph --cluster ceph --name
client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring
mon getmap -o /var/lib/ceph/osd/ceph-81/activate.monmap
[hqosd7][DEBUG ]  stderr: got monmap epoch 2
[hqosd7][DEBUG ] Running comman

Re: [ceph-users] block.db on a LV? (Re: Mixed SSD+HDD OSD setup recommendation)

2019-02-01 Thread ceph
Hello @all,

Am 18. Januar 2019 14:29:42 MEZ schrieb Alfredo Deza :
>On Fri, Jan 18, 2019 at 7:21 AM Jan Kasprzak  wrote:
>>
>> Eugen Block wrote:
>> : Hi Jan,
>> :
>> : I think you're running into an issue reported a couple of times.
>> : For the use of LVM you have to specify the name of the Volume Group
>> : and the respective Logical Volume instead of the path, e.g.
>> :
>> : ceph-volume lvm prepare --bluestore --block.db ssd_vg/ssd00 --data
>/dev/sda
>>
>> Eugen,
>>
>> thanks, I will try it. In the meantime, I have discovered another way
>> how to get around it: convert my SSDs from MBR to GPT partition
>table,
>> and then create 15 additional GPT partitions for the respective
>block.dbs
>> instead of 2x15 LVs.
>
>This is because ceph-volume can accept both LVs or GPT partitions for
>block.db
>
>Another way around this, that doesn't require you to create the LVs is
>to use the `batch` sub-command, that will automatically
>detect your HDD and put data on it, and detect the SSD and create the
>block.db LVs. The command could look something like:
>
>
>ceph-volume lvm batch --bluestore /dev/sda /dev/sdb /dev/sdc /dev/sdd
>/dev/nvme0n1
>
>Would create 4 OSDs, place data on: sda, sdb, sdc, and sdd. And create
>4 block.db LVs on nvme0n1
>

How would you replace ,lets say sdc (osd.2), in this Case?

Could you please give a short step-by-step howto?

Thanks in advice for  your Great Job on ceph-volume @alfredo

- Mehmet 

>
>
>>
>> -Yenya
>>
>> --
>> | Jan "Yenya" Kasprzak private}> |
>> | http://www.fi.muni.cz/~kas/ GPG:
>4096R/A45477D5 |
>>  This is the world we live in: the way to deal with computers is to
>google
>>  the symptoms, and hope that you don't have to watch a video. --P.
>Zaitcev
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD default pool

2019-02-01 Thread Carlos Mogas da Silva

On 01/02/2019 22:40, Alan Johnson wrote:

Confirm that no pools are created by default with Mimic.


I can confirm that. Mimic deploy doesn't create any pools.



*From:*ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of 
*solarflow99
*Sent:* Friday, February 1, 2019 2:28 PM
*To:* Ceph Users 
*Subject:* [ceph-users] RBD default pool

I thought a new cluster would have the 'rbd' pool already created, has this 
changed?  I'm using mimic.

# rbd ls
rbd: error opening default pool 'rbd'
Ensure that the default pool has been created or specify an alternate pool name.
rbd: list: (2) No such file or directory


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] backfill_toofull after adding new OSDs

2019-02-01 Thread Fyodor Ustinov
Hi!

Right now, after adding OSD:

# ceph health detail
HEALTH_ERR 74197563/199392333 objects misplaced (37.212%); Degraded data 
redundancy (low space): 1 pg backfill_toofull
OBJECT_MISPLACED 74197563/199392333 objects misplaced (37.212%)
PG_DEGRADED_FULL Degraded data redundancy (low space): 1 pg backfill_toofull
pg 6.eb is active+remapped+backfill_wait+backfill_toofull, acting [21,0,47]

# ceph pg ls-by-pool iscsi backfill_toofull
PG   OBJECTS DEGRADED MISPLACED UNFOUND BYTES  LOG  STATE   
   STATE_STAMPVERSION   REPORTED   UP   
  ACTING   SCRUB_STAMPDEEP_SCRUB_STAMP
6.eb 6450  1290   0 1645654016 3067 
active+remapped+backfill_wait+backfill_toofull 2019-02-02 00:20:32.975300 
7208'6567 9790:16214 [5,1,21]p5 [21,0,47]p21 2019-01-18 04:13:54.280495 
2019-01-18 04:13:54.280495

All OSD have less 40% USE.

ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
 0   hdd 9.56149  1.0 9.6 TiB 3.2 TiB 6.3 TiB 33.64 1.31 313
 1   hdd 9.56149  1.0 9.6 TiB 3.3 TiB 6.3 TiB 34.13 1.33 295
 5   hdd 9.56149  1.0 9.6 TiB 756 GiB 8.8 TiB  7.72 0.30 103
47   hdd 9.32390  1.0 9.3 TiB 3.1 TiB 6.2 TiB 33.75 1.31 306

(all other OSD also have less 40%)

ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)

Maybe the developers will pay attention to the letter and say something?

- Original Message -
From: "Fyodor Ustinov" 
To: "Caspar Smit" 
Cc: "Jan Kasprzak" , "ceph-users" 
Sent: Thursday, 31 January, 2019 16:50:24
Subject: Re: [ceph-users] backfill_toofull after adding new OSDs

Hi!

I saw the same several times when I added a new osd to the cluster. One-two pg 
in "backfill_toofull" state.

In all versions of mimic.

- Original Message -
From: "Caspar Smit" 
To: "Jan Kasprzak" 
Cc: "ceph-users" 
Sent: Thursday, 31 January, 2019 15:43:07
Subject: Re: [ceph-users] backfill_toofull after adding new OSDs

Hi Jan, 

You might be hitting the same issue as Wido here: 

[ https://www.spinics.net/lists/ceph-users/msg50603.html | 
https://www.spinics.net/lists/ceph-users/msg50603.html ] 

Kind regards, 
Caspar 

Op do 31 jan. 2019 om 14:36 schreef Jan Kasprzak < [ mailto:k...@fi.muni.cz | 
k...@fi.muni.cz ] >: 


Hello, ceph users, 

I see the following HEALTH_ERR during cluster rebalance: 

Degraded data redundancy (low space): 8 pgs backfill_toofull 

Detailed description: 
I have upgraded my cluster to mimic and added 16 new bluestore OSDs 
on 4 hosts. The hosts are in a separate region in my crush map, and crush 
rules prevented data to be moved on the new OSDs. Now I want to move 
all data to the new OSDs (and possibly decomission the old filestore OSDs). 
I have created the following rule: 

# ceph osd crush rule create-replicated on-newhosts newhostsroot host 

after this, I am slowly moving the pools one-by-one to this new rule: 

# ceph osd pool set test-hdd-pool crush_rule on-newhosts 

When I do this, I get the above error. This is misleading, because 
ceph osd df does not suggest the OSDs are getting full (the most full 
OSD is about 41 % full). After rebalancing is done, the HEALTH_ERR 
disappears. Why am I getting this error? 

# ceph -s 
cluster: 
id: ...my UUID... 
health: HEALTH_ERR 
1271/3803223 objects misplaced (0.033%) 
Degraded data redundancy: 40124/3803223 objects degraded (1.055%), 65 pgs 
degraded, 67 pgs undersized 
Degraded data redundancy (low space): 8 pgs backfill_toofull 

services: 
mon: 3 daemons, quorum mon1,mon2,mon3 
mgr: mon2(active), standbys: mon1, mon3 
osd: 80 osds: 80 up, 80 in; 90 remapped pgs 
rgw: 1 daemon active 

data: 
pools: 13 pools, 5056 pgs 
objects: 1.27 M objects, 4.8 TiB 
usage: 15 TiB used, 208 TiB / 224 TiB avail 
pgs: 40124/3803223 objects degraded (1.055%) 
1271/3803223 objects misplaced (0.033%) 
4963 active+clean 
41 active+recovery_wait+undersized+degraded+remapped 
21 active+recovery_wait+undersized+degraded 
17 active+remapped+backfill_wait 
5 active+remapped+backfill_wait+backfill_toofull 
3 active+remapped+backfill_toofull 
2 active+recovering+undersized+remapped 
2 active+recovering+undersized+degraded+remapped 
1 active+clean+remapped 
1 active+recovering+undersized+degraded 

io: 
client: 6.6 MiB/s rd, 2.7 MiB/s wr, 75 op/s rd, 89 op/s wr 
recovery: 2.0 MiB/s, 92 objects/s 

Thanks for any hint, 

-Yenya 

-- 
| Jan "Yenya" Kasprzak http://fi.muni.cz/ | fi.muni.cz ] - work | [ 
http://yenya.net/ | yenya.net ] - private}> | 
| [ http://www.fi.muni.cz/~kas/ | http://www.fi.muni.cz/~kas/ ] GPG: 
4096R/A45477D5 | 
This is the world we live in: the way to deal with computers is to google 
the symptoms, and hope that you don't have to watch a video. --P. Zaitcev 
___ 
ceph-users mailing list 
[ mailto:ceph-users@lists.ceph.com | ceph-users@lists.ceph.com ] 
[ http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | 
http://lists.ceph.com/listinfo.cgi/cep