Re: [ceph-users] Were fixed CephFS lock ups when it's running on nodes with OSDs?

2019-04-23 Thread Dan van der Ster
On Mon, 22 Apr 2019, 22:20 Gregory Farnum,  wrote:

> On Sat, Apr 20, 2019 at 9:29 AM Igor Podlesny  wrote:
> >
> > I remember seeing reports in regards but it's being a while now.
> > Can anyone tell?
>
> No, this hasn't changed. It's unlikely it ever will; I think NFS
> resolved the issue but it took a lot of ridiculous workarounds and
> imposes a permanent memory cost on the client.
>

On the other hand, we've been running osds and local kernel mounts through
some ior stress testing and managed to lock up only one node, only once
(and that was with a 2TB shared output file).

Maybe the necessary memory pressure conditions get less likely as the
number of clients and osds gets larger? (i.e. it's probably easy to trigger
with one single node/osd because all IO is local, but for large clusters
most IO is remote).

.. Dan


-Greg
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Were fixed CephFS lock ups when it's running on nodes with OSDs?

2019-04-23 Thread Patrick Hein
I'm am running a Ceph Cluster on 5 Servers, all with a single osd and
acting as a client (kernel) for nearly half a year now and didn't encounter
a lockup yet. Total storage is 3.25TB with about 600GB raw storage used, if
that matters.

Dan van der Ster  schrieb am Di., 23. Apr. 2019, 09:33:

> On Mon, 22 Apr 2019, 22:20 Gregory Farnum,  wrote:
>
>> On Sat, Apr 20, 2019 at 9:29 AM Igor Podlesny  wrote:
>> >
>> > I remember seeing reports in regards but it's being a while now.
>> > Can anyone tell?
>>
>> No, this hasn't changed. It's unlikely it ever will; I think NFS
>> resolved the issue but it took a lot of ridiculous workarounds and
>> imposes a permanent memory cost on the client.
>>
>
> On the other hand, we've been running osds and local kernel mounts through
> some ior stress testing and managed to lock up only one node, only once
> (and that was with a 2TB shared output file).
>
> Maybe the necessary memory pressure conditions get less likely as the
> number of clients and osds gets larger? (i.e. it's probably easy to trigger
> with one single node/osd because all IO is local, but for large clusters
> most IO is remote).
>
> .. Dan
>
>
> -Greg
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Osd update from 12.2.11 to 12.2.12

2019-04-23 Thread Marc Roos
 

I have only this in the default section, I think it is related to not 
having any configuration for some of these osd's. I 'forgot' to add the 
most recently added node [osd.x] sections. But in any case nothing afaik 
that should have them behave differently.

[osd]
osd journal size = 1024
osd pool default size = 3
osd pool default min size = 2
osd pool default pg num = 8
osd pool default pgp num = 8
# osd objectstore = bluestore
# osd max object size = 134217728
# osd max object size = 26843545600
osd scrub min interval = 172800

And these in the custom section

[osd.x]
public addr = 192.168.10.x
cluster addr = 10.0.0.x





-Original Message-
From: David Turner [mailto:drakonst...@gmail.com] 
Sent: 22 April 2019 22:34
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] Osd update from 12.2.11 to 12.2.12

Do you perhaps have anything in the ceph.conf files on the servers with 
those OSDs that would attempt to tell the daemon that they are filestore 
osds instead of bluestore?  I'm sure you know that the second part [1] 
of the output in both cases only shows up after an OSD has been 
rebooted.  I'm sure this too could be cleaned up by adding that line to 
the ceph.conf file.

[1] rocksdb_separate_wal_dir = 'false' (not observed, change may require 
restart)

On Sun, Apr 21, 2019 at 8:32 AM  wrote:




Just updated luminous, and setting max_scrubs value back. Why do I 
get 
osd's reporting differently 


I get these:
osd.18: osd_max_scrubs = '1' (not observed, change may require 
restart) 
osd_objectstore = 'bluestore' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.19: osd_max_scrubs = '1' (not observed, change may require 
restart) 
osd_objectstore = 'bluestore' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.20: osd_max_scrubs = '1' (not observed, change may require 
restart) 
osd_objectstore = 'bluestore' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.21: osd_max_scrubs = '1' (not observed, change may require 
restart) 
osd_objectstore = 'bluestore' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.22: osd_max_scrubs = '1' (not observed, change may require 
restart) 
osd_objectstore = 'bluestore' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)


And I get osd's reporting like this:
osd.23: osd_max_scrubs = '1' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.24: osd_max_scrubs = '1' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.25: osd_max_scrubs = '1' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.26: osd_max_scrubs = '1' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.27: osd_max_scrubs = '1' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)
osd.28: osd_max_scrubs = '1' (not observed, change may require 
restart) 
rocksdb_separate_wal_dir = 'false' (not observed, change may 
require 
restart)







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph inside Docker containers inside VirtualBox

2019-04-23 Thread Marc Roos
 

I am not sure about your background knowledge of ceph, but if you are 
starting. Maybe first try and get ceph working in a virtual environment, 
that should not be to much of a problem. Then try migrating it to your 
container. Now you are probably fighting to many issues at the same 
time.  





-Original Message-
From: Varun Singh [mailto:varun.si...@gslab.com] 
Sent: 22 April 2019 07:46
To: ceph-us...@ceph.com
Subject: Re: [ceph-users] Ceph inside Docker containers inside 
VirtualBox

On Fri, Apr 19, 2019 at 6:53 PM Varun Singh  
wrote:
>
> On Fri, Apr 19, 2019 at 10:44 AM Varun Singh  
wrote:
> >
> > On Thu, Apr 18, 2019 at 9:53 PM Siegfried Höllrigl 
> >  wrote:
> > >
> > > Hi !
> > >
> > > I am not 100% sure, but i think, --net=host does not propagate 
> > > /dev/ inside the conatiner.
> > >
> > >  From the Error Message :
> > >
> > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- 

> > > The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !
> > >
> > >
> > > I whould say, you should add something like --device=/dev/vdd to 
the docker run command for the osd.
> > >
> > > Br
> > >
> > >
> > > Am 18.04.2019 um 14:46 schrieb Varun Singh:
> > > > Hi,
> > > > I am trying to setup Ceph through Docker inside a VM. My host 
> > > > machine is Mac. My VM is an Ubuntu 18.04. Docker version is 
> > > > 18.09.5, build e8ff056.
> > > > I am following the documentation present on ceph/daemon Docker 
> > > > Hub page. The idea is, if I spawn docker containers as mentioned 

> > > > on the page, I should get a ceph setup without KV store. I am 
> > > > not worried about KV store as I just want to try it out. 
> > > > Following are the commands I am firing to bring the containers 
up:
> > > >
> > > > Monitor:
> > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v 
> > > > /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e
> > > > CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon
> > > >
> > > > Manager:
> > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v 
> > > > /var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr
> > > >
> > > > OSD:
> > > > docker run -d --net=host --pid=host --privileged=true -v 
> > > > /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v 
> > > > /dev/:/dev/ -e OSD_DEVICE=/dev/vdd ceph/daemon osd
> > > >
> > > >  From the above commands I am able to spawn monitor and manager 
> > > > properly. I verified this by firing this command on both monitor 

> > > > and manager containers:
> > > > sudo docker exec d1ab985 ceph -s
> > > >
> > > > I get following outputs for both:
> > > >
> > > >cluster:
> > > >  id: 14a6e40a-8e54-4851-a881-661a84b3441c
> > > >  health: HEALTH_OK
> > > >
> > > >services:
> > > >  mon: 1 daemons, quorum serverceph-VirtualBox (age 62m)
> > > >  mgr: serverceph-VirtualBox(active, since 56m)
> > > >  osd: 0 osds: 0 up, 0 in
> > > >
> > > >data:
> > > >  pools:   0 pools, 0 pgs
> > > >  objects: 0 objects, 0 B
> > > >  usage:   0 B used, 0 B / 0 B avail
> > > >  pgs:
> > > >
> > > > However when I try to bring up OSD using above command, it 
> > > > doesn't work. Docker logs show this output:
> > > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: 
static:
> > > > does not generate config
> > > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: 
> > > > ERROR- The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist 
!
> > > >
> > > > I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE 
env var.
> > > > I know there are five different ways to spawning the OSD, but I 
> > > > am not able to figure out which one would be suitable for a 
> > > > simple deployment. If you could please let me know how to spawn 
> > > > OSDs using Docker, it would help a lot.
> > > >
> > > >
> >
> > Thanks Br, I will try this out today.
> >
> > --
> > Regards,
> > Varun Singh
>
> Hi,
> So following your suggestion I tried following two commands:
> 1. I added --device=/dev/vdd switch without removing OSD_DEVICE env 
> var. This resulted in same error before docker run -d --net=host 
> --pid=host --privileged=true --device=/dev/vdd -v /etc/ceph:/etc/ceph 
> -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdd 

> ceph/daemon osd
>
>
> 2. Then I removed OSD_DEVICE env var and just added --device=/dev/vdd 
> switch docker run -d --net=host --pid=host --privileged=true 
> --device=/dev/vdd -v /etc/ceph:/etc/ceph -v 
> /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/  ceph/daemon osd
>
> OSD_DEVICE related error went away and I think ceph created an OSD 
> successfully. But it wasn't able to connect to cluster. Is it because 
> I did not give and network related information? I get the following 
> error now:
>
> 2019-04-18 08:30:47  /opt/ceph-container/bin/entrypoint.sh: static:
> does not generate config
> 2019-04-18 08:30:47  /opt/ceph-container/bin/entrypoint.sh:
> Bootstrapped OSD(s) found; using OSD directory
> 2019-04-18 08:30:47  /opt/ceph-container/bin/en

Re: [ceph-users] Bluestore with so many small files

2019-04-23 Thread Frédéric Nass
Hi, 

You probably forgot to recreate the OSD after changing 
bluestore_min_alloc_size. 

Regards, 
Frédéric. 

- Le 22 Avr 19, à 5:41, 刘 俊  a écrit : 

> Hi All ,
> I still see this issue with latest ceph Luminous 12.2.11 and 12.2.12.
> I have set bluestore_min_alloc_size = 4096 before the test.
> when I write 10 small objects less than 64KB through rgw, the RAW USED
> showed in "ceph df" looks incorrect.
> For example, I test three times and clean up the rgw data pool each time, the
> object size for first time is 4KB, for second time is 32KB, for third time is
> 64KB.
> The RAW USED showed in "ceph df" are the same(18GB),  looks like always equal 
> to
> 64KB*10/1024*3 . (replicator is 3 here )
> Any thought?
> Jamie
> ___
> Hi Behnam,

> On 2/12/2018 4:06 PM, Behnam Loghmani wrote:
>> Hi there, > > I am using ceph Luminous 12.2.2 with: > > 3 osds (each osd is
>> 100G) - no WAL/DB separation. > 3 mons > 1 rgw > cluster size 3 > > I stored
>> lots of thumbnails with very small size on ceph with radosgw. > > Actual size
>> of files is something about 32G but it filled 70G of each osd. > > what's the
>> reason of this high disk usage? Most probably the major reason is BlueStore
> > allocation granularity. E.g.
> an object of 1K bytes length needs 64K of disk space if default
> bluestore_min_alloc_size_hdd  (=64K) is applied.
> Additional inconsistency in space reporting might also appear since
> BlueStore adds up DB volume space when accounting total store space.
> While free space is taken from Block device only. is As a result when
> reporting "Used" space always contain that total DB space part ( i.e.
> Used = Total(Block+DB) - Free(Block) ). That correlates to other
> comments in this thread about RockDB space usage.
> There is a pending PR to fix that: [
> https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86
> |
> https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86
> ] You may look for "Bluestore: inaccurate disk usage statistics problem"
> in this mail list for previous discussion as well.

>> should I change "bluestore_min_alloc_size_hdd"? and If I change it and > set 
>> it
>> to smaller size, does it impact on performance? Unfortunately I haven't
> > benchmark "small writes over hdd" cases much
> hence don't have exacts answer here. Indeed these 'min_alloc_size'
> family of parameters might impact the performance quite significantly.
>> > what is the best practice for storing small files on bluestore? > > Best
> > > regards, > Behnam Loghmani
>> > On Mon, Feb 12, 2018 at 5:06 PM, David Turner < [
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | drakonstein at
>> > gmail.com ] > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> > | drakonstein at gmail.com ] >> wrote: > > Some of your overhead is the 
>> > Wal and
>> > rocksdb that are on the OSDs. > The Wal is pretty static in size, but 
>> > rocksdb
>> > grows with the amount > of objects you have. You also have copies of the 
>> > osdmap
>> > on each osd. > There's just overhead that adds up. The biggest is going to 
>> > be >
>> > rocksdb with how many objects you have. > > > On Mon, Feb 12, 2018, 8:06 AM
>> > Behnam Loghmani > < [ 
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>> > behnam.loghmani at gmail.com ] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | behnam.loghmani at
>> > gmail.com ] >> wrote: > > Hi there, > > I am using ceph Luminous 12.2.2 
>> > with: >
>> > > 3 osds (each osd is 100G) - no WAL/DB separation. > 3 mons > 1 rgw > 
>> > > cluster
>> > size 3 > > I stored lots of thumbnails with very small size on ceph with >
>> > radosgw. > > Actual size of files is something about 32G but it filled 70G 
>> > of >
>> > each osd. > > what's the reason of this high disk usage? > should I change
>> > "bluestore_min_alloc_size_hdd"? and If I change > it and set it to smaller
>> > size, does it impact on performance? > > what is the best practice for 
>> > storing
>> > small files on bluestore? > > Best regards, > Behnam Loghmani >
>> > ___ > ceph-users mailing list 
>> > > [
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | ceph-users at
>> > lists.ceph.com ] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | ceph-users at
>> > lists.ceph.com ] > > [ 
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] > < [
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com |
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] >

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph

[ceph-users] Recovery 13.2.5 Slow

2019-04-23 Thread Andrew Cassera
Hello,

I have a cluster with 6 OSD nodes each with 10 SATA 8TB drives.  Node 6 was
just added.  All nodes are 10Gbps on the network with Jumbo frames.  S3
application access is working as expected but recovery is extremely slow.
Based on past posts I attempted to do the following:

Alter the osd_recovery_sleep_hdd.  I tried 0 and 0.1.  0 seems to improve
the speed slightly but it is still very slow.  I also attempted to change
osd_max_backfills to 16 from 8 and osd_recovery_max_active to 8 from 4.
This showed no noticeable improvement.  The cluster is running 13.2.5.
Here is the output from ceph -s

  cluster:
id: xx
health: HEALTH_ERR
3 large omap objects
67164650/268993641 objects misplaced (24.969%)
Degraded data redundancy: 612258/268993641 objects degraded
(0.228%), 8 pgs degraded, 8 pgs undersized
Degraded data redundancy (low space): 9 pgs backfill_toofull

  services:
mon: 3 daemons, quorum mon1,mon2,mon3
mgr: mon2(active), standbys: mon1
osd: 55 osds: 50 up, 50 in; 531 remapped pgs
rgw: 3 daemons active

  data:
pools:   15 pools, 1476 pgs
objects: 89.66 M objects, 49 TiB
usage:   159 TiB used, 205 TiB / 364 TiB avail
pgs: 612258/268993641 objects degraded (0.228%)
 67164650/268993641 objects misplaced (24.969%)
 945 active+clean
 507 active+remapped+backfill_wait
 9   active+remapped+backfill_wait+backfill_toofull
 7   active+remapped+backfilling
 4   active+undersized+degraded+remapped+backfill_wait
 4   active+undersized+degraded+remapped+backfilling

  io:
client:   5.3 MiB/s rd, 3.9 MiB/s wr, 844 op/s rd, 81 op/s wr
recovery: 19 MiB/s, 33 objects/s

Any clue at what I can looks at further to investigate the slow recovery
would be appreciated.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Default Pools

2019-04-23 Thread David Turner
You should be able to see all pools in use in a RGW zone from the
radosgw-admin command. This [1] is probably overkill for most, but I deal
with multi-realm clusters so I generally think like this when dealing with
RGW.  Running this as is will create a file in your current directory for
each zone in your deployment (likely to be just one file).  My rough guess
for what you would find in that file based on your pool names would be this
[2].

If you identify any pools not listed from the zone get command, then you
can rename [3] the pool to see if it is being created and/or used by rgw
currently.  The process here would be to stop all RGW daemons, rename the
pools, start a RGW daemon, stop it again, and see which pools were
recreated.  Clean up the pools that were freshly made and rename the
original pools back into place before starting your RGW daemons again.
Please note that .rgw.root is a required pool in every RGW deployment and
will not be listed in the zones themselves.


[1]
for realm in $(radosgw-admin realm list --format=json | jq '.realms[]' -r);
do
  for zonegroup in $(radosgw-admin --rgw-realm=$realm zonegroup list
--format=json | jq '.zonegroups[]' -r); do
for zone in $(radosgw-admin --rgw-realm=$realm
--rgw-zonegroup=$zonegroup zone list --format=json | jq '.zones[]' -r); do
  echo $realm.$zonegroup.$zone.json
  radosgw-admin --rgw-realm=$realm --rgw-zonegroup=$zonegroup
--rgw-zone=$zone zone get > $realm.$zonegroup.$zone.json
done
  done
done

[2] default.default.default.json
{
"id": "{{ UUID }}",
"name": "default",
"domain_root": "default.rgw.meta",
"control_pool": "default.rgw.control",
"gc_pool": ".rgw.gc",
"log_pool": "default.rgw.log",
"user_email_pool": ".users.email",
"user_uid_pool": ".users.uid",
"system_key": {
},
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": "default.rgw.buckets.index",
"data_pool": "default.rgw.buckets.data",
"data_extra_pool": "default.rgw.buckets.non-ec",
"index_type": 0,
"compression": ""
}
}
],
"metadata_heap": "",
"tier_config": [],
"realm_id": "{{ UUID }}"
}

[3] ceph osd pool rename  

On Thu, Apr 18, 2019 at 10:46 AM Brent Kennedy  wrote:

> Yea, that was a cluster created during firefly...
>
> Wish there was a good article on the naming and use of these, or perhaps a
> way I could make sure they are not used before deleting them.  I know RGW
> will recreate anything it uses, but I don’t want to lose data because I
> wanted a clean system.
>
> -Brent
>
> -Original Message-
> From: Gregory Farnum 
> Sent: Monday, April 15, 2019 5:37 PM
> To: Brent Kennedy 
> Cc: Ceph Users 
> Subject: Re: [ceph-users] Default Pools
>
> On Mon, Apr 15, 2019 at 1:52 PM Brent Kennedy  wrote:
> >
> > I was looking around the web for the reason for some of the default
> pools in Ceph and I cant find anything concrete.  Here is our list, some
> show no use at all.  Can any of these be deleted ( or is there an article
> my googlefu failed to find that covers the default pools?
> >
> > We only use buckets, so I took out .rgw.buckets, .users and
> > .rgw.buckets.index…
> >
> > Name
> > .log
> > .rgw.root
> > .rgw.gc
> > .rgw.control
> > .rgw
> > .users.uid
> > .users.email
> > .rgw.buckets.extra
> > default.rgw.control
> > default.rgw.meta
> > default.rgw.log
> > default.rgw.buckets.non-ec
>
> All of these are created by RGW when you run it, not by the core Ceph
> system. I think they're all used (although they may report sizes of 0, as
> they mostly make use of omap).
>
> > metadata
>
> Except this one used to be created-by-default for CephFS metadata, but
> that hasn't been true in many releases. So I guess you're looking at an old
> cluster? (In which case it's *possible* some of those RGW pools are also
> unused now but were needed in the past; I haven't kept good track of them.)
> -Greg
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] showing active config settings

2019-04-23 Thread solarflow99
Thanks, but does this not work on Luminous maybe?  I am on the mon hosts
trying this:


# ceph config set osd osd_recovery_max_active 4
Invalid command: unused arguments: [u'4']
config set   :  Set a configuration option at runtime (not
persistent)
Error EINVAL: invalid command

# ceph daemon osd.0 config diff|grep -A5 osd_recovery_max_active
admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory


On Tue, Apr 16, 2019 at 4:04 PM Brad Hubbard  wrote:

> $ ceph config set osd osd_recovery_max_active 4
> $ ceph daemon osd.0 config diff|grep -A5 osd_recovery_max_active
> "osd_recovery_max_active": {
> "default": 3,
> "mon": 4,
> "override": 4,
> "final": 4
> },
>
> On Wed, Apr 17, 2019 at 5:29 AM solarflow99  wrote:
> >
> > I wish there was a way to query the running settings from one of the MGR
> hosts, and it doesn't help that ansible doesn't even copy the keyring to
> the OSD nodes so commands there wouldn't work anyway.
> > I'm still puzzled why it doesn't show any change when I run this no
> matter what I set it to:
> >
> > # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> > osd_recovery_max_active = 3
> >
> > in fact it doesn't matter if I use an OSD number that doesn't exist,
> same thing if I use ceph get
> >
> >
> >
> > On Tue, Apr 16, 2019 at 1:18 AM Brad Hubbard 
> wrote:
> >>
> >> On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich 
> wrote:
> >> >
> >> > This works, it just says that it *might* require a restart, but this
> >> > particular option takes effect without a restart.
> >>
> >> We've already looked at changing the wording once to make it more
> palatable.
> >>
> >> http://tracker.ceph.com/issues/18424
> >>
> >> >
> >> > Implementation detail: this message shows up if there's no internal
> >> > function to be called when this option changes, so it can't be sure if
> >> > the change is actually doing anything because the option might be
> >> > cached or only read on startup. But in this case this option is read
> >> > in the relevant path every time and no notification is required. But
> >> > the injectargs command can't know that.
> >>
> >> Right on all counts. The functions are referred to as observers and
> >> register to be notified if the value changes, hence "not observed."
> >>
> >> >
> >> > Paul
> >> >
> >> > On Mon, Apr 15, 2019 at 11:38 PM solarflow99 
> wrote:
> >> > >
> >> > > Then why doesn't this work?
> >> > >
> >> > > # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
> >> > > osd.0: osd_recovery_max_active = '4' (not observed, change may
> require restart)
> >> > > osd.1: osd_recovery_max_active = '4' (not observed, change may
> require restart)
> >> > > osd.2: osd_recovery_max_active = '4' (not observed, change may
> require restart)
> >> > > osd.3: osd_recovery_max_active = '4' (not observed, change may
> require restart)
> >> > > osd.4: osd_recovery_max_active = '4' (not observed, change may
> require restart)
> >> > >
> >> > > # ceph -n osd.1 --show-config | grep osd_recovery_max_active
> >> > > osd_recovery_max_active = 3
> >> > >
> >> > >
> >> > >
> >> > > On Wed, Apr 10, 2019 at 7:21 AM Eugen Block  wrote:
> >> > >>
> >> > >> > I always end up using "ceph --admin-daemon
> >> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..."
> to get what
> >> > >> > is in effect now for a certain daemon.
> >> > >> > Needs you to be on the host of the daemon of course.
> >> > >>
> >> > >> Me too, I just wanted to try what OP reported. And after trying
> that,
> >> > >> I'll keep it that way. ;-)
> >> > >>
> >> > >>
> >> > >> Zitat von Janne Johansson :
> >> > >>
> >> > >> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block :
> >> > >> >
> >> > >> >> > If you don't specify which daemon to talk to, it tells you
> what the
> >> > >> >> > defaults would be for a random daemon started just now using
> the same
> >> > >> >> > config as you have in /etc/ceph/ceph.conf.
> >> > >> >>
> >> > >> >> I tried that, too, but the result is not correct:
> >> > >> >>
> >> > >> >> host1:~ # ceph -n osd.1 --show-config | grep
> osd_recovery_max_active
> >> > >> >> osd_recovery_max_active = 3
> >> > >> >>
> >> > >> >
> >> > >> > I always end up using "ceph --admin-daemon
> >> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..."
> to get what
> >> > >> > is in effect now for a certain daemon.
> >> > >> > Needs you to be on the host of the daemon of course.
> >> > >> >
> >> > >> > --
> >> > >> > May the most significant bit of your life be positive.
> >> > >>
> >> > >>
> >> > >>
> >> > >> ___
> >> > >> ceph-users mailing list
> >> > >> ceph-users@lists.ceph.com
> >> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> > >
> >> > > ___
> >> > > ceph-users mailing list
> >> > > ceph-users@lists.ceph.com
> >> > > http://lists.ceph.com/listinfo.cgi/ceph-users-cep

Re: [ceph-users] ceph-iscsi: problem when discovery auth is disabled, but gateway receives auth requests

2019-04-23 Thread Mike Christie
On 04/18/2019 06:24 AM, Matthias Leopold wrote:
> Hi,
> 
> the Ceph iSCSI gateway has a problem when receiving discovery auth
> requests when discovery auth is not enabled. Target discovery fails in
> this case (see below). This is especially annoying with oVirt (KVM
> management platform) where you can't separate the two authentication
> phases. This leads to a situation where you are forced to use discovery
> auth and have the same credentials for target auth (for oVirt target).
> These credentials (for discovery auth) would then have to be shared for
> other targets on the same gateway, this is not acceptable. I saw that
> other iSCSI vendors (FreeNAS) don't have this problem. I don't know if
> this is Ceph gateway specific or a general LIO target problem. In any
> case I would be very happy if this could be resolved. I think that
> smooth integration of Ceph iSCSI gateway and oVirt should be of broader
> interest. Please correct me if I got anything wrong.
> 
> kernel messages when discovery_auth is disabled, but auth requests are
> received
> 
> Apr 18 13:05:01 ceiscsi0 kernel: CHAP user or password not set for
> Initiator ACL
> Apr 18 13:05:01 ceiscsi0 kernel: Security negotiation failed.
> Apr 18 13:05:01 ceiscsi0 kernel: iSCSI Login negotiation failed.
> 

Incase other people hit this, it can be tracked here:

https://github.com/ceph/ceph-iscsi/issues/68

It's a kernel bug where when disabling discovery_auth the kernel was
still requiring CHAP.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to minimize IO starvations while Bluestore try to delete WAL files

2019-04-23 Thread I Gede Iswara Darmawan
Hello,

Recently I have an issue when Bluestore try to delete WAL files (indicated
from osd log), the IO of the disk (HDD - Spinning) will reached 100% and
will introduced slow request to the cluster.

Is there any way to throttle this operation down or completely disable it?

Thanks

Regards,
I Gede Iswara Darmawan
Information System - School of Industrial and System Engineering
Telkom University
P / SMS / WA : 081 322 070719
E : iswaradr...@gmail.com / iswaradr...@live.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph inside Docker containers inside VirtualBox

2019-04-23 Thread Varun Singh
On Tue, Apr 23, 2019 at 2:58 PM Marc Roos  wrote:
>
>
>
> I am not sure about your background knowledge of ceph, but if you are
> starting. Maybe first try and get ceph working in a virtual environment,
> that should not be to much of a problem. Then try migrating it to your
> container. Now you are probably fighting to many issues at the same
> time.
>
>
>
>
>
> -Original Message-
> From: Varun Singh [mailto:varun.si...@gslab.com]
> Sent: 22 April 2019 07:46
> To: ceph-us...@ceph.com
> Subject: Re: [ceph-users] Ceph inside Docker containers inside
> VirtualBox
>
> On Fri, Apr 19, 2019 at 6:53 PM Varun Singh 
> wrote:
> >
> > On Fri, Apr 19, 2019 at 10:44 AM Varun Singh 
> wrote:
> > >
> > > On Thu, Apr 18, 2019 at 9:53 PM Siegfried Höllrigl
> > >  wrote:
> > > >
> > > > Hi !
> > > >
> > > > I am not 100% sure, but i think, --net=host does not propagate
> > > > /dev/ inside the conatiner.
> > > >
> > > >  From the Error Message :
> > > >
> > > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR-
>
> > > > The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !
> > > >
> > > >
> > > > I whould say, you should add something like --device=/dev/vdd to
> the docker run command for the osd.
> > > >
> > > > Br
> > > >
> > > >
> > > > Am 18.04.2019 um 14:46 schrieb Varun Singh:
> > > > > Hi,
> > > > > I am trying to setup Ceph through Docker inside a VM. My host
> > > > > machine is Mac. My VM is an Ubuntu 18.04. Docker version is
> > > > > 18.09.5, build e8ff056.
> > > > > I am following the documentation present on ceph/daemon Docker
> > > > > Hub page. The idea is, if I spawn docker containers as mentioned
>
> > > > > on the page, I should get a ceph setup without KV store. I am
> > > > > not worried about KV store as I just want to try it out.
> > > > > Following are the commands I am firing to bring the containers
> up:
> > > > >
> > > > > Monitor:
> > > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v
> > > > > /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e
> > > > > CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon
> > > > >
> > > > > Manager:
> > > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v
> > > > > /var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr
> > > > >
> > > > > OSD:
> > > > > docker run -d --net=host --pid=host --privileged=true -v
> > > > > /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v
> > > > > /dev/:/dev/ -e OSD_DEVICE=/dev/vdd ceph/daemon osd
> > > > >
> > > > >  From the above commands I am able to spawn monitor and manager
> > > > > properly. I verified this by firing this command on both monitor
>
> > > > > and manager containers:
> > > > > sudo docker exec d1ab985 ceph -s
> > > > >
> > > > > I get following outputs for both:
> > > > >
> > > > >cluster:
> > > > >  id: 14a6e40a-8e54-4851-a881-661a84b3441c
> > > > >  health: HEALTH_OK
> > > > >
> > > > >services:
> > > > >  mon: 1 daemons, quorum serverceph-VirtualBox (age 62m)
> > > > >  mgr: serverceph-VirtualBox(active, since 56m)
> > > > >  osd: 0 osds: 0 up, 0 in
> > > > >
> > > > >data:
> > > > >  pools:   0 pools, 0 pgs
> > > > >  objects: 0 objects, 0 B
> > > > >  usage:   0 B used, 0 B / 0 B avail
> > > > >  pgs:
> > > > >
> > > > > However when I try to bring up OSD using above command, it
> > > > > doesn't work. Docker logs show this output:
> > > > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh:
> static:
> > > > > does not generate config
> > > > > 2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh:
> > > > > ERROR- The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist
> !
> > > > >
> > > > > I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE
> env var.
> > > > > I know there are five different ways to spawning the OSD, but I
> > > > > am not able to figure out which one would be suitable for a
> > > > > simple deployment. If you could please let me know how to spawn
> > > > > OSDs using Docker, it would help a lot.
> > > > >
> > > > >
> > >
> > > Thanks Br, I will try this out today.
> > >
> > > --
> > > Regards,
> > > Varun Singh
> >
> > Hi,
> > So following your suggestion I tried following two commands:
> > 1. I added --device=/dev/vdd switch without removing OSD_DEVICE env
> > var. This resulted in same error before docker run -d --net=host
> > --pid=host --privileged=true --device=/dev/vdd -v /etc/ceph:/etc/ceph
> > -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdd
>
> > ceph/daemon osd
> >
> >
> > 2. Then I removed OSD_DEVICE env var and just added --device=/dev/vdd
> > switch docker run -d --net=host --pid=host --privileged=true
> > --device=/dev/vdd -v /etc/ceph:/etc/ceph -v
> > /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/  ceph/daemon osd
> >
> > OSD_DEVICE related error went away and I think ceph created an OSD
> > successfully. But it wasn't able to connect to cluster. Is it because
> > I did not give and network related information? I get the following
> 

[ceph-users] getting pg inconsistent periodly

2019-04-23 Thread Zhenshi Zhou
Hi,

I'm running a cluster for a period of time. I find the cluster usually
run into unhealthy state recently.

With 'ceph health detail', one or two pg are inconsistent. What's
more, pg in wrong state each day are not placed on the same disk,
so that I don't think it's a disk problem.

The cluster is using version 12.2.5. Any idea about this strange issue?

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com