Re: [ceph-users] Were fixed CephFS lock ups when it's running on nodes with OSDs?
On Mon, 22 Apr 2019, 22:20 Gregory Farnum, wrote: > On Sat, Apr 20, 2019 at 9:29 AM Igor Podlesny wrote: > > > > I remember seeing reports in regards but it's being a while now. > > Can anyone tell? > > No, this hasn't changed. It's unlikely it ever will; I think NFS > resolved the issue but it took a lot of ridiculous workarounds and > imposes a permanent memory cost on the client. > On the other hand, we've been running osds and local kernel mounts through some ior stress testing and managed to lock up only one node, only once (and that was with a 2TB shared output file). Maybe the necessary memory pressure conditions get less likely as the number of clients and osds gets larger? (i.e. it's probably easy to trigger with one single node/osd because all IO is local, but for large clusters most IO is remote). .. Dan -Greg > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Were fixed CephFS lock ups when it's running on nodes with OSDs?
I'm am running a Ceph Cluster on 5 Servers, all with a single osd and acting as a client (kernel) for nearly half a year now and didn't encounter a lockup yet. Total storage is 3.25TB with about 600GB raw storage used, if that matters. Dan van der Ster schrieb am Di., 23. Apr. 2019, 09:33: > On Mon, 22 Apr 2019, 22:20 Gregory Farnum, wrote: > >> On Sat, Apr 20, 2019 at 9:29 AM Igor Podlesny wrote: >> > >> > I remember seeing reports in regards but it's being a while now. >> > Can anyone tell? >> >> No, this hasn't changed. It's unlikely it ever will; I think NFS >> resolved the issue but it took a lot of ridiculous workarounds and >> imposes a permanent memory cost on the client. >> > > On the other hand, we've been running osds and local kernel mounts through > some ior stress testing and managed to lock up only one node, only once > (and that was with a 2TB shared output file). > > Maybe the necessary memory pressure conditions get less likely as the > number of clients and osds gets larger? (i.e. it's probably easy to trigger > with one single node/osd because all IO is local, but for large clusters > most IO is remote). > > .. Dan > > > -Greg >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Osd update from 12.2.11 to 12.2.12
I have only this in the default section, I think it is related to not having any configuration for some of these osd's. I 'forgot' to add the most recently added node [osd.x] sections. But in any case nothing afaik that should have them behave differently. [osd] osd journal size = 1024 osd pool default size = 3 osd pool default min size = 2 osd pool default pg num = 8 osd pool default pgp num = 8 # osd objectstore = bluestore # osd max object size = 134217728 # osd max object size = 26843545600 osd scrub min interval = 172800 And these in the custom section [osd.x] public addr = 192.168.10.x cluster addr = 10.0.0.x -Original Message- From: David Turner [mailto:drakonst...@gmail.com] Sent: 22 April 2019 22:34 To: Marc Roos Cc: ceph-users Subject: Re: [ceph-users] Osd update from 12.2.11 to 12.2.12 Do you perhaps have anything in the ceph.conf files on the servers with those OSDs that would attempt to tell the daemon that they are filestore osds instead of bluestore? I'm sure you know that the second part [1] of the output in both cases only shows up after an OSD has been rebooted. I'm sure this too could be cleaned up by adding that line to the ceph.conf file. [1] rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) On Sun, Apr 21, 2019 at 8:32 AM wrote: Just updated luminous, and setting max_scrubs value back. Why do I get osd's reporting differently I get these: osd.18: osd_max_scrubs = '1' (not observed, change may require restart) osd_objectstore = 'bluestore' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.19: osd_max_scrubs = '1' (not observed, change may require restart) osd_objectstore = 'bluestore' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.20: osd_max_scrubs = '1' (not observed, change may require restart) osd_objectstore = 'bluestore' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.21: osd_max_scrubs = '1' (not observed, change may require restart) osd_objectstore = 'bluestore' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.22: osd_max_scrubs = '1' (not observed, change may require restart) osd_objectstore = 'bluestore' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) And I get osd's reporting like this: osd.23: osd_max_scrubs = '1' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.24: osd_max_scrubs = '1' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.25: osd_max_scrubs = '1' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.26: osd_max_scrubs = '1' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.27: osd_max_scrubs = '1' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) osd.28: osd_max_scrubs = '1' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph inside Docker containers inside VirtualBox
I am not sure about your background knowledge of ceph, but if you are starting. Maybe first try and get ceph working in a virtual environment, that should not be to much of a problem. Then try migrating it to your container. Now you are probably fighting to many issues at the same time. -Original Message- From: Varun Singh [mailto:varun.si...@gslab.com] Sent: 22 April 2019 07:46 To: ceph-us...@ceph.com Subject: Re: [ceph-users] Ceph inside Docker containers inside VirtualBox On Fri, Apr 19, 2019 at 6:53 PM Varun Singh wrote: > > On Fri, Apr 19, 2019 at 10:44 AM Varun Singh wrote: > > > > On Thu, Apr 18, 2019 at 9:53 PM Siegfried Höllrigl > > wrote: > > > > > > Hi ! > > > > > > I am not 100% sure, but i think, --net=host does not propagate > > > /dev/ inside the conatiner. > > > > > > From the Error Message : > > > > > > 2019-04-18 07:30:06 /opt/ceph-container/bin/entrypoint.sh: ERROR- > > > The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist ! > > > > > > > > > I whould say, you should add something like --device=/dev/vdd to the docker run command for the osd. > > > > > > Br > > > > > > > > > Am 18.04.2019 um 14:46 schrieb Varun Singh: > > > > Hi, > > > > I am trying to setup Ceph through Docker inside a VM. My host > > > > machine is Mac. My VM is an Ubuntu 18.04. Docker version is > > > > 18.09.5, build e8ff056. > > > > I am following the documentation present on ceph/daemon Docker > > > > Hub page. The idea is, if I spawn docker containers as mentioned > > > > on the page, I should get a ceph setup without KV store. I am > > > > not worried about KV store as I just want to try it out. > > > > Following are the commands I am firing to bring the containers up: > > > > > > > > Monitor: > > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v > > > > /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e > > > > CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon > > > > > > > > Manager: > > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v > > > > /var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr > > > > > > > > OSD: > > > > docker run -d --net=host --pid=host --privileged=true -v > > > > /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v > > > > /dev/:/dev/ -e OSD_DEVICE=/dev/vdd ceph/daemon osd > > > > > > > > From the above commands I am able to spawn monitor and manager > > > > properly. I verified this by firing this command on both monitor > > > > and manager containers: > > > > sudo docker exec d1ab985 ceph -s > > > > > > > > I get following outputs for both: > > > > > > > >cluster: > > > > id: 14a6e40a-8e54-4851-a881-661a84b3441c > > > > health: HEALTH_OK > > > > > > > >services: > > > > mon: 1 daemons, quorum serverceph-VirtualBox (age 62m) > > > > mgr: serverceph-VirtualBox(active, since 56m) > > > > osd: 0 osds: 0 up, 0 in > > > > > > > >data: > > > > pools: 0 pools, 0 pgs > > > > objects: 0 objects, 0 B > > > > usage: 0 B used, 0 B / 0 B avail > > > > pgs: > > > > > > > > However when I try to bring up OSD using above command, it > > > > doesn't work. Docker logs show this output: > > > > 2019-04-18 07:30:06 /opt/ceph-container/bin/entrypoint.sh: static: > > > > does not generate config > > > > 2019-04-18 07:30:06 /opt/ceph-container/bin/entrypoint.sh: > > > > ERROR- The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist ! > > > > > > > > I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE env var. > > > > I know there are five different ways to spawning the OSD, but I > > > > am not able to figure out which one would be suitable for a > > > > simple deployment. If you could please let me know how to spawn > > > > OSDs using Docker, it would help a lot. > > > > > > > > > > > > Thanks Br, I will try this out today. > > > > -- > > Regards, > > Varun Singh > > Hi, > So following your suggestion I tried following two commands: > 1. I added --device=/dev/vdd switch without removing OSD_DEVICE env > var. This resulted in same error before docker run -d --net=host > --pid=host --privileged=true --device=/dev/vdd -v /etc/ceph:/etc/ceph > -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdd > ceph/daemon osd > > > 2. Then I removed OSD_DEVICE env var and just added --device=/dev/vdd > switch docker run -d --net=host --pid=host --privileged=true > --device=/dev/vdd -v /etc/ceph:/etc/ceph -v > /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ ceph/daemon osd > > OSD_DEVICE related error went away and I think ceph created an OSD > successfully. But it wasn't able to connect to cluster. Is it because > I did not give and network related information? I get the following > error now: > > 2019-04-18 08:30:47 /opt/ceph-container/bin/entrypoint.sh: static: > does not generate config > 2019-04-18 08:30:47 /opt/ceph-container/bin/entrypoint.sh: > Bootstrapped OSD(s) found; using OSD directory > 2019-04-18 08:30:47 /opt/ceph-container/bin/en
Re: [ceph-users] Bluestore with so many small files
Hi, You probably forgot to recreate the OSD after changing bluestore_min_alloc_size. Regards, Frédéric. - Le 22 Avr 19, à 5:41, 刘 俊 a écrit : > Hi All , > I still see this issue with latest ceph Luminous 12.2.11 and 12.2.12. > I have set bluestore_min_alloc_size = 4096 before the test. > when I write 10 small objects less than 64KB through rgw, the RAW USED > showed in "ceph df" looks incorrect. > For example, I test three times and clean up the rgw data pool each time, the > object size for first time is 4KB, for second time is 32KB, for third time is > 64KB. > The RAW USED showed in "ceph df" are the same(18GB), looks like always equal > to > 64KB*10/1024*3 . (replicator is 3 here ) > Any thought? > Jamie > ___ > Hi Behnam, > On 2/12/2018 4:06 PM, Behnam Loghmani wrote: >> Hi there, > > I am using ceph Luminous 12.2.2 with: > > 3 osds (each osd is >> 100G) - no WAL/DB separation. > 3 mons > 1 rgw > cluster size 3 > > I stored >> lots of thumbnails with very small size on ceph with radosgw. > > Actual size >> of files is something about 32G but it filled 70G of each osd. > > what's the >> reason of this high disk usage? Most probably the major reason is BlueStore > > allocation granularity. E.g. > an object of 1K bytes length needs 64K of disk space if default > bluestore_min_alloc_size_hdd (=64K) is applied. > Additional inconsistency in space reporting might also appear since > BlueStore adds up DB volume space when accounting total store space. > While free space is taken from Block device only. is As a result when > reporting "Used" space always contain that total DB space part ( i.e. > Used = Total(Block+DB) - Free(Block) ). That correlates to other > comments in this thread about RockDB space usage. > There is a pending PR to fix that: [ > https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86 > | > https://github.com/ceph/ceph/pull/19454/commits/144fb9663778f833782bdcb16acd707c3ed62a86 > ] You may look for "Bluestore: inaccurate disk usage statistics problem" > in this mail list for previous discussion as well. >> should I change "bluestore_min_alloc_size_hdd"? and If I change it and > set >> it >> to smaller size, does it impact on performance? Unfortunately I haven't > > benchmark "small writes over hdd" cases much > hence don't have exacts answer here. Indeed these 'min_alloc_size' > family of parameters might impact the performance quite significantly. >> > what is the best practice for storing small files on bluestore? > > Best > > > regards, > Behnam Loghmani >> > On Mon, Feb 12, 2018 at 5:06 PM, David Turner < [ >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | drakonstein at >> > gmail.com ] > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > | drakonstein at gmail.com ] >> wrote: > > Some of your overhead is the >> > Wal and >> > rocksdb that are on the OSDs. > The Wal is pretty static in size, but >> > rocksdb >> > grows with the amount > of objects you have. You also have copies of the >> > osdmap >> > on each osd. > There's just overhead that adds up. The biggest is going to >> > be > >> > rocksdb with how many objects you have. > > > On Mon, Feb 12, 2018, 8:06 AM >> > Behnam Loghmani > < [ >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | >> > behnam.loghmani at gmail.com ] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | behnam.loghmani at >> > gmail.com ] >> wrote: > > Hi there, > > I am using ceph Luminous 12.2.2 >> > with: > >> > > 3 osds (each osd is 100G) - no WAL/DB separation. > 3 mons > 1 rgw > >> > > cluster >> > size 3 > > I stored lots of thumbnails with very small size on ceph with > >> > radosgw. > > Actual size of files is something about 32G but it filled 70G >> > of > >> > each osd. > > what's the reason of this high disk usage? > should I change >> > "bluestore_min_alloc_size_hdd"? and If I change > it and set it to smaller >> > size, does it impact on performance? > > what is the best practice for >> > storing >> > small files on bluestore? > > Best regards, > Behnam Loghmani > >> > ___ > ceph-users mailing list >> > > [ >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | ceph-users at >> > lists.ceph.com ] > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | ceph-users at >> > lists.ceph.com ] > > [ >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] > < [ >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com | > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ] > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph
[ceph-users] Recovery 13.2.5 Slow
Hello, I have a cluster with 6 OSD nodes each with 10 SATA 8TB drives. Node 6 was just added. All nodes are 10Gbps on the network with Jumbo frames. S3 application access is working as expected but recovery is extremely slow. Based on past posts I attempted to do the following: Alter the osd_recovery_sleep_hdd. I tried 0 and 0.1. 0 seems to improve the speed slightly but it is still very slow. I also attempted to change osd_max_backfills to 16 from 8 and osd_recovery_max_active to 8 from 4. This showed no noticeable improvement. The cluster is running 13.2.5. Here is the output from ceph -s cluster: id: xx health: HEALTH_ERR 3 large omap objects 67164650/268993641 objects misplaced (24.969%) Degraded data redundancy: 612258/268993641 objects degraded (0.228%), 8 pgs degraded, 8 pgs undersized Degraded data redundancy (low space): 9 pgs backfill_toofull services: mon: 3 daemons, quorum mon1,mon2,mon3 mgr: mon2(active), standbys: mon1 osd: 55 osds: 50 up, 50 in; 531 remapped pgs rgw: 3 daemons active data: pools: 15 pools, 1476 pgs objects: 89.66 M objects, 49 TiB usage: 159 TiB used, 205 TiB / 364 TiB avail pgs: 612258/268993641 objects degraded (0.228%) 67164650/268993641 objects misplaced (24.969%) 945 active+clean 507 active+remapped+backfill_wait 9 active+remapped+backfill_wait+backfill_toofull 7 active+remapped+backfilling 4 active+undersized+degraded+remapped+backfill_wait 4 active+undersized+degraded+remapped+backfilling io: client: 5.3 MiB/s rd, 3.9 MiB/s wr, 844 op/s rd, 81 op/s wr recovery: 19 MiB/s, 33 objects/s Any clue at what I can looks at further to investigate the slow recovery would be appreciated. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Default Pools
You should be able to see all pools in use in a RGW zone from the radosgw-admin command. This [1] is probably overkill for most, but I deal with multi-realm clusters so I generally think like this when dealing with RGW. Running this as is will create a file in your current directory for each zone in your deployment (likely to be just one file). My rough guess for what you would find in that file based on your pool names would be this [2]. If you identify any pools not listed from the zone get command, then you can rename [3] the pool to see if it is being created and/or used by rgw currently. The process here would be to stop all RGW daemons, rename the pools, start a RGW daemon, stop it again, and see which pools were recreated. Clean up the pools that were freshly made and rename the original pools back into place before starting your RGW daemons again. Please note that .rgw.root is a required pool in every RGW deployment and will not be listed in the zones themselves. [1] for realm in $(radosgw-admin realm list --format=json | jq '.realms[]' -r); do for zonegroup in $(radosgw-admin --rgw-realm=$realm zonegroup list --format=json | jq '.zonegroups[]' -r); do for zone in $(radosgw-admin --rgw-realm=$realm --rgw-zonegroup=$zonegroup zone list --format=json | jq '.zones[]' -r); do echo $realm.$zonegroup.$zone.json radosgw-admin --rgw-realm=$realm --rgw-zonegroup=$zonegroup --rgw-zone=$zone zone get > $realm.$zonegroup.$zone.json done done done [2] default.default.default.json { "id": "{{ UUID }}", "name": "default", "domain_root": "default.rgw.meta", "control_pool": "default.rgw.control", "gc_pool": ".rgw.gc", "log_pool": "default.rgw.log", "user_email_pool": ".users.email", "user_uid_pool": ".users.uid", "system_key": { }, "placement_pools": [ { "key": "default-placement", "val": { "index_pool": "default.rgw.buckets.index", "data_pool": "default.rgw.buckets.data", "data_extra_pool": "default.rgw.buckets.non-ec", "index_type": 0, "compression": "" } } ], "metadata_heap": "", "tier_config": [], "realm_id": "{{ UUID }}" } [3] ceph osd pool rename On Thu, Apr 18, 2019 at 10:46 AM Brent Kennedy wrote: > Yea, that was a cluster created during firefly... > > Wish there was a good article on the naming and use of these, or perhaps a > way I could make sure they are not used before deleting them. I know RGW > will recreate anything it uses, but I don’t want to lose data because I > wanted a clean system. > > -Brent > > -Original Message- > From: Gregory Farnum > Sent: Monday, April 15, 2019 5:37 PM > To: Brent Kennedy > Cc: Ceph Users > Subject: Re: [ceph-users] Default Pools > > On Mon, Apr 15, 2019 at 1:52 PM Brent Kennedy wrote: > > > > I was looking around the web for the reason for some of the default > pools in Ceph and I cant find anything concrete. Here is our list, some > show no use at all. Can any of these be deleted ( or is there an article > my googlefu failed to find that covers the default pools? > > > > We only use buckets, so I took out .rgw.buckets, .users and > > .rgw.buckets.index… > > > > Name > > .log > > .rgw.root > > .rgw.gc > > .rgw.control > > .rgw > > .users.uid > > .users.email > > .rgw.buckets.extra > > default.rgw.control > > default.rgw.meta > > default.rgw.log > > default.rgw.buckets.non-ec > > All of these are created by RGW when you run it, not by the core Ceph > system. I think they're all used (although they may report sizes of 0, as > they mostly make use of omap). > > > metadata > > Except this one used to be created-by-default for CephFS metadata, but > that hasn't been true in many releases. So I guess you're looking at an old > cluster? (In which case it's *possible* some of those RGW pools are also > unused now but were needed in the past; I haven't kept good track of them.) > -Greg > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] showing active config settings
Thanks, but does this not work on Luminous maybe? I am on the mon hosts trying this: # ceph config set osd osd_recovery_max_active 4 Invalid command: unused arguments: [u'4'] config set : Set a configuration option at runtime (not persistent) Error EINVAL: invalid command # ceph daemon osd.0 config diff|grep -A5 osd_recovery_max_active admin_socket: exception getting command descriptions: [Errno 2] No such file or directory On Tue, Apr 16, 2019 at 4:04 PM Brad Hubbard wrote: > $ ceph config set osd osd_recovery_max_active 4 > $ ceph daemon osd.0 config diff|grep -A5 osd_recovery_max_active > "osd_recovery_max_active": { > "default": 3, > "mon": 4, > "override": 4, > "final": 4 > }, > > On Wed, Apr 17, 2019 at 5:29 AM solarflow99 wrote: > > > > I wish there was a way to query the running settings from one of the MGR > hosts, and it doesn't help that ansible doesn't even copy the keyring to > the OSD nodes so commands there wouldn't work anyway. > > I'm still puzzled why it doesn't show any change when I run this no > matter what I set it to: > > > > # ceph -n osd.1 --show-config | grep osd_recovery_max_active > > osd_recovery_max_active = 3 > > > > in fact it doesn't matter if I use an OSD number that doesn't exist, > same thing if I use ceph get > > > > > > > > On Tue, Apr 16, 2019 at 1:18 AM Brad Hubbard > wrote: > >> > >> On Tue, Apr 16, 2019 at 6:03 PM Paul Emmerich > wrote: > >> > > >> > This works, it just says that it *might* require a restart, but this > >> > particular option takes effect without a restart. > >> > >> We've already looked at changing the wording once to make it more > palatable. > >> > >> http://tracker.ceph.com/issues/18424 > >> > >> > > >> > Implementation detail: this message shows up if there's no internal > >> > function to be called when this option changes, so it can't be sure if > >> > the change is actually doing anything because the option might be > >> > cached or only read on startup. But in this case this option is read > >> > in the relevant path every time and no notification is required. But > >> > the injectargs command can't know that. > >> > >> Right on all counts. The functions are referred to as observers and > >> register to be notified if the value changes, hence "not observed." > >> > >> > > >> > Paul > >> > > >> > On Mon, Apr 15, 2019 at 11:38 PM solarflow99 > wrote: > >> > > > >> > > Then why doesn't this work? > >> > > > >> > > # ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4' > >> > > osd.0: osd_recovery_max_active = '4' (not observed, change may > require restart) > >> > > osd.1: osd_recovery_max_active = '4' (not observed, change may > require restart) > >> > > osd.2: osd_recovery_max_active = '4' (not observed, change may > require restart) > >> > > osd.3: osd_recovery_max_active = '4' (not observed, change may > require restart) > >> > > osd.4: osd_recovery_max_active = '4' (not observed, change may > require restart) > >> > > > >> > > # ceph -n osd.1 --show-config | grep osd_recovery_max_active > >> > > osd_recovery_max_active = 3 > >> > > > >> > > > >> > > > >> > > On Wed, Apr 10, 2019 at 7:21 AM Eugen Block wrote: > >> > >> > >> > >> > I always end up using "ceph --admin-daemon > >> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." > to get what > >> > >> > is in effect now for a certain daemon. > >> > >> > Needs you to be on the host of the daemon of course. > >> > >> > >> > >> Me too, I just wanted to try what OP reported. And after trying > that, > >> > >> I'll keep it that way. ;-) > >> > >> > >> > >> > >> > >> Zitat von Janne Johansson : > >> > >> > >> > >> > Den ons 10 apr. 2019 kl 13:37 skrev Eugen Block : > >> > >> > > >> > >> >> > If you don't specify which daemon to talk to, it tells you > what the > >> > >> >> > defaults would be for a random daemon started just now using > the same > >> > >> >> > config as you have in /etc/ceph/ceph.conf. > >> > >> >> > >> > >> >> I tried that, too, but the result is not correct: > >> > >> >> > >> > >> >> host1:~ # ceph -n osd.1 --show-config | grep > osd_recovery_max_active > >> > >> >> osd_recovery_max_active = 3 > >> > >> >> > >> > >> > > >> > >> > I always end up using "ceph --admin-daemon > >> > >> > /var/run/ceph/name-of-socket-here.asok config show | grep ..." > to get what > >> > >> > is in effect now for a certain daemon. > >> > >> > Needs you to be on the host of the daemon of course. > >> > >> > > >> > >> > -- > >> > >> > May the most significant bit of your life be positive. > >> > >> > >> > >> > >> > >> > >> > >> ___ > >> > >> ceph-users mailing list > >> > >> ceph-users@lists.ceph.com > >> > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > >> > > ___ > >> > > ceph-users mailing list > >> > > ceph-users@lists.ceph.com > >> > > http://lists.ceph.com/listinfo.cgi/ceph-users-cep
Re: [ceph-users] ceph-iscsi: problem when discovery auth is disabled, but gateway receives auth requests
On 04/18/2019 06:24 AM, Matthias Leopold wrote: > Hi, > > the Ceph iSCSI gateway has a problem when receiving discovery auth > requests when discovery auth is not enabled. Target discovery fails in > this case (see below). This is especially annoying with oVirt (KVM > management platform) where you can't separate the two authentication > phases. This leads to a situation where you are forced to use discovery > auth and have the same credentials for target auth (for oVirt target). > These credentials (for discovery auth) would then have to be shared for > other targets on the same gateway, this is not acceptable. I saw that > other iSCSI vendors (FreeNAS) don't have this problem. I don't know if > this is Ceph gateway specific or a general LIO target problem. In any > case I would be very happy if this could be resolved. I think that > smooth integration of Ceph iSCSI gateway and oVirt should be of broader > interest. Please correct me if I got anything wrong. > > kernel messages when discovery_auth is disabled, but auth requests are > received > > Apr 18 13:05:01 ceiscsi0 kernel: CHAP user or password not set for > Initiator ACL > Apr 18 13:05:01 ceiscsi0 kernel: Security negotiation failed. > Apr 18 13:05:01 ceiscsi0 kernel: iSCSI Login negotiation failed. > Incase other people hit this, it can be tracked here: https://github.com/ceph/ceph-iscsi/issues/68 It's a kernel bug where when disabling discovery_auth the kernel was still requiring CHAP. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to minimize IO starvations while Bluestore try to delete WAL files
Hello, Recently I have an issue when Bluestore try to delete WAL files (indicated from osd log), the IO of the disk (HDD - Spinning) will reached 100% and will introduced slow request to the cluster. Is there any way to throttle this operation down or completely disable it? Thanks Regards, I Gede Iswara Darmawan Information System - School of Industrial and System Engineering Telkom University P / SMS / WA : 081 322 070719 E : iswaradr...@gmail.com / iswaradr...@live.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Ceph inside Docker containers inside VirtualBox
On Tue, Apr 23, 2019 at 2:58 PM Marc Roos wrote: > > > > I am not sure about your background knowledge of ceph, but if you are > starting. Maybe first try and get ceph working in a virtual environment, > that should not be to much of a problem. Then try migrating it to your > container. Now you are probably fighting to many issues at the same > time. > > > > > > -Original Message- > From: Varun Singh [mailto:varun.si...@gslab.com] > Sent: 22 April 2019 07:46 > To: ceph-us...@ceph.com > Subject: Re: [ceph-users] Ceph inside Docker containers inside > VirtualBox > > On Fri, Apr 19, 2019 at 6:53 PM Varun Singh > wrote: > > > > On Fri, Apr 19, 2019 at 10:44 AM Varun Singh > wrote: > > > > > > On Thu, Apr 18, 2019 at 9:53 PM Siegfried Höllrigl > > > wrote: > > > > > > > > Hi ! > > > > > > > > I am not 100% sure, but i think, --net=host does not propagate > > > > /dev/ inside the conatiner. > > > > > > > > From the Error Message : > > > > > > > > 2019-04-18 07:30:06 /opt/ceph-container/bin/entrypoint.sh: ERROR- > > > > > The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist ! > > > > > > > > > > > > I whould say, you should add something like --device=/dev/vdd to > the docker run command for the osd. > > > > > > > > Br > > > > > > > > > > > > Am 18.04.2019 um 14:46 schrieb Varun Singh: > > > > > Hi, > > > > > I am trying to setup Ceph through Docker inside a VM. My host > > > > > machine is Mac. My VM is an Ubuntu 18.04. Docker version is > > > > > 18.09.5, build e8ff056. > > > > > I am following the documentation present on ceph/daemon Docker > > > > > Hub page. The idea is, if I spawn docker containers as mentioned > > > > > > on the page, I should get a ceph setup without KV store. I am > > > > > not worried about KV store as I just want to try it out. > > > > > Following are the commands I am firing to bring the containers > up: > > > > > > > > > > Monitor: > > > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v > > > > > /var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e > > > > > CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon > > > > > > > > > > Manager: > > > > > docker run -d --net=host -v /etc/ceph:/etc/ceph -v > > > > > /var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr > > > > > > > > > > OSD: > > > > > docker run -d --net=host --pid=host --privileged=true -v > > > > > /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v > > > > > /dev/:/dev/ -e OSD_DEVICE=/dev/vdd ceph/daemon osd > > > > > > > > > > From the above commands I am able to spawn monitor and manager > > > > > properly. I verified this by firing this command on both monitor > > > > > > and manager containers: > > > > > sudo docker exec d1ab985 ceph -s > > > > > > > > > > I get following outputs for both: > > > > > > > > > >cluster: > > > > > id: 14a6e40a-8e54-4851-a881-661a84b3441c > > > > > health: HEALTH_OK > > > > > > > > > >services: > > > > > mon: 1 daemons, quorum serverceph-VirtualBox (age 62m) > > > > > mgr: serverceph-VirtualBox(active, since 56m) > > > > > osd: 0 osds: 0 up, 0 in > > > > > > > > > >data: > > > > > pools: 0 pools, 0 pgs > > > > > objects: 0 objects, 0 B > > > > > usage: 0 B used, 0 B / 0 B avail > > > > > pgs: > > > > > > > > > > However when I try to bring up OSD using above command, it > > > > > doesn't work. Docker logs show this output: > > > > > 2019-04-18 07:30:06 /opt/ceph-container/bin/entrypoint.sh: > static: > > > > > does not generate config > > > > > 2019-04-18 07:30:06 /opt/ceph-container/bin/entrypoint.sh: > > > > > ERROR- The device pointed by OSD_DEVICE (/dev/vdd) doesn't exist > ! > > > > > > > > > > I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE > env var. > > > > > I know there are five different ways to spawning the OSD, but I > > > > > am not able to figure out which one would be suitable for a > > > > > simple deployment. If you could please let me know how to spawn > > > > > OSDs using Docker, it would help a lot. > > > > > > > > > > > > > > > > Thanks Br, I will try this out today. > > > > > > -- > > > Regards, > > > Varun Singh > > > > Hi, > > So following your suggestion I tried following two commands: > > 1. I added --device=/dev/vdd switch without removing OSD_DEVICE env > > var. This resulted in same error before docker run -d --net=host > > --pid=host --privileged=true --device=/dev/vdd -v /etc/ceph:/etc/ceph > > -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e OSD_DEVICE=/dev/vdd > > > ceph/daemon osd > > > > > > 2. Then I removed OSD_DEVICE env var and just added --device=/dev/vdd > > switch docker run -d --net=host --pid=host --privileged=true > > --device=/dev/vdd -v /etc/ceph:/etc/ceph -v > > /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ ceph/daemon osd > > > > OSD_DEVICE related error went away and I think ceph created an OSD > > successfully. But it wasn't able to connect to cluster. Is it because > > I did not give and network related information? I get the following >
[ceph-users] getting pg inconsistent periodly
Hi, I'm running a cluster for a period of time. I find the cluster usually run into unhealthy state recently. With 'ceph health detail', one or two pg are inconsistent. What's more, pg in wrong state each day are not placed on the same disk, so that I don't think it's a disk problem. The cluster is using version 12.2.5. Any idea about this strange issue? Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com